Bridging Learning Analytics and Cognitive Computing for Big Data Classification in Micro-Learning Video Collections


Moving towards the next generation of personalized learning environments requires intelligent approaches powered by analytics for advanced learning contexts with enriched digital content. Micro-Learning through Massive Open Online Courses is riding the wave of popularity as a novel paradigm for delivering short educational videos in small pre-organized chunks over time, so that learners can get knowledge in a manageable way. However, with the ever-increasing number of videos, it has become challenging to arrange and search them according to specific categories. In this paper, we get around the problem by bridging Learning Analytics and Cognitive Computing to analyze the content of large video collections, going over traditional term-based methods. We propose an efficient and effective approach to automatically classify a collection of educational videos on pre-existing categories which uses (i) a Speech-to-Text tool to get video transcripts, (ii) Natural Language Processing and Cognitive Computing methods to extract semantic concepts and keywords from video transcripts for their representation, and (iii) Apache Spark as Big Data technology for scalability. Several classifiers are trained on the feature vectors extracted by Cognitive Computing tools. Then, we compared our approach with other combinations of state-of-the-art feature types and classifiers over a large-scale dataset we collected from Coursera. Considering the experimental results, we expect our approach can facilitate the development of Learning Analytics tools powered by Cognitive Computing to support content managers on micro-learning video management while improving how learners search videos.