当前位置: 首页  2014贵州省先进计算与医疗信息服务工程实验室  通知公告
20220912论文报告-MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs

报告题目MOOCCube: A Large-scale Data Repository for NLP Applications in MOOCs

论文出处ACL 2020


Jifan Yu1*,Gan Luo1*,Tong Xiao1,Qingyang Zhong1,Yuquan Wang1,Wenzheng Feng1,Junyi Luo1,Chenyu Wang1,Lei Hou1,2,3,Juanzi Li1,2,3+,Zhiyuan Liu1,2,3,Jie Tang1,2,3


1Dept. of Computer SCi.& Tech., Tsinghua University, China 100084

2KIRC, Institute for Artificial Intelligence, Tsinghua University, China 100084

3Beijing National Research Center for Information Science and Technology, China 100084


报告时间2022912日 下午2:00



The prosperity of Massive Open Online Courses (MOOCs) provides fodder for many NLP and AI research for education applications, e.g., course concept extraction, prerequisite relation discovery, etc. However, the publicly available datasets of MOOC are limited in size with few types of data, which hinders advanced models and novel attempts in related topics. Therefore, we present MOOCCube, a large-scale data repository of over 700 MOOC courses, 100k concepts, 8 million student behaviors with an external resource. Moreover, we conduct a prerequisite discovery task as an example application to show the potential of MOOCCube in facilitating relevant research. The data repository is now available at


【关闭本页】 【返回顶部】