在虚拟学习数据集中学生参与度的定义和注释不一致：一项关键评论

论文标题

在虚拟学习数据集中学生参与度的定义和注释不一致：一项关键评论

Inconsistencies in the Definition and Annotation of Student Engagement in Virtual Learning Datasets: A Critical Review

论文作者

Khan, Shehroz S., Abedi, Ali, Colella, Tracey

论文摘要

背景：虚拟学习中的学生参与度（SE）可能会对满足学习目标和计划辍学风险产生重大影响。开发自动SE测量的人工智能（AI）模型需要注释的数据集。但是，现有的SE数据集遭受了不一致的定义和注释协议，主要与教育心理学中SE的定义不符。在开发可推广的AI模型时，此问题可能会产生误导，并使很难比较在不同数据集中开发的这些模型的性能。这项关键审查的目的是探索现有的SE数据集并在不同的参与定义和注释协议方面突出显示不一致的情况。方法：搜索了一些学术数据库，以介绍新的SE数据集的出版物。包括在线或离线基于计算机的虚拟学习会话中包含学生单或多模式数据的数据集。根据我们定义的七个维度注释，分析了现有数据集中SE的定义和注释：来源，数据方式，时机，时间分辨率，抽象水平，组合和量化水平。结果：三十SE测量数据集符合纳入标准。回顾的SE数据集使用了非常多样化和不一致的定义和注释协议。出乎意料的是，在其SE的定义中，很少有审核的数据集使用了现有的心理验证量表。讨论：SE的定义不一致和注释对于开发可比较的AI模型以进行自动SE测量的研究是有问题的。引入了在虚拟学习以外的其他设置中现有的SE定义和协议，这些定义和协议具有在虚拟学习中使用的潜力。

Background: Student engagement (SE) in virtual learning can have a major impact on meeting learning objectives and program dropout risks. Developing Artificial Intelligence (AI) models for automatic SE measurement requires annotated datasets. However, existing SE datasets suffer from inconsistent definitions and annotation protocols mostly unaligned with the definition of SE in educational psychology. This issue could be misleading in developing generalizable AI models and make it hard to compare the performance of these models developed on different datasets. The objective of this critical review was to explore the existing SE datasets and highlight inconsistencies in terms of differing engagement definitions and annotation protocols. Methods: Several academic databases were searched for publications introducing new SE datasets. The datasets containing students' single- or multi-modal data in online or offline computer-based virtual learning sessions were included. The definition and annotation of SE in the existing datasets were analyzed based on our defined seven dimensions of engagement annotation: sources, data modalities, timing, temporal resolution, level of abstraction, combination, and quantification. Results: Thirty SE measurement datasets met the inclusion criteria. The reviewed SE datasets used very diverse and inconsistent definitions and annotation protocols. Unexpectedly, very few of the reviewed datasets used existing psychometrically validated scales in their definition of SE. Discussion: The inconsistent definition and annotation of SE are problematic for research on developing comparable AI models for automatic SE measurement. Some of the existing SE definitions and protocols in settings other than virtual learning that have the potential to be used in virtual learning are introduced.

下载PDF全文

下载文献需遵守相关版权规定

论文标题