论文标题
Mitao:一种使人文学科中的学者在其研究中使用主题建模的工具
MITAO: a tool for enabling scholars in the Humanities to use Topic Modelling in their studies
论文作者
论文摘要
自动文本分析方法(例如主题建模)在人文学科中引起了很多关注。但是,学者需要具有广泛的编码技巧来适当使用此类方法。具有这种技术专长的需求阻止了这些方法在人文研究中的广泛采用。在本文中,为了帮助人文学科的学者使用没有编码技能或有限的编码技能的主题建模,我们介绍了MITAO,MITAO是一种基于网络的工具,允许视觉工作流的定义,该工具将各种自动文本分析操作嵌入各种自动文本分析操作,并允许一个人共享工作流程和共享其执行者的结果,以实现分析可重复可再现性。我们介绍了使用Mitao使用主题建模的示例,该示例使用了“ Umanistica Digitale”中发表的文章的一系列英语摘要。 Mitao返回的结果以动态的基于Web的可视化显示,这使我们能够在“ Umanistica Digitale”中发表的文章中对所处理的主题的演变具有初步见解。所有结果以及定义的工作流程都可以发表并可以访问以进行进一步研究。
Automatic text analysis methods, such as Topic Modelling, are gaining much attention in Humanities. However, scholars need to have extensive coding skills to use such methods appropriately. The need of having this technical expertise prevents the broad adoption of these methods in Humanities research. In this paper, to help scholars in the Humanities to use Topic Modelling having no or limited coding skills, we introduce MITAO, a web-based tool that allow the definition of a visual workflow which embeds various automatic text analysis operations and allows one to store and share both the workflow and the results of its execution to other researchers, which enables the reproducibility of the analysis. We present an example of an application of use of Topic Modelling with MITAO using a collection of English abstracts of the articles published in "Umanistica Digitale". The results returned by MITAO are shown with dynamic web-based visualizations, which allowed us to have preliminary insights about the evolution of the topics treated over the time in the articles published in "Umanistica Digitale". All the results along with the defined workflows are published and accessible for further studies.