论文标题

第四台机器说,比较文本分析的复杂性 - “园丁永远是凶手”

The Complexity of Comparative Text Analysis -- "The Gardener is always the Murderer" says the Fourth Machine

论文作者

Weber, Marcus, Fackeldey, Konstantin

论文摘要

关于计算机可以与整个人类研究人员的能力相比,计算机可以绘制文本分析的复杂性的范围。对给定文本的“深入”分析仍然超出了现代计算机的可能性。 在现有的计算文本分析算法的核心中,有实数的操作,例如根据代数字段的规则,添加和乘法。但是,“比较”的过程具有非常精确的数学结构,这与代数场的结构不同。 “比较”的数学结构可以使用布尔环表示。我们以这种结构为基础,并定义相应的代数方程,将比较文本分析的算法提升到“正确”代数基础上。从这个角度来看,我们可以研究比较文本分析的{\ em Computational}复杂性的问题。

There is a heated debate about how far computers can map the complexity of text analysis compared to the abilities of the whole team of human researchers. A "deep" analysis of a given text is still beyond the possibilities of modern computers. In the heart of the existing computational text analysis algorithms there are operations with real numbers, such as additions and multiplications according to the rules of algebraic fields. However, the process of "comparing" has a very precise mathematical structure, which is different from the structure of an algebraic field. The mathematical structure of "comparing" can be expressed by using Boolean rings. We build on this structure and define the corresponding algebraic equations lifting algorithms of comparative text analysis onto the "correct" algebraic basis. From this point of view, we can investigate the question of {\em computational} complexity of comparative text analysis.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源