论文标题
野外文本电路的语言不依赖性:英语和乌尔都语
Language-independence of DisCoCirc's Text Circuits: English and Urdu
论文作者
论文摘要
cycirc是一个新提出的框架,用于使用构图,生成电路来表示文本的语法和语义。尽管它构成了分类分布组成(Discocat)框架的开发,但它揭示了从根本上进行新的特征。特别是,[14]提出,野鸡在消除语言之间的语法差异方面采取了某种方式。在本文中,我们提供了一个草图,即英语和乌尔都语的受限制片段确实是这种情况。我们首先为乌尔都语的片段开发了圆形,就像[14]中的英语一样。从英语语法到乌尔都语语法有一个简单的翻译,反之亦然。然后,我们证明英语和乌尔都语之间的语法结构差异 - 主要与单词和短语的顺序有关 - 传递到ciccirc电路时消失。
DisCoCirc is a newly proposed framework for representing the grammar and semantics of texts using compositional, generative circuits. While it constitutes a development of the Categorical Distributional Compositional (DisCoCat) framework, it exposes radically new features. In particular, [14] suggested that DisCoCirc goes some way toward eliminating grammatical differences between languages. In this paper we provide a sketch that this is indeed the case for restricted fragments of English and Urdu. We first develop DisCoCirc for a fragment of Urdu, as it was done for English in [14]. There is a simple translation from English grammar to Urdu grammar, and vice versa. We then show that differences in grammatical structure between English and Urdu - primarily relating to the ordering of words and phrases - vanish when passing to DisCoCirc circuits.