文本分类中整合的几何形状rnns

论文标题

文本分类中整合的几何形状rnns

The geometry of integration in text classification RNNs

论文作者

Aitken, Kyle, Ramasesh, Vinay V., Garg, Ankush, Cao, Yuan, Sussillo, David, Maheswaranathan, Niru

论文摘要

尽管复发性神经网络（RNN）在各种任务中广泛应用，但对RNN如何解决这些任务的统一理解仍然难以捉摸。特别是，尚不清楚训练有素的RNN中出现了哪些动态模式，以及这些模式如何取决于训练数据集或任务。这项工作在特定的自然语言处理任务的背景下解决了这些问题：文本分类。使用动力学系统分析中的工具，我们研究了经过自然和合成文本分类任务的电池训练的经常性网络。我们发现这些受过训练的RNN的动力学既可以解释又低维。具体而言，在架构和数据集中，RNN使用低维吸引子作为基础机制，在处理文本时积累了每个类别的证据。此外，吸引子歧管的维度和几何形状取决于训练数据集的结构。特别是，我们描述了如何使用训练数据集中计算出的简单单词计数统计信息来预测这些属性。我们的观察结果涵盖了多个体系结构和数据集，反映了执行文本分类的常见机制。在某种程度上，将证据集成为决策是一种常见的计算原始性，这项工作为使用动力学系统技术研究RNN的内部工作奠定了基础。

Despite the widespread application of recurrent neural networks (RNNs) across a variety of tasks, a unified understanding of how RNNs solve these tasks remains elusive. In particular, it is unclear what dynamical patterns arise in trained RNNs, and how those patterns depend on the training dataset or task. This work addresses these questions in the context of a specific natural language processing task: text classification. Using tools from dynamical systems analysis, we study recurrent networks trained on a battery of both natural and synthetic text classification tasks. We find the dynamics of these trained RNNs to be both interpretable and low-dimensional. Specifically, across architectures and datasets, RNNs accumulate evidence for each class as they process the text, using a low-dimensional attractor manifold as the underlying mechanism. Moreover, the dimensionality and geometry of the attractor manifold are determined by the structure of the training dataset; in particular, we describe how simple word-count statistics computed on the training dataset can be used to predict these properties. Our observations span multiple architectures and datasets, reflecting a common mechanism RNNs employ to perform text classification. To the degree that integration of evidence towards a decision is a common computational primitive, this work lays the foundation for using dynamical systems techniques to study the inner workings of RNNs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题