层次结构化的任务无关持续学习

论文标题

层次结构化的任务无关持续学习

Hierarchically Structured Task-Agnostic Continual Learning

论文作者

Hihn, Heinke, Braun, Daniel A.

论文摘要

当前机器学习算法的一个值得注意的弱点是模型解决新问题的能力不佳而又不忘记先前获得的知识。持续学习范式已成为系统地研究设置的协议，其中该模型依次观察一系列任务生成的样本。在这项工作中，我们对持续学习的任务不合时宜的看法，并开发了一个层次信息理论的最佳原则，从而促进了学习和遗忘之间的权衡。我们从贝叶斯的角度得出了这一原则，并显示了与以前的持续学习方法的联系。基于这一原则，我们提出了一个神经网络层，称为“变种式专家层的混合物”，通过通过网络创建一组信息处理路径来减轻遗忘，该网络由网络控制，该网络由门控策略控制。配备了一套多样化和专业的参数集，每条路径都可以视为学会解决任务的独特子网。为了改善专家分配，我们引入了多样性目标，我们在其他消融研究中进行了评估。重要的是，我们的方法可以以任务不足的方式运行，即，它不需要特定于任务的知识，就像许多现有的持续学习算法一样。由于基于通用效用功能的一般配方，我们可以将此最佳原理应用于各种各样的学习问题，包括监督学习，增强学习和生成建模。我们证明了我们方法在连续强化学习以及MNIST，CIFAR-10和CIFAR-100数据集的变体方面的竞争性能。

One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We derive this principle from a Bayesian perspective and show its connections to previous approaches to continual learning. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Equipped with a diverse and specialized set of parameters, each path can be regarded as a distinct sub-network that learns to solve tasks. To improve expert allocation, we introduce diversity objectives, which we evaluate in additional ablation studies. Importantly, our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method on continual reinforcement learning and variants of the MNIST, CIFAR-10, and CIFAR-100 datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题