论文标题
解开以对象为中心的深层活动推理模型的形状和姿势
Disentangling Shape and Pose for Object-Centric Deep Active Inference Models
论文作者
论文摘要
主动推论是一种特别是理解大脑的第一原理方法,通常是有情的药物,而自由能的单一势必要降低自由能。因此,它通过定义代理的生成模型并推断模型参数,动作和隐藏的状态信念,为对人工智能代理进行建模提供了一个计算帐户。但是,生成模型和隐藏状态空间结构的确切规范留给了实验者,其设计选择会影响代理的产生行为。最近,已经提出了深度学习方法,以从数据中学习隐藏的状态空间结构,从而减轻了这项乏味的设计任务的实验者,但导致了一个纠缠的,不可解剖的状态空间。在本文中,我们假设这样一个学识渊博的纠缠状态空间并不一定会在自由能中产生最佳模型,并且在状态空间中执行不同的因素可以产生较低的模型复杂性。特别是,我们考虑了3D对象表示的问题,并专注于Shapenet数据集的不同实例。我们提出了一个分配对象形状,姿势和类别的模型,同时仍使用深层神经网络学习每个因素的表示形式。我们表明,当活跃代理在达到首选观察时采用时,具有最佳分离属性的模型在采用时表现最好。
Active inference is a first principles approach for understanding the brain in particular, and sentient agents in general, with the single imperative of minimizing free energy. As such, it provides a computational account for modelling artificial intelligent agents, by defining the agent's generative model and inferring the model parameters, actions and hidden state beliefs. However, the exact specification of the generative model and the hidden state space structure is left to the experimenter, whose design choices influence the resulting behaviour of the agent. Recently, deep learning methods have been proposed to learn a hidden state space structure purely from data, alleviating the experimenter from this tedious design task, but resulting in an entangled, non-interpreteable state space. In this paper, we hypothesize that such a learnt, entangled state space does not necessarily yield the best model in terms of free energy, and that enforcing different factors in the state space can yield a lower model complexity. In particular, we consider the problem of 3D object representation, and focus on different instances of the ShapeNet dataset. We propose a model that factorizes object shape, pose and category, while still learning a representation for each factor using a deep neural network. We show that models, with best disentanglement properties, perform best when adopted by an active agent in reaching preferred observations.