论文标题
NIFT:对象操纵的神经相互作用字段和模板
NIFT: Neural Interaction Field and Template for Object Manipulation
论文作者
论文摘要
我们介绍了NIFT,神经相互作用领域和模板,这是对象操作的描述性和稳健的相互作用表示,以促进模仿学习。给定一些对象操纵演示,NIFT通过匹配从针对新对象定义的目标神经相互作用场(NIF)中提取的神经相互作用模板(NIT)来指导新对象实例的相互作用模仿。具体而言,NIF是一个神经场,它编码每个空间点和给定对象之间的关系,其中相对位置是由球形距离函数而不是占用或签名距离定义的,这些距离通常是由常规神经领域所采用的,但信息不足。对于给定的演示相互作用,相应的NIT由与相关神经特征的演示NIF中采样的一组空间点定义。为了更好地捕获相互作用,将点在相互作用双压表面(IBS)上进行采样,该点由等距与两个相互作用对象的点组成,并已广泛用于相互作用表示。通过定义了为更好的交互编码的点选择和点式特征,NIT有效地指导了新对象实例的NIF中的功能匹配,从而优化了相对姿势以在模仿演示相互作用的同时实现操纵。实验表明,我们的NIFT解决方案优于对象操纵的最先进的模仿学习方法,并将其推广到新类别的对象。
We introduce NIFT, Neural Interaction Field and Template, a descriptive and robust interaction representation of object manipulations to facilitate imitation learning. Given a few object manipulation demos, NIFT guides the generation of the interaction imitation for a new object instance by matching the Neural Interaction Template (NIT) extracted from the demos in the target Neural Interaction Field (NIF) defined for the new object. Specifically, the NIF is a neural field that encodes the relationship between each spatial point and a given object, where the relative position is defined by a spherical distance function rather than occupancies or signed distances, which are commonly adopted by conventional neural fields but less informative. For a given demo interaction, the corresponding NIT is defined by a set of spatial points sampled in the demo NIF with associated neural features. To better capture the interaction, the points are sampled on the Interaction Bisector Surface (IBS), which consists of points that are equidistant to the two interacting objects and has been used extensively for interaction representation. With both point selection and pointwise features defined for better interaction encoding, NIT effectively guides the feature matching in the NIFs of the new object instances such that the relative poses are optimized to realize the manipulation while imitating the demo interactions. Experiments show that our NIFT solution outperforms state-of-the-art imitation learning methods for object manipulation and generalizes better to objects from new categories.