一种学习方法，用于机器人反应力引导的高精度组装

论文标题

一种学习方法，用于机器人反应力引导的高精度组装

A Learning Approach to Robot-Agnostic Force-Guided High Precision Assembly

论文作者

Luo, Jieliang, Li, Hui

论文摘要

在这项工作中，我们提出了一种用于高精度机器人组装问题的学习方法。我们专注于接触阶段，其中组件彼此紧密接触。与许多在很大程度上依赖视力或空间跟踪的基于学习的方法不同，我们的方法在任务空间中采用了力量/扭矩，这是唯一的观察。我们的训练环境是无机器人的，因为最终效应器没有附在任何特定的机器人上。然后可以将训练有素的政策应用于不同的机器人臂而不会重新训练。这种方法可以大大降低在现实世界中执行富含接触的机器人组件的复杂性，尤其是在建筑结构等非结构化环境中。为了实现这一目标，我们开发了一种新的分布式RL代理，称为复发分布式DDPG（RD2），该代理以复发性扩展了APE-X DDPG，并对优先体验重播进行了两个结构性改进。我们的结果表明，RD2能够解决两个基本的高精度组装任务：圈接头和钉子孔，并且与LSTM一起胜过两种最先进的算法，APE-X DDPG和PPO。在模拟中，我们已经成功地评估了三个机器人武器Kuka KR60，Franka Panda和UR10的机器人反应政策。介绍我们实验的视频可在https://sites.google.com/view/rd2-rl上获得

In this work we propose a learning approach to high-precision robotic assembly problems. We focus on the contact-rich phase, where the assembly pieces are in close contact with each other. Unlike many learning-based approaches that heavily rely on vision or spatial tracking, our approach takes force/torque in task space as the only observation. Our training environment is robotless, as the end-effector is not attached to any specific robot. Trained policies can then be applied to different robotic arms without re-training. This approach can greatly reduce complexity to perform contact-rich robotic assembly in the real world, especially in unstructured settings such as in architectural construction. To achieve it, we have developed a new distributed RL agent, named Recurrent Distributed DDPG (RD2), which extends Ape-X DDPG with recurrency and makes two structural improvements on prioritized experience replay. Our results show that RD2 is able to solve two fundamental high-precision assembly tasks, lap-joint and peg-in-hole, and outperforms two state-of-the-art algorithms, Ape-X DDPG and PPO with LSTM. We have successfully evaluated our robot-agnostic policies on three robotic arms, Kuka KR60, Franka Panda, and UR10, in simulation. The video presenting our experiments is available at https://sites.google.com/view/rd2-rl

下载PDF全文

下载文献需遵守相关版权规定

论文标题