从机器人观察数据中估算偏低运动估计的因果推断

论文标题

从机器人观察数据中估算偏低运动估计的因果推断

Causal Inference for De-biasing Motion Estimation from Robotic Observational Data

论文作者

Xu, Junhong, Yin, Kai, Gregory, Jason M., Liu, Lantao

论文摘要

在复杂的现实情况下收集的机器人数据通常是由于安全问题，人类偏好以及使命或平台限制因素而产生的。因此，从这种观察数据中学习的机器人对准确的参数估计提出了巨大的挑战。我们为机器人提出了一个原则性的因果推理框架，以使用观察数据来学习随机运动模型的参数。具体而言，我们利用潜在结果因果推理框架，反向倾向加权（IPW）和双重鲁棒（DR）方法的偏差功能，以获得对机器人随机运动模型的更好参数估计。 IPW是一种重新加权方法，可确保公正的估计，即使这些估计器之一是偏见，DR方法也进一步结合了任何两个估计器，以增强无偏见的结果。然后，我们使用偏见的估计状态过渡函数来开发近似政策迭代算法。我们使用仿真和现实世界实验验证我们的框架，结果表明，提出的基于因果推理的导航和控制框架可以正确有效地从偏见的观察数据中学习参数。

Robot data collected in complex real-world scenarios are often biased due to safety concerns, human preferences, and mission or platform constraints. Consequently, robot learning from such observational data poses great challenges for accurate parameter estimation. We propose a principled causal inference framework for robots to learn the parameters of a stochastic motion model using observational data. Specifically, we leverage the de-biasing functionality of the potential-outcome causal inference framework, the Inverse Propensity Weighting (IPW), and the Doubly Robust (DR) methods, to obtain a better parameter estimation of the robot's stochastic motion model. The IPW is a re-weighting approach to ensure unbiased estimation, and the DR approach further combines any two estimators to strengthen the unbiased result even if one of these estimators is biased. We then develop an approximate policy iteration algorithm using the bias-eliminated estimated state transition function. We validate our framework using both simulation and real-world experiments, and the results have revealed that the proposed causal inference-based navigation and control framework can correctly and efficiently learn the parameters from biased observational data.

下载PDF全文

下载文献需遵守相关版权规定

论文标题