论文标题
Carnet:一种动态自动编码器,用于学习自动驾驶任务中的潜在动态
CARNet: A Dynamic Autoencoder for Learning Latent Dynamics in Autonomous Driving Tasks
论文作者
论文摘要
自主驾驶在汽车行业受到了很多关注,通常被视为运输的未来。配备有多种传感器(例如,相机,前面雷达,激光雷达和IMU)的乘用车越来越普遍。这些传感器提供了高维,时间相关的数据流,这对于可靠的自主驾驶至关重要。自主驾驶系统应有效地使用从各种传感器中收集的信息,以形成对世界的抽象描述并保持情境意识。深度学习模型(例如自动编码器)可用于此目的,因为它们可以从传入数据流学习紧凑的潜在表示。但是,大多数自动编码器模型在不假定任何时间相互依赖的情况下独立处理数据。因此,需要深入学习模型,这些模型明确考虑了其体系结构中数据的时间依赖性。这项工作提出了Carnet是一种组合动态自动编码器网络体系结构,该网络结构使用自动编码器与经常性神经网络相结合,以了解当前的潜在表示,此外,还可以预测在自主驾驶的背景下未来的潜在表示。我们使用模拟和实际数据集证明了所提出模型在模仿和增强学习设置中的功效。我们的结果表明,所提出的模型的表现优于基线最先进的模型,同时具有较少的训练参数。
Autonomous driving has received a lot of attention in the automotive industry and is often seen as the future of transportation. Passenger vehicles equipped with a wide array of sensors (e.g., cameras, front-facing radars, LiDARs, and IMUs) capable of continuous perception of the environment are becoming increasingly prevalent. These sensors provide a stream of high-dimensional, temporally correlated data that is essential for reliable autonomous driving. An autonomous driving system should effectively use the information collected from the various sensors in order to form an abstract description of the world and maintain situational awareness. Deep learning models, such as autoencoders, can be used for that purpose, as they can learn compact latent representations from a stream of incoming data. However, most autoencoder models process the data independently, without assuming any temporal interdependencies. Thus, there is a need for deep learning models that explicitly consider the temporal dependence of the data in their architecture. This work proposes CARNet, a Combined dynAmic autoencodeR NETwork architecture that utilizes an autoencoder combined with a recurrent neural network to learn the current latent representation and, in addition, also predict future latent representations in the context of autonomous driving. We demonstrate the efficacy of the proposed model in both imitation and reinforcement learning settings using both simulated and real datasets. Our results show that the proposed model outperforms the baseline state-of-the-art model, while having significantly fewer trainable parameters.