设计可解释的近似值

论文标题

设计可解释的近似值

Designing Interpretable Approximations to Deep Reinforcement Learning

论文作者

Dahlin, Nathan, Kalagarla, Krishna Chaitanya, Naik, Nikhil, Jain, Rahul, Nuzzo, Pierluigi

论文摘要

在一组不断扩展的研究和应用领域，深度神经网络（DNNS）为算法性能设定了标准。但是，根据其他约束，例如处理功率和执行时间限制或诸如可验证的安全保证之类的要求，实际上实际使用此类高性能DNN可能是不可行的。近年来，已经开发了许多技术，可以将复杂的DNN压缩为较小，更快或更易于理解的模型和控制器。这项工作旨在确定不仅保留所需绩效水平的简化模型，还可以简洁地解释DNN代表的潜在知识。我们说明了在基准增强学习任务的背景下，提出的方法在评估决策树变体和内核机器评估中的有效性。

In an ever expanding set of research and application areas, deep neural networks (DNNs) set the bar for algorithm performance. However, depending upon additional constraints such as processing power and execution time limits, or requirements such as verifiable safety guarantees, it may not be feasible to actually use such high-performing DNNs in practice. Many techniques have been developed in recent years to compress or distill complex DNNs into smaller, faster or more understandable models and controllers. This work seeks to identify reduced models that not only preserve a desired performance level, but also, for example, succinctly explain the latent knowledge represented by a DNN. We illustrate the effectiveness of the proposed approach on the evaluation of decision tree variants and kernel machines in the context of benchmark reinforcement learning tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题