论文标题

如何在您的DEDR中通过匈牙利人倒退?

How to Backpropagate through Hungarian in Your DETR?

论文作者

Chen, Lingji, Sharma, Alok, Shirore, Chinmay, Zhang, Chengjie, Buddharaju, Balarama Raju

论文摘要

检测变压器(DETR)方法使用了变压器编码器架构架构和基于集合的全局损失,已成为许多基于变压器的应用程序中的基础。但是,正如最初提出的那样,任务成本和全球损失不一致,即减少前者的可能性可能会减少后者。当使用诸如匈牙利的组合求解器(例如匈牙利语)时,梯度问题就会被忽略。在本文中,我们表明,全局损失可以表示为独立术语的总和,以及可用于定义分配成本矩阵的任务依赖性项。然后,使用有关分配问题参数的最佳分配成本梯度的最新结果,然后用来定义损失的通用梯度相对于网络参数,并正确执行了反向传播。我们使用相同减肥权重的实验表现出有趣的收敛性能,并具有进一步改进的潜力。

The DEtection TRansformer (DETR) approach, which uses a transformer encoder-decoder architecture and a set-based global loss, has become a building block in many transformer based applications. However, as originally presented, the assignment cost and the global loss are not aligned, i.e., reducing the former is likely but not guaranteed to reduce the latter. And the issue of gradient is ignored when a combinatorial solver such as Hungarian is used. In this paper we show that the global loss can be expressed as the sum of an assignment-independent term, and an assignment-dependent term which can be used to define the assignment cost matrix. Recent results on generalized gradients of optimal assignment cost with respect to parameters of an assignment problem are then used to define generalized gradients of the loss with respect to network parameters, and backpropagation is carried out properly. Our experiments using the same loss weights show interesting convergence properties and a potential for further performance improvements.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源