不仅是漂亮的图片：使用文本到图像发电机进行介入数据增强

论文标题

不仅是漂亮的图片：使用文本到图像发电机进行介入数据增强

Not Just Pretty Pictures: Toward Interventional Data Augmentation Using Text-to-Image Generators

论文作者

Yuan, Jianhao, Pinto, Francesco, Davies, Adam, Torr, Philip

论文摘要

已知神经图像分类器在暴露于与训练数据不同的环境条件下取样的输入时会经历严重的性能降解。鉴于文本到图像（T2I）一代的最新进展，一个自然的问题是如何使用现代T2i发电机模拟此类环境因素的任意干预措施，以增强训练数据并改善下游分类器的稳健性。我们在单个域概括（SDG）中进行各种基准集合，并降低对虚假特征（RRSF）的依赖，在T2I生成的关键维度上消融，包括干预提示策略，调理机制和事后过滤。我们广泛的经验发现表明，诸如稳定扩散之类的现代T2i发电机确实可以用作强大的介入数据增强机制，无论每个维度如何配置如何，都表现优于先前先前的最新数据增强技术。

Neural image classifiers are known to undergo severe performance degradation when exposed to inputs that are sampled from environmental conditions that differ from their training data. Given the recent progress in Text-to-Image (T2I) generation, a natural question is how modern T2I generators can be used to simulate arbitrary interventions over such environmental factors in order to augment training data and improve the robustness of downstream classifiers. We experiment across a diverse collection of benchmarks in single domain generalization (SDG) and reducing reliance on spurious features (RRSF), ablating across key dimensions of T2I generation, including interventional prompting strategies, conditioning mechanisms, and post-hoc filtering. Our extensive empirical findings demonstrate that modern T2I generators like Stable Diffusion can indeed be used as a powerful interventional data augmentation mechanism, outperforming previously state-of-the-art data augmentation techniques regardless of how each dimension is configured.

下载PDF全文

下载文献需遵守相关版权规定

论文标题