零射击视觉常识不道德预测

论文标题

零射击视觉常识不道德预测

Zero-shot Visual Commonsense Immorality Prediction

论文作者

Jeong, Yujin, Park, Seongbeom, Moon, Suhong, Kim, Jinkyu

论文摘要

人工智能目前正在为多样化的现实应用程序提供动力。这些应用显示出令人鼓舞的表现，但提出了复杂的道德问题，即如何嵌入道德来使AI应用在道德上行事。道德AI系统的一种方法是模仿人类亲社会行为并鼓励系统中某种形式的良好行为。但是，学习这种规范性伦理（尤其是来自图像）的挑战主要是由于缺乏数据和标记复杂性。在这里，我们提出了一个模型，该模型以零拍的方式预测不道德的不道德行为。我们通过基于剪辑的图像文本嵌入的伦理数据集（一对文本和道德注释）训练模型。在测试阶段，可以预测看不见的图像的不道德。我们使用现有的道德/不道德图像数据集评估我们的模型，并显示与人类直觉一致的公平预测性能。此外，我们创建了一个视觉常识不道德基准，并具有更一般和广泛的不道德视觉内容。代码和数据集可在https://github.com/ku-vai/zero-sero-sero-visual-commonsense-immoration-prediction上找到。请注意，本文可能包含令人反感的图像和描述。

Artificial intelligence is currently powering diverse real-world applications. These applications have shown promising performance, but raise complicated ethical issues, i.e. how to embed ethics to make AI applications behave morally. One way toward moral AI systems is by imitating human prosocial behavior and encouraging some form of good behavior in systems. However, learning such normative ethics (especially from images) is challenging mainly due to a lack of data and labeling complexity. Here, we propose a model that predicts visual commonsense immorality in a zero-shot manner. We train our model with an ETHICS dataset (a pair of text and morality annotation) via a CLIP-based image-text joint embedding. In a testing phase, the immorality of an unseen image is predicted. We evaluate our model with existing moral/immoral image datasets and show fair prediction performance consistent with human intuitions. Further, we create a visual commonsense immorality benchmark with more general and extensive immoral visual contents. Codes and dataset are available at https://github.com/ku-vai/Zero-shot-Visual-Commonsense-Immorality-Prediction. Note that this paper might contain images and descriptions that are offensive in nature.

下载PDF全文

下载文献需遵守相关版权规定

论文标题