论文标题

长颈鹿可以变成鸟吗?对数据生成图像到图像翻译的评估

Can Giraffes Become Birds? An Evaluation of Image-to-image Translation for Data Generation

论文作者

Ruiz, Daniel V., Salomon, Gabriel, Todt, Eduardo

论文摘要

对图像到图像翻译的兴趣越来越多,其应用程序从从卫星图像生成地图到仅从轮廓创建整个衣服的图像不等。在目前的工作中,我们使用生成的对抗网络(GAN)研究图像到图像翻译,以生成新数据,作为案例研究,将长颈鹿图像的变形变成鸟图像。将长颈鹿变成鸟是一项艰巨的任务,因为它们具有不同的尺度,纹理和形态。一个无监督的跨域翻译人员,题为Instagan接受了长颈鹿和鸟类的培训,以及各自的口罩,以学习两个域之间的翻译。使用最初的长颈鹿图像的翻译生成合成鸟图像的数据集,同时保留原始的空间布置和背景。重要的是要强调,产生的鸟不存在,仅仅是Instagan学到的潜在代表的结果。两个常见文献数据集的子集用于训练gan并产生翻译图像:可可和加州大学 - 乌干达石鸟(Caltech-UCSD Birds)200-2011。为了评估生成的图像和面具的现实性和质量,进行了定性和定量分析。为了进行定量分析,将预先训练的面膜R-CNN用于检测和分割Pascal VOC,Caltech-UCSD Birds 200-2011和我们的新数据集,并将其标题为“ Fakeset”。生成的数据集达到了接近实际数据集的检测和分割结果,这表明生成的图像已经足够现实,可以通过最先进的深神经网络检测和分割。

There is an increasing interest in image-to-image translation with applications ranging from generating maps from satellite images to creating entire clothes' images from only contours. In the present work, we investigate image-to-image translation using Generative Adversarial Networks (GANs) for generating new data, taking as a case study the morphing of giraffes images into bird images. Morphing a giraffe into a bird is a challenging task, as they have different scales, textures, and morphology. An unsupervised cross-domain translator entitled InstaGAN was trained on giraffes and birds, along with their respective masks, to learn translation between both domains. A dataset of synthetic bird images was generated using translation from originally giraffe images while preserving the original spatial arrangement and background. It is important to stress that the generated birds do not exist, being only the result of a latent representation learned by InstaGAN. Two subsets of common literature datasets were used for training the GAN and generating the translated images: COCO and Caltech-UCSD Birds 200-2011. To evaluate the realness and quality of the generated images and masks, qualitative and quantitative analyses were made. For the quantitative analysis, a pre-trained Mask R-CNN was used for the detection and segmentation of birds on Pascal VOC, Caltech-UCSD Birds 200-2011, and our new dataset entitled FakeSet. The generated dataset achieved detection and segmentation results close to the real datasets, suggesting that the generated images are realistic enough to be detected and segmented by a state-of-the-art deep neural network.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源