← ML Research Wiki / 1611.07004

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola [email protected] (BAIR) Laboratory University of California Berkeley, Jun-Yan Zhu [email protected] (BAIR) Laboratory University of California Berkeley, Tinghui Zhou [email protected] (BAIR) Laboratory University of California Berkeley, Alexei A Efros [email protected] (BAIR) Laboratory University of California Berkeley, Berkeley Ai Research (BAIR) Laboratory University of California Berkeley (2016)

Paper Information

arXiv ID

1611.07004

Venue

Computer Vision and Pattern Recognition

Domain

computer vision, machine learning

SOTA Claim

Yes

Code

Available

Reproducibility

8/10

Contents

Abstract
Methods
Datasets
Results
Limitations
Related Work
External Resources

Abstract

Abstract not available.

Summary

This paper investigates image-to-image translation using Conditional Adversarial Networks (cGANs). The authors highlight the limitations of traditional Convolutional Neural Networks (CNNs) in generating sharp images when minimizing Euclidean loss and propose a framework leveraging GANs to automatically learn appropriate loss functions that encourage realism in outputs. They discuss their contributions in demonstrating the effectiveness of cGANs across various tasks and offer a simplified framework while exploring architectural choices. The method's robustness is validated through experiments on different datasets including Cityscapes and ImageNet, showing promising results for generating realistic images from various types of input images like semantic labels and sketches. They also emphasize the need for perceptual evaluation of generated images and report qualitative and quantitative results from their experiments, concluding that cGANs provide a versatile solution for many image-to-image translation tasks.

Methods

This paper employs the following methods:

Convolutional Neural Networks
Generative Adversarial Networks
Conditional Generative Adversarial Networks

Models Used

Conditional GANs
U-Net
PatchGAN

Datasets

The following datasets were used in this research:

Cityscapes
ImageNet
CMP Facades
Google Maps
UT Zappos50K

Evaluation Metrics

FCN-score
AMT perceptual studies

Results

Conditional GANs produce reasonable results on a wide variety of image-to-image translation problems.
U-Net architecture with skip connections improves image generation quality compared to standard encoder-decoder models.
The PatchGAN discriminator is effective in producing high-quality images for local structures.

Limitations

The authors identified the following limitations:

The presented cGANs can lead to artifacts in some generated outputs.
Conditional GANs may not outperform simpler methods like L1 regression in certain vision tasks.

Technical Requirements

Number of GPUs: 1
GPU Type: Pascal Titan X

Keywords

image-to-image translation conditional GANs Pix2Pix U-Net PatchGAN

Papers Using Similar Methods

External Resources

Funding: Not specified
References: 70
Influential Citations: 3128

Image-to-Image Translation with Conditional Adversarial Networks

Abstract edit

Summary

Methods add

Models Used add

Datasets add

Evaluation Metrics add

Results add

Limitations add

Technical Requirements edit

Keywords add

Related Papers