Introduction

Present a general learning framework that combines a variational auto-encoder(VAE) with a generative adversarial network(GAN).
Propose a new objective for the generator. Instead of using the same cross entropy loss as the discriminator network, the new objective requires the generator to generate data that minimize the l2 distance of average feature to the real data.

The result of this work is as follows.

Variational Auto-encoder (VAE)

Here E represents Encoder and G represents Generator. z is a latent verctor from input x. x' is generated from z.

Generative Adversarial Network (GAN)

GAN has a Discriminative network represented by D. The discriminator D tries to distinguish real training data from synthesized data; and the generator G tries to fool the discriminator.

CVAE && CGAN

VAEs and GANs can also be trained to conduct conditional generation, e.g., CVAE and CGAN. By introducing additional conditionality, they can handle probabilistic one-to-many mapping problems.

Performance

In the experiment for contrast, the results generated by CVAE are relatively blurry, but the whole structure is maintained, while results generated by CGAN miss the structure of faces.

How to improve these two models to generate better images? One idea is whether can we combine them to learn from the other’s strong points to offset one’s weakness.

CVAE-GAN

The formulation of CVAE-GAN

Where x and x′ are input and generated image. E, G, C, D are encoder, generative, classification, and discriminative network, respectively. z is the latent vector. y is a binary output which represents real/synthesized image. c is the condition, such as attribute or class label.

The naive combination of VAE and GAN is insufficient. Recent work shows that if the original KL Divergence loss is adopted, training of GANwill suffer from a gradient vanishing problem of the network G.

So this work keep the training process of network E, D, and C as the same as the original VAE and GAN, and propose a new mean feature matching objective for the generative network G to improve the stability of the original GAN.

Loss function

E, D, and C are as the same as the original VAE and GAN. Therefore their loss function is as follows:

Mean feature matching based GAN

To improve the stability of the original GAN, this work propose a new mean feature matching objective for the generative network G which is represented

The objective requires the center of the features of the synthesized samples to match the center of the feature of the real samples.

Mean Feature Matching for Conditional Image Generation

For the conditional image generation, this work propose using the mean feature matching objective for generative network G.

Pairwise Feature Matching

The VAE model in this work can enforce the GAN to generate diverse samples since the encoder network E can obtain a mapping from the real image x to the latent space z. Therefore, the model explicitly sets up the relationship between the latent space and real image space.

The loss of G in this part is

Objective of CVAE-GAN

The goal of this approach is to minimize the following loss function:

In the experiments, λ1 = 3, λ2 = 1, λ3 = 1e10−3 and λ4 = 1e10−3.

Algorithm

The whole training procedure is clear and easy to understand.

Experiment

Visualization comparison with other models

The results generated by CVAE-GAN are difficult to distinguish from the real samples.

Quantitative comparison

The higher the score of Realisticity, the better.

Attributes morphing

Given two latent vectors, z1 and z2, the generated images can change attribute when use the new z.

Image inpainting

CVAE-GAN for data augmentation

Two data augmentation strategies: generate more images for existing identities in the training datasets; generating new identities by mixing of different identities.

Conclusion and Discussion

Present a general learning framework that combines a variational auto-encoder with a generative adversarial network.
Propose a mean discrepancy objective for the generative network to make the training of the GAN more stable.
Now the model can just generate images from an known category.

Reference

Bao J, Chen D, Wen F, et al. CVAE-GAN: Fine-Grained Image Generation through Asymmetric Training[J]. 2017.

CVAE-GAN

Generative model

Introduction

Variational Auto-encoder (VAE)

Generative Adversarial Network (GAN)

CVAE && CGAN

Performance

CVAE-GAN

The formulation of CVAE-GAN