Related work
Unsupervised representation learning: K-means auto-encoders
Generating natural images: parametric and non- parametric non-parametric: texture synthesis , super-resolution, in-painting Parametric: VAE, A laplacian pyramid extension to VAE, recurrent network approach, deconvolution network approach
Visualizing: visualize different layers (Zeiler & Fergus, 2014)
Architecture guidelines for stable Deep Convolutional GANs
- Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator)
- Use BN in both the generator and the discriminator
- Remove fully connected hidden layers for deeper architectures
- Use ReLU activation in generator for all layers except for the output, which uses Tanh
- Use LeakyReLU activation in the discriminator for all layers
- The use of the Adam optimizer rather than SGD with momentum
network
figure 1: generator
figure 2: discriminator
feature extrator
figure 3: LSUN
figure 4: SVNH(SVHN(StreetView House Numbers dataset)
Walking in the latent space
figure 5: interpolation in the latent space
Visualizing the discriminator features
figure 6: visualizing the discriminator features
Forgetting to draw certain objects
figure 7: forgetting to generate windows
Vector arithmetic on face samples
figure 8: arithmetic of z vectors
figure 9: interpolation in director of pose
conclusion
- propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings
- use the trained discriminators for image classification tasks, showing competitive performance with other unsupervised algorithms
- visualize the filters learnt by GANs and empirically show that specific filters have learned to draw specific objects
- show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated samples