MachineX: Generative Adversary Networks (GAN)

Table of contents

Reading Time: 6 minutes

In this blog , we are going to talk about GAN(Generative Adversary Networks) Basics and how they actually works.

GAN is about creating, like drawing a portrait or composing a symphony. This is hard compared to other deep learning fields. It is much easier to identify a Monet painting than painting one, by computers or by people. But it brings us closer in understanding intelligence. GAN leads us to thousands of GAN research papers written in recent years. In developing games, we hire many production artists to create animation. Some of the tasks are routine. By applying automation with GAN, we may one day focus ourselves to the creative sides rather than repeating routine tasks daily.

To understand how they work, imagine a blind forger trying to create copies of paintings by great masters. To start with, he has no idea what a painting should look like.

But he happens to have a friend who has a photographic memory of every masterpiece that’s ever been painted.

The principle behind the GAN was first proposed in 2014. It describes a system that pits two AI systems (neural networks) against each other to improve the quality of their results.

Introduction

Generative Adversarial Networks(GAN) are an approach to generative modeling using deep learning methods, such as convolutional neural networks.

Generative modeling is an unsupervised learning task in machine learning. It involves automatically discovering and learning the regularities or patterns in input data. So that the model can be used to generate or output new examples that plausibly could have been drawn from the original dataset.

GAN are a clever way of training a generative model by framing the problem as a supervised learning problem with two sub-models: the generator model that we train to generate new examples. The discriminator model that tries to classify examples as either real (from the domain) or fake (generated). The two models are trained together in a zero-sum game, adversarial, until the discriminator model is fooled about half the time, meaning the generator model is generating plausible examples.

Before going to Generative Adversary Networks , lets talk about Generative models first. So,

What Are Generative Models?

we will review the idea of generative models, stepping over the supervised vs. unsupervised learning paradigms and discriminative vs. generative modeling.

A typical machine learning problem involves using a model to make a prediction

This requires a training dataset that is used to train a model, comprised of multiple examples, called samples, each with input variables (X) and output class labels (y). A model is trained by showing examples of inputs, having it predict outputs, and correcting the model to make the outputs more like the expected outputs.

This image has an empty alt attribute; its file name is example-of-supervised-learning.png

What Are Generative Adversarial Networks?

The GAN model architecture involves two sub-models: a generator model for generating new examples and a discriminator model for classifying. Whether generated examples are real, from the domain, or fake, generated by the generator model.

Generator. Model that is used to generate new plausible examples from the problem domain.
Discriminator. Model that is used to classify examples as real (from the domain) or fake (generated).

The Generator Model

The generator model takes a fixed-length random vector as input and generates a sample in the domain.

The vector is drawn from randomly from a Gaussian distribution, and the vector is used to seed the generative process. After training, points in this multidimensional vector space will correspond to points in the problem domain, forming a compressed representation of the data distribution.

This vector space is referred to as a latent space, or a vector space comprised of latent variables. Latent variables, or hidden variables, are those variables that are important for a domain but are not directly observable.

After training, the generator model is kept and used to generate new samples.

This image has an empty alt attribute; its file name is generator.png

The Discriminator Model

The discriminator model takes an example from the domain as input (real or generated) and predicts a binary class label of real or fake (generated).

The real example comes from the training dataset. The generated examples are output by the generator model.

The discriminator is a normal (and well understood) classification model.

After the training process, the discriminator model is discarded as we are interested in the generator.

Sometimes, the generator can be repurposed as it has learned to effectively extract features from examples in the problem domain. Some or all of the feature extraction layers can be used in transfer learning applications using the same or similar input data.

This image has an empty alt attribute; its file name is discriminiter.png

GANs as a Two Player Game

The two models, the generator and discriminator, are trained together. The generator generates a batch of samples, and these, along with real examples from the domain, are provided to the discriminator and classified as real or fake.

The discriminator is then updated to get better at discriminating real and fake samples in the next round, and importantly. The generator is updated based on how well, or not, the generated samples fooled the discriminator.

In this case, zero-sum means that when the discriminator successfully identifies real and fake samples. It is rewarded or no change is needed to the model parameters, whereas the generator is penalized with large updates to model parameters.

Alternately, when the generator fools the discriminator, it is rewarded, or no change is needed to the model parameters, but the discriminator is penalized and its model parameters are updated.

At a limit, the generator generates perfect replicas from the input domain every time, and the discriminator cannot tell the difference and predicts “unsure” (e.g. 50% for real and fake) in every case. This is just an example of an idealized case; we do not need to get to this point to arrive at a useful generator model.

This image has an empty alt attribute; its file name is example-of-the-generative-adversarial-network-model-architecture.png

Why Generative Adversarial Networks?

One of the many major advancements in the use of deep learning methods in domains such as computer vision is a technique called data augmentation.

Data augmentation results in better performing models, both increasing model skill and providing a regularizing effect, reducing generalization error. It works by creating new, artificial but plausible examples from the input problem domain on which the model is trained.

The techniques are primitive in the case of image data, involving crops, flips, zooms, and other simple transforms of existing images in the training dataset.

Successful generative modeling provides an alternative and potentially more domain-specific approach for data augmentation. In fact, data augmentation is a simplified version of generative modeling, although it is rarely described this way.

GANs’ successful ability to model high-dimensional data, handle missing data, and the capacity of GANs to provide multi-modal outputs or multiple plausible answers.

Perhaps the most compelling application of GANs is in conditional GANs for tasks that require the generation of new examples. Here, Goodfellow indicates three main examples:

Image Super-Resolution. The ability to generate high-resolution versions of input images.
Creating Art. The ability to great new and artistic images, sketches, painting, and more.
Image-to-Image Translation. The ability to translate photographs across domains, such as day to night, summer to winter, and more.