By Athreya Daniel
As the field of artificial intelligence continues to grow at a rapid rate, machines are able to perform more and more complicated tasks. One rapidly expanding field of artificial intelligence involves creating synthetic data based on an existing dataset. In order to do this, a special type of neural network was invented -- the GAN.
The GAN, or generative adversarial network, was created by a team of researchers in 2014, which, in an effort to generate synthetic data, proposed a “contest” between 2 neural networks. One network would create synthetic data, and the other would try to distinguish between the real training set and the synthetic data. This idea formed the GAN and its two main components -- the generator and the discriminator.
The generator of the GAN is responsible for the physical creation of synthetic data. The generator, typically a deconvolutional neural network, starts by producing some sort of random noise (like a Gaussian distribution). In contrast to a normal convolutional neural network, the generator upsamples the noise into an image. A simple generator could be a sequential model with Dense (feedforward) and normalizing layers, and a ReLU activation function.
On the other hand, the discriminator’s main task is to distinguish between the training data set and the newly created synthetic images. The discriminator relies on a convolutional neural network to do this, which is trained to scan the images for distinct features that determine the class of the image (if you want to learn more about convolutional neural nets, refer to Yash’s article).
Now that we understand how the individual parts of the network function, it is important to discuss the specifics of how they work together. The network can be summarized into two main feedback loops: one that includes the discriminator and the training images, and the other which includes the generator and the discriminator. During the training process, both the generator and the discriminator improve through this feedback loop. As it receives output from the discriminator, the generator adjusts its weights through backpropagation to produce a more meaningful distribution of data, and with more and more data, the discriminator also becomes more accurate in filtering the images, in turn necessitating more accurate data from the generator. Additionally, the weights of both the generator and discriminator are held constant while the other one trains, so both are able to benefit from the feedback loop.
Although GAN’s are a popular tool used for generative tasks, they do have certain flaws. GAN’s can often take long amounts of time to train, and it can be difficult to keep the correct balance between the generator and the discriminator so one does not overpower the other. As a result, other generative networks have been developed for similar tasks. One such network is a VAE or a variational autoencoder. These networks do not start with random noise, but rather take in input in the form of vectors, and compress them using their encoding layer. The compressed image is then decoded into its individual features and made into a reconstructed version of the input vector.
There are numerous applications of GAN’s in the real world, ranging from image generation to protein modeling. GAN’s can be used to generate new layouts for bedrooms, and recently Nvidia’s GameGAN was able to recreate the game PacMan just by watching it being played. In biology, research is showing that GAN’s have the potential to create new inhibitors for proteins by learning existing inhibitor structures. Additionally, because GAN’s utilize convolutional neural networks, they can be used for sound generation, as long as the sound has been processed properly. Playing around with these using Tensorflow or another machine learning platform is a great way to create something interesting and better your understanding of GAN’s at the same time.
What are the two parts of a GAN and what do they do?
The two parts of a GAN are the generator and the discriminator. The generator scales random noise up to images, while the discriminator attempts to distinguish the images created by the generator from those in the training set.
What type of network is the generator, and what layers does it include?
The generator is a deconvolutional neural network and generally includes Dense, normalization, and relu activation layers.
Made by Athreya Daniel