{% extends "layout.html" %} {% block content %} Study Guide: Generative Models

๐ŸŒŒ Study Guide: Generative Models

๐Ÿ”น Core Concepts

Story-style intuition: The Artist vs. The Art Critic

Imagine two types of AI that both study thousands of cat photos.
โ€ข The Discriminative Model is like an art critic. Its only job is to learn the difference between a cat photo and a dog photo. If you show it a new picture, it can tell you, "That's a cat," but it can't create a cat picture of its own. It learns a decision boundary.
โ€ข The Generative Model is like an artist. It studies the cat photos so deeply that it understands the "essence" of what makes a cat a catโ€”the patterns, the textures, the shapes. It learns the underlying distribution of "cat-ness." Because it has this deep understanding, it can then be asked to create a brand new, never-before-seen picture of a cat from scratch.

Generative Models are a class of statistical models that learn the underlying probability distribution of a dataset. Their primary goal is to understand the data so well that they can "generate" new data samples that are similar to the ones they were trained on.

๐Ÿ”น Types of Generative Models

Generative models come in several powerful flavors, each with a different approach to learning and creating.

๐Ÿ”น Mathematical Foundations

๐Ÿ”น Workflow of Generative Models

  1. Collect Data: Gather a large, high-quality dataset of the thing you want to generate (e.g., thousands of celebrity faces).
  2. Choose a Model: Select the right type of generative model for your task (e.g., a GAN or Diffusion Model for realistic images).
  3. Train the Model: This is the most computationally expensive step, where the model learns the underlying patterns and distribution of the training data.
  4. Generate New Samples: After training, you can use the model to generate new, synthetic data by sampling from its learned distribution.
  5. Evaluate Quality: Assess the quality of the generated samples using both quantitative metrics (like FID) and human evaluation.

๐Ÿ”น Applications

๐Ÿ”น Advantages & Disadvantages

Advantages:

Disadvantages:

๐Ÿ”น Key Evaluation Metrics

๐Ÿ”น Python Implementation (Conceptual Sketches)

Training large generative models from scratch is a major undertaking. Here are conceptual sketches of what the code looks like using popular frameworks.

Simple GAN Generator in PyTorch


import torch.nn as nn
import numpy as np

class Generator(nn.Module):
    def __init__(self, latent_dim, img_shape):
        super(Generator, self).__init__()
        self.img_shape = img_shape
        self.model = nn.Sequential(
            # Takes a random noise vector (latent_dim) and upsamples it
            nn.Linear(latent_dim, 128),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(128, 256),
            nn.BatchNorm1d(256),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(256, 512),
            nn.BatchNorm1d(512),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Linear(512, int(np.prod(self.img_shape))),
            nn.Tanh() # Scales output to be between -1 and 1
        )

    def forward(self, z):
        img = self.model(z)
        img = img.view(img.size(0), *self.img_shape)
        return img
            

Simple GAN Generator in TensorFlow/Keras


import tensorflow as tf
from tensorflow.keras import layers
import numpy as np

def build_generator(latent_dim, img_shape):
    model = tf.keras.Sequential()
    
    model.add(layers.Dense(256, input_dim=latent_dim))
    model.add(layers.LeakyReLU(alpha=0.2))
    model.add(layers.BatchNormalization(momentum=0.8))
    
    model.add(layers.Dense(512))
    model.add(layers.LeakyReLU(alpha=0.2))
    model.add(layers.BatchNormalization(momentum=0.8))
    
    model.add(layers.Dense(1024))
    model.add(layers.LeakyReLU(alpha=0.2))
    model.add(layers.BatchNormalization(momentum=0.8))
    
    model.add(layers.Dense(np.prod(img_shape), activation='tanh'))
    model.add(layers.Reshape(img_shape))
    
    return model
            

Using a Pre-trained Model from Hugging Face


# Easiest way to get started with powerful generative models!
from transformers import pipeline

# Initialize a text generation pipeline with a pre-trained model
generator = pipeline('text-generation', model='gpt2')

# Generate text
prompt = "In a world where AI could dream,"
generated_text = generator(prompt, max_length=50, num_return_sequences=1)

print(generated_text[0]['generated_text'])
            

๐Ÿ“ Quick Quiz: Test Your Knowledge

  1. What is the key difference between a generative model and a discriminative model?
  2. In a GAN, what are the roles of the Generator and the Discriminator?
  3. What is the core idea behind Diffusion Models?
  4. You have trained a GAN to generate images of cats. You calculate the FID score and get a value of 5. Your colleague trains another model and gets an FID score of 45. Which model is better, and why?

Answers

1. A generative model (the artist) learns the underlying distribution of the data, P(X), and can create new samples. A discriminative model (the critic) learns the decision boundary between classes, P(Y|X), and can only classify existing data.

2. The Generator tries to create fake data that looks real. The Discriminator tries to distinguish between real data and the Generator's fake data.

3. The core idea is to learn to reverse a process of gradually adding noise to an image. By mastering this "denoising" process, the model can start with pure noise and denoise it step-by-step into a coherent new image.

4. Your model with an FID score of 5 is much better. For Frechet Inception Distance (FID), a lower score is better, as it indicates that the statistical distribution of your generated images is closer to the distribution of the real images.

๐Ÿ”น Key Terminology Explained

The Story: Decoding the AI Artist's Toolkit

{% endblock %}