intermediateDeep Learning

Learn about autoencoders - neural networks that learn compressed representations by reconstructing their input.

autoencodersvaeunsupervised-learningdimensionality-reductiongenerative

Autoencoders

Autoencoders are neural networks that learn to compress data into a lower-dimensional representation and then reconstruct the original input. They're powerful tools for dimensionality reduction, denoising, and generative modeling.

Architecture

Input → [Encoder] → Latent Code → [Decoder] → Reconstruction
  x    →     z     →     z      →           →      x̂

Components

  • Encoder: Compresses input to latent representation
  • Latent Space: Compressed representation (bottleneck)
  • Decoder: Reconstructs input from latent code

Training Objective

Loss = ||x - x̂||²

Minimize reconstruction error.

Basic Autoencoder

class Autoencoder(nn.Module):
    def __init__(self, input_dim, latent_dim):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.ReLU(),
            nn.Linear(256, latent_dim)
        )
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 256),
            nn.ReLU(),
            nn.Linear(256, input_dim),
            nn.Sigmoid()
        )
    
    def forward(self, x):
        z = self.encoder(x)
        x_hat = self.decoder(z)
        return x_hat
    
    def encode(self, x):
        return self.encoder(x)

Types of Autoencoders

Undercomplete Autoencoder

Latent dimension < input dimension:

784 → 64 → 784

Forces compression, learns important features.

Overcomplete Autoencoder

Latent dimension ≥ input dimension:

784 → 1000 → 784

Needs regularization to avoid identity mapping.

Sparse Autoencoder

Add sparsity penalty:

Loss = ||x - x̂||² + λ × sparsity(z)

Most latent units inactive for any input.

Denoising Autoencoder (DAE)

Train to reconstruct from corrupted input:

x̃ = corrupt(x)       # Add noise
z = encoder(x̃)
x̂ = decoder(z)
Loss = ||x - x̂||²    # Reconstruct clean!

Learns robust features.

Contractive Autoencoder (CAE)

Penalize sensitivity to input:

Loss = ||x - x̂||² + λ × ||∂z/∂x||²

Learns smooth, stable representations.

Variational Autoencoder (VAE)

Key Difference

Encode to distribution, not point:

Encoder → μ, σ (mean and std of Gaussian)
z ~ N(μ, σ²)  (sample from distribution)
Decoder(z) → x̂

Loss Function

Loss = Reconstruction + KL Divergence
     = ||x - x̂||² + KL(q(z|x) || p(z))

KL term regularizes latent space to be Gaussian.

Reparameterization Trick

To backpropagate through sampling:

z = μ + σ × ε,  where ε ~ N(0, 1)

Implementation

class VAE(nn.Module):
    def __init__(self, input_dim, latent_dim):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Linear(input_dim, 256),
            nn.ReLU()
        )
        self.fc_mu = nn.Linear(256, latent_dim)
        self.fc_logvar = nn.Linear(256, latent_dim)
        self.decoder = nn.Sequential(
            nn.Linear(latent_dim, 256),
            nn.ReLU(),
            nn.Linear(256, input_dim),
            nn.Sigmoid()
        )
    
    def encode(self, x):
        h = self.encoder(x)
        return self.fc_mu(h), self.fc_logvar(h)
    
    def reparameterize(self, mu, logvar):
        std = torch.exp(0.5 * logvar)
        eps = torch.randn_like(std)
        return mu + eps * std
    
    def decode(self, z):
        return self.decoder(z)
    
    def forward(self, x):
        mu, logvar = self.encode(x)
        z = self.reparameterize(mu, logvar)
        return self.decode(z), mu, logvar

def vae_loss(x, x_hat, mu, logvar):
    recon = F.mse_loss(x_hat, x, reduction='sum')
    kl = -0.5 * torch.sum(1 + logvar - mu.pow(2) - logvar.exp())
    return recon + kl

Applications

Dimensionality Reduction

latent_codes = model.encode(data)
# Use for visualization, clustering

Non-linear alternative to PCA.

Denoising

Noisy image → Trained denoiser → Clean image

Anomaly Detection

def detect_anomaly(x):
    x_hat = model(x)
    error = ((x - x_hat) ** 2).mean()
    return error > threshold  # High error = anomaly

Generation (VAE)

# Sample from prior
z = torch.randn(1, latent_dim)
generated = model.decode(z)

Data Augmentation

Generate variations of training data.

Convolutional Autoencoders

For images, use conv layers:

class ConvAutoencoder(nn.Module):
    def __init__(self):
        super().__init__()
        self.encoder = nn.Sequential(
            nn.Conv2d(1, 32, 3, stride=2, padding=1),
            nn.ReLU(),
            nn.Conv2d(32, 64, 3, stride=2, padding=1),
            nn.ReLU()
        )
        self.decoder = nn.Sequential(
            nn.ConvTranspose2d(64, 32, 3, stride=2, padding=1, output_padding=1),
            nn.ReLU(),
            nn.ConvTranspose2d(32, 1, 3, stride=2, padding=1, output_padding=1),
            nn.Sigmoid()
        )

Autoencoder vs PCA

AspectPCAAutoencoder
TransformationLinearNon-linear
TrainingClosed-formGradient descent
InterpretabilityEigenvectorsBlack box
ScalabilityMemory intensiveMini-batch

With linear activation, autoencoder ≈ PCA.

Tips for Training

Architecture

  • Symmetry helps (encoder/decoder mirror)
  • Start simple, add complexity
  • Bottleneck size: experiment

Loss Functions

  • MSE for real-valued data
  • Binary cross-entropy for binary data
  • Perceptual loss for images

Regularization

  • Dropout in encoder
  • Noise injection (denoising)
  • Weight decay

Key Takeaways

  1. Autoencoders learn compressed representations
  2. Encoder → Latent → Decoder architecture
  3. Training minimizes reconstruction error
  4. VAEs enable generation by modeling latent distribution
  5. Applications: compression, denoising, anomaly detection
  6. For images, use convolutional architectures