intermediateFoundations

Master eigenvalues and eigenvectors - essential linear algebra concepts used in PCA, spectral clustering, and understanding neural networks.

linear-algebrapcadimensionality-reductionmathematics

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors are fundamental concepts in linear algebra that appear throughout machine learning, from PCA to understanding how neural networks transform data.

The Basic Idea

For a square matrix A, an eigenvector v is a non-zero vector that, when multiplied by A, only gets scaled (not rotated):

A × v = λ × v

Where:

  • v is the eigenvector (direction that stays the same)
  • λ (lambda) is the eigenvalue (the scaling factor)

Intuitive Understanding

Think of a matrix as a transformation. Most vectors change direction when transformed. But eigenvectors are special - they only stretch or shrink, staying on the same line.

Example: Consider a 2D stretch transformation:

  • Vectors along the stretch direction are eigenvectors
  • The stretch factor is the eigenvalue

Computing Eigenvectors

To find eigenvalues, solve:

det(A - λI) = 0

This gives you eigenvalues. Then for each λ, solve:

(A - λI)v = 0

to find the corresponding eigenvector.

Properties

Symmetric Matrices (A = Aᵀ)

  • All eigenvalues are real (not complex)
  • Eigenvectors are orthogonal
  • Can always be diagonalized

This is why covariance matrices (symmetric!) are so nice for PCA.

Eigendecomposition

A matrix can often be decomposed as:

A = VΛV⁻¹

Where V contains eigenvectors and Λ is diagonal with eigenvalues.

Trace and Determinant

  • Sum of eigenvalues = trace(A)
  • Product of eigenvalues = det(A)

Applications in Machine Learning

Principal Component Analysis (PCA)

PCA finds directions of maximum variance:

  1. Compute covariance matrix of data
  2. Find its eigenvectors and eigenvalues
  3. Eigenvectors are principal components
  4. Eigenvalues indicate variance explained

Larger eigenvalue = more important direction.

Spectral Clustering

  1. Build similarity/Laplacian matrix
  2. Compute eigenvectors
  3. Cluster in the eigenvector space

Eigenvectors reveal the cluster structure.

Google's PageRank

The importance of web pages is the eigenvector of the link matrix corresponding to eigenvalue 1.

Neural Network Analysis

  • Hessian eigenvalues: Indicate loss landscape curvature
  • Large eigenvalues → sharp minima
  • Eigenspectrum helps understand trainability

Markov Chains

The stationary distribution is an eigenvector with eigenvalue 1.

Singular Value Decomposition (SVD)

SVD generalizes eigendecomposition to non-square matrices:

A = UΣVᵀ
  • U: left singular vectors (eigenvectors of AAᵀ)
  • Σ: singular values (sqrt of eigenvalues)
  • V: right singular vectors (eigenvectors of AᵀA)

SVD is used for:

  • Matrix factorization in recommendations
  • Latent semantic analysis in NLP
  • Image compression

Numerical Considerations

Power Iteration

Simple algorithm to find largest eigenvalue:

  1. Start with random vector v
  2. Repeat: v = Av / ||Av||
  3. Converges to principal eigenvector

Computational Cost

  • Full eigendecomposition: O(n³)
  • Just top k eigenvectors: Much faster (Lanczos, randomized methods)

Key Takeaways

  1. Eigenvectors are directions preserved by a transformation
  2. Eigenvalues are the corresponding scaling factors
  3. PCA uses eigenvectors of covariance matrix
  4. Symmetric matrices have real eigenvalues and orthogonal eigenvectors
  5. SVD extends these ideas to rectangular matrices

Practice Questions

Test your understanding with these related interview questions: