Svd

Overview

Singular Value Decomposition (SVD) is a fundamental matrix factorization technique in linear algebra with a vast range of applications across science and engineering. It decomposes any rectangular matrix into a product of three distinct matrices: an orthogonal matrix, a diagonal matrix, and another orthogonal matrix.

SVD is particularly powerful because it generalizes concepts like eigendecomposition to non-square matrices and provides deep insights into a matrix's structure, including its rank, null space, and column space. It's a cornerstone for techniques like Principal Component Analysis (PCA), recommendation systems, image compression, and noise reduction.

Core Concepts

  • Definition

    For any real m x n matrix A, the Singular Value Decomposition is given by:

    Where:

    • U is an m x m orthogonal matrix. Its columns are the left-singular vectors of A. These are the eigenvectors of AAT.
    • Σ (Sigma) is an m x n rectangular diagonal matrix. The diagonal entries Σii are the singular values of A. These are non-negative and are typically arranged in descending order (σ1 ≥ σ2 ≥ ... ≥ 0). Singular values are the square roots of the non-zero eigenvalues of both ATA and AAT.
    • VT is the transpose of an n x n orthogonal matrix V. The columns of V (which are the rows of VT) are the right-singular vectors of A. These are the eigenvectors of ATA.

    The number of non-zero singular values is equal to the rank of the matrix A.

    A = U\\Sigma V^T
  • Geometric Interpretation

    Geometrically, SVD states that any linear transformation represented by matrix A can be decomposed into three fundamental operations:

    1. Rotation/Reflection (VT): An initial rotation (or reflection) of the input space. Since V is orthogonal, VT preserves lengths and angles.
    2. Scaling (Σ): A scaling operation along the new axes. The singular values in Σ dictate how much each dimension is stretched or shrunk. Some dimensions might be scaled to zero if corresponding singular values are zero.
    3. Rotation/Reflection (U): A final rotation (or reflection) in the output space. Since U is orthogonal, it also preserves lengths and angles.

    Essentially, SVD describes how a linear transformation maps a unit sphere in the input space to an ellipsoid (possibly in a lower-dimensional subspace) in the output space. The singular values are the lengths of the semi-axes of this ellipsoid.

  • Reduced SVD (Economy SVD)

    In practice, especially when m > n (tall matrix) or m < n (wide matrix), a more compact form called the Reduced SVD (or Economy SVD) is often used.

    • If m ≥ n (rank r ≤ n ≤ m):
      A = Ur Σr VrT Where Ur is m x r, Σr is r x r (square diagonal with r non-zero singular values), and VrT is r x n (or Vr is n x r).
    • If m < n (rank r ≤ m < n):
      A similar reduction applies.

    Many numerical libraries compute the economy SVD by default to save computation and storage when U or V would be unnecessarily large.

  • Connection

    SVD is closely related to eigendecomposition:

    • The left-singular vectors of A (columns of U) are the eigenvectors of AAT.
    • The right-singular vectors of A (columns of V) are the eigenvectors of ATA.
    • The non-zero singular values of A are the square roots of the non-zero eigenvalues of both AAT and ATA.

    However, SVD is more general as it applies to any m x n matrix, whereas eigendecomposition is typically defined for square matrices.

  • Principal Component Analysis (PCA)

    SVD is a common method to perform PCA. If X is a data matrix (mean-centered), the SVD of X = UΣVT directly gives the principal components in V and the variance explained by them through Σ.

  • Low-Rank Matrix Approximation

    SVD allows for the best (in the Frobenius norm sense) low-rank approximation of a matrix. By keeping the k largest singular values and their corresponding singular vectors, we can construct a rank-k matrix Ak = UkΣkVkT that is closest to A. This is fundamental for image compression and noise reduction.

  • Recommendation Systems

    In collaborative filtering, SVD can be used to factorize user-item rating matrices to discover latent factors and predict missing ratings.

  • Natural Language Processing (NLP)

    Latent Semantic Analysis (LSA) uses SVD on term-document matrices to uncover latent semantic relationships between words and documents.

  • Solving Linear Systems & Pseudo-Inverse

    SVD can be used to compute the Moore-Penrose pseudo-inverse of a matrix, which is useful for solving linear least squares problems or finding solutions to systems of linear equations that do not have a unique solution.

Implementation

  • SVD with NumPy

    NumPy's `linalg.svd` function computes the SVD.
    
    import numpy as np
    
    # Define a matrix (can be non-square)
    A = np.array([[1, 2, 3],
                  [4, 5, 6]])
    
    print("Matrix A:\n", A)
    
    # Compute SVD
    # full_matrices=False for economy SVD
    U, s_diag, Vh = np.linalg.svd(A, full_matrices=False)
    
    print("\nU (left-singular vectors):\n", U)
    print("\ns_diag (singular values as a 1D array):\n", s_diag)
    print("\nVh (V transpose, right-singular vectors as rows):\n", Vh)
    
    # To reconstruct A, Sigma needs to be formed into a diagonal matrix
    # For economy SVD where m >= n (as in this A is 2x3, so m < n), Sigma will be m x n
    # If A is m x n, U is m x k, Sigma is k x k (diag(s_diag)), Vh is k x n, where k = min(m,n)
    Sigma = np.zeros((A.shape[0], A.shape[1]))
    
    # Populate Sigma with singular values
    # For A (2x3), U (2x2), s_diag (2,), Vh (2x3). Sigma should be 2x2 to make U @ Sigma work, then result @ Vh
    # Or, more generally, Sigma_reconstruct is min(m,n) x min(m,n)
    # A_reconstructed = U @ np.diag(s_diag) @ Vh # This is common if U,s,Vh are from full_matrices=False and m>=n or specific economy handling
    
    # Constructing Sigma matrix for A_reconstructed = U @ Sigma_full @ Vh_full_V
    # For A = U S V^T, if U is (m,k), S must be (k,k) (from s_diag), V^T is (k,n)
    Sigma_matrix = np.diag(s_diag) # This will be k x k where k = min(m,n)
    
    # If m < n, U is m x m, Sigma_matrix from np.diag(s_diag) is m x m. We need to pad Sigma or use U[:,:m] if m!=k
    # A (2x3), U(2x2), s_diag(2), Vh(2x3)
    # Sigma_matrix from np.diag(s_diag) = [[s1,0],[0,s2]] (2x2)
    A_reconstructed = U @ Sigma_matrix @ Vh
    
    print("\nReconstructed A from U @ diag(s) @ Vh:\n", A_reconstructed)
    print("Original A and Reconstructed A are close:", np.allclose(A, A_reconstructed))
                            

Interview Examples

What is the difference between SVD and Eigendecomposition?

Highlight the key distinctions.

How can SVD be used for dimensionality reduction?

Explain the process of using SVD for reducing dimensions.

Practice Questions

1. How would you implement this in a production environment? Hard

Hint: Consider scalability and efficiency

2. Explain the core concepts of Svd Easy

Hint: Think about the fundamental principles

3. What are the practical applications of Svd? Medium

Hint: Consider both academic and industry use cases