Svd

Overview

Singular Value Decomposition (SVD) is a fundamental matrix factorization technique in linear algebra with a vast range of applications across science and engineering. It decomposes any rectangular matrix into a product of three distinct matrices: an orthogonal matrix, a diagonal matrix, and another orthogonal matrix.

SVD is particularly powerful because it generalizes concepts like eigendecomposition to non-square matrices and provides deep insights into a matrix's structure, including its rank, null space, and column space. It's a cornerstone for techniques like Principal Component Analysis (PCA), recommendation systems, image compression, and noise reduction.

Core Concepts

Definition
▶
For any real m x n matrix A, the Singular Value Decomposition is given by:

Where:
- U is an m x m orthogonal matrix. Its columns are the left-singular vectors of A. These are the eigenvectors of AA^T.
- Σ (Sigma) is an m x n rectangular diagonal matrix. The diagonal entries Σ_ii are the singular values of A. These are non-negative and are typically arranged in descending order (σ₁ ≥ σ₂ ≥ ... ≥ 0). Singular values are the square roots of the non-zero eigenvalues of both A^TA and AA^T.
- V^T is the transpose of an n x n orthogonal matrix V. The columns of V (which are the rows of V^T) are the right-singular vectors of A. These are the eigenvectors of A^TA.
The number of non-zero singular values is equal to the rank of the matrix A.

A = U\\Sigma V^T
Geometric Interpretation
▶
Geometrically, SVD states that any linear transformation represented by matrix A can be decomposed into three fundamental operations:
1. Rotation/Reflection (V^T): An initial rotation (or reflection) of the input space. Since V is orthogonal, V^T preserves lengths and angles.
2. Scaling (Σ): A scaling operation along the new axes. The singular values in Σ dictate how much each dimension is stretched or shrunk. Some dimensions might be scaled to zero if corresponding singular values are zero.
3. Rotation/Reflection (U): A final rotation (or reflection) in the output space. Since U is orthogonal, it also preserves lengths and angles.
Essentially, SVD describes how a linear transformation maps a unit sphere in the input space to an ellipsoid (possibly in a lower-dimensional subspace) in the output space. The singular values are the lengths of the semi-axes of this ellipsoid.
Reduced SVD (Economy SVD)
▶
In practice, especially when m > n (tall matrix) or m < n (wide matrix), a more compact form called the Reduced SVD (or Economy SVD) is often used.
- If m ≥ n (rank r ≤ n ≤ m):
  A = U_r Σ_r V_r^T Where U_r is m x r, Σ_r is r x r (square diagonal with r non-zero singular values), and V_r^T is r x n (or V_r is n x r).
- If m < n (rank r ≤ m < n):
  A similar reduction applies.
Many numerical libraries compute the economy SVD by default to save computation and storage when U or V would be unnecessarily large.
Connection
▶
SVD is closely related to eigendecomposition:
- The left-singular vectors of A (columns of U) are the eigenvectors of AA^T.
- The right-singular vectors of A (columns of V) are the eigenvectors of A^TA.
- The non-zero singular values of A are the square roots of the non-zero eigenvalues of both AA^T and A^TA.
However, SVD is more general as it applies to any m x n matrix, whereas eigendecomposition is typically defined for square matrices.
Principal Component Analysis (PCA)
▶

SVD is a common method to perform PCA. If X is a data matrix (mean-centered), the SVD of X = UΣV^T directly gives the principal components in V and the variance explained by them through Σ.
Low-Rank Matrix Approximation
▶

SVD allows for the best (in the Frobenius norm sense) low-rank approximation of a matrix. By keeping the k largest singular values and their corresponding singular vectors, we can construct a rank-k matrix A_k = U_kΣ_kV_k^T that is closest to A. This is fundamental for image compression and noise reduction.
Recommendation Systems
▶

In collaborative filtering, SVD can be used to factorize user-item rating matrices to discover latent factors and predict missing ratings.
Natural Language Processing (NLP)
▶

Latent Semantic Analysis (LSA) uses SVD on term-document matrices to uncover latent semantic relationships between words and documents.
Solving Linear Systems & Pseudo-Inverse
▶

SVD can be used to compute the Moore-Penrose pseudo-inverse of a matrix, which is useful for solving linear least squares problems or finding solutions to systems of linear equations that do not have a unique solution.

Implementation

SVD with NumPy

▶

NumPy's `linalg.svd` function computes the SVD.


import numpy as np

# Define a matrix (can be non-square)
A = np.array([[1, 2, 3],
              [4, 5, 6]])

print("Matrix A:\n", A)

# Compute SVD
# full_matrices=False for economy SVD
U, s_diag, Vh = np.linalg.svd(A, full_matrices=False)

print("\nU (left-singular vectors):\n", U)
print("\ns_diag (singular values as a 1D array):\n", s_diag)
print("\nVh (V transpose, right-singular vectors as rows):\n", Vh)

# To reconstruct A, Sigma needs to be formed into a diagonal matrix
# For economy SVD where m >= n (as in this A is 2x3, so m < n), Sigma will be m x n
# If A is m x n, U is m x k, Sigma is k x k (diag(s_diag)), Vh is k x n, where k = min(m,n)
Sigma = np.zeros((A.shape[0], A.shape[1]))

# Populate Sigma with singular values
# For A (2x3), U (2x2), s_diag (2,), Vh (2x3). Sigma should be 2x2 to make U @ Sigma work, then result @ Vh
# Or, more generally, Sigma_reconstruct is min(m,n) x min(m,n)
# A_reconstructed = U @ np.diag(s_diag) @ Vh # This is common if U,s,Vh are from full_matrices=False and m>=n or specific economy handling

# Constructing Sigma matrix for A_reconstructed = U @ Sigma_full @ Vh_full_V
# For A = U S V^T, if U is (m,k), S must be (k,k) (from s_diag), V^T is (k,n)
Sigma_matrix = np.diag(s_diag) # This will be k x k where k = min(m,n)

# If m < n, U is m x m, Sigma_matrix from np.diag(s_diag) is m x m. We need to pad Sigma or use U[:,:m] if m!=k
# A (2x3), U(2x2), s_diag(2), Vh(2x3)
# Sigma_matrix from np.diag(s_diag) = [[s1,0],[0,s2]] (2x2)
A_reconstructed = U @ Sigma_matrix @ Vh

print("\nReconstructed A from U @ diag(s) @ Vh:\n", A_reconstructed)
print("Original A and Reconstructed A are close:", np.allclose(A, A_reconstructed))

Interview Examples

What is the difference between SVD and Eigendecomposition?

Highlight the key distinctions.

How can SVD be used for dimensionality reduction?

Explain the process of using SVD for reducing dimensions.

Practice Questions

1. How would you implement this in a production environment? Hard

Hint: Consider scalability and efficiency

2. Explain the core concepts of Svd Easy

Hint: Think about the fundamental principles

3. What are the practical applications of Svd? Medium

Hint: Consider both academic and industry use cases

Svd

Overview

Core Concepts

Definition

Geometric Interpretation

Reduced SVD (Economy SVD)

Connection

Principal Component Analysis (PCA)

Low-Rank Matrix Approximation

Recommendation Systems

Natural Language Processing (NLP)

Solving Linear Systems & Pseudo-Inverse

Implementation

SVD with NumPy

Interview Examples

What is the difference between SVD and Eigendecomposition?

How can SVD be used for dimensionality reduction?

Practice Questions

1. How would you implement this in a production environment? Hard

2. Explain the core concepts of Svd Easy

3. What are the practical applications of Svd? Medium

Related Resources

Related Topics