An introduction to vectors, their properties, operations, and significance in linear algebra and machine learning.
Vectors
Overview
Core Concepts
-
What is a Vector?
In mathematics and physics, a vector is an object that has both magnitude (or length) and direction. Vectors can be represented geometrically as directed line segments or arrows in a coordinate system (e.g., 2D or 3D space).
Algebraically, a vector is often represented as an ordered list of numbers, called components, enclosed in parentheses or square brackets. For example, a vector v in n-dimensional space (ℝn) can be written as:
$$ \mathbf{v} = (v_1, v_2, \dots, v_n) \quad \text{or} \quad \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} $$
Vectors are fundamental in linear algebra and are used to represent various quantities such as displacement, velocity, force, and, importantly in machine learning, feature vectors or data points.
-
Vector Representation
Geometric Representation: An arrow pointing from an origin to a point in space. The length of the arrow represents the magnitude, and its orientation represents the direction.
Algebraic Representation (Components):
- Row Vector: Components are arranged in a single row: $$ \mathbf{v} = [v_1, v_2, \dots, v_n] $$
- Column Vector: Components are arranged in a single column (this is more common in many linear algebra contexts, especially when dealing with matrices): $$ \mathbf{v} = \begin{bmatrix} v_1 \\ v_2 \\ \vdots \\ v_n \end{bmatrix} $$
The number of components (n) is the dimension of the vector.
-
Vector Addition and Subtraction
Vectors of the same dimension can be added or subtracted by adding or subtracting their corresponding components.
If $$ \mathbf{u} = (u_1, ..., u_n) $$ and $$ \mathbf{v} = (v_1, ..., v_n) $$, then:
$$ \mathbf{u} + \mathbf{v} = (u_1+v_1, u_2+v_2, \dots, u_n+v_n) $$
$$ \mathbf{u} - \mathbf{v} = (u_1-v_1, u_2-v_2, \dots, u_n-v_n) $$
Geometrically, vector addition corresponds to the parallelogram law or head-to-tail rule.
-
Scalar Multiplication
Multiplying a vector v by a scalar c (a real number) scales the magnitude of the vector by |c|. If c is positive, the direction remains the same; if c is negative, the direction is reversed.
If $$ \mathbf{v} = (v_1, ..., v_n) $$, then $$ c\mathbf{v} = (cv_1, cv_2, \dots, cv_n) $$
-
Dot Product (Scalar Product)
The dot product of two vectors u and v (of the same dimension) is a scalar value.
Algebraic definition: $$ \mathbf{u} \cdot \mathbf{v} = \sum_{i=1}^{n} u_i v_i = u_1v_1 + u_2v_2 + \dots + u_nv_n $$
Geometric definition: $$ \mathbf{u} \cdot \mathbf{v} = ||\mathbf{u}|| \, ||\mathbf{v}|| \cos(\theta) $$, where $ ||\mathbf{u}|| $ and $ ||\mathbf{v}|| $ are the magnitudes of the vectors, and $ \theta $ is the angle between them.
Properties:
- Commutative: $ \mathbf{u} \cdot \mathbf{v} = \mathbf{v} \cdot \mathbf{u} $
- Distributive over vector addition: $ \mathbf{u} \cdot (\mathbf{v} + \mathbf{w}) = \mathbf{u} \cdot \mathbf{v} + \mathbf{u} \cdot \mathbf{w} $
- If $ \mathbf{u} \cdot \mathbf{v} = 0 $ and u, v are non-zero, then u and v are orthogonal (perpendicular).
-
Cross Product (Vector Product) - (Primarily in ℝ³)
The cross product of two vectors u and v in ℝ³ is another vector that is perpendicular to both u and v.
If $$ \mathbf{u} = (u_1, u_2, u_3) $$ and $$ \mathbf{v} = (v_1, v_2, v_3) $$, then:
$$ \mathbf{u} \times \mathbf{v} = (u_2v_3 - u_3v_2, u_3v_1 - u_1v_3, u_1v_2 - u_2v_1) $$
The magnitude of the cross product is $ ||\mathbf{u} \times \mathbf{v}|| = ||\mathbf{u}|| \, ||\mathbf{v}|| \sin(\theta) $, which is the area of the parallelogram spanned by u and v.
Properties:
- Anti-commutative: $ \mathbf{u} \times \mathbf{v} = -(\mathbf{v} \times \mathbf{u}) $
- $ \mathbf{u} \times \mathbf{u} = \mathbf{0} $ (the zero vector)
- Not associative in general.
-
Vector Magnitude (Norm)
The magnitude (or length or norm) of a vector v, denoted $ ||\mathbf{v}|| $, measures its length.
For a vector $ \mathbf{v} = (v_1, v_2, \dots, v_n) $, the Euclidean norm (L2 norm) is:
$$ ||\mathbf{v}||_2 = \sqrt{v_1^2 + v_2^2 + \dots + v_n^2} = \sqrt{\mathbf{v} \cdot \mathbf{v}} $$
Other norms exist, such as the L1 norm (Manhattan norm): $ ||\mathbf{v}||_1 = \sum_{i=1}^{n} |v_i| $.
-
Unit Vector
A unit vector is a vector with a magnitude of 1. It is often used to represent direction.
A unit vector $ \hat{\mathbf{u}} $ in the direction of a non-zero vector v can be obtained by dividing v by its magnitude:
$$ \hat{\mathbf{u}} = \frac{\mathbf{v}}{||\mathbf{v}||} $$
-
Zero Vector
The zero vector (or null vector) is a vector where all components are zero: $ \mathbf{0} = (0, 0, \dots, 0) $. It has zero magnitude and no specific direction. It is the additive identity for vector addition: $ \mathbf{v} + \mathbf{0} = \mathbf{v} $.
-
Orthogonal Vectors
Two non-zero vectors u and v are orthogonal (perpendicular) if their dot product is zero: $ \mathbf{u} \cdot \mathbf{v} = 0 $. This implies the angle between them is 90 degrees ($ \cos(90^{\circ}) = 0 $).
-
Orthonormal Vectors
A set of vectors is orthonormal if all vectors in the set are unit vectors and they are mutually orthogonal.
-
Feature Vectors
In machine learning, data instances are often represented as feature vectors. Each component of the vector corresponds to a specific feature (attribute or characteristic) of the instance. For example, a house might be represented by a vector [size (sq ft), number of bedrooms, age (years)].
-
Data Points in Space
Feature vectors can be thought of as points in a high-dimensional space. This geometric interpretation allows us to use concepts like distance (e.g., Euclidean distance between vectors) and similarity (e.g., cosine similarity from dot product) in algorithms like k-Nearest Neighbors, SVMs, and clustering.
-
Weights and Biases in Neural Networks
The weights connecting neurons in a neural network can be organized into vectors (or matrices). Bias terms are also vectors. Vector operations are fundamental to how neural networks process information (e.g., calculating weighted sums).
-
Embeddings
In Natural Language Processing and other areas, entities like words, sentences, or items are often mapped to dense vector representations called embeddings (e.g., Word2Vec, GloVe). These embeddings capture semantic relationships, where similar items have vectors that are close in the vector space.
Implementation
-
Vector Operations with NumPy
Illustrating basic vector operations using the Python library NumPy.
Interview Examples
What is the difference between a scalar and a vector?
Explain the fundamental distinction.
Explain the geometric interpretation of the dot product.
What does the dot product of two vectors tell us geometrically?
What does it mean for a set of vectors to be linearly independent?
Define linear independence and its significance.