Essence of Linear Algebra

Tim included in Data Science & Machine Learning

2023-04-12 2024-07-25 About 1100 words 5 minutes

Contents

Vector

The introduction of numbers as coordinates is an act of violence.

AND on the flip side, it gives people a language to describe space and the manipulation of space using numbers that can be crunched and run through a computer. ~~Heresy: Linear algebra lets programmers manipulate space.~~

$i$ and $j$ are basis vectors, any vector can be seen as their linear combination.

Collinear vectors are linearly dependent, and the space they span is just a line (or the origin); Non-collinear vectors are linearly independent, and the space they span is the entire set of vectors;

Matrix

Kind of linear transformation

~~Fortunately, linear algebra only involves linear transformations;~~

Matrices can be understood as a kind of transformation; ac represents the position of a basis after transformation, and so does bd, therefore 1001 is equivalent to no transformation; Knowing how the basis is transformed, we know how all vectors are transformed; Orthogonal transformation is a transformation where the basis vectors maintain their length and remain perpendicular to each other (rigid body motion);

Matrix Multiply

A single matrix is a linear transformation, so matrix multiplication represents a composite transformation. The collective meaning of matrix multiplication lies in making two linear transformations succeed each other (apply one transformation then another). Non-square matrices represent transformations across dimensions;

Determinant

This is the determinant! 👆 The area of a 1x1 square, after a matrix transformation, equals the value of the corresponding determinant.

If the determinant is 0, everything is flattened, indicating a non-invertible transformation, and the matrix is also non-invertible. The value of the determinant can be negative; can your area also be negative? The area equals the absolute value, a negative signifies a change in spatial orientation (like flipping a piece of paper);

However, area doesn’t explain everything, in higher dimensions, it’s something else;

Inverse matrices & Column space & Rank

Matrices are not only for manipulating space but also for solving systems of equations.

Transforming a system of equations into matrix multiplication, we naturally return to the traditional art of manipulating space again; $\vec{x}$, under the action of matrix $A$, transforms into $\vec{v}$; then using the inverse transformation $A^{-1}$ to find the original $\vec{x}$ is the process of solving the system of equations; When the determinant is not 0, the system of equations can be solved by finding the inverse matrix;

When the determinant is 0, the system of equations might still have solutions, provided that $\vec{v}$ survives in the compressed space (column space);

The explanation about rank, this video around 8 minutes in, is really brilliant. Rank represents the dimension of the transformed (column) space; (in the system of equations, the rank of the matrix just happens to be the number of constraints) All possible sets of $A\vec{v}$ form the column space; The set of vectors that fall onto the origin after transformation is the null space or also called the kernel; Kernel methods in SVM?

Duality of Dot Product

Traditionally, the understanding of vector dot product is projection, but forget about that for now in order to grasp duality. Duality refers to natural yet unexpected corresponding relationships. A vector is the physical embodiment of a linear transformation. The understanding of duality is crucial for grasping Hilbert spaces, PCA, IDA.

There’s a fascinating relationship between vectors and their corresponding 1×n matrices, as the transformation represented by a 1×n matrix is equivalent to doing a dot product with an n×1 vector; each vector is an embodiment of some matrix; each matrix corresponds to a certain vector;

Cross Product

The traditional explanation is shown in the above diagram.

Change of Basis

How should two people in different coordinate systems communicate? By translating the other person’s basis vectors into one’s own coordinate system to obtain the transformation matrix.

For a vector $\vec{v}$ in another coordinate system, first use a transformation to convert it to a vector in our own coordinate system, then transform it within our own coordinate system, and finally convert the transformation result back into his coordinates; The expression $A^{-1}MA$ represents a kind of transfer effect, this matrix multiplication is still a transformation, but from the perspective of others.

Eigenvectors & Eigenvalues

Purpose: Eigenvectors with an eigenvalue of 1 are the axis of rotation. Calculation: $A\vec{v}=\lambda \vec{v}$, after rearrangement, i.e., $det(A-\lambda I)=0$, finding a vector that can compress the space. In rotation transformation, eigenvectors exist in the complex vector space; shearing transformation has only one eigenvector; There are also cases with a unique eigenvalue and non-collinear eigenvectors (such as scaling all vectors by a factor of two).

In diagonal matrices, all the basis vectors are eigenvectors, and the values on the diagonal are the corresponding eigenvalues.

If one day, you wanted to take two non-collinear eigenvectors [1 0] [0 1] as the new coordinate system’s basis, this base change process is known as diagonalization; the resulting matrix will necessarily be a diagonal matrix, with values as eigenvalues. Such eigenvectors are also called eigenbasis; Why go to great lengths to perform an eigenbasis transformation? For example, for this matrix $\begin{bmatrix}3 &1 \ 0 & 2\end{bmatrix}$, calculating this transformation 100 times could be very complex, transforming it allows for quickly obtaining results $\begin{bmatrix}3^{100} &1 \ 0 & 2^{100}\end{bmatrix}$, then just convert it back.

Vector Spaces

Determinants, eigenvectors, etc. are independent of the chosen coordinate system… Differentiation of functions can also be done with matrices… So, what exactly is a vector?

~~Vectors are nothing in particular.~~ Axioms are not natural laws but rules defined by mathematicians, connecting mathematicians and those using mathematical tools; vectors can be anything — points, arrows, functions, odd creatures…, as long as they satisfy these rule defined by axioms. Asking what a vector is, is as meaningless as asking what ‘1’ is.

Cramer’s Rule

When solving determinants, computers use Cramer’s rule, while people use Gaussian elimination; but Cramer’s rule is far more interesting.

A unique way to represent coordinates: y = area/1, x = area / 1;

After such transformation, y remains the area of a quadrilateral based on the green base, which perfectly matches the geometric meaning of determinants; The area of the quadrilateral remains the same, the green base does not change (the first column of the determinant), while the height becomes the transformed 42. This is the geometric meaning of Cramer’s rule.

Buy me a coffee~

Donate

Alipay

PayPal

WeChat Pay