Chapter 2: Linear Algebra
Most of these concepts are discussed in an linear algebra class. Thus, I only included concepts that I felt were helpful for me to remember (admittedly, this is most of the nontrivial topics in this chapter). This is not a comprehensive review of all the linear algebra explained in the book.
2.4 Linear Dependence and Span
A square matrix with linearly dependent columns is known as singular.
2.5 Norms
The norm is given by
for .
The term norm is not limited to norms, but rather describes any function mapping vectors to non-negative real numbers () such that
- .
- . (triangle inequality)
- .
The norm is the Euclidean norm. The norm is the max norm, and is equivalent to . Finally, the Frobenius norm is a non- norm described by
which measures the size of a matrix . It is analogous to the norm or matrices instead of vectors.
2.6 Special Kinds of Matrices and Vectors
An orthogonal matrix is a square matrix whose rows are mutually orthonormal and whose columns are mutually orthonormal.
This follows from the definition of orthonormality of vectors .
- .
- .
Notably, matrix orthogonality implies that .
2.7 Eigendecomposition
An eigenvector of a square matrix is a non-zero vector such that multiplication by alters only the scale of , and not the angle.
The scalar is the eigenvalue corresponding to this eigenvector . By convention, we typically concern ourselves with only unit eigenvectors (since is still an eigenvector for any ).
Suppose a matrix has linearly independent eigenvectors with eigenvalues . Let and . Then, the eigendecomposition of is
Where is the diagonal matrix constructed by placing the elements of on the diagonals. Frequently, is represented by .
Notably, every real symmetric matrix can be eigendecomposed with only real-valued eigenvectors and eigenvalues.
Where is an orthogonal matrix composed of eigenvectors of and is a diagonal matrix. Since is orthogonal, can be interpreted as scaling space by in direction .
Also, note that eigendecompositions are not necessarily unique; if two or more eigenvectors share the same eigenvalue, any set of orthogonal vectors lying in their span are also eigenvectors. Additionally, a matrix is also singular at least one eigenvalue is zero.
The eigendecomposition of a real symmetric matrix can also optimize quadratic expressions of the form subject to the constraint . The maximum value of is the maximum eigenvalue, and the minimum is the minimum eigenvalue.
Finally, we can characterize some matrices based on their eigenvalues.
- Positive Definite: .
- Positive Semidefinite: .
- Negative Definite: .
- Negative Semidefinite: .
Positive semidefinite matrices guarantee that . Positive definite matrices additionally guarantee .
2.8 Singular Value decomposition
Singular value decomposition (SVD) is another method of factorizing a matrix, this time into singular vectors and singular values. SVD is more general than eigendecomposition because it applies to all matrices, while eigendecomposition applies to only some square matrices. SVD has the following form
Let be an matrix. Then is an matrix, is an matrix, and is an matrix. Additionally, and are orthogonal matrices, and is a diagonal matrix.
The elements along the diagonal of are the singular values of the matrix . The columns of are the left-singular vectors, and the columns of are known as the right-singular vectors.
We can further interpret/compute the SVD of via the eigendecomposition of related matrices. The left-singular vectors of are the eigenvectors of , and the right-singular vectors are the eigenvectors of . The non-zero singular values of are the square roots of the eigenvalues of , which are also equivalent to the eigenvalues of .
2.9 The Moore-Penrose Pseudoinverse
Matrix inversion is undefined for non-square matrices. The Moore-Penrose pseudoinverse is designed to compute an inverse for a matrix in the equation that:
- For wide matrices, produces a solution with minimal Euclidean norm among all possible solutions. (Since wide matrices can have more than one solution)
- For tall matrices, produces an such that it minimizes the Euclidean norm . (Since tall matrices can have zero solutions).
The pseudoinverse is defined formally as
and practically calculated using the matrices in SVD
Note that the psuedoinverse of any diagonal matrix, including , can be computed by taking the reciprocal of its nonzero elements and then transposing the matrix.
The matrix may not be square!
2.10 The Trace Operator
The trace operator represents the summation of a matrix's diagonal.
It is a useful and interesting operator for several reasons. For instance, the Frobenius norm of a matrix can be expressed as
It is invariant under the transpose operator.
And under "cyclic permutation" of a product.
Also, note that a scalar is its own trace.
2.11 The Determinant
The determinant is a function mapping a square matrix to a real scalar. It is equivalent to the product of all the eigenvalues of the matrix, and can be interpreted as a measure of how much multiplication by the matrix dilates space. If , space is contracted completely along at least one dimension. If , the transformation preserves volume.