Matrices as transformations
September 2018
Suppose an equation Lx = y where L is some n x m matrix. Then A can be be viewed as a function that maps from one up to q vectors of dimension m x 1 to m x q matrix containing those vectors as columns.

There are a few alternative ways to represent this mapping. If L =[1 2; 2 3] and x is some 2 x 1 vector:

Lx = y

[1 2; 2 3][x_1 x2] = [y_1 y_2]

x_1 + 2x_2 = y_1
2x_1 + 3x_2 = y_2

f(x_1, x_2) = (x_1 + 2x_2, 2x_1 + 3x_2)


Let's partition L such that v1 = (1 2)'], v2 = (2 3)', L = [v1 v2]. Now, we can see that because the v1 column vector in L operates on the first element (x-axis) element of x vector and v2 operates on the second element (y-axis) vector, we can think of L as a transformation of the two-dimensional vector space in question such that the canonical basis vectors (1, 0) and (0, 1) would map to v1 and v2, respectively.

How this function operates it is does a matrix multiplication with the input. So what does a matrix multiplication mean?

Assume there are vectors v = (1, 2, 3) and u = (4, 5, 6). We can form two matrices with these, where in one the vectors are rows and in the other, columns. Let A = [v; u] and B = [v' u'] where prime is the transpose. Thus, A is a 2 x 3 matrix and B is a 3 x 2 matrix. Now, we can multiply A and B:

A * B = [v; u] * [v' u'] = [<v, v> <v, u>; <u, v> <u, u>]


where <i, j> is the inner product. Thus, a matrix multiplication creates new vectors element-wise from the inner products of the row and column vectors of the multiplied matrices.

This inner product interpretation of matrix multiplication is very interesting when we use it to think about key concepts such as linear dependence or null space. The key thing to remember is that two vectors are orthogonal, or linearly independent, if their inner product is zero:

<v, u> = 0 ↔ u and v are orthogonal


This means that L maps x to null space, Lx = 0 if and only if the rows of L are orthogonal to the vector(s) contained in x. Thus, the null space of any linearly independent matrix only includes the origin. Think about it: if L is a 2 x 2 matrix with the canonical basis vectors (0, 1) and (1, 0), and x is 2 by 1 vector, there would have to exist some values for the elements of x that would make it orthogonal to two orthogonal vectors in two-dimensional space.

Having thought about it for a while it becomes obvious that such thing is impossible. You can make the vector orthogonal to one, but that by definition makes it congruent with the other basis vector.

Two dimensions, in other words, do not have enough dimensions for three vectors that are each orthogonal with respect to the other two vectors.
What if you add one more dimension without changing the base? Then the basis vectors simply won't span all of the now-three-dimensional space. Instead, they will span only a horizontal plane that goes through the origin and is perpendicular to the newly-added z-axis.

So how could we construct a matrix, or a mapping, which has something else in the null space but the origin? This is straightforward. We simply need to construct v1 and v2 such that their inner product with some chosen x is a zero vector; that is, x is orthogonal to both.

If we add one more two-dimensional basis vector to our base, then it's easy to see that our set of basis vectors is linearly dependent:

E = [1 0 1; 0 1 0]


That is, we have three two-dimensional column vectors but they only span a plane, or we have two three-dimensional row vectors which also span only a plane. What vector could be orthogonal to both of these row vectors? Notice that if we limit the input now to vectors only, then our vector would have to have dimensions 3 by 1. After a moment's thought one concludes that for x = [x1 x2 x3], x2 has to be zero while x1 and x3 can be any value as long as they are opposite sign. As a result, our null space is spanned by t[1 0 1] where t is any real number. Turns out that our null space is a line, which is natural when thought about from the point of view of orthogonality: if one vector is orthogonal to some set of vectors, then also any multiples of that vector will be orthogonal to the same set of vectors.

So why is null space interesting? For one thing, it points out dimensions that our basis vectors are not yet spanning.