Linear algebra part 8 - Matrices
If a vector represents multiple dimensions, a matrix is set a many vectors. let us start by defining three vectors, , and :
We can represent these three vectors as a single matrix, , where each vector becomes a column of the matrix:
Matrix notation
Matricies are denoted using an upper case bold letter, . Matrices are two dimensional objects. They have a height and a width. The first matrix we saw in this post, , is a 3 x 3 matrix. It is three rows high and three columns wide. Matricies can have any width and height and do not need to be square. For example, is a 2 x 4 matrix and is a 5 x 3 matrix. Saying a matrix is 2 x 4 means it has a height of 2 and width of 4.
The position of an component in a vector is described by two subscript numbers. The component in the second row and third column of matrix would be denoted as . The number of rows in a matrix is typically called and the number of is called .
Mathematics normally uses 1-based indexing, meaning the first row is row 1, the second row is row 2, etc. Many programming languages use 0-based indexing, where the first row is row 0, the second row is row 1, etc. In this blog I will use 1-based indexing because I believe it is more intuative for people meeting linear algebra for the first time. Be aware that you are likely to encounter 0-based indexing if you use linear alegbra in programming.
Multiplying a matrix and a vector
You have already learnt how to multiple a matrix and vector without realising when you learnt about linear combinations of vectors. In the introduction to this section we built up matrix from vectors , and . Let us say we want to multiple matrix by vector , where has a length of three.
You can think about as a matrix that acts upon the vector to create an output matrix, . The matrix can be known as a difference matrix because it is the difference between and .
You can see the way was calculated was by multiplying by each component in the first colum of , then adding multiplied by the second column of , the adding multiplied by the third column of .
Multiplication by row
The method of multiplication we have just learnt is multiplication by column. You can also multply a matrix and a vector by row using the dot product. I would generally recommend using the by column method as it is simplier to understand.
To multiply matrix by vector using the by row method we take each row of matrix in turn and calculate its dot product with vector .
If you keep going you will see that it multiplies out to the same result as the by columns method. If you need a refresher on calculating the dot product of vectors revisit this post.
Practical uses for matrix multiplication
The idea that you can transform one vector into another vector by multiplying it by a matrix is central to large language models. The matrix in this context is the weights learned by the model during training. The input vector is the prompt you give the model. The output vector is the response the model gives you back.
Problems
If you can solve this problem you have understood how to multiply matricies and vectors.
- Calculate vector , where is the product of multplying vector by matrix .
Solutions
- Using the columns method, we multiply the first element in by the first column of and so on: