Transforming abstract vectors: The first look into Matrices

Vectors can be used to obtain other vectors by repeated additions and multiplication by scalars. However, in general, one can imagine complicated, arbitrary transformation on vectors that takes one vector and changes it to another. We are interested in studying such transformation of vectors. Specifically, we want to study transformations that are well defined, meaning given a vector its transformation is unique and invertible. In addition, we want such transformations to also preserve the superposition property of the vectors. That is, if T is a transformation that takes a vector $\ket{v_i}$ to $\ket{w_i}$ (i.e. $T \ket{v} = \ket{w}$), we want

$$ T(\ket{v_1} + \ket{v_2}) \stackrel{def}{=} T\ket{v_1} + T\ket{v_2} = \ket{w_1} + \ket{w_2}\;. $$

Such transformations are known as linear operators and our goal is to study them in detail in this section. However, before we learn about them, it turns out just like abstract vectors are easier to understand in component forms, abstract transformations are also easier to understand if we work with their component forms which are known as matrices.

Let us take the example of rotation in $2D$ as an example of such a transformation. Take one vector from 2D vector space, it’s components are denoted $x$ and $y$ (for simplicity let the vector have unit length). Their values can be written in angles $\phi$ as $x = \cos \phi,\; y = \sin \phi$. Now we want to rotate by an angle of $\theta$ . The components of the rotated vector can be written as $x' = \cos(\phi+\theta),\; y' = \sin (\phi+\theta)$. We can express the components of the rotated vector in terms of the original vector and the angle $\theta$ by using some trigonometry. The answer is:

rotation of the vector in 2D

rotation of the vector in 2D

$$ \begin{pmatrix} x \\ y \end{pmatrix} \xrightarrow{rotation} \begin{pmatrix} x' \\ y' \end{pmatrix} = \begin{pmatrix} x\cos \theta - y\sin \theta\\ x\sin \theta + y\cos \theta\end{pmatrix} $$

<aside> 💡 We used the following trigonometric formulae:

  1. $\cos(A+B) = \cos A \cos B - \sin A \sin B$
  2. $\sin(A+B) = \sin A\cos B + \cos A\sin B$

</aside>

We want to express this as

$$ \begin{pmatrix} x' \\ y' \end{pmatrix} = \begin{pmatrix} x\cos \theta - y\sin \theta\\ x\sin \theta + y\cos \theta\end{pmatrix} = M \begin{pmatrix} x \\ y \end{pmatrix} $$

Clearly, this new object $M$, representing the act of rotation cannot be another vector. It cannot be a scalar either since multiplying by scalar just changes the length of the vector without rotating it. So it must be a new kind of object. It is indeed, in fact, it must be expressed as a square (generally rectangular) array of numbers with a specific way of adding, multiplying etc. We now explicitly describe their properties and thereafter see how do they achieve rotation and other transformation which change one vector to another vector in the same vector space.

Matrices: Properties and examples

Matrices

We define matrix as a rectangular array of numbers (real or complex)

$$ M = \begin{pmatrix} a_1 & a_2 & \dots & a_n \\ b_1 & b_2 & b_3 & \dots & \\ \vdots & \ddots & \ddots & \ddots \\ b_n & \dots & \dots & \dots \\ \end{pmatrix} $$

The size of the matrix is denoted by two numbers (no. of rows and no. of columns) as follows: $m \times n$. If the number of rows equals the number of columns, we call it a square matrix of size $n \times n$.

$$ M = \begin{pmatrix} 1 & 3 & 5 \\ 2 & 4 & 9 \end{pmatrix}_{2 \times 3} $$

$$ M = \begin{pmatrix} A & B \\ C & D \end{pmatrix}_{2 \times 2} $$

Let’s have matrix $M = \begin{pmatrix} x \end{pmatrix}_{1 \times 1}\,\,$, which is basically a real or complex number $x$.

Notation