Previously, we have explored methods to compute the essence of power functions $Ȣx^n$ which involves solving a large system of linear equations. This method is equivalent to solving for the inverse of a large $n\times n$ matrix where entries are values of pascal's triangle. Though the matrix method allow us to solve for large number of essences at once, it does not extend easily to solve for next iterations of essence coefficients. Rather than reusing the values we have already solved for, we will have to solve for inverse of a separate larger matrix again. Here we will introduce an iterative method for solving for these coefficients. Chapter 0: Recap Let us first remind ourselves of the definition of essence. For a function $f(x)$, we want to find the transformation $Ȣf(x)$ such that we are able to 'smooth out' its series: $$\sum_{i=a}^b f(i) = \int_{a-1}^b Ȣf(x) dx$$ For example, we can solve for the following functions: $$\begin{align*}Ȣ1 &= 1 \\ Ȣx &= x +...
In Linear Algebra, Vector Transformations are analogous to Functions of Real Numbers: both take in an input and yields a specific output.
Main difference appears in the fact that inputs of Transformations, Vectors, is multiple-dimensional construct, rather than being a single real number.
Real Function:$$ f(x) = y, \\
\text{where } x\in \mathbb R, \\
y \in \mathbb R
$$ Vector Transformation/Matrices$$ T(\vec{x}) = A\vec{x} = \vec{y}, \\
\text{where } \vec{x} \in \mathbb R ^m, \\
\vec{y} \in \mathbb R ^ n, \\
\text{and $A$ is of order $n \times m$} $$ Similar to how algebra students learn to solve for the Zeroes and Range of a Function, linear algebra students learn to solve for the linear-algebraic counterparts of the zeroes and range known as Kernel and Image.
Unlike in a real number functions where zeroes and range can be described by real number intervals, Kernels and Images exists as multidimensional Sub-Space of $\mathbb{R}^n$ and $\mathbb{R}^m$ respectively.
Solving for Kernel and images means to solve for a way to describe these Sub-Spaces.
Let's consider the Matrix $A$ with an order $n \times m$. $$T(\vec x) = A \vec x, \vec x \in \mathbb R ^m$$
The Matrix can be written out in terms of its element, as so:
$$ A = \begin{bmatrix}
a_{1\,1} & a_{1\,2} & \dots & a_{1\,m} \\
a_{2\,1} & a_{2\,2} & \dots & a_{2\,m} \\
\vdots \\ a_{n\,1} & a_{n\,2} & \dots & a_{n\,m}
\end{bmatrix}
$$ but we can equally represent the same $A$ in terms of its Column Vectors:
$$A = \begin{bmatrix}
\vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_m
\end{bmatrix}, \vec{a}_i = \begin{bmatrix}
a_{1\,i} \\
a_{2\,i} \\
\vdots \\
a_{n\,i}
\end{bmatrix}
$$
With the Column representation of a Matrix, the Transform can be written as:
$$
\vec{x} = \begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_m
\end{bmatrix} \\
\begin{align*}
A\vec{x} &= \begin{bmatrix}
\vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_m
\end{bmatrix}\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_m
\end{bmatrix} \\
&= x_1 \vec{a}_1 + x_2 \vec{a}_2 + \dots + x_m \vec{a}_m
\end{align*}
$$
You can verify that this indeed works by trying out an example:
$$
\begin{align*}
\begin{bmatrix}
2 & 1 \\
0 & 3 \\
2 & 4
\end{bmatrix}\begin{bmatrix}
-1 \\
2
\end{bmatrix} &= -\begin{bmatrix}
2 \\
0 \\
2
\end{bmatrix} + 2\begin{bmatrix}
1\\
3\\
4
\end{bmatrix} \\
&= \begin{bmatrix}
-2 \\
0 \\
-2
\end{bmatrix} + \begin{bmatrix}
2\\
6\\
8
\end{bmatrix} \\
&= \begin{bmatrix}
0\\
6\\
6
\end{bmatrix} \\
&\text{which is same the result of}\\
&\text{traditional method of multiplying matrices:} \\
&= \begin{bmatrix}
2*-1 + 1*2 \\
0*-1 + 3*2 \\
2*-1 + 4*2
\end{bmatrix} \\
&= \begin{bmatrix}
0\\
6\\
6
\end{bmatrix} \\
&\text{and by trying it out, it is easy to see why this is the case.}
\end{align*} $$
Let's look again at the general Matrix Product:
$$A\vec x = x_1 \vec{a}_1 + x_2 \vec{a}_2 + \dots + x_m \vec{a}_m = \vec y$$
What we notice from this is that the outcome $\vec y$ is a Linear Combinations of the Matrix's Column Vectors $\vec a _i$.
This must be true for all input $\vec x$ which means we can describe the Image of $A$ as the Sub-Space defined by its Column Vectors which we call the Span of the Vectors $\vec a_i$:
$$\text{Im}(A) = \text{span}(\vec a_1, \vec a_2, \dots,\vec a_m)$$
So this is all we need right?
Not necessarily.
Let's try to use this method to find the Image of this Matrix:
$$
A= \begin{bmatrix}
1&1&1 \\
1&2&3 \\
1&3&5
\end{bmatrix} \\
\therefore \text{Im}(A) = \text{span}(\begin{bmatrix}
1\\
1\\
1
\end{bmatrix},\begin{bmatrix}
1\\
2\\
3
\end{bmatrix},\begin{bmatrix}
1\\
3\\
5
\end{bmatrix})
$$ This means that any output vector $\vec y$ must be in the form
$$\vec y = x_1\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}+x_2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}+x_3\begin{bmatrix}
1\\
3\\
5
\end{bmatrix}
$$ But notice that the third column is actually a Linear Combination of the first two columns:
$$ \begin{bmatrix}
1\\
3\\
5
\end{bmatrix}= - \begin{bmatrix}
1\\
1\\
1
\end{bmatrix}+2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}$$ which means that $\vec y$ only really needed the first two columns:
$$\begin{align*}
\vec y &=& (x_1-x_3)\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}&+&(x_2+2x_3)\begin{bmatrix}
1\\
2\\
3
\end{bmatrix} \\
&=& c_1\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}&+&c_2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}
\end{align*}$$ Because the second column cannot be written as a Linear Combination of the first and vise versa, we can't reduce $\vec y$ into terms of only one vector.
This also means that the Image of $A$ only needed the first two column vectors:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\
1\\
1
\end{bmatrix},\begin{bmatrix}
1\\
2\\
3
\end{bmatrix})$$ which is a more concise and accurate description of the Image.
This way, each outcome $\vec y$ can be uniquely written as Linear Combination of the two Columns.
When we can write a Vector as Linear Combination of other Vectors, like with the third Column $\begin{bmatrix}1\\3\\5\end{bmatrix}$, we call them Redundant Vectors.
When we cannot write a Vector as a Linear Combination, like with third two Columns $\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}$ and $\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}$, we call them Independent Vectors.
When solving for the Span of the Image, we only want to use the Independent Vectors and not the Redundant Column Vectors.
When we have found the Independent Vectors that span the Sub-Space, we call them the Basis of the Sub-Space. We always want to be thinking in terms of these Basis because they reveal more about the Sub-Space than when the span is described with Redundant Vectors.
If the Sub-Space is defined with 2 Basis, it is obvious that the Sub-Space is a 2D plane, but if we were to add a third Redundant Vector to describe the span, like we did in the example, then it is not immediately obvious that the span describes a 2D Plane rather than a 3D Space.
So how do we eliminate all the Redundant Linear Combinations?
Well, obviously we can try doing it by hand, checking if each combination of Columns result in a Linear Combination of the other, but as we saw even in the example above, it is not immediately obvious that one of the Columns may be a Linear Combination of the others. And as we start dealing with bigger and bigger Matrices with more and more columns, checking every possible combinations will be difficult and time-consuming.
How else can we systematically and easily eliminate all Redundant Columns, leaving only the essential Independent Vectors?
What does it mean to be in Reduced Row Echelon Form?
In RREF, a row can carry a Leading One, or be a Zero-Row, and there can be only one Leading One in a single column.
The matrix $\begin{bmatrix}
1&0&7 \\
0&1&0 \\
0&0&0
\end{bmatrix}$ is in RREF,
while $\begin{bmatrix}
1&0&0 \\
0&2&3 \\
0&0&0
\end{bmatrix}$ is not, because the Second Row neither is a Zero-Row nor carries a Leading One.
We will not worry so much about the orders of these rows for now.
These simple restrictions alone result in an intriguing consequences that the Non-Zero Rows, by definition of RREF, must be linearly independent.
This is simple to see in our example:
Since first column of first row is 1, but all other rows carry 0 in their first column, there is no way that other rows can linearly combine to result in first row:
$$\not \exists c_1,c_2, \begin{bmatrix}
1&0&7
\end{bmatrix} = c_1\begin{bmatrix}
0&1&0
\end{bmatrix} + c_2 \begin{bmatrix}
0&0&0
\end{bmatrix}$$
In other words, by performing Gauss-Jordan Elimination and putting a matrix in its RREF, we can effectively eliminate Linear Combinants within the rows, thus eliminating all Redundant Rows.
Let's see why this will work by looking deeper into how Gauss-Jordan Elimination operates.
$$
A = \begin{bmatrix}
\vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} \\
\text{we can scale the first row up by $c$} \\
\begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} \\
\text{and then we can add the scaled first row to the last} \\
\begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
c \vec r _1 + \vec r_n
\end{bmatrix} $$
Interestingly, these Elementary Operations can actually be represented by Matrix Product:
$$
\begin{bmatrix}
c & 0 & \dots & 0\\
0 & 1 & \dots & 0 \\
\vdots \\
0 & 0 & \dots & 1
\end{bmatrix}A = \begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} $$$$
\begin{bmatrix}
c & 0 & \dots & 0\\
0 & 1 & \dots & 0 \\
\vdots \\
c & 0 & \dots & 1
\end{bmatrix}A = \begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
c\vec r_1 + \vec r_n
\end{bmatrix} \\
$$
These Transformation Matrixes can be achieved by Augmenting the Matrix with an Identity Matrix and performing Elementary Operations on the whole Augmented Matrix:
$$[ A \vert \mathbf 1_n ] = \left[ \begin{array}{c|cccc}
\vec r_1 & 1 & 0 & \dots & 0 \\
\vec r_2 & 0 & 1 & \dots & 0 \\
\vdots & \vdots \\
\vec r _ n & 0 & 0 & \dots & 1
\end{array}
\right ]$$
Now when we scale the first row by $c$ and add that scaled first row to the last, we will get:
$$\left[ \begin{array}{c|cccc}
c\vec r_1 & c & 0 & \dots & 0 \\
\vec r_2 & 0 & 1 & \dots & 0 \\
\vdots & \vdots \\
c\vec r_1 + \vec r _ n & c & 0 & \dots & 1
\end{array}
\right ]$$ Where lefthand side of the Augmented Matrix is what we expected from our elementary operations, and the righthand side now gives the Matrix Transformation representation of the operations.
$$
\text{rref}\left[ A \, \vert \, \mathbf 1 _n \right] = \left[ \text{rref}(A) \, \vert B \, \right]$$
In short, we can interpret the Transformation Matrix as telling us the combinations of Rows from original matrix that will yield the row in the RREF.
$
B = \begin{bmatrix}
\vdots & \vdots && \vdots\\
x_1 & x_2 & \dots & x_n\\
\vdots & \vdots && \vdots\\
\end{bmatrix},
$ where $B$ describes the Elementary Operations of Gauss-Jordan Elimination, it means that the $i$th row of RREF is composed of $x_1 \vec r_1 + x_2 \vec r_2 + \dots + x_n \vec r_n$.
$$\text{rref}(A) = BA = \begin{bmatrix}
\vdots \\
x_1 \vec r_1 + x_2 \vec r_2 + \dots + x_n \vec r_n \\
\vdots
\end{bmatrix}$$
Let's see this at work with an example:
$$
A = \begin{bmatrix}
1&1 \\
1&2 \\
-1&3
\end{bmatrix}, \\
\begin{align*}
\text{rref}(A \, \vert \, \mathbf 1_3 ) &= \text{rref}\left[ \begin{array}{cc|ccc}
1&1&1&0&0\\
1&2&0&1&0\\
-1&3&0&0&1
\end{array}
\right] \\
&= \left[ \begin{array}{cc|ccc}
1&0&0&\frac{3}{5}&-\frac{2}{5}\\
0&1&0&\frac{1}{5}&\frac{1}{5}\\
0&0&1&-\frac{4}{5}&\frac{1}{5}
\end{array}
\right]\\
\end{align*}
$$ We can interpret this as meaning
$$\begin{align*}
&\begin{bmatrix}
1&0
\end{bmatrix} &=& 0\begin{bmatrix}
1&1
\end{bmatrix} &+\frac{3}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&- \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}& \\
&\begin{bmatrix}
0&1
\end{bmatrix} &=& 0\begin{bmatrix}
1&1
\end{bmatrix} &+\frac{1}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&+\frac{1}{5}\begin{bmatrix}
-1&3
\end{bmatrix} &\\
&\begin{bmatrix}
0&0
\end{bmatrix}& =& \begin{bmatrix}
1&1
\end{bmatrix} &-\frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&+ \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}&
\end{align*}$$ Which are all true!
But focus on the last equality of the Zero-Row, the third row of RREF:
$$
\vec 0 = \begin{bmatrix}
1&1
\end{bmatrix} -\frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} + \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix} \\
\therefore \frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} - \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}=\begin{bmatrix}
1&1
\end{bmatrix} \\
\text{or in symbolic notations,} \\
\frac{4}{5}\vec r _2 - \frac{2}{5}\vec r_3=\vec r_1
$$ Existence of a Zero-Row in RREF suggests that one of the Rows in the original Matrix was a Linear Combinant of other rows!
Also notice that putting the matrix in RREF form eliminated the usage of that whole Redundant Row (the first column in the Augmented Matrix which represents the First Row is always Zero except in that Zero-Row, meaning that the first Row is never used to define RREF on any other rows)!
$$\begin{bmatrix}
0&\frac{3}{5} & -\frac{2}{5} \\
0&\frac{1}{5} & \frac{1}{5} \\
1&-\frac{4}{5} & \frac{2}{5}
\end{bmatrix}
$$ If the $i$th column of the Augmented Matrix carry a Leading one, then the $i$th row in the Original Matrix is Redundant!
We will come back to the specific usage of Augmented Matrix in a bit.
This is exactly what we wanted for our Image, for eliminating all Redundant Linear Combinants. Remaining Rows of RREF are guaranteed to be Independent!
However, for our Images we need to deal with Columns rather than Rows. How can we eliminate Columns rather than Rows?
For those unfamiliar with Matrix Transpose, it simply "flips" the Matrix along its main diagonal.
$$\begin{bmatrix}
1&1\\
2&3\\
3&5
\end{bmatrix} ^\intercal =
\begin{bmatrix}
1&2&3\\
1&3&5
\end{bmatrix} \\
\begin{bmatrix}
5\\
6
\end{bmatrix}^\intercal = \begin{bmatrix}
5&6
\end{bmatrix} \\
\begin{bmatrix}
8&9
\end{bmatrix}^\intercal = \begin{bmatrix}
8\\
9
\end{bmatrix}
$$ and so on.
In general, we can write Transpose as
$$
A^\intercal = \begin{bmatrix}
\vec a_1 & \vec a_2 & \dots & \vec a_m
\end{bmatrix}^\intercal = \begin{bmatrix}
{\vec a_1}^\intercal \\
{\vec a_2}^\intercal \\
\vdots \\
{\vec a_m}^\intercal
\end{bmatrix}
$$ If $A$ had order of $n \times m$, the Transposed $A^\intercal$ will have order of $m \times n$.
Notice that through Transpose, the Column Vectors became the new Row Vectors.
Now when Gauss-Jordan Elimination is performed to this Transposed Matrix, we are effectively eliminating Linear Combinations of Columns of original Matrix:
$$
\text{rref}(A^\intercal) = \text{rref}(\begin{bmatrix}
{\vec a_1}^\intercal \\
{\vec a_2}^\intercal \\
\vdots \\
{\vec a_m}^\intercal
\end{bmatrix}) = \begin{bmatrix}
\vec b_1\\
\vec b_2 \\
\vdots \\
\vec b_m
\end{bmatrix}$$ where each $\vec b_i$ are either Linear Independent or is a Zero-Vector. These are the Vectors we were looking for when solving for the Image, only as Row-Vectors. We can turn them back to Column-Vectors with another Transpose:
$$\begin{bmatrix}
\vec b_1\\
\vec b_2 \\
\vdots \\
\vec b_m
\end{bmatrix}^\intercal = \begin{bmatrix}
{\vec b _1}^\intercal & {\vec b_2}^\intercal & \dots & {\vec b_m}^\intercal
\end{bmatrix}$$ where ${\vec b_i}^\intercal$ is still guaranteed to be Linearly Independent or be a Zero-Column.
$$\text{rref}(A^\intercal)^\intercal = \text{rcef}(A)$$
Our RCEF shares many similarities with rules of RREF with little differences:
$\begin{bmatrix} 1 & 0 \\ 0 & 1\\ 2 & 3 \end{bmatrix}$is in RCEF,
$\begin{bmatrix} 0&0 \\ 1&0 \\ 0&1 \end{bmatrix}$ is also in RCEF, but
$\begin{bmatrix} 1&0&0\\ 0&1&2\\ 0&0&0 \end{bmatrix}$ is not in RCEF because the second row carries a Peaking One, but there is a non-zero element also in the second row. It is, however, in RREF.
Like we discussed above, RCEF eliminates all Redundant Columns and the remaining Non-Zero Columns are guaranteed to be Independent Vectors.
$$
\text{rcef}(A) = \begin{bmatrix}
\vec b_1 & \vec b_2 & \dots & \vec b_m
\end{bmatrix}
$$ This is what we needed to define our Image. So we can write
$$\text{Im}(A) = \text{span}(\vec b_1, \vec b _2, \dots, \vec b_m), \vec b_i \neq \vec 0$$
So we can systematically define the Image using RCEF, but what about the Kernel?
$$A\vec x = \vec 0, \\ \vec x \in \mathbb R ^m, \vec 0 \in \mathbb R^n$$ Solving for the Kernel is a little bit trickier than solving for the Image, just like find the Zeroes is little more difficult than finding the range of a function.
Let's look again at what it means to be in the Kernel.
$$\begin{align*}
\vec 0 &=
A \vec x \\
\vec 0 =&= \begin{bmatrix}
\vec a_1 & \dots & \vec a_m
\end{bmatrix}\begin{bmatrix}
x_1 \\
\vdots \\
x_m
\end{bmatrix} \\
\vec 0 & =
x_1\vec a_1 + x_2\vec a_2 + \dots + x_m\vec a_m
\end{align*}$$ From here, we notice that this equality implies that one of the Columns $\vec a_i$ must be a Linear Combination of the other:
$$\begin{align*}
\vec 0 &= x_1\vec a_1 + x_2\vec a_2 + \dots + x_m\vec a_m \\
\therefore -x_1\vec a_1 &= x_2\vec a_2 + \dots + x_m\vec a_m \\
\therefore \vec a_1 &= -\frac{x_2}{x_1}\vec a_2 - \dots - \frac{x_m}{x_1}\vec a_m
\end{align*}$$ So again, we find that the problem we want to solve is linked to Linear Combinations of the Columns and the Redundant Vectors, but in contrast to solving for Image by eliminating these Redundant Columns, we instead need to find these Columns and see how other Columns combine to
form that vector.
$$\text{rref}\left [ A\, \vert \mathbf 1_m\right ] = \left[ \text{rref}(A) \, \vert B \right]$$ We found that $\text{rref}(A)$ will carry a Zero-Row if and only if a row of $A$ is a Linear Combination of the other rows. Also, by looking at the augmented $B$, we can find which row that Linear Combinant was: if $j$th Column in $B$ carries a Leading One, then $j$th row of $A$ is a Linear Combination of the others.
We explored this before, but let's see it again with a different example:
$$\text{rref}\left[
\begin{array}{ccc|ccc}
2 & 3& 5& 1& 0& 0 \\
1 & 2& 7& 0& 1& 0 \\
4 & 7& 19& 0& 0& 1 \\\end{array}
\right] =
\left [
\begin{array}{ccc|ccc}
1 & 0& -11& 0& -7& 2 \\
0 & 1& 9& 0& 4& -1 \\
0 & 0& 0& 1& 2& -1 \\
\end{array}
\right ]
$$ Notice the third row is a Zero-Row, and $B$ carries a Leading One in its first column. This means that first row of $A$ is a Linear Combination of the other rows, namely
$$\begin{bmatrix}2 & 3& 5
\end{bmatrix} = -2\begin{bmatrix}1 & 2& 7
\end{bmatrix}+\begin{bmatrix}4 & 7& 19
\end{bmatrix}$$ However, more important than finding which Row is a Redundant Row, $B$ allows us to find which Combination of Rows sum up to a Zero-Vector!
This is exactly what we were looking for when trying to find Kernel.
$$\text{rref}\left[ A^\intercal \, \vert \mathbf 1_m \right]^\intercal = \left[ \frac{\text{rcef}(A)}{B}\right]$$ In this context, $\left[ \frac{A}{B}\right]$ does not represent a fraction, but a Vertically Augmented Matrix which is also called Column-Augmented Matrix. This occurs because of the Transpose at the end.
How would we use this to find Kernel?
In $\left[ \frac{\text{rcef}(A)}{B}\right]$, for all Zero-Columns of $\text{rcef}(A)$, the corresponding Column of $B$ will be in the Kernel of $A$.
For example:
$$A = \begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix} \\
\begin{align*}
\text{rref}\left[ A^\intercal \vert \mathbf 1_3\right]^\intercal &=
\text{rref}\left[ \begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix}^\intercal \vert \mathbf 1_3
\right] ^\intercal\\
&= \text{rref}\left[ \begin{bmatrix}
1&1&3 \\
1&2&4 \\
1&3&5
\end{bmatrix} \vert \begin{bmatrix}
1&0&0 \\
0&1&0\\
0&0&1
\end{bmatrix}
\right] ^\intercal \\
&\text{Transposing the Matrix switches Columns with Rows} \\
&= \text{rref} \left[ \begin{array}{ccc|ccc}
1&1&3&1&0&0 \\
1&2&4&0&1&0 \\
1&3&5&0&0&1
\end{array} \right]^\intercal \\
&\text{Augmenting Identity allow us to "store" orders of operations of RREF} \\
&= \left[
\begin{array}{ccc|ccc}
1&0&2&0&3&-2\\
0&1&1&0&-1&1\\
0&0&0&1&-2&1
\end{array}
\right]^\intercal \\
&\text{Notice the Zero Row on Third Row.} \\
&\text{The Corresponding Row in the Augmented Matrix tells us the} \\
&\text{combinations of Columns of $A$ that yields $\vec 0$.} \\
&=
\begin{bmatrix}
1&0&0\\
0&1&0\\
2&1&0\\
\hline
0&0&1 \\
3&-1&-2\\
-2&1&1
\end{bmatrix} \\
&\text{Transpose returns Columns to its original places}
\end{align*} \\
$$The corresponding Column to the Zero-Column of $\text{rcef}(A)$ describes the Span of the Kernel.
$$\begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix}\begin{bmatrix}
1\\
-2\\
1
\end{bmatrix}= \vec 0
$$
In this case, only the third column describes the Span.
$$\ker(A) = \text{span}(\begin{bmatrix}
1\\
-2\\
1
\end{bmatrix})$$
Conveniently, in trying to find the Kernel with RCEF, we also can simultaneously solve for the Image by looking at Non-Zero Columns of RCEF:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0\\2
\end{bmatrix}, \begin{bmatrix}
0\\1\\1
\end{bmatrix}
)$$
Why does this work?
Let's look at what each component of the equation actually does:
We can thus conclude $$ \text{rref}(A^\intercal \vert \mathbf 1_m)^\intercal = \left[ \frac{\text{rcef}(A)}{B}\right] = \left[ \frac{ \begin{bmatrix} \vec r_1 & \dots & \vec r_m \end{bmatrix} }{ \begin{bmatrix} \vec b_1 & \dots & \vec b_m \end{bmatrix} } \right] \\
\Rightarrow \\
\ker(A) = \text{span}( \dots \vec b_i, \dots), \, \vec r_i = \vec 0$$
They are useful because they tell us at one glance what the Sub-Space is like:
To span a Line, you will need only one Independent Vector.
To span a 2D Plane, you will need 2 Independent Vectors.
To span a 3D Space, you need 3 Independent Vectors.
And so on.
Now that we have found a easy way of solving for Kernel and Image, solving for the Dimension of those Sub-Spaces become extremely easy.
We will again use RCEF to make this problem quick and simple.
Because we know that the Non-Zero Columns of $\text{rcef}(A)$ will span the Image and because we know those columns are guaranteed to be Independent, the Dimension of Image will just be the number of Non-Zero Columns of RCEF:
$$\dim(\text{Im}(A)) = \text{Number of Non-Zero Columns of $\text{rcef}(A)$} = \text{rank}(\text{rcef}(A))$$
The Zero-Columns of RCEF implies that there is a Redundant Column in $A$. To solve for the Basis of Kernel, we have to use Augmented RCEF; however, since we don't need to know the actual basis themselves to find the Dimension, we can conclude that Dimension of Kernel is simple number of Zero-Columns of RCEF.
$$\dim(\ker(A)) = \text{Number of Zero-Columns of $\text{rcef}(A)$} = m-\text{rank}(\text{rcef}(A))$$
These two facts naturally follow to show the Rank-Nullity Theorem
However, using our new method with RCEF, Rank-Nullity Theorem merely becomes an obvious consequence of solving for the Basis of Kernel and Images simultaneously.
menu->Matrix->Transpose for Transpose
menu->Matrix->Create->Augment for Augmenting
menu->Matrix->Create->Identity for easily creating an Identity Matrix of given size.
You command will thus look like $$ \text{rref}(\text{augment}(A^\intercal , \text{identity}(m)))^\intercal$$where you will input the Matrix $A$ and its Column $m$ yourself.
But if you find yourself too lazy to type the whole line, it is fine to Transpose the Matrix yourself and only use RREF command.
For example, if you want to analyze the Image, Kernel, and their Dimensions of $$\begin{bmatrix}
1&2&3&4\\
0&1&2&3\\
1&1&1&1
\end{bmatrix}$$ you can simply create a $3\times 6$ Matrix and apply RREF:
$$
\text{rref}(\begin{bmatrix}
1&0&1&1&0&0&0\\
2&1&1&0&1&0&0\\
3&2&1&0&0&1&0\\
4&2&1&0&0&0&1
\end{bmatrix}) = \begin{bmatrix}
1&0&1&0&0&3&-2\\
0&1&-1&0&0&-4&3\\
0&0&0&1&0&-3&2\\
0&0&0&0&1&-2&1
\end{bmatrix}
$$ When checking for Zero/Non-Zero Columns, we need to look at the first 3 Columns (because original Matrix was in $3\times 4$).
The first two Columns are Non-Zero, so we know they form the Image. We write manually Transpose them back:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0\\1
\end{bmatrix}, \begin{bmatrix}
0\\1\\-1
\end{bmatrix})$$ Then we read that the third and fourth are all Zeroes, thus we know their Corresponding Rows form the Kernels. Again we manually turn them to Column Vectors:
$$\ker(A) = \text{span}(\begin{bmatrix}1\\0\\-3\\2
\end{bmatrix},\begin{bmatrix}
0\\1\\-2\\1
\end{bmatrix})$$
Try these following exercises either using the proper operations or the short-cut:
$$
\begin{align*}
1.& \begin{bmatrix}
1&2\\
3&4
\end{bmatrix} \\
2.& \begin{bmatrix}
0&0\\
0&0
\end{bmatrix} \\
3.& \begin{bmatrix}
1&1&1\\
1&2&3\\
1&3&5
\end{bmatrix} \\
3. & \begin{bmatrix}
1&2&2&-5&6\\
-1&-2&-1&1&-1\\
4&8&5&-8&9\\
3&6&1&5&-7
\end{bmatrix}
\end{align*}$$Try using RCEF and also without using RCEF.
Notice that the Basis you find using RCEF may differ from ones you find from Basis you find using other methods, but this is only because RCEF will always contain a Peaking One.
Let's try the exercise 1.
We can easily find that
$\text{rcef}(A) = \begin{bmatrix}
1&0\\
0&1
\end{bmatrix}$ and so conclude that
$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0
\end{bmatrix}, \begin{bmatrix}
0\\1
\end{bmatrix})$.
But without using RCEF, you could have noticed that $\begin{bmatrix}
1\\3
\end{bmatrix}$ and $\begin{bmatrix}
2\\4
\end{bmatrix}$ are already Independent from each other and so concluded that
$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\3
\end{bmatrix},\begin{bmatrix}
2\\4
\end{bmatrix}
)$
Both of them are correct representation of same Sub-Space of 2 Dimensions, only written differently.
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0
\end{bmatrix}, \begin{bmatrix}
0\\1
\end{bmatrix}) = \text{span}(\begin{bmatrix}
1\\3
\end{bmatrix},\begin{bmatrix}
2\\4
\end{bmatrix}
) = \mathbb R^2 $$ You will notice similar minor differences when solving for Kernels as well, but again they are merely different representations of same Sub-Space.
Main difference appears in the fact that inputs of Transformations, Vectors, is multiple-dimensional construct, rather than being a single real number.
Real Function:$$ f(x) = y, \\
\text{where } x\in \mathbb R, \\
y \in \mathbb R
$$ Vector Transformation/Matrices$$ T(\vec{x}) = A\vec{x} = \vec{y}, \\
\text{where } \vec{x} \in \mathbb R ^m, \\
\vec{y} \in \mathbb R ^ n, \\
\text{and $A$ is of order $n \times m$} $$ Similar to how algebra students learn to solve for the Zeroes and Range of a Function, linear algebra students learn to solve for the linear-algebraic counterparts of the zeroes and range known as Kernel and Image.
Unlike in a real number functions where zeroes and range can be described by real number intervals, Kernels and Images exists as multidimensional Sub-Space of $\mathbb{R}^n$ and $\mathbb{R}^m$ respectively.
Solving for Kernel and images means to solve for a way to describe these Sub-Spaces.
Image
Image of a Matrix, in simple words, refer to every possible outcome vector of a Matrix Product. It is analogous to the Range of a function.Let's consider the Matrix $A$ with an order $n \times m$. $$T(\vec x) = A \vec x, \vec x \in \mathbb R ^m$$
The Matrix can be written out in terms of its element, as so:
$$ A = \begin{bmatrix}
a_{1\,1} & a_{1\,2} & \dots & a_{1\,m} \\
a_{2\,1} & a_{2\,2} & \dots & a_{2\,m} \\
\vdots \\ a_{n\,1} & a_{n\,2} & \dots & a_{n\,m}
\end{bmatrix}
$$ but we can equally represent the same $A$ in terms of its Column Vectors:
$$A = \begin{bmatrix}
\vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_m
\end{bmatrix}, \vec{a}_i = \begin{bmatrix}
a_{1\,i} \\
a_{2\,i} \\
\vdots \\
a_{n\,i}
\end{bmatrix}
$$
With the Column representation of a Matrix, the Transform can be written as:
$$
\vec{x} = \begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_m
\end{bmatrix} \\
\begin{align*}
A\vec{x} &= \begin{bmatrix}
\vec{a}_1 & \vec{a}_2 & \dots & \vec{a}_m
\end{bmatrix}\begin{bmatrix}
x_1 \\
x_2 \\
\vdots \\
x_m
\end{bmatrix} \\
&= x_1 \vec{a}_1 + x_2 \vec{a}_2 + \dots + x_m \vec{a}_m
\end{align*}
$$
You can verify that this indeed works by trying out an example:
$$
\begin{align*}
\begin{bmatrix}
2 & 1 \\
0 & 3 \\
2 & 4
\end{bmatrix}\begin{bmatrix}
-1 \\
2
\end{bmatrix} &= -\begin{bmatrix}
2 \\
0 \\
2
\end{bmatrix} + 2\begin{bmatrix}
1\\
3\\
4
\end{bmatrix} \\
&= \begin{bmatrix}
-2 \\
0 \\
-2
\end{bmatrix} + \begin{bmatrix}
2\\
6\\
8
\end{bmatrix} \\
&= \begin{bmatrix}
0\\
6\\
6
\end{bmatrix} \\
&\text{which is same the result of}\\
&\text{traditional method of multiplying matrices:} \\
&= \begin{bmatrix}
2*-1 + 1*2 \\
0*-1 + 3*2 \\
2*-1 + 4*2
\end{bmatrix} \\
&= \begin{bmatrix}
0\\
6\\
6
\end{bmatrix} \\
&\text{and by trying it out, it is easy to see why this is the case.}
\end{align*} $$
Let's look again at the general Matrix Product:
$$A\vec x = x_1 \vec{a}_1 + x_2 \vec{a}_2 + \dots + x_m \vec{a}_m = \vec y$$
What we notice from this is that the outcome $\vec y$ is a Linear Combinations of the Matrix's Column Vectors $\vec a _i$.
This must be true for all input $\vec x$ which means we can describe the Image of $A$ as the Sub-Space defined by its Column Vectors which we call the Span of the Vectors $\vec a_i$:
$$\text{Im}(A) = \text{span}(\vec a_1, \vec a_2, \dots,\vec a_m)$$
So this is all we need right?
Not necessarily.
Let's try to use this method to find the Image of this Matrix:
$$
A= \begin{bmatrix}
1&1&1 \\
1&2&3 \\
1&3&5
\end{bmatrix} \\
\therefore \text{Im}(A) = \text{span}(\begin{bmatrix}
1\\
1\\
1
\end{bmatrix},\begin{bmatrix}
1\\
2\\
3
\end{bmatrix},\begin{bmatrix}
1\\
3\\
5
\end{bmatrix})
$$ This means that any output vector $\vec y$ must be in the form
$$\vec y = x_1\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}+x_2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}+x_3\begin{bmatrix}
1\\
3\\
5
\end{bmatrix}
$$ But notice that the third column is actually a Linear Combination of the first two columns:
$$ \begin{bmatrix}
1\\
3\\
5
\end{bmatrix}= - \begin{bmatrix}
1\\
1\\
1
\end{bmatrix}+2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}$$ which means that $\vec y$ only really needed the first two columns:
$$\begin{align*}
\vec y &=& (x_1-x_3)\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}&+&(x_2+2x_3)\begin{bmatrix}
1\\
2\\
3
\end{bmatrix} \\
&=& c_1\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}&+&c_2\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}
\end{align*}$$ Because the second column cannot be written as a Linear Combination of the first and vise versa, we can't reduce $\vec y$ into terms of only one vector.
This also means that the Image of $A$ only needed the first two column vectors:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\
1\\
1
\end{bmatrix},\begin{bmatrix}
1\\
2\\
3
\end{bmatrix})$$ which is a more concise and accurate description of the Image.
This way, each outcome $\vec y$ can be uniquely written as Linear Combination of the two Columns.
When we can write a Vector as Linear Combination of other Vectors, like with the third Column $\begin{bmatrix}1\\3\\5\end{bmatrix}$, we call them Redundant Vectors.
When we cannot write a Vector as a Linear Combination, like with third two Columns $\begin{bmatrix}
1\\
1\\
1
\end{bmatrix}$ and $\begin{bmatrix}
1\\
2\\
3
\end{bmatrix}$, we call them Independent Vectors.
When solving for the Span of the Image, we only want to use the Independent Vectors and not the Redundant Column Vectors.
When we have found the Independent Vectors that span the Sub-Space, we call them the Basis of the Sub-Space. We always want to be thinking in terms of these Basis because they reveal more about the Sub-Space than when the span is described with Redundant Vectors.
If the Sub-Space is defined with 2 Basis, it is obvious that the Sub-Space is a 2D plane, but if we were to add a third Redundant Vector to describe the span, like we did in the example, then it is not immediately obvious that the span describes a 2D Plane rather than a 3D Space.
So how do we eliminate all the Redundant Linear Combinations?
Well, obviously we can try doing it by hand, checking if each combination of Columns result in a Linear Combination of the other, but as we saw even in the example above, it is not immediately obvious that one of the Columns may be a Linear Combination of the others. And as we start dealing with bigger and bigger Matrices with more and more columns, checking every possible combinations will be difficult and time-consuming.
How else can we systematically and easily eliminate all Redundant Columns, leaving only the essential Independent Vectors?
Gauss-Jordan Elimination and Reduced Row Echelon Form
We briefly mentioned a similar concept in my previous post Inverses of Non-Square Matrices, but we will explore it deeper here.What does it mean to be in Reduced Row Echelon Form?
In RREF, a row can carry a Leading One, or be a Zero-Row, and there can be only one Leading One in a single column.
The matrix $\begin{bmatrix}
1&0&7 \\
0&1&0 \\
0&0&0
\end{bmatrix}$ is in RREF,
while $\begin{bmatrix}
1&0&0 \\
0&2&3 \\
0&0&0
\end{bmatrix}$ is not, because the Second Row neither is a Zero-Row nor carries a Leading One.
We will not worry so much about the orders of these rows for now.
These simple restrictions alone result in an intriguing consequences that the Non-Zero Rows, by definition of RREF, must be linearly independent.
This is simple to see in our example:
Since first column of first row is 1, but all other rows carry 0 in their first column, there is no way that other rows can linearly combine to result in first row:
$$\not \exists c_1,c_2, \begin{bmatrix}
1&0&7
\end{bmatrix} = c_1\begin{bmatrix}
0&1&0
\end{bmatrix} + c_2 \begin{bmatrix}
0&0&0
\end{bmatrix}$$
In other words, by performing Gauss-Jordan Elimination and putting a matrix in its RREF, we can effectively eliminate Linear Combinants within the rows, thus eliminating all Redundant Rows.
Let's see why this will work by looking deeper into how Gauss-Jordan Elimination operates.
Elementary Operations
Gauss-Jordan Elimination transforms a Matrix into its RREF by scaling and adding rows. These scaling and adding of the rows are known as Elementary Row Operations.$$
A = \begin{bmatrix}
\vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} \\
\text{we can scale the first row up by $c$} \\
\begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} \\
\text{and then we can add the scaled first row to the last} \\
\begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
c \vec r _1 + \vec r_n
\end{bmatrix} $$
Interestingly, these Elementary Operations can actually be represented by Matrix Product:
$$
\begin{bmatrix}
c & 0 & \dots & 0\\
0 & 1 & \dots & 0 \\
\vdots \\
0 & 0 & \dots & 1
\end{bmatrix}A = \begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
\vec r_n
\end{bmatrix} $$$$
\begin{bmatrix}
c & 0 & \dots & 0\\
0 & 1 & \dots & 0 \\
\vdots \\
c & 0 & \dots & 1
\end{bmatrix}A = \begin{bmatrix}
c \vec r_1\\
\vec r_2\\
\vdots \\
c\vec r_1 + \vec r_n
\end{bmatrix} \\
$$
These Transformation Matrixes can be achieved by Augmenting the Matrix with an Identity Matrix and performing Elementary Operations on the whole Augmented Matrix:
$$[ A \vert \mathbf 1_n ] = \left[ \begin{array}{c|cccc}
\vec r_1 & 1 & 0 & \dots & 0 \\
\vec r_2 & 0 & 1 & \dots & 0 \\
\vdots & \vdots \\
\vec r _ n & 0 & 0 & \dots & 1
\end{array}
\right ]$$
Now when we scale the first row by $c$ and add that scaled first row to the last, we will get:
$$\left[ \begin{array}{c|cccc}
c\vec r_1 & c & 0 & \dots & 0 \\
\vec r_2 & 0 & 1 & \dots & 0 \\
\vdots & \vdots \\
c\vec r_1 + \vec r _ n & c & 0 & \dots & 1
\end{array}
\right ]$$ Where lefthand side of the Augmented Matrix is what we expected from our elementary operations, and the righthand side now gives the Matrix Transformation representation of the operations.
$$
\text{rref}\left[ A \, \vert \, \mathbf 1 _n \right] = \left[ \text{rref}(A) \, \vert B \, \right]$$
In short, we can interpret the Transformation Matrix as telling us the combinations of Rows from original matrix that will yield the row in the RREF.
$
B = \begin{bmatrix}
\vdots & \vdots && \vdots\\
x_1 & x_2 & \dots & x_n\\
\vdots & \vdots && \vdots\\
\end{bmatrix},
$ where $B$ describes the Elementary Operations of Gauss-Jordan Elimination, it means that the $i$th row of RREF is composed of $x_1 \vec r_1 + x_2 \vec r_2 + \dots + x_n \vec r_n$.
$$\text{rref}(A) = BA = \begin{bmatrix}
\vdots \\
x_1 \vec r_1 + x_2 \vec r_2 + \dots + x_n \vec r_n \\
\vdots
\end{bmatrix}$$
Let's see this at work with an example:
$$
A = \begin{bmatrix}
1&1 \\
1&2 \\
-1&3
\end{bmatrix}, \\
\begin{align*}
\text{rref}(A \, \vert \, \mathbf 1_3 ) &= \text{rref}\left[ \begin{array}{cc|ccc}
1&1&1&0&0\\
1&2&0&1&0\\
-1&3&0&0&1
\end{array}
\right] \\
&= \left[ \begin{array}{cc|ccc}
1&0&0&\frac{3}{5}&-\frac{2}{5}\\
0&1&0&\frac{1}{5}&\frac{1}{5}\\
0&0&1&-\frac{4}{5}&\frac{1}{5}
\end{array}
\right]\\
\end{align*}
$$ We can interpret this as meaning
$$\begin{align*}
&\begin{bmatrix}
1&0
\end{bmatrix} &=& 0\begin{bmatrix}
1&1
\end{bmatrix} &+\frac{3}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&- \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}& \\
&\begin{bmatrix}
0&1
\end{bmatrix} &=& 0\begin{bmatrix}
1&1
\end{bmatrix} &+\frac{1}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&+\frac{1}{5}\begin{bmatrix}
-1&3
\end{bmatrix} &\\
&\begin{bmatrix}
0&0
\end{bmatrix}& =& \begin{bmatrix}
1&1
\end{bmatrix} &-\frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} &&+ \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}&
\end{align*}$$ Which are all true!
But focus on the last equality of the Zero-Row, the third row of RREF:
$$
\vec 0 = \begin{bmatrix}
1&1
\end{bmatrix} -\frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} + \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix} \\
\therefore \frac{4}{5}\begin{bmatrix}
1&2
\end{bmatrix} - \frac{2}{5}\begin{bmatrix}
-1&3
\end{bmatrix}=\begin{bmatrix}
1&1
\end{bmatrix} \\
\text{or in symbolic notations,} \\
\frac{4}{5}\vec r _2 - \frac{2}{5}\vec r_3=\vec r_1
$$ Existence of a Zero-Row in RREF suggests that one of the Rows in the original Matrix was a Linear Combinant of other rows!
Also notice that putting the matrix in RREF form eliminated the usage of that whole Redundant Row (the first column in the Augmented Matrix which represents the First Row is always Zero except in that Zero-Row, meaning that the first Row is never used to define RREF on any other rows)!
$$\begin{bmatrix}
0&\frac{3}{5} & -\frac{2}{5} \\
0&\frac{1}{5} & \frac{1}{5} \\
1&-\frac{4}{5} & \frac{2}{5}
\end{bmatrix}
$$ If the $i$th column of the Augmented Matrix carry a Leading one, then the $i$th row in the Original Matrix is Redundant!
We will come back to the specific usage of Augmented Matrix in a bit.
This is exactly what we wanted for our Image, for eliminating all Redundant Linear Combinants. Remaining Rows of RREF are guaranteed to be Independent!
However, for our Images we need to deal with Columns rather than Rows. How can we eliminate Columns rather than Rows?
Transpose
We can use Matrix Transpose to switch the roles of Columns and Rows.For those unfamiliar with Matrix Transpose, it simply "flips" the Matrix along its main diagonal.
$$\begin{bmatrix}
1&1\\
2&3\\
3&5
\end{bmatrix} ^\intercal =
\begin{bmatrix}
1&2&3\\
1&3&5
\end{bmatrix} \\
\begin{bmatrix}
5\\
6
\end{bmatrix}^\intercal = \begin{bmatrix}
5&6
\end{bmatrix} \\
\begin{bmatrix}
8&9
\end{bmatrix}^\intercal = \begin{bmatrix}
8\\
9
\end{bmatrix}
$$ and so on.
In general, we can write Transpose as
$$
A^\intercal = \begin{bmatrix}
\vec a_1 & \vec a_2 & \dots & \vec a_m
\end{bmatrix}^\intercal = \begin{bmatrix}
{\vec a_1}^\intercal \\
{\vec a_2}^\intercal \\
\vdots \\
{\vec a_m}^\intercal
\end{bmatrix}
$$ If $A$ had order of $n \times m$, the Transposed $A^\intercal$ will have order of $m \times n$.
Notice that through Transpose, the Column Vectors became the new Row Vectors.
Now when Gauss-Jordan Elimination is performed to this Transposed Matrix, we are effectively eliminating Linear Combinations of Columns of original Matrix:
$$
\text{rref}(A^\intercal) = \text{rref}(\begin{bmatrix}
{\vec a_1}^\intercal \\
{\vec a_2}^\intercal \\
\vdots \\
{\vec a_m}^\intercal
\end{bmatrix}) = \begin{bmatrix}
\vec b_1\\
\vec b_2 \\
\vdots \\
\vec b_m
\end{bmatrix}$$ where each $\vec b_i$ are either Linear Independent or is a Zero-Vector. These are the Vectors we were looking for when solving for the Image, only as Row-Vectors. We can turn them back to Column-Vectors with another Transpose:
$$\begin{bmatrix}
\vec b_1\\
\vec b_2 \\
\vdots \\
\vec b_m
\end{bmatrix}^\intercal = \begin{bmatrix}
{\vec b _1}^\intercal & {\vec b_2}^\intercal & \dots & {\vec b_m}^\intercal
\end{bmatrix}$$ where ${\vec b_i}^\intercal$ is still guaranteed to be Linearly Independent or be a Zero-Column.
Reduced Column Echelon Form
Thus, we can define the operations to be:$$\text{rref}(A^\intercal)^\intercal = \text{rcef}(A)$$
Our RCEF shares many similarities with rules of RREF with little differences:
- A Column can only have at most one Peaking Ones (analogous to the Leading One of RREF).
- A Row containing a Peaking One must carry all zeroes at every other columns.
- A Column with higher Peaking One must be to the right of Columns with lower Peaking One.
- All Zero-Columns must be at the left of the RCEF
$\begin{bmatrix} 1 & 0 \\ 0 & 1\\ 2 & 3 \end{bmatrix}$is in RCEF,
$\begin{bmatrix} 0&0 \\ 1&0 \\ 0&1 \end{bmatrix}$ is also in RCEF, but
$\begin{bmatrix} 1&0&0\\ 0&1&2\\ 0&0&0 \end{bmatrix}$ is not in RCEF because the second row carries a Peaking One, but there is a non-zero element also in the second row. It is, however, in RREF.
Like we discussed above, RCEF eliminates all Redundant Columns and the remaining Non-Zero Columns are guaranteed to be Independent Vectors.
$$
\text{rcef}(A) = \begin{bmatrix}
\vec b_1 & \vec b_2 & \dots & \vec b_m
\end{bmatrix}
$$ This is what we needed to define our Image. So we can write
$$\text{Im}(A) = \text{span}(\vec b_1, \vec b _2, \dots, \vec b_m), \vec b_i \neq \vec 0$$
So we can systematically define the Image using RCEF, but what about the Kernel?
Kernel
Kernel of a Matrix refers to every Matrix that maps the $\vec 0$. It is analogous to the Zeroes of a function.$$A\vec x = \vec 0, \\ \vec x \in \mathbb R ^m, \vec 0 \in \mathbb R^n$$ Solving for the Kernel is a little bit trickier than solving for the Image, just like find the Zeroes is little more difficult than finding the range of a function.
Let's look again at what it means to be in the Kernel.
$$\begin{align*}
\vec 0 &=
A \vec x \\
\vec 0 =&= \begin{bmatrix}
\vec a_1 & \dots & \vec a_m
\end{bmatrix}\begin{bmatrix}
x_1 \\
\vdots \\
x_m
\end{bmatrix} \\
\vec 0 & =
x_1\vec a_1 + x_2\vec a_2 + \dots + x_m\vec a_m
\end{align*}$$ From here, we notice that this equality implies that one of the Columns $\vec a_i$ must be a Linear Combination of the other:
$$\begin{align*}
\vec 0 &= x_1\vec a_1 + x_2\vec a_2 + \dots + x_m\vec a_m \\
\therefore -x_1\vec a_1 &= x_2\vec a_2 + \dots + x_m\vec a_m \\
\therefore \vec a_1 &= -\frac{x_2}{x_1}\vec a_2 - \dots - \frac{x_m}{x_1}\vec a_m
\end{align*}$$ So again, we find that the problem we want to solve is linked to Linear Combinations of the Columns and the Redundant Vectors, but in contrast to solving for Image by eliminating these Redundant Columns, we instead need to find these Columns and see how other Columns combine to
form that vector.
Augmented Identity Matrix
Let's recall back to our findings with Gauss-Jordan Elimination and what happens when we find RREF of matrix augmented with a Identity Matrix. RREF deals with rows rather than columns so think in terms of rows for this section:$$\text{rref}\left [ A\, \vert \mathbf 1_m\right ] = \left[ \text{rref}(A) \, \vert B \right]$$ We found that $\text{rref}(A)$ will carry a Zero-Row if and only if a row of $A$ is a Linear Combination of the other rows. Also, by looking at the augmented $B$, we can find which row that Linear Combinant was: if $j$th Column in $B$ carries a Leading One, then $j$th row of $A$ is a Linear Combination of the others.
We explored this before, but let's see it again with a different example:
$$\text{rref}\left[
\begin{array}{ccc|ccc}
2 & 3& 5& 1& 0& 0 \\
1 & 2& 7& 0& 1& 0 \\
4 & 7& 19& 0& 0& 1 \\\end{array}
\right] =
\left [
\begin{array}{ccc|ccc}
1 & 0& -11& 0& -7& 2 \\
0 & 1& 9& 0& 4& -1 \\
0 & 0& 0& 1& 2& -1 \\
\end{array}
\right ]
$$ Notice the third row is a Zero-Row, and $B$ carries a Leading One in its first column. This means that first row of $A$ is a Linear Combination of the other rows, namely
$$\begin{bmatrix}2 & 3& 5
\end{bmatrix} = -2\begin{bmatrix}1 & 2& 7
\end{bmatrix}+\begin{bmatrix}4 & 7& 19
\end{bmatrix}$$ However, more important than finding which Row is a Redundant Row, $B$ allows us to find which Combination of Rows sum up to a Zero-Vector!
This is exactly what we were looking for when trying to find Kernel.
Augmented Reduced Column Echelon Form
Like before, let's try to combine this with Transpose to make it all work with columns rather than rows.$$\text{rref}\left[ A^\intercal \, \vert \mathbf 1_m \right]^\intercal = \left[ \frac{\text{rcef}(A)}{B}\right]$$ In this context, $\left[ \frac{A}{B}\right]$ does not represent a fraction, but a Vertically Augmented Matrix which is also called Column-Augmented Matrix. This occurs because of the Transpose at the end.
How would we use this to find Kernel?
In $\left[ \frac{\text{rcef}(A)}{B}\right]$, for all Zero-Columns of $\text{rcef}(A)$, the corresponding Column of $B$ will be in the Kernel of $A$.
For example:
$$A = \begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix} \\
\begin{align*}
\text{rref}\left[ A^\intercal \vert \mathbf 1_3\right]^\intercal &=
\text{rref}\left[ \begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix}^\intercal \vert \mathbf 1_3
\right] ^\intercal\\
&= \text{rref}\left[ \begin{bmatrix}
1&1&3 \\
1&2&4 \\
1&3&5
\end{bmatrix} \vert \begin{bmatrix}
1&0&0 \\
0&1&0\\
0&0&1
\end{bmatrix}
\right] ^\intercal \\
&\text{Transposing the Matrix switches Columns with Rows} \\
&= \text{rref} \left[ \begin{array}{ccc|ccc}
1&1&3&1&0&0 \\
1&2&4&0&1&0 \\
1&3&5&0&0&1
\end{array} \right]^\intercal \\
&\text{Augmenting Identity allow us to "store" orders of operations of RREF} \\
&= \left[
\begin{array}{ccc|ccc}
1&0&2&0&3&-2\\
0&1&1&0&-1&1\\
0&0&0&1&-2&1
\end{array}
\right]^\intercal \\
&\text{Notice the Zero Row on Third Row.} \\
&\text{The Corresponding Row in the Augmented Matrix tells us the} \\
&\text{combinations of Columns of $A$ that yields $\vec 0$.} \\
&=
\begin{bmatrix}
1&0&0\\
0&1&0\\
2&1&0\\
\hline
0&0&1 \\
3&-1&-2\\
-2&1&1
\end{bmatrix} \\
&\text{Transpose returns Columns to its original places}
\end{align*} \\
$$The corresponding Column to the Zero-Column of $\text{rcef}(A)$ describes the Span of the Kernel.
$$\begin{bmatrix}
1&1&1 \\
1&2&3 \\
3&4&5
\end{bmatrix}\begin{bmatrix}
1\\
-2\\
1
\end{bmatrix}= \vec 0
$$
In this case, only the third column describes the Span.
$$\ker(A) = \text{span}(\begin{bmatrix}
1\\
-2\\
1
\end{bmatrix})$$
Conveniently, in trying to find the Kernel with RCEF, we also can simultaneously solve for the Image by looking at Non-Zero Columns of RCEF:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0\\2
\end{bmatrix}, \begin{bmatrix}
0\\1\\1
\end{bmatrix}
)$$
Why does this work?
Let's look at what each component of the equation actually does:
- $A^\intercal$ transposes the Columns with Rows, allowing Gaussian Elimination to eliminate Redundant Columns.
- $\mathbf 1 _m$ will remember the combinations and steps taken by the Gaussian Elimination, allowing find the Basis for Kernels.
- $\text{rref}$ operates on the Columns and eliminates any Redundant Vectors. It makes sure all resulting Columns are Independent. Combinations that lead to $\vec 0$ are stored in the Identity Matrix.
- $^\intercal$ transposes the Columns back to its original orientation.
We can thus conclude $$ \text{rref}(A^\intercal \vert \mathbf 1_m)^\intercal = \left[ \frac{\text{rcef}(A)}{B}\right] = \left[ \frac{ \begin{bmatrix} \vec r_1 & \dots & \vec r_m \end{bmatrix} }{ \begin{bmatrix} \vec b_1 & \dots & \vec b_m \end{bmatrix} } \right] \\
\Rightarrow \\
\ker(A) = \text{span}( \dots \vec b_i, \dots), \, \vec r_i = \vec 0$$
Dimension
Dimension can be thought of as minimum number of vectors needed to span the entire Sub-Space.They are useful because they tell us at one glance what the Sub-Space is like:
To span a Line, you will need only one Independent Vector.
To span a 2D Plane, you will need 2 Independent Vectors.
To span a 3D Space, you need 3 Independent Vectors.
And so on.
Now that we have found a easy way of solving for Kernel and Image, solving for the Dimension of those Sub-Spaces become extremely easy.
We will again use RCEF to make this problem quick and simple.
Because we know that the Non-Zero Columns of $\text{rcef}(A)$ will span the Image and because we know those columns are guaranteed to be Independent, the Dimension of Image will just be the number of Non-Zero Columns of RCEF:
$$\dim(\text{Im}(A)) = \text{Number of Non-Zero Columns of $\text{rcef}(A)$} = \text{rank}(\text{rcef}(A))$$
The Zero-Columns of RCEF implies that there is a Redundant Column in $A$. To solve for the Basis of Kernel, we have to use Augmented RCEF; however, since we don't need to know the actual basis themselves to find the Dimension, we can conclude that Dimension of Kernel is simple number of Zero-Columns of RCEF.
$$\dim(\ker(A)) = \text{Number of Zero-Columns of $\text{rcef}(A)$} = m-\text{rank}(\text{rcef}(A))$$
These two facts naturally follow to show the Rank-Nullity Theorem
Rank-Nullity Theorem
$$\dim(\text{Im}(A)) + \dim(\ker(A)) = \text{Number of Columns of $A$}=m$$ This Theorem is often used to find the dimension of either Kernel of the Image when solving for one of them seems more difficult.However, using our new method with RCEF, Rank-Nullity Theorem merely becomes an obvious consequence of solving for the Basis of Kernel and Images simultaneously.
Practical Usages
Though I highly recommend you follow all the operation properly, your calculators may not have operations for Transposing or Augmenting. In case you are using TI N-Spire CX CAS like me, you will find both of these operations atmenu->Matrix->Transpose for Transpose
menu->Matrix->Create->Augment for Augmenting
menu->Matrix->Create->Identity for easily creating an Identity Matrix of given size.
You command will thus look like $$ \text{rref}(\text{augment}(A^\intercal , \text{identity}(m)))^\intercal$$where you will input the Matrix $A$ and its Column $m$ yourself.
But if you find yourself too lazy to type the whole line, it is fine to Transpose the Matrix yourself and only use RREF command.
For example, if you want to analyze the Image, Kernel, and their Dimensions of $$\begin{bmatrix}
1&2&3&4\\
0&1&2&3\\
1&1&1&1
\end{bmatrix}$$ you can simply create a $3\times 6$ Matrix and apply RREF:
$$
\text{rref}(\begin{bmatrix}
1&0&1&1&0&0&0\\
2&1&1&0&1&0&0\\
3&2&1&0&0&1&0\\
4&2&1&0&0&0&1
\end{bmatrix}) = \begin{bmatrix}
1&0&1&0&0&3&-2\\
0&1&-1&0&0&-4&3\\
0&0&0&1&0&-3&2\\
0&0&0&0&1&-2&1
\end{bmatrix}
$$ When checking for Zero/Non-Zero Columns, we need to look at the first 3 Columns (because original Matrix was in $3\times 4$).
The first two Columns are Non-Zero, so we know they form the Image. We write manually Transpose them back:
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0\\1
\end{bmatrix}, \begin{bmatrix}
0\\1\\-1
\end{bmatrix})$$ Then we read that the third and fourth are all Zeroes, thus we know their Corresponding Rows form the Kernels. Again we manually turn them to Column Vectors:
$$\ker(A) = \text{span}(\begin{bmatrix}1\\0\\-3\\2
\end{bmatrix},\begin{bmatrix}
0\\1\\-2\\1
\end{bmatrix})$$
Try these following exercises either using the proper operations or the short-cut:
Exercises
Using the concepts discussed in this article, try to find the Image, Kernel, and their Dimensions of the following Matrices:$$
\begin{align*}
1.& \begin{bmatrix}
1&2\\
3&4
\end{bmatrix} \\
2.& \begin{bmatrix}
0&0\\
0&0
\end{bmatrix} \\
3.& \begin{bmatrix}
1&1&1\\
1&2&3\\
1&3&5
\end{bmatrix} \\
3. & \begin{bmatrix}
1&2&2&-5&6\\
-1&-2&-1&1&-1\\
4&8&5&-8&9\\
3&6&1&5&-7
\end{bmatrix}
\end{align*}$$Try using RCEF and also without using RCEF.
Notice that the Basis you find using RCEF may differ from ones you find from Basis you find using other methods, but this is only because RCEF will always contain a Peaking One.
Let's try the exercise 1.
We can easily find that
$\text{rcef}(A) = \begin{bmatrix}
1&0\\
0&1
\end{bmatrix}$ and so conclude that
$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0
\end{bmatrix}, \begin{bmatrix}
0\\1
\end{bmatrix})$.
But without using RCEF, you could have noticed that $\begin{bmatrix}
1\\3
\end{bmatrix}$ and $\begin{bmatrix}
2\\4
\end{bmatrix}$ are already Independent from each other and so concluded that
$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\3
\end{bmatrix},\begin{bmatrix}
2\\4
\end{bmatrix}
)$
Both of them are correct representation of same Sub-Space of 2 Dimensions, only written differently.
$$\text{Im}(A) = \text{span}(\begin{bmatrix}
1\\0
\end{bmatrix}, \begin{bmatrix}
0\\1
\end{bmatrix}) = \text{span}(\begin{bmatrix}
1\\3
\end{bmatrix},\begin{bmatrix}
2\\4
\end{bmatrix}
) = \mathbb R^2 $$ You will notice similar minor differences when solving for Kernels as well, but again they are merely different representations of same Sub-Space.
Comments
Post a Comment