01 Row Operations and Gauss's Method
There are a lot of new ideas in the first reading: linear system, row operations, Guass's method, echelon form, matrix, augmented matrix, vector, leading variable, free variable, solution set. The detail can be overwhelming if you let it. The point of this reading guide is to pull out the most important ideas, express them as simply as possible, and give basic examples. Sometimes, I will add my own perspective or additional information. The reading guides for the course are not a substitute for the full reading assignments, but they should help you focus your reading on what is most important for the course.
The first reading for the course is Chapter 1, Sections I.1 and I.2. These sections cover a general technique, Gauss's method, for solving systems of linear equations. Although Guass's method can be applied directly to equations, it is usually applied to an augmented matrix that holds the coefficients and constant terms from the equations. For example, here is a linear system of equations: $$\begin{matrix} x &+ 3y &- 2z &= &1\\ & y &- z &= &-1\\ x &- y &+ z &= &2 \end{matrix}$$ and here is the augmented matrix that corresponds to this system: $$\left(\begin{array}{ccc|c} 1 & 3 & -2 & 1\\ 0 & 1 & -1 & -1\\ 1 & -1 & 1 & 2 \end{array}\right)$$ Note that to get the matrix, the variables in the equations must be listed in each equation in the same order, and when a variable is missing from a given equation, the corresponding value in the matrix is zero. The constant terms are separated from the coefficients by a vertical line. (Later, we will see that when all of the constant terms are zero, they are often left out of the matrix.)
Guass's method uses three types of row operation: swapping two rows, multiplying a row by a non-zero constant, and adding a multiple of one row to another row. (When applied to a linear system, these row operations are manipulating the equations that make up the system.) The textbook uses specific notations for these operations, which you should also use:
- $\rho_i \leftrightarrow \rho_j$ — swap the $i$-th row with the $j$-th row;
- $k\rho_i$ — multiply row $i$ by the constant $k$, where $k$ cannot be zero;
- $\rho_i+k\rho_j$ — add $k$ times the $j$-th row to row $i$, where $k$ is a constant.
The point is that applying any sequence of row operations to a linear system gives a new linear system that has exactly the same set of solutions as the original system. Gauss's method uses the row operations to transform a matrix (or system of equations) into echelon form. Here is Gauss's method applied to the above system: $$\begin{align*} \begin{matrix} x &+ 3y &- 2z &= &1\\ & y &- z &= &-1\\ x &- y &+ z &= &2 \end{matrix} &\rowop{-\rho_1 + \rho_3} \begin{matrix} x &+ 3y &- 2z &= &1\\ & y &- z &= &-1\\ &- 4y &+ 3z &= &1 \end{matrix} \\[20pt] &\rowop{4\rho_2 + \rho_3} \begin{matrix} x &+ 3y &- 2z &= &1\\ & y &- z &= &-1\\ & &- z &= &-3 \end{matrix} \end{align*}$$ and here it is applied to the matrix representing the same system of equations: $$\begin{align*} \left(\begin{array}{ccc|c} 1 & 3 &-2 & 1\\ 0 & 1 &-1 &-1\\ 1 &-1 & 1 & 2 \end{array}\right) &\rowop{-\rho_1 + \rho_3} \left(\begin{array}{ccc|c} 1 & 3 &-2& 1\\ 0 & 1 &-1& -1\\ 0 &-4 & 3& 1 \end{array}\right) \\[20pt] &\rowop{4\rho_2 + \rho_3} \left(\begin{array}{ccc|c} 1 & 3 &-2 & 1\\ 0 & 1 &-1 &-1\\ 0 & 0 &-1 &-3 \end{array}\right) \end{align*}$$ In echelon form, each row that contains non-zero coefficients contains at least one more leading zero than the preceding row, and any rows that contain only zeros are on the bottom. In a system that has been reduced to echelon form, the variables from the equations are divided into leading variables and free variables. Any row in which non all of the coefficients are zero has a leading variable, namely the variable whose coefficient is the first non-zero coefficient in that row. A free variable is a variable that is not the leading coefficient in any row. In the above example, all of the variables are leading variables, and there are no free variables. Here is a system in echelon form in which $x_1$, $x_3$, and $x_4$ are leading variables, leaving $x_2$ and $x_5$ as the free variables: $$\begin{matrix} x_1 & +3x_2 & -2x_3 & +x_4 & -x_5 & = & -2\\ & & x_3 & -x_4 & +4x_5 & = & 1\\ & & & 2x_4 & +4x_5 & = & 4 \end{matrix}$$
It is easy to solve a system that has been reduced to echelon form. First, any row that contains only zeros can be ignored. If there is a row in which all of the coefficients are zero but the constant term is non-zero, then there is no solution. Otherwise, if there are no free variables, then there is exactly on solution, and if there is at least one free free variable, there are infinitely many solutions. The solutions can be found by working backwards from the last (non-zero) equation to the first.
For the first example, the final equation, $-z=-3$, means that $z=3$. Substituting this into the equation $y-z=-1$ gives $y-3=-1$, which means that $y=2$. And substituting the values for $y$ and $z$ into $x+3y-2z=1$ gives $x+3\cdot 2-2\cdot 3=1$, which reduces to $x=1$. The solution can be written as a column vector $$\begin{pmatrix} x\\ y\\ z\end{pmatrix}= \begin{pmatrix} 1\\ 2\\ 3 \end{pmatrix}$$
For the second example, the free variables can be assigned arbitrary values, which I will write here as $a$ and $b$, and we can solve for the other variables in terms of the values of the free variables: $$\begin{align*} x_5 &= a\\ x_4 &= \mbox{$\frac12$}(4-4x_5)= 2-2a\\ x_3 &= 1+x_4-4x_5 = 1 + (2-2a)-4a = 3-6a\\ x_2 &= b\\ x_1 &= -2-3x_2+2x_3-x_4+x_5 = -2 -3b +2(3-6a) -(2-2a) + a = 2-9a-3b \end{align*}$$ Note that the book might use the variables names $x_5$ and $x_2$, instead of $a$ and $b$, in these formulas. I prefer to use $a$ and $b$ to make it clear that in the solution, they represent arbitrary values rather than unknown values. We can then write the set of solutions as $$\{ \begin{pmatrix} 2-9a-3b\\ b\\ 3-6a\\ 2-2a\\ a \end{pmatrix} \;\; | \;\; a\in\R, b\in\R \}$$ or, using vector operations, as $$\{ \begin{pmatrix} 2\\ 0 \\ 3\\ 2 \\0\end{pmatrix} + \begin{pmatrix} -9\\ 0 \\-6 \\-2\\ 0\end{pmatrix}a + \begin{pmatrix} -3\\ 1\\ 0\\ 0\\ 0\end{pmatrix}b \;\; | \;\; a\in\R, b\in\R \}$$ You will learn more about vectors and vector operations in a later section.
Here is a 25-minute video that discusses how to prove that applying a row operation to a system of linear equations does not change the set of solutions of the system. It also spends some time talking about proof in general.
Video: Row Operations preserve solution set