Ken Levasseur, Al Doerr, Michiel Smid, Oscar Levin, Charles M. Grinstead, J. Laurie Snell, Eric Lehman, F. Thomson Leighton, Albert R Meyer, Jeff Erickson, Kenneth P. Bogart, Carol Chritchlow, David Eck, OpenDSA Project, L.J. Miller
In this section we will learn how to solve multiple linear equations with multiple unknowns using matrices.
Subsection11.1.1Elementary Operations
Consider the following example.
Example11.1.1.Solution Set.
Find \(x\) and \(y\) such that
\begin{equation*}
x + y = 7 \text{ and } 2x - y = 8.
\end{equation*}
The set of ordered pairs, (x, y) which solve both equations is called the solution set.
You can verify that \((x, y) = (5, 2)\) is a solution to the above system. The interesting question is this: If you were not given this information to verify, how could you determine the solution? You can do this by using the following basic operations on the equations, none of which change the set of solutions of the system of equations.
Definition11.1.2.Elementary Operations.
Elementary operations are those operations consisting of the following.
Interchange the order in which the equations are listed.
Multiply any equation by a nonzero number.
Replace any equation with itself added to a multiple of another equation.
Example11.1.3.
To illustrate the third of these operations on this particular system, consider the following.
\begin{equation*}
x + y = 7
\end{equation*}
\begin{equation*}
2x - y = 8
\end{equation*}
The system has the same solution set as the system
\begin{equation*}
x + y = 7
\end{equation*}
\begin{equation*}
-3y = -6.
\end{equation*}
To obtain the second system, take the second equation of the first system and add \(-2\) times the first equation to obtain \(-3y = -6\text{.}\) Now, this clearly shows that \(y = 2\) and so it follows from the other equation that \(x + 2 = 7\) and so \(x = 5\text{.}\)
Of course a linear system may involve many equations and many variables. The solution set is still the collection of solutions to the equations. In every case, the above operations of Definition 11.1.2 do not change the set of solutions to the system of linear equations.
Theorem11.1.4.
Suppose you have two equations, involving the variables,
where \(E_1\) and \(E_2\) are expressions involving the variables and \(f_1\) and \(f_2\) are constants. (In the above example there are only two variables, \(x\) and \(y\) and \(E_1 = x
+ y\) while \(E_2
= 2x - y\text{.}\)) Then the system \(E_1 = f_1, E_2 = f_2\) has the same solution set as
Also the system \(E_1
=
f_1 , E_2 = f_2\) has the same solutions as the system, \(E_2 = f_2 , E_1
=
f_1\text{.}\) The system \(E_1 = f_1 , E_2 = f_2\) has the same solution as the system \(E_1 = f_1, aE_2 = a f_2\) provided \(a \neq 0\text{.}\)
Proof.
If \((x_1, · · · , x_n)\) solves \(E_1 = f_1, E_2 = f_2\) then it solves the first equation in \(E_1 =
f_1, E_2 + aE_1 = f_2 + a f_1 \text{.}\) Also, it satisfies \(aE_1 = a f_1\) and so, since it also solves \(E_2 = f_2\) it must solve \(E_2 + aE_1 = f_2
+ a
f_1 \text{.}\) Therefore, if \((x_1, · · · , x_n )\) solves \(E_1 = f_1, E_2
= f_2\) it must also solve\(E_2 + aE_1 = f_2 + a f_1\text{.}\) On the other hand, if it solves the system \(E_1 = f_1\) and \(E_2 + aE_1 = f_2 + a f_1\) , then \(aE_1 = a
f_1\) and so you can subtract these equal quantities from both sides of \(E_2
+aE_1 = f_2 +a f_1\) to obtain \(E_2 = f_2\) showing that it satisfies \(E_1
=
f_1 , E_2 = f_2\) .
The second assertion of the theorem which says that the system \(E_1 = f_1 , E_2
= f_2\) has the same solution as the system, \(E_2
= f_2 , E_1 = f_1\) is seen to be true because it involves nothing more than listing the two equations in a different order. They are the same equations.
The third assertion of the theorem which says \(E_1 = f_1 , E_2 = f_2\) has the same solution as the system \(E_1 = f_1, aE_2 = a f_2\) provided \(a\neq
0\) is verified as follows: If \((x_1 , · · · , x_n )\) is a solution of \(E_1
= f_1 ,
E_2 = f_2\) , then it is a solution to \(E_1 = f_1 , aE_2 = a f_2\) because the second system only involves multiplying the equation, \(E_2 = f_2\) by \(a\text{.}\) If \((x_1, · · · , x_n)\) is a solution of \(E_1 = f_1 , aE_2 =
a f_2\text{,}\) then upon multiplying \(aE_2 = a f_2\) by the number \(1/a\text{,}\) you find that \(E_2
= f_2. \Box\)
Stated simply, the above theorem shows that the elementary operations do not change the solution set of a system of equations.
Here is an example in which there are three equations and three variables. You want to find values for \(x, y, z\) such that each of the given equations are satisfied when these values are plugged in to the equations.
Example11.1.5.
Find the solutions to the system,
\begin{equation*}
x + 3y + 6z = 25
\end{equation*}
Now take \((-2)\) times the second and add to the third. More precisely, replace the third equation with \((-2)\) times the second added to the third. This yields the system
\begin{align*}
x + 3y + 6z \amp = 25 \\
y + 2z \amp = 8 \\
z \amp = 3
\end{align*}
At this point, you can tell what the solution is. This system has the same solution as the original system and in the above, \(z = 3\text{.}\) Then using this in the second equation, it follows \(y + 6 = 8\) and so \(y = 2\text{.}\) Now using this in the top equation yields \(x + 6 + 18 = 25\) and so \(x = 1\text{.}\) This process is called back substitution.
Alternatively, you could have continued as follows. Add \((-2)\) times the bottom equation to the middle and then add \((-6)\) times the bottom to the top. This yields
\begin{equation*}
x + 3y = 7, y = 2, z = 3
\end{equation*}
Now add \((-3)\) times the second to the top. This yields
\begin{equation*}
x = 1, y = 2, z = 3,
\end{equation*}
a system which has the same solution set as the original system. This avoided back substitution and led to the same solution set.
Subsection11.1.2Gauss Elimination
A less cumbersome way to represent a linear system is to write it as an augmented matrix. For example the linear system in Example 11.1.5 can be written as
It has exactly the same information as the original system but here it is understood there is an \(x\) column \(\left( \begin{array}{c}
1 \\
2 \\
0 \\
\end{array} \right)
\text{,}\) a \(y\) column, \(\left( \begin{array}{c}
3 \\
7 \\
2 \\
\end{array} \right)
\text{,}\) and a \(z\) column, \(\left( \begin{array}{c}
6 \\
14 \\
5 \\
\end{array} \right)
\text{.}\) The rows correspond to the equations in the system. Thus the top row in the augmented matrix corresponds to the equation,
\begin{equation*}
x + 3y + 6z = 25\text{.}
\end{equation*}
Now when you replace an equation with a multiple of another equation added to itself, you are just taking a row of this augmented matrix and replacing it with a multiple of another row added to it. Thus the first step in solving Example 11.1.5 would be to take \((-2)\) times the first row of the augmented matrix above and add it to the second row,
\begin{align*}
x + 3y + 6z \amp = 25 \\
y + 2z \amp = 8 \\
z \amp = 3
\end{align*}
which is the same as the second step in solving Example 11.1.5. By back substitution you obtain the solution \(x = 1\text{,}\)\(y = 6\text{,}\) and \(z = 3\text{.}\)
In general a linear system is of the form
\begin{equation*}
\begin{array}{c c c c c c}
a_{11}x_1 + &\dots & + a_{1n}x_n & = b_1 \\
&\vdots & & \vdots \\
a_{m1}x_1 + &\dots & + a_{mn}X_n & = b_m \\
\end{array}
\end{equation*}
where the \(x_i\) are variables and the \(a_{ij}\) and \(b_i\) are constants. This system can be represented by the augmented matrix
Changes to a system of equations as a result of an elementary operation translate into changes of an augmented matrix resulting from a row operation. Note that Theorem 11.1.4 implies that the row operations deliver an augmented matrix for a system of equations which has the same solution set as the original system.
Definition11.1.6.Row Operations.
The row operations consist of the following
Switch two rows.
Multiply a row by a nonzero number.
Replace a row by a multiple of another row added to it.
Gauss elimination is a systematic procedure to simplify an augmented matrix to a reduced form. In the following definition, the term “leading entry” refers to the first nonzero entry of a row when scanning the row from left to right.
Definition11.1.7.Echelon Form.
An augmented matrix is in echelon form if
All nonzero rows are above any rows of zeros.
Each leading entry of a row is in a column to the right of the leading entries of any rows above it.
How do you know when to stop doing row operations? You might stop when you have obtained an echelon form as described above, but you certainly should stop doing row operations if you have gotten a matrix in row reduced echelon form described next.
Definition11.1.8.Row Reduced Echelon Form.
An augmented matrix is in row reduced echelon form if
All nonzero rows are above any rows of zeros.
Each leading entry of a row is in a column to the right of the leading entries of any rows above it.
All entries in a column above and below a leading entry are zero.
Each leading entry is a 1, the only nonzero entry in its column.
Example11.1.9.Matrices in Row Reduced Echelon Form.
Here are some matrices which are in row reduced echelon form.
Definition11.1.12.Pivot Positions and Pivot Columns.
A pivot position in a matrix is the location of a leading entry in an echelon form resulting from the application of row operations to the matrix. A pivot column is a column that contains a pivot position.
For example consider the following.
Example11.1.13.Putting a Matrix into Row Reduced Echelon Form.
which is in echelon form, although not in reduced echelon form. Therefore, the pivot positions in the original matrix are the locations corresponding to the first row and first column and the second row and second columns as shown in the following:
Thus the pivot columns in the matrix are the first two columns.
The following is the algorithm for obtaining a matrix which is in row reduced echelon form.
Algorithm11.1.14.
This algorithm tells how to start with a matrix and do row operations on it in such a way as to end up with a matrix in row reduced echelon form.
Find the first nonzero column from the left. This is the first pivot column. The position at the top of the first pivot column is the first pivot position. Switch rows if necessary to place a nonzero number in the first pivot position.
Use row operations to zero out the entries below the first pivot position.
Ignore the row containing the most recent pivot position identified and the rows above it. Repeat steps 1 and 2 to the remaining sub-matrix, the rectangular array of numbers obtained from the original matrix by deleting the rows you just ignored. Repeat the process until there are no more rows to modify. The matrix will then be in echelon form.
Moving from right to left, use the nonzero elements in the pivot positions to zero out the elements in the pivot columns which are above the pivots.
Divide each nonzero row by the value of the leading entry. The result will be a matrix in row reduced echelon form.
This row reduction procedure applies to both augmented matrices and non augmented matrices. There is nothing special about the augmented column with respect to the row reduction procedure.
Do row reductions till you obtain a matrix in echelon form. Then complete the process by producing one in row reduced echelon form.
Solution.
The pivot column is the second. Hence the pivot position is the one in the first row and second column. Switch the first two rows to obtain a nonzero entry in this pivot position.
Step two is not necessary because all the entries below the first pivot position in the resulting matrix are zero. Now ignore the top row and the columns to the left of this first pivot position. Thus you apply the same operations to the smaller matrix
The next pivot column is the third corresponding to the first in this smaller matrix and the second pivot position is therefore, the one which is in the second row and third column.
In this case it is not necessary to switch any rows to place a nonzero entry in this position because there is already a nonzero entry there. Multiply the third row of the original matrix by -2 and then add the second row to it. This yields
The first pivot column is the first column in this case and no switching of rows is necessary because there is a nonzero entry in the first pivot position. Therefore, the algorithm yields for the next step
There is only one column and it is nonzero so this single column is the pivot column. Therefore, the algorithm yields the following matrix for the echelon form.
The above algorithm is the way a computer would obtain a reduced echelon form for a given matrix. It is not necessary for you to pretend you are a computer but if you like to do so, the algorithm described above will work. The main idea is to do row operations in such a way as to end up with a matrix in echelon form or row reduced echelon form because when this has been done, the resulting augmented matrix will allow you to describe the solutions to the linear system of equations in a meaningful way. When you do row operations until you obtain row reduced echelon form, the process is called the Gauss Jordan method. Otherwise, it is called Gauss elimination.
Example11.1.16.
Give the complete solution to the system of equations, \(5x+10y-7z = -2\text{,}\)\(2x
+ 4y - 3z = -1\text{,}\) and \(3x + 6y + 5z = 9\text{.}\)
Multiply the second row by 2, the first row by 5, and then take (-1) times the first row and add to the second. Then multiply the first row by 1/5. This yields
a row of zeros with -2 at the right end, representing the equation 0x + 0y + 0z = -2 which has no solution so there is no solution to this system of equations. When this happens, the system is called inconsistent. In this case it is very easy to describe the solution set. The system has no solution.
Here is another example based on the use of row operations.
Example11.1.17.
Give the complete solution to the system of equations, \(3x - y - 5z = 9\text{,}\)\(y
- 10z = 0\text{,}\) and \(-2x + y = -6\text{.}\)
This is in reduced echelon form. The equations corresponding to this reduced echelon form are \(y = 10z\) and \(x = 3 + 5z\text{.}\) Apparently \(z\) can equal any number. Lets call this number \(t\text{.}\) Therefore, the solution set of this system is \(x = 3 + 5t\text{,}\)\(y = 10t\text{,}\) and \(z = t\) where \(t\) is completely arbitrary. The system has an infinite set of solutions which are given in the above simple way. This is what it is all about, finding the solutions to the system.
There is some terminology connected to this which is useful. Recall how each column corresponds to a variable in the original system of equations. The variables corresponding to a pivot column are called basic variables. The other variables are called free variables. In Example 11.1.17 there was one free variable, \(z\text{,}\) and two basic variables, \(x\) and \(y\text{.}\) In de- scribing the solution to the system of equations, the free variables are assigned a parameter. In Example 11.1.17 this parameter was \(t\text{.}\) Sometimes there are many free variables and in these cases, you need to use many parameters. Here is another example.
Example11.1.18.
Find the solution to the system
\begin{equation*}
x + 2y - z + w = 3
\end{equation*}
\begin{equation*}
x + y - z + w = 1
\end{equation*}
\begin{equation*}
x + 3y - z + w = 5
\end{equation*}
This matrix is in echelon form and you see the basic variables are \(x\) and \(y\) while the free variables are \(z\) and \(w\text{.}\) Assign \(s\) to \(z\) and \(t\) to \(w\text{.}\) Then the second row yields the equation, \(y = 2\) while the top equation yields the equation, \(x + 2y
- s + t = 3\) and so since \(y = 2\text{,}\) this gives \(x + 4 - s + t = 3\) showing that \(x = -1 + s - t\text{,}\)\(y = 2\text{,}\)\(z = s\text{,}\) and \(w = t\text{.}\) One can write this in the form
This is another example of a system which has an infinite solution set but this time the solution set depends on two parameters, not one.
Most people find it less confusing in the case of an infinite solution set to first place the augmented matrix in row reduced echelon form rather than just echelon form before seeking to write down the description of the solution. In the above, this means we don’t stop with the echelon form above. Instead we first place it in reduced echelon form as follows.
Then the solution is \(y = 2\) from the second row and \(x
= -1 + z - w\) from the first. Thus letting \(z = s\) and \(w = t\text{,}\) the solution as is given previously.
The number of free variables is always equal to the number of different parameters used to describe the solution. If there are no free variables, then either there is no solution as in the case where row operations yield an echelon form like
There are a lot of cases to consider but it is not necessary to make a major production of this. Do row operations till you obtain a matrix in echelon form or reduced echelon form and determine whether there is a solution. If there is, see if there are free variables. In this case, there will be infinitely many solutions. Find them by assigning different parameters to the free variables and obtain the solution. If there are no free variables, then there will be a unique solution which is easily determined once the augmented matrix is in echelon or row reduced echelon form. In every case, the process yields a straightforward way to describe the solutions to the linear system. As indicated above, you are probably less likely to become confused if you place the augmented matrix in row reduced echelon form rather than just echelon form.
In summary,
Definition11.1.19.System of Linear Equations.
A system of linear equations is a list of equations,
where \(a_{ij} \) are numbers, and \(b_j\) is a number. The above is a system of \(m\) equations in the \(n\) variables, \(x_1 , x_2, \dots, x_n\) . Nothing is said about the relative size of \(m\) and \(n\text{.}\) Written more simply in terms of summation notation, the above can be written in the form
\begin{equation*}
\sum_{j=1}^n a_{ij}x_j = f_i, i = 1,2,3,\dots,m
\end{equation*}
It is desired to find\((x_1 ,\dots , x_n )\) solving each of the equations listed.
As illustrated above, such a system of linear equations may have a unique solution, no solution, or infinitely many solutions and these are the only three cases which can occur for any linear system. Furthermore, you do exactly the same things to solve any linear system. You write the augmented matrix and do row operations until you get a simpler system in which it is possible to see the solution, usually obtaining a matrix in echelon or reduced echelon form. All is based on the observation that the row operations do not change the solution set. You can have more equations than variables, fewer equations than variables, etc. It doesn’t matter. You always set up the augmented matrix and go to work on it.
Definition11.1.20.
A system of linear equations is called consistent if there exists a solution. It is called inconsistent if there is no solution.
These are reasonable words to describe the situations of having or not having a solution. If you think of each equation as a condition which must be satisfied by the variables, con- sistent would mean there is some choice of variables which can satisfy all the conditions. Inconsistent would mean there is no choice of the variables which can satisfy each of the conditions.
Exercises11.1.3Exercises
1.
Find the point \((x, y )\) which lies on both lines, \(x + 3y = 1\) and \(4x
- y = 3\text{.}\)
So \(y = 1/13\text{.}\) Backsubstituting: \(x + 3(1/13) = 1\text{,}\) so \(x = 10/13\text{.}\) Therefore \((x, y) \) is the point \((10/13,1/13)\text{.}\)
2.
Solve Exercise 1 graphically. That is, graph each line and see where they intersect.
Answer.
Figure11.1.21.Graph of Exercise 11.1.1
3.
Find the point of intersection of the two lines \(3x + y = 3\) and \(x + 2y
= 1\text{.}\)
4.
Solve Exercise 3 graphically. That is, graph each line and see where they intersect.
5.
Do the three lines, \(x + 2y = 1\text{,}\)\(2x - y = 1\text{,}\) and \(4x + 3y = 3\) have a common point of intersection? If so, find the point and if not, tell why they don’t have such a common point of intersection
So \(y = 1/5\text{.}\) Backsubstituting to line one: \(x + 2(1/5)
= 1\text{,}\) so \(x = 3/5\text{.}\) The common intersecting point is \((3/5, 1/5)\text{.}\)
6.
Do the three planes, \(x + y -3z = 2\text{,}\)\(2x + y + z = 1\text{,}\) and \(3x + 2y
- 2z = 0\) have a common point of intersection? If so, find one and if not, tell why there is no such point.
7.
You have a system of \(k\) equations in two variables, \(k \geq 2\text{.}\) Explain the geometric significance of
No solution.
A unique solution.
An infinite number of solutions.
Solution.
No solution. The lines are parallel.
A unique solution. The lines must intersect.
An infinite number of solutions. The lines must be the same line.
8.
If a system of equations has more equations than variables, can it have a solution? If so, give an example and if not, tell why not.
is the augmented matrix of an inconsistent matrix.
Solution.
What we want is for one of the rows to be all zeros except for the last term. If we multiply the top row by \(-3/2\) and add that to the bottom row, we get
Therefore we see that we have a problem because rows 3 and 4 are saying \(-3z=
-2\) and \(-3z = -1\text{.}\) Or we could keep going with the row operations and add \(-1 * \)row 3 to row 4: Add \(-2 * \) row 2 to row 4:
This shows that \(y\) is going to be a free variable. Assign \(s\) to \(y\text{.}\)The top row yields the equation \(7x + 14s + 15z =
22\text{,}\) since \(z = 1\text{,}\) this gives \(7x + 14s + 15 = 22\text{.}\) Solving for \(x\) we get \(x = 1 - 2s \text{,}\) so the final solution can be written: