# UNDER CONSTRUCTION

## CHAPTER ONE

Many of the problems concerning properties of matrices depend on properties of the entries, which are real numbers. The following list of properties will give you a point of reference, so that you know what can be assumed.

## Properties of Real Numbers

This is a list of some of the properties of the set of real numbers that we need in order to work with vectors and matrices. Actually, we can work with matrices whose entries come from any set that satisfies these properties, such as the set of all rational numbers or the set of all complex numbers.

1. Closure: For all real numbers a,b, the sum a + b and the product a . b are real numbers.

2. Associative laws: For all real numbers a,b,c,

a + (b + c) = (a + b) + c and a . (b . c) = (a . b) . c.

3. Commutative laws: For all real numbers a,b,

a + b = b + a and a . b = b . a.

4. Distributive laws: For all real numbers a,b,c,

a . (b + c) = a . b + a . c and (a + b) . c = a . c + b . c.

5. Identity elements: There are real numbers 0 and 1 such that for all real numbers a,

a + 0 = a and 0 + a = a, and

a . 1 = a and 1 . a = a.

6. Inverse elements: For each real number a, the equations

a + x = 0 and x + a = 0

have a solution x in the set of real numbers, called the additive inverse of a, and denoted by -a.
For each nonzero real number a, the equations

a . x = 1 and x . a = 1

have a solution x in the set of real numbers, called the multiplicative inverse of a, and denoted by a-1.

Here are some additional properties of real numbers a,b,c, which can be proved from the properties listed above.

• If a + c = b + c, then a = b.
• If a . c = b . c and c is nonzero, then a = b.
• a . 0 = 0
• -(-a) = a
• (-a) . (-b) = a . b

## Solving Systems of Linear Equations

The key ideas are best introduced through an example. We want to develop a systematic method for solving systems of linear equations like the one below.
```       3x  +2y  -5z       =  3
-2x  - y  +3z  + w  =  0
- x  + y       +6w  = 11
x  + y  -2z  + w  =  3
```
We can perform any of these operations on the system:
(1) Interchange two equations;

(2) Multiply each term of an equation by a nonzero constant;

(3) Replace an equation by adding to it a multiple of another equation.

To use the Gauss-Jordan technique, sometimes called Gaussian elimination, choose an equation with a coefficient of 1 in the first column. (It may be necessary to first create one, by dividing each term of one of the equations by its coefficient of x, or by adding a multiple of one of the equations to another to get the 1.) This equation is called the pivot, and it should be moved to the top position. Use it to eliminate the x term in the other equations.

Repeat this procedure for each of the columns. The solution given below illustrates Gauss-Jordan elimination.

```       3x  +2y  -5z       =  3
- x  + y       +6w  = 11
-2x  - y  +3z  + w  =  0
x  + y  -2z  + w  =  3
```
~>
```        x  + y  -2z  + w  =  3
- x  + y       +6w  = 11
-2x  - y  +3z  + w  =  0
3x  +2y  -5z       =  3
```
~>
```        x  + y  -2z  + w  =  3
2y  -2z  +7w  = 14
y  - z  +3w  =  6
- y  + z  -3w  = -6
```
~>
```        x  + y  -2z  + w  =  3
y  - z  +3w  =  6
2y  -2z  +7w  = 14
- y  + z  -3w  = -6
```
~>
```        x       - z  -2w  = -3
y  - z  +3w  =  6
w  =  2
```
~>
```        x       - z       =  1
y  - z       =  0
w  =  2
```

This gives us the final solution: x = z + 1, y = z, w = 2.

We do not have to write down the variables each time, provided we keep careful track of their positions. The solution using matrices to represent the system looks like this.

```     _                   _
|                     |
|  3   2  -5   0   3  |
| -1   1   0   6  11  |
| -2  -1   3   1   0  |
|  1   1  -2   1   3  |
|_                   _|
```
~>
```     _                   _
|                     |
|  1   1  -2   1   3  |
| -1   1   0   6  11  |
| -2  -1   3   1   0  |
|  3   2  -5   0   3  |
|_                   _|
```
~>
```     _                   _
|                     |
|  1   1  -2   1   3  |
|  0   2  -2   7  14  |
|  0   1  -1   3   6  |
|  0  -1   1  -3  -6  |
|_                   _|
```
~>
```     _                   _
|                     |
|  1   1  -2   1   3  |
|  0   1  -1   3   6  |
|  0   2  -2   7  14  |
|  0  -1   1  -3  -6  |
|_                   _|
```
~>
```     _                   _
|                     |
|  1   0  -1  -2  -3  |
|  0   1  -1   3   6  |
|  0   0   0   1   2  |
|  0   0   0   0   0  |
|_                   _|
```
~>
```     _                   _
|                     |
|  1   0  -1   0   1  |
|  0   1  -1   0   0  |
|  0   0   0   1   2  |
|  0   0   0   0   0  |
|_                   _|
```

Finally, we put the variables back in, to get the solution: x -z = 1, y -z = 0, w = 2. This can be rewritten in the form x = z + 1, y = z, w = 2.

The answer shows that there are infinitely many solutions. Any value can be chosen for z, and then using the corresponding values for x, y, and w gives a solution.

## Definition of a Real Vector Space

Definition 2.4 (p 96): A vector space is a set V on which two operations + and · are defined, called addition and scalar multiplication.

The operation + (vector addition) must satisfy the following conditions:

Closure: For all vectors u and v in V, the sum   u + v   belongs to V.

(1) Commutative law: For all vectors u and v in V,     u + v = v + u

(2) Associative law: For all vectors u, v, w in V,     u + (v + w) = (u + v) + w

(3) Additive identity: The set V contains an additive identity element, denoted by 0, such that for all vectors v in V,     0 + v = v   and   v + 0 = v.

(4) Additive inverses: For each vector v in V, the equations     v + x = 0   and   x + v = 0     have a solution x in V, called an additive inverse of v, and denoted by - v.
The operation · (scalar multiplication) must satisfy the following conditions:

Closure: For all real numbers c and all vectors v in V, the product   c · v   belongs to V.

(5) Distributive law: For all real numbers c and all vectors u, v in V,     c · (u + v) = c · u + c · v

(6) Distributive law: For all real numbers c, d and all vectors v in V,     (c+d) · v = c · v + d · v

(7) Associative law: For all real numbers c,d and all vectors v in V,     c · (d · v) = (cd) · v

(8) Unitary law: For all vectors v in V,     1 · v = v

## Subspaces

Definition 2.5 (p 103): Let V be a vector space, and let W be a subset of V. If W is a vector space with respect to the operations in V, then W is called a subspace of V.

Theorem 2.3 (p 103): Let V be a vector space, with operations + and   ·, and let W be a subset of V. Then W is a subspace of V if and only if the following conditions hold.

Sub0 W is nonempty: The zero vector belongs to W.

Sub1 Closure under +: If u and v are any vectors in W, then   u + v   is in W.

Sub2 Closure under ·: If v is any vector in W, and c is any real number, then   c · v   is in W.

## Summary of Definitions and Theorems

The summary that I have provided below is probably something that you should do on your own, to help make sense of the rather large number of facts that you must know. The definitions, theorems, and numerical algorithms are the tools you will need to solve problems.

Definition 2.6 (page 105). Let v1, v2, ..., vk be vectors in a vector space V. A vector v in V is called a linear combination of v1, v2, ..., vk if there are real numbers a1, a2, ..., ak with

v = a1 v1 + a2 v2 + ... + ak vk

Definition 2.7 (page 106). span { v1, v2, ..., vk } is the set of all linear combinations of v1, v2, ..., vk.

Theorem 2.4 (page 107). span { v1, v2, ..., vk } is a subspace.

Definition 2.8 (page 114). Vectors v1, v2, ..., vk span V if span { v1, v2, ..., vk } = V.

Definition 2.9 (page 116). The vectors v1, v2, ..., vk. are linearly independent if the equation

x1 v1 + x2 v2 + ... + xk vk = 0

has only the trivial solution (all zeros).
The vectors are linearly dependent if the equation has a nontrivial solution (not all zeros).

I believe that the next theorem is the best way to think about linear dependence. I would probably use it as the definition. The definition that the author uses is the usual one, and it is the best way to check whether or not a given set of vectors is linearly independent.

Theorem 2.6 (page 121). A set of vectors is linearly dependent if and only if one of them is a linear combination of the others.

Definition 2.10 (page 125). A set of vectors is a basis for V if it is a linearly independent spanning set.

Theorem 2.7 (page 124). A set of vectors is a basis for V if and only if every vector in V can be expressed uniquely as a linear combination of the vectors in the set.

Theorem 2.8 (page 125). Any spanning set contains a basis.

Theorem 2.9 (page 129). If a vector space has a basis with n elements, then it cannot contain more than n linearly independent vectors.

Corollary 2.1 (page 129). Any two bases have the same number of elements.

Definition 2.11 (page 129). If a vector space V has a finite basis, then the number of vectors in the basis is called the dimension of V.

Corollaries 2.2 - 2.5 (page 130). dim (V) is the maximum number of linearly independent vectors in V, and it is also the minimum number of vectors in any spanning set.

Theorem 2.10 (page 131). Any linearly independent set can be expanded to a basis.

Theorem 2.11 (page 132). If dim (V) = n, and you have a set of n vectors, then to check that it forms a basis you only need to check one of the two conditions (spanning and linear independence).

Theorem 2.16 (page 160). Row equivalent matrices have the same row space.

Definition 2.15 (page 163). The row rank of a matrix is the dimension of its row space. The column rank of a matrix is the dimension of its column space.

Theorem 2.17 (page 165). For any matrix (of any size), the row rank and column rank are equal.

Theorem 2.18 (page 166). If A is any m by n matrix, then rank(A) + nullity(A) = n .

Theorem 2.20 (page 168). The equation Ax = b has a solution if and only if the augmented matrix has the same rank as A.

Summary of results on rank and nonsingularity (page 169)

The following conditions are equivalent for any n by n matrix A:

1. A is nonsingular
2. Ax = 0 has only the trivial solution
3. A is row equivalent to the identity
4. Ax = b always has a solution
5. A is a product of elementary matrices
6. rank(A) = n
7. nullity(A) = 0
8. the rows of A are linearly independent
9. the columns of A are linearly independent

## Summary of numerical algorithms

Procedure 1. (page 112). To test whether the vectors v1, v2, ..., vk are linearly independent or linearly dependent:

• Solve the equation x1 v1 + x2 v2 + ... + xk vk = 0.
Note: the vectors end up as columns in a matrix.
If the only solution is all zeros, then the vectors are linearly independent.
If there is a nonzero solution, then the vectors are linearly dependent.

Procedure 2. (page 110). To check that the vectors v1, v2, ..., vk span the subspace W:

• Show that for every vector b in W there is a soluton to x1v1 + x2v2 + ... + xkvk = b.

Procedure 3. (page 128). To find a basis for the subspace span { v1, v2, ..., vk } by deleting vectors:

1. Construct the matrix whose columns are the coordinate vectors for the v's
2. Row reduce
3. Keep the vectors whose column contains a leading 1
The advantage of this procedure is that the answer consists of some of the vectors in the original set.

Procedure 4. (page 145). To find the transition matrix PS<-T:

1. Construct the matrix A whose columns are the coordinate vectors for the basis S
2. Construct the matrix B whose columns are the coordinate vectors for the basis T
3. Row reduce the matrix [ A | B ] to the form [ I | P ]
4. The matrix P is the transition matrix
The purpose of the procedure is to allow a change of coordinates [ v ]S = PS<-T [ v ]T .

Procedure 5. (page 153). To find a basis for the solution space of the system A x = 0 :

1. Row reduce A
2. Identify the independent variables in the solution
3. In turn, let one of these variables be 1, and all others be 0
4. The corresponding solution vectors form a basis

Procedure 6. To find a simplified basis for the subspace span { v1, v2, ..., vk } :

1. Construct the matrix whose rows are the coordinate vectors for the v's
2. Row reduce
3. The nonzero rows form a basis
The advantage of this procedure is that the vectors in the basis have lots of zeros, so they are in a useful form.

## CHAPTER THREE

Theorem 3.3 (Cauchy-Schwarz Inequality) If u and v are any two vectors in an innner product space V, then

(u,v)2 < ||u||2 ||v||2

Discussion of the ideas behind the proof: This is an alternate proof that I hope you will think is better motivated than the one in the text.

The only inequality that shows up in the definition of an inner product space appears in the condition that

0 < (x,x) for any vector x.

Since this seems to be the only possible tool, we need to rewrite the Cauchy-Schwarz inequality until it looks like the inner product of a vector with itself. The first thing to do is to rewrite the lengths in term of the inner product, using the fact that (u,u) = ||u||2.

(u,v)2 < ||u||2 ||v||2

(u,v)2 < (u,u) (v,v)

Next, we can subtract (u,v)2 from both sides of the inequality.

0 < (u,u) (v,v) - (u,v) (u,v)

Now divide through by (u,u), and to simplify things let c be the quotient (u,v) / (u,u). This gives us the next inequality.

0 < (v,v) - c (u,v)

Now we can factor out v, using the properties of the inner product, to get an inequality that almost looks like the inner product of a vector with itself.

0 < (v-c u , v)

We can complete the proof if we show that (v-c u , v) = (v-c u , v-c u). To show this, we only need to check that

(v-c u , u) = 0,

since (v-c u , v-c u) = (v-c u , v) - (v-c u , -c u) = (v-c u , v) + c (v-c u , u).

Expanding (v-c u , u) gives (v,u) -c (u,u), and this is equal to zero because c = (u,v)/(u,u). In summary, we finally have the inequality

0 < (v-c u , v-c u) = (v-c u , v) = (v,v) - c (u,v).

As we have already shown, this inequality is the same as the Cauchy-Schwarz inequality

(u,v)2 < ||u||2 ||v||2.

End of discussion

Formal proof of the Cauchy-Schwarz inequality:

If u = 0, then the inequality certainly holds, so we can assume that u is nonzero. Then (u,u) is nonzero, and so we can define c = (u,v) / (u,u). It follows form the definition of an inner product that for the vector v-c u we have

0 < (v-c u , v-c u).

Computing this inner product gives

(v-c u , v-c u) = (v , v-c u) - c (u , v-c u) = (v , v-c u)

because c = (u,v) / (u,u)) and therefore (u , v-c u) = (u,v) - c (u,u) = 0.

Thus we have 0 < (v-c u , v) = (v,v) - c (u,v),

and adding c (u,v) to both sides gives c (u,v) < (v,v).

Finally, multiplying both sides of the inequality by (u,u) gives the identity

(u,v)2 < (u,u) (v,v), which is the same as the Cauchy-Schwarz inequality.

End of proof

## CHAPTER FOUR

To add to the author's notation for transition matrices, some new notation may be worthwhile.

Let L: V -> W be a linear transformation. Suppose that S and S' are bases for V, while T and T' are bases for W. We will use the notation MT<-S(L) for the matrix for L, relative to the basis S for V and T for W. If we use PS<-S' for the transition matrix which converts coordinates relative to S' into coordinates relative to S, then we have the following relationship:

MT',S'(L) = PT'<-T MT<-S(L) PS<-S'

Note that the above equation must be read from right to left because it involves composition of operations.

In case W=V, T=S, and T'=S', we have

MS'<-S' (L) = PS'<-S MS<-S(L) PS<-S'.

Since the transition from S to S' is the inverse of the transition from S' to S, the transition matrices are inverses of each other. With MS'<-S'=B, MS<-S(L)=A, and PS<-S'=P, the equation is the one which defines similarity of matrices:

B = P-1AP.