Skip to main content
\(\newcommand{\bvec}{{\mathbf b}} \newcommand{\cvec}{{\mathbf c}} \newcommand{\dvec}{{\mathbf d}} \newcommand{\evec}{{\mathbf e}} \newcommand{\fvec}{{\mathbf f}} \newcommand{\qvec}{{\mathbf q}} \newcommand{\uvec}{{\mathbf u}} \newcommand{\vvec}{{\mathbf v}} \newcommand{\wvec}{{\mathbf w}} \newcommand{\xvec}{{\mathbf x}} \newcommand{\yvec}{{\mathbf y}} \newcommand{\zvec}{{\mathbf y}} \newcommand{\zerovec}{{\mathbf 0}} \newcommand{\real}{{\mathbb R}} \newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]} \newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]} \newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]} \newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]} \newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]} \newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]} \newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]} \newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]} \newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]} \renewcommand{\span}[1]{\text{Span}\{#1\}} \newcommand{\bcal}{{\cal B}} \newcommand{\ccal}{{\cal C}} \newcommand{\scal}{{\cal S}} \newcommand{\wcal}{{\cal W}} \newcommand{\ecal}{{\cal E}} \newcommand{\coords}[2]{\left\{#1\right\}_{#2}} \newcommand{\gray}[1]{\color{gray}{#1}} \newcommand{\lgray}[1]{\color{lightgray}{#1}} \newcommand{\rank}{\text{rank}} \newcommand{\col}{\text{Col}} \newcommand{\nul}{\text{Nul}} \newcommand{\lt}{<} \newcommand{\gt}{>} \newcommand{\amp}{&} \)

Section4.3Diagonalization, similarity, and powers of a matrix

The first example we considered in this chapter was the matrix \(A=\left[\begin{array}{rr} 1 \amp 2 \\ 2 \amp 1 \\ \end{array}\right] \text{,}\) which has eigenvectors \(\vvec_1=\twovec{1}{1}\) and \(\vvec_2 = \twovec{-1}{1}\) and associated eigenvalues \(\lambda_1=3\) and \(\lambda_2=-1\text{.}\) In Subsection 4.1.2, we described how \(A\) is, in some sense, equivalent to the diagonal matrix \(D = \left[\begin{array}{rr} 3 \amp 0 \\ 0 \amp -1\\ \end{array}\right] \text{.}\)

This equivalence is summarized by Figure 1. The diagonal matrix \(D\) has the geometric effect of stretching vectors horizontally by a factor of \(3\) and flipping vectors vertically. The matrix \(A\) has the geometric effect of stretching vectors by a factor of \(3\) in the direction \(\vvec_1\) and flipping them in the direction of \(\vvec_2\text{.}\) The geometric effect of \(A\) is the same as that of \(D\) when viewed in a basis of eigenvectors of \(A\text{.}\)

<<SVG image is unavailable, or your browser cannot render it>>

Figure4.3.1The matrix \(A\) has the same geometric effect as the diagonal matrix \(D\) when expressed in the coordinate system defined by the basis of eigenvectors.

Now that we have developed some algebraic techniques for finding eigenvalues and eigenvectors, we will explore this observation more deeply. In particular, we will make precise the sense in which \(A\) and \(D\) are equivalent by using the coordinate system defined by the basis of eigenvectors \(\vvec_1\) and \(\vvec_2\text{.}\)

Preview Activity4.3.1

Let's recall how a vector in \(\real^2\) can be represented in a coordinate system defined by a basis \(\bcal=\{\vvec_1, \vvec_2\}\text{.}\)

  1. Suppose that we consider the basis \(\bcal\) defined by

    \begin{equation*} \vvec_1 = \twovec{1}{1},\qquad \vvec_2 = \twovec{-1}{0} \text{.} \end{equation*}

    Find the vector \(\xvec\) whose representation in the coordinate system defined by \(\bcal\) is \(\coords{\xvec}{\bcal} = \twovec{-3}{2}\text{.}\)

  2. Consider the vector \(\xvec=\twovec{4}{5}\) and find its representation \(\coords{\xvec}{\bcal}\) in the coordinate system defined by \(\bcal\text{.}\)

  3. How do we use the matrix \(C_{\bcal} = \left[\begin{array}{rr} \vvec_1 \amp \vvec_2 \end{array}\right]\) to convert a vector's representation \(\coords{\xvec}{\bcal}\) in the coordinate system defined by \(\bcal\) into its standard representation \(\xvec\text{?}\) How do we use this matrix to convert \(\xvec\) into \(\coords{\xvec}{\bcal}\text{?}\)

  4. Suppose that we have a matrix \(A\) whose eigenvectors are \(\vvec_1\) and \(\vvec_2\) and associated eigenvalues are \(\lambda_1=4\) and \(\lambda_2 = 2\text{.}\) Express the vector \(A(-3\vvec_1 +5\vvec_2)\) as a linear combination of \(\vvec_1\) and \(\vvec_2\text{.}\)

  5. If \(\coords{\xvec}{\bcal} = \twovec{-3}{5}\text{,}\) find \(\coords{A\xvec}{\bcal}\text{.}\)

Subsection4.3.1Diagonalization of matrices

As we have investigated eigenvalues and eigenvectors of matrices in this chapter, we have frequently asked whether we can find a basis of eigenvectors, as in Question 4.1.7. In fact, Proposition 4.2.3 tells us that if \(A\) is an \(n\times n\) matrix having distinct and real eigenvalues, then there is a basis for \(\real^n\) consisting of eigenvectors of \(A\text{.}\) There are, in addition, other conditions on \(A\) that guarantee such a basis, as we will see in subsequent chapters, but for now, suffice it to say that for many matrices, we can find a basis of eigenvectors. We will now see how such a matrix \(A\) is equivalent to a diagonal matrix \(D\text{.}\)

Remember also that we have seen how to use a basis \(\bcal=\{\vvec_1,\vvec_2,\ldots,\vvec_n\}\) of \(\real^n\) to construct a coordinate system for \(\real^n\text{.}\) In particular, \(\coords{\xvec}{\bcal} = \fourvec{c_1}{c_2}{\vdots}{c_n}\) if \(\xvec = c_1\vvec_1 + c_2\vvec_2 + \ldots + c_n\vvec_n\text{.}\) We also used matrix multiplication to express this fact: if \(C_{\bcal} = \left[\begin{array}{rrrr} \vvec_1 \amp \vvec_2 \amp \ldots \amp \vvec_n \end{array}\right]\text{,}\) then

\begin{equation*} \xvec = C_{\bcal}\coords{\xvec}{\bcal}, \qquad \coords{\xvec}{\bcal} = C_{\bcal}^{-1}\xvec \text{.} \end{equation*}
Activity4.3.2

Once again, we will consider the matrices

\begin{equation*} A = \left[\begin{array}{rr} 1 \amp 2 \\ 2 \amp 1 \\ \end{array}\right],\qquad D = \left[\begin{array}{rr} 3 \amp 0 \\ 0 \amp -1 \\ \end{array}\right] \text{.} \end{equation*}

The matrix \(A\) has eigenvectors \(\vvec_1=\twovec{1}{1}\) and \(\vvec_2=\twovec{-1}{1}\) and eigenvalues \(\lambda_1=3\) and \(\lambda_2=-1\text{.}\) We will consider the basis of \(\real^2\) consisting of eigenvectors \(\bcal= \{\vvec_1, \vvec_2\}\text{.}\)

  1. If \(\xvec= 2\vvec_1 - 3\vvec_2\text{,}\) write \(A\xvec\) as a linear combination of \(\vvec_1\) and \(\vvec_2\text{.}\)

  2. If \(\coords{\xvec}{\bcal}=\twovec{2}{-3}\text{,}\) find \(\coords{A\xvec}{\bcal}\text{,}\) the representation of \(A\xvec\) in the coordinate system defined by \(\bcal\text{.}\)

  3. If \(\coords{\xvec}{\bcal}=\twovec{c_1}{c_2}\text{,}\) find \(\coords{A\xvec}{\bcal}\text{,}\) the representation of \(A\xvec\) in the coordinate system defined by \(\bcal\text{.}\)

  4. Explain why \(\coords{A\xvec}{\bcal} = D\coords{\xvec}{\bcal}\text{.}\)

  5. Explain why \(C_{\bcal}^{-1}A\xvec = DC_{\bcal}^{-1}\xvec\) for all vectors \(\xvec\) and hence

    \begin{equation*} C_{\bcal}^{-1}A = DC_{\bcal}^{-1} \text{.} \end{equation*}
  6. Explain why \(A = C_{\bcal}DC_{\bcal}^{-1}\) and verify this relationship by computing \(C_{\bcal}DC_{\bcal}^{-1}\) in the Sage cell below.

The key to understanding the equivalence of a matrix \(A\) and a diagonal matrix \(D\) is through the coordinate system defined by a basis consisting of eigenvectors of \(A\text{.}\) We will assume that \(A\) is an \(n\times n\) matrix and that there is a basis \(\bcal=\{\vvec_1,\vvec_2,\ldots,\vvec_n\}\) consisting of eigenvectors of \(A\) with associated eigenvalues \(\lambda_1, \lambda_2,\ldots,\lambda_n\text{.}\)

We know that if

\begin{equation*} \xvec = c_1\vvec_1 + c_2\vvec_2 + \ldots + c_n\vvec_n \text{,} \end{equation*}

then

\begin{equation*} A\xvec = \lambda_1c_1\vvec_1 + \lambda_2c_2\vvec_2 + \ldots + \lambda_nc_n\vvec_n \text{.} \end{equation*}

This fact is conveniently expressed using the coordinate system defined by \(\bcal\text{;}\) in particular,

\begin{equation*} \coords{\xvec}{\bcal} = \fourvec{c_1}{c_2}{\vdots}{c_n},\qquad \coords{A\xvec}{\bcal} = \fourvec{\lambda_1c_1}{\lambda_2c_2}{\vdots}{\lambda_nc_n} \text{.} \end{equation*}

Forming the diagonal matrix

\begin{equation*} D = \left[\begin{array}{cccc} \lambda_1 \amp 0 \amp \ldots \amp 0 \\ 0 \amp \lambda_2 \amp \ldots \amp 0 \\ \vdots \amp \vdots \amp \ddots \amp 0 \\ 0 \amp 0 \amp \ldots \amp \lambda_n \\ \end{array}\right] \text{,} \end{equation*}

we see that

\begin{equation*} \coords{A\xvec}{\bcal} = D\coords{\xvec}{\bcal} \text{.} \end{equation*}

We now use the fact that the matrix \(C_{\bcal} = \left[\begin{array}{cccc} \vvec_1 \amp \vvec_2 \amp \ldots \amp \vvec_n \end{array}\right]\) performs the change of coordinates; that is, \(\coords{A\xvec}{\bcal} = C_{\bcal}^{-1}A\xvec\) and \(\coords{\xvec}{\bcal} = C_{\bcal}^{-1}\xvec\text{.}\) This says that

\begin{equation*} C_{\bcal}^{-1}A\xvec = DC_{\bcal}^{-1}\xvec \text{,} \end{equation*}

for all vectors \(\xvec\text{,}\) which means that \(C_{\bcal}^{-1}A = DC_{\bcal}^{-1}\) or

\begin{equation*} A = C_{\bcal}^{-1}DC_{\bcal}^{-1} \text{.} \end{equation*}

So that the form of this expression stands out more clearly, it is customary to denote the matrix \(C_{\bcal}\) as \(P\) so that we have \(P = C_{\bcal}\) and hence

\begin{equation*} A = PDP^{-1} \text{.} \end{equation*}
Definition4.3.2

We say that the matrix \(A\) is diagonalizable if there is a diagonal matrix \(D\) and invertible matrix \(P\) such that

\begin{equation*} A = PDP^{-1} \text{.} \end{equation*}

This is the sense in which we mean that \(A\) is equivalent to a diagonal matrix \(D\text{.}\) The expression \(A=PDP^{-1}\) says that \(A\text{,}\) expressed in the basis defined by the columns of \(P\text{,}\) has the same geometric effect as \(D\text{,}\) expressed in the standard basis \(\evec_1, \evec_2,\ldots,\evec_n\text{.}\)

We have now seen the following proposition.

In fact, if we only know that \(A = PDP^{-1}\) where \(P = \left[\begin{array}{cccc} \vvec_1 \amp \vvec_2 \amp \ldots \vvec_n \end{array}\right]\text{,}\) we can say that the vectors \(\vvec_j\) are eigenvectors of \(A\) and that the associated eigenvalue is the \(j^{th}\) diagonal entry of \(D\text{.}\)

Example4.3.4

We will try to find a diagonalization of \(A = \left[\begin{array}{rr} -5 \amp 6 \\ -3 \amp 4 \\ \end{array}\right] \text{.}\)

First, we find the eigenvalues of \(A\) by solving the characteristic equation

\begin{equation*} \det(A-\lambda I) = (-5-\lambda)(4-\lambda)+18 = (-2-\lambda)(1-\lambda) = 0 \text{.} \end{equation*}

This shows that the eigenvalues of \(A\) are \(\lambda_1 = -2\) and \(\lambda_2 = 1\text{.}\)

By constructing \(\nul(A-(-2)I)\text{,}\) we find a basis for \(E_{-2}\) consisting of the vector \(\vvec_1 = \twovec{2}{1}\text{.}\) Similarly, a basis for \(E_1\) consists of the vector \(\vvec_2 = \twovec{1}{1}\text{.}\) This shows that we can construct a basis \(\bcal=\{\vvec_1,\vvec_2\}\) of \(\real^2\) consisting of eigenvectors of \(A\text{.}\)

We now form the matrices

\begin{equation*} D = \left[\begin{array}{rr} -2 \amp 0 \\ 0 \amp 1 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \vvec_1 \amp \vvec_2 \end{array}\right] = \left[\begin{array}{rr} 2 \amp 1 \\ 1 \amp 1 \\ \end{array}\right] \end{equation*}

and verify that

\begin{equation*} PDP^{-1} = \left[\begin{array}{rr} 2 \amp 1 \\ 1 \amp 1 \\ \end{array}\right] \left[\begin{array}{rr} -2 \amp 0 \\ 0 \amp 1 \\ \end{array}\right] \left[\begin{array}{rr} 1 \amp -1 \\ -1 \amp 2 \\ \end{array}\right] = \left[\begin{array}{rr} -5 \amp 6 \\ -3 \amp 4 \\ \end{array}\right] = A \text{.} \end{equation*}

There are, of course, many ways to diagonalize \(A\text{.}\) For instance, we could change the order of the eigenvalues and eigenvectors and write

\begin{equation*} D = \left[\begin{array}{rr} 1 \amp 0 \\ 0 \amp -2 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \vvec_2 \amp \vvec_1 \end{array}\right] = \left[\begin{array}{rr} 1 \amp 2 \\ 1 \amp 1 \\ \end{array}\right] \text{.} \end{equation*}

If we choose a different basis for the eigenspaces, we will also find a different matrix \(P\) that diagonalizes \(A\text{.}\) The point is that there are many ways in which \(A\) can be written in the form \(A=PDP^{-1}\text{.}\)

Example4.3.5

We will try to find a diagonalization of \(A = \left[\begin{array}{rr} 0 \amp 4 \\ -1 \amp 4 \\ \end{array}\right] \text{.}\)

Once again, we find the eigenvalues by solving the characteristic equation:

\begin{equation*} \det(A-\lambda I) = -\lambda(4-\lambda) + 4 = (2-\lambda)^2 = 0 \text{.} \end{equation*}

In this case, there is a single eigenvalue \(\lambda=2\text{.}\)

We find a basis for the eigenspace \(E_2\) by describing \(\nul(A-2I)\text{:}\)

\begin{equation*} A-2I = \left[\begin{array}{rr} -2 \amp 4 \\ -1 \amp 2 \\ \end{array}\right] \sim \left[\begin{array}{rr} 1 \amp -2 \\ 0 \amp 0 \\ \end{array}\right] \text{.} \end{equation*}

This shows that the eigenspace \(E_2\) is one-dimensional with \(\vvec_1=\twovec{2}{1}\) forming a basis.

In this case, there is not a basis of \(\real^2\) consisting of eigenvectors of \(A\text{,}\) which tells us that \(A\) is not diagonalizable.

Example4.3.6

Suppose we know that \(A=PDP^{-1}\) where

\begin{equation*} D = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -2 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \vvec_2 \amp \vvec_1 \end{array}\right] = \left[\begin{array}{rr} 1 \amp 1 \\ 1 \amp 2 \\ \end{array}\right] \text{.} \end{equation*}

In this case, we know that the columns of \(P\) form eigenvectors of \(A\text{.}\) For instance, \(\vvec_1 = \twovec{1}{1}\) is an eigenvector of \(A\) with eigenvalue \(\lambda_1 = 2\text{.}\) Also, \(\vvec_2 = \twovec{1}{2}\) is an eigenvector with eigenvalue \(\lambda_2=-2\text{.}\)

We can verify this by computing

\begin{equation*} A = PDP^{-1} = \left[\begin{array}{rr} 6 \amp -4 \\ 8 \amp -6 \\ \end{array}\right] \text{.} \end{equation*}

Then, we can compute that \(A\vvec_1 = \twovec{1}{1}=2\vvec_1\) and \(A\vvec_2 = \twovec{1}{2} = -2\vvec_2\text{.}\)

Activity4.3.3

  1. Find a diagonalization of \(A\text{,}\) if one exists, when

    \begin{equation*} A = \left[\begin{array}{rr} 3 \amp -2 \\ 6 \amp -5 \\ \end{array}\right] \text{.} \end{equation*}
  2. Can the diagonal matrix

    \begin{equation*} A = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -5 \\ \end{array}\right] \end{equation*}

    be diagonalized? If so, explain how to find the matrices \(P\) and \(D\text{.}\)

  3. Find a diagonalization of \(A\text{,}\) if one exists, when

    \begin{equation*} A = \left[\begin{array}{rrr} -2 \amp 0 \amp 0 \\ 1 \amp -3\amp 0 \\ 2 \amp 0 \amp -3 \\ \end{array}\right] \text{.} \end{equation*}
  4. Find a diagonalization of \(A\text{,}\) if one exists, when

    \begin{equation*} A = \left[\begin{array}{rrr} -2 \amp 0 \amp 0 \\ 1 \amp -3\amp 0 \\ 2 \amp 1 \amp -3 \\ \end{array}\right] \text{.} \end{equation*}
  5. Suppose that \(A=PDP^{-1}\) where

    \begin{equation*} D = \left[\begin{array}{rr} 3 \amp 0 \\ 0 \amp -1 \\ \end{array}\right],\qquad P = \left[\begin{array}{cc} \vvec_2 \amp \vvec_1 \end{array}\right] = \left[\begin{array}{rr} 2 \amp 2 \\ 1 \amp -1 \\ \end{array}\right] \text{.} \end{equation*}
    1. Explain why \(A\) is invertible.

    2. Find a diagonalization of \(A^{-1}\text{.}\)

    3. Find a diagonalization of \(A^3\text{.}\)

Subsection4.3.2Powers of a diagonalizable matrix

In several earlier examples, we have been interested in computing powers of a given matrix. For instance, in Activity 4.1.3, we are given the matrix \(A = \left[\begin{array}{rr} 0.8 \amp 0.6 \\ 0.2 \amp 0.4 \\ \end{array}\right]\) and an initial vector \(\xvec_0=\twovec{1000}{0}\text{,}\) and we wanted to compute

\begin{equation*} \begin{aligned} \xvec_1 \amp {}={} A\xvec_0 \\ \xvec_2 \amp {}={} A\xvec_1 = A^2\xvec_0 \\ \xvec_3 \amp {}={} A\xvec_2 = A^3\xvec_0\text{.} \\ \end{aligned} \end{equation*}

More generally, we would like to find \(\xvec_k=A^k\xvec_0\) and determine what happens as \(k\) becomes very large. If a matrix \(A\) is diagonalizable, writing \(A=PDP^{-1}\) can help us understand powers of \(A\) easily.

Activity4.3.4

  1. Let's begin with the diagonal matrix

    \begin{equation*} D = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -1 \\ \end{array}\right] \text{.} \end{equation*}

    Find the powers \(D^2\text{,}\) \(D^3\text{,}\) and \(D^4\text{.}\) What is \(D^k\) for a general value of \(k\text{?}\)

  2. Suppose that \(A\) is a matrix with eigenvector \(\vvec\) and associated eigenvalue \(\lambda\text{;}\) that is, \(A\vvec = \lambda\vvec\text{.}\) By considering \(A^2\vvec\text{,}\) explain why \(\vvec\) is also an eigenvector of \(A\) with eigenvalue \(\lambda^2\text{.}\)

  3. Suppose that \(A= PDP^{-1}\) where

    \begin{equation*} D = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp -1 \\ \end{array}\right] \text{.} \end{equation*}

    Remembering that the columns of \(P\) are eigenvectors of \(A\text{,}\) explain why \(A^2\) is diagonalizable and find a diagonalization of it.

  4. Give another explanation of the diagonalizability of \(A^2\) by writing

    \begin{equation*} A^2 = (PDP^{-1})(PDP^{-1}) = PD(P^{-1}P)DP^{-1} \text{.} \end{equation*}
  5. In the same way, find a diagonalization of \(A^3\text{,}\) \(A^4\text{,}\) and \(A^k\text{.}\)

  6. Suppose that \(A\) is a diagonalizable \(2\times2\) matrix with eigenvalues \(\lambda_1 = 0.5\) and \(\lambda_2=0.1\text{.}\) What happens to \(A^k\) as \(k\) becomes very large?

We begin by noting that the eigenvectors of a matrix \(A\) are also eigenvectors of the powers of \(A\text{.}\) For instance, if \(A\vvec = \lambda\vvec\text{,}\) then

\begin{equation*} A^2\vvec = A(A\vvec) = A(\lambda\vvec) = \lambda A\vvec = \lambda^2\vvec \text{.} \end{equation*}

In this way, we see that \(\vvec\) is an eigenvector of \(A^2\) with eigenvalue \(\lambda^2\text{.}\) Furthermore, for any \(k\text{,}\) \(\vvec\) is an eigenvector of \(A^k\) with eigenvalue \(\lambda^k\text{.}\)

Now if \(A\) is diagonalizable, we can write \(A=PDP^{-1}\) where the columns of \(P\) are eigenvectors of \(A\) and the diagonal entries of \(D\) are the eigenvalues. If \(D = \left[\begin{array}{rr} \lambda_1 \amp 0 \\ 0 \amp \lambda_2 \\ \end{array}\right] \text{,}\) then

\begin{equation*} A^2 = P\left[\begin{array}{rr} \lambda_1^2 \amp 0 \\ 0 \amp \lambda_2^2 \\ \end{array}\right] P^{-1} = PD^2P^{-1} \text{.} \end{equation*}

We have the same matrix \(P\) in this expression since the eigenvectors of \(A^2\) are also the eigenvectors of \(A\text{.}\)

Another way to see this is to note that

\begin{equation*} \begin{aligned} A^2 \amp {}={} (PDP^{-1})(PDP^{-1}) \\ \amp {}={} PD(P^{-1}P)DP^{-1} \\ \amp {}={} PDIDP^{-1} \\ \amp {}={} PDDP^{-1} \\ \amp {}={} PD^2P^{-1}\text{.} \end{aligned} \end{equation*}

Similarly, any power of \(A\) is diagaonalizable; in particular, \(A^k = PD^kP^{-1}\text{.}\)

In the next section, we will see some important uses of our ability to deal with powers in this way. Until then, consider the case where \(D = \left[\begin{array}{rr} 0.5 \amp 0 \\ 0 \amp 0.1 \\ \end{array}\right]\) so that \(D^k = \left[\begin{array}{rr} 0.5^k \amp 0 \\ 0 \amp 0.1^k \\ \end{array}\right] \text{.}\) As \(k\) becomes very large, the diagonal entries become increasingly close to zero. This means that \(D^k\) becomes increasingly close to the zero matrix \(\left[\begin{array}{rr} 0 \amp 0 \\ 0 \amp 0 \\ \end{array}\right]\) as does \(A^k = PD^kP^{-1}\text{.}\) In other words, no matter what vector \(\xvec_0\) we begin with, the vectors \(A^k\xvec_0\) becomes increasingly close to \(\zerovec\text{.}\)

Subsection4.3.3Similarity and complex eigenvalues

We have been interested in diagonalizing a matrix \(A\) because doing so relates a matrix \(A\) to a simpler diagonal matrix \(D\text{.}\) If we write \(A=PDP^{-1}\text{,}\) we see that multiplying a vector by \(A\) in the coordinates defined by the columns of \(P\) is the same as multiplying by \(D\) in standard coordinates. Under this change of coordinates, \(A\) and \(D\) have the same effect on vectors.

More generally, if we have two matrices \(A\) and \(B\) such that \(A=PBP^{-1}\text{,}\) we may regard multiplication by \(A\) and \(B\) as having the same effect on vectors under the change of coordinates defined by the columns of \(P\text{.}\) That is, if \(\bcal\) is the basis formed by the columns of \(P\text{,}\) then \(\coords{A\xvec}{\bcal} = B\coords{\xvec}{\bcal}\text{.}\) This leads to the following definition.

Definition4.3.7

We say that \(A\) is similar to \(B\) if there is an invertible matrix \(P\) such that \(A = PBP^{-1}\text{.}\)

Notice that a matrix is diagonalizable if and only if it is similar to a diagonal matrix. We have, however, seen several examples of a matrix \(A\) that is not diagonalizable. In this case, it is natural to ask if there is some simpler matrix that is similar to \(A\text{.}\)

Example4.3.8

Let's consider the matrix \(A = \left[\begin{array}{rr} -2 \amp 2 \\ -5 \amp 4 \\ \end{array}\right]\) whose characteristic equation is

\begin{equation*} \det(A-\lambda I) = (-2-\lambda)(4-\lambda)+10 = 2 - 2\lambda + \lambda^2 = 0 \text{.} \end{equation*}

Applying the quadratic formula to find the eigenvalues, we obtain

\begin{equation*} \lambda = \frac{2\pm\sqrt{(-2)^2-4\cdot1\cdot2}}{2}=1\pm i \text{.} \end{equation*}

Here we see that the matrix \(A\) has two complex eigenvalues and is therefore not diagonalizable.

In case a matrix \(A\) has complex eigenvalues, we will find a simpler matrix \(C\) that is similar to \(A\text{.}\) In particular, if \(A\) has an eigenvalue \(\lambda = a+bi\text{,}\) then \(A\) is similar to \(C=\left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array}\right] \text{.}\)

The next activity shows that \(C\) has a simple geometric effect on \(\real^2\text{.}\) First, however, we will rewrite \(C\) in polar coordinates, as shown in the figure. We form the point \((a,b)\text{,}\) which defines \(r\text{,}\) the distance from the origin, and \(\theta\text{,}\) the angle formed with the positive horizontal axis. We then have

\begin{equation*} \begin{aligned} a \amp {}={} r\cos\theta \\ b \amp {}={} r\sin\theta\text{.} \\ \end{aligned} \end{equation*}

Notice that \(r=\sqrt{a^2+b^2}\text{.}\)

<<SVG image is unavailable, or your browser cannot render it>>

Activity4.3.5

  1. We will rewrite \(C\) in terms of \(r\) and \(\theta\text{.}\) Explain why

    \begin{equation*} \left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array}\right] = \left[\begin{array}{rr} r\cos\theta \amp -r\sin\theta \\ r\sin\theta \amp r\cos\theta \\ \end{array}\right] = \left[\begin{array}{rr} r \amp 0 \\ 0 \amp r \\ \end{array}\right] \left[\begin{array}{rr} \cos\theta \amp -\sin\theta \\ \sin\theta \amp \cos\theta \\ \end{array}\right] \text{.} \end{equation*}
  2. Explain why \(C\) has the geometric effect of rotating vectors by \(\theta\) and stretching them by a factor of \(r\text{.}\)

  3. Let's now consider the matrix \(A\) from Example 8:

    \begin{equation*} A = \left[\begin{array}{rr} -2 \amp 2 \\ -5 \amp 4 \\ \end{array}\right] \end{equation*}

    whose eigenvalues are \(\lambda_1 = 1+i\) and \(\lambda_2 = 1-i\text{.}\) We will choose to focus on one of the eigenvalues \(\lambda_1 = a+bi= 1+i. \)

    Form the matrix \(C\) using these values of \(a\) and \(b\text{.}\) Then rewrite the point \((a,b)\) in polar coordinates by identifying the values of \(r\) and \(\theta\text{.}\) Explain the geometric effect of multiplying vectors of \(C\text{.}\)

  4. Suppose that \(P=\left[\begin{array}{rr} 1 \amp 1 \\ 2 \amp 1 \\ \end{array}\right] \text{.}\) Verify that \(A = PCP^{-1}\text{.}\)

  5. Explain why \(A^kk = PC^kP^{-1}\text{.}\)

  6. We formed the matrix \(C\) by choosing the eigenvalue \(\lambda_1=1+i\text{.}\) Suppose we had instead chosen \(\lambda_2 = 1-i\text{.}\) Form the matrix \(C'\) and use polar coordinates to describe the geometric effect of \(C\text{.}\)

  7. Using the matrix \(P' = \left[\begin{array}{rr} 1 \amp -1 \\ 2 \amp -1 \\ \end{array}\right] \text{,}\) show that \(A = P'C'P'^{-1}\text{.}\)

If the \(2\times2\) matrix \(A\) has a complex eigenvalue \(\lambda = a + bi\text{,}\) this activity demonstrates the fact that \(A\) is similar to the matrix \(C = \left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array}\right] \text{.}\) When we consider the matrix \(A = \left[\begin{array}{rr} -2 \amp 2 \\ -5 \amp 4 \\ \end{array}\right] \text{,}\) we find the complex eigenvalue \(\lambda=1+i\text{,}\) which leads to the matrix

\begin{equation*} C = \left[\begin{array}{rr} 1 \amp -1 \\ 1 \amp 1 \\ \end{array}\right] = \left[\begin{array}{rr} \sqrt{2} \amp 0 \\ 0 \amp \sqrt{2} \\ \end{array}\right] \left[\begin{array}{rr} \cos(45^\circ) \amp -\sin(45^\circ) \\ \sin(45^\circ) \amp \cos(45^\circ) \\ \end{array}\right] \text{.} \end{equation*}

The matrix has the geometric effect of rotating vectors by \(45^\circ\) and stretching them by a factor of \(\sqrt{2}\text{,}\) as shown in the figure.

<<SVG image is unavailable, or your browser cannot render it>>

As we saw in the activity, our original matrix \(A\) is similar to \(C\text{.}\) That is, we saw that there is a matrix \(P\) such that \(A=PCP^{-1}\text{.}\) This means that, when expressed in the coordinates defined by the columns of \(P\text{,}\) multiplying a vector by \(A\) is equivalent to multiplying by \(C\text{;}\) that is, if \(\bcal\) is the basis formed by the columns of \(A\text{,}\) then \(\coords{A\xvec}{\bcal} = C\coords{\xvec}{\bcal}\text{.}\)

Had we chosen the other eigenvalue \(\lambda_2 = 1-i\text{,}\) we would have formed the matrix

\begin{equation*} C' = \left[\begin{array}{rr} 1 \amp 1 \\ -1 \amp 1 \\ \end{array}\right] = \left[\begin{array}{rr} \sqrt{2} \amp 0 \\ 0 \amp \sqrt{2} \\ \end{array}\right] \left[\begin{array}{rr} \cos(-45^\circ) \amp -\sin(-45^\circ) \\ \sin(-45^\circ) \amp \cos(-45^\circ) \\ \end{array}\right] \text{.} \end{equation*}

In other words, this matrix \(C'\) rotates vectors by \(-45^\circ\) and stretches them by a factor of \(\sqrt{2}\text{.}\) The original matrix \(A\) is also similar to \(C'\text{.}\)

Depending on which complex eigenvalue we choose, we find a matrix \(C\) that performs either a counterclockwise or a clockwise rotation. In our future uses, we will focus on \(r\text{,}\) the streching factor, and not be concerned about the direction of the rotation.

Subsection4.3.4Summary

The ideas in this section demonstrate how the eigenvalues and eigenvectors of a matrix \(A\) can provide us with a new coordinate system in which multiplying by \(A\) reduces to a simpler operation.

  • We said that \(A\) is diagonalizable if we can write \(A = PDP^{-1}\) where \(D\) is a diagonal matrix. The columns of \(P\) consist of eigenvectors of \(A\) and the diagonal entries of \(D\) are the associated eigenvalues.

  • An \(n\times n\) matrix \(A\) is diagonalizable if and only if there is a basis of \(\real^n\) consisting of eigenvectors of \(A\text{.}\)

  • We said that \(A\) and \(B\) are similar if there is an invertible matrix \(P\) such that \(A=PBP^{-1}\text{.}\) In this case, \(A^k = PB^kP^{-1}\text{.}\)

  • If \(A\) is a \(2\times2\) matrix with complex eigenvalue \(\lambda = a+bi\text{,}\) then \(A\) is similar to \(C = \left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array} \right] \text{.}\) Writing the point \((a,b)\) in polar coordinates \(r\) and \(\theta\text{,}\) we see that \(C\) rotates vectors through an angle \(\theta\) and stretches them by a factor of \(r=\sqrt{a^2+b^2}\text{.}\)

Subsection4.3.5Exercises

1

Determine whether the following matrices are diagonalizable. If so, find matrices \(D\) and \(P\) such that \(A=PDP^{-1}\text{.}\)

  1. \(A = \left[\begin{array}{rr} -2 \amp -2 \\ -2 \amp 1 \\ \end{array}\right] \text{.}\)

  2. \(A = \left[\begin{array}{rr} -1 \amp 1 \\ -1 \amp -3 \\ \end{array}\right] \text{.}\)

  3. \(A = \left[\begin{array}{rr} 3 \amp -4 \\ 2 \amp -1 \\ \end{array}\right] \text{.}\)

  4. \(A = \left[\begin{array}{rrr} 1 \amp 0 \amp 0 \\ 2 \amp -2 \amp 0 \\ 0 \amp 1 \amp 4 \\ \end{array}\right] \text{.}\)

  5. \(A = \left[\begin{array}{rrr} 1 \amp 2 \amp 2 \\ 2 \amp 1 \amp 2 \\ 2 \amp 2 \amp 1 \\ \end{array}\right] \text{.}\)

2

Determine whether the following matrices have complex eigenvalues. If so, find the matrix \(C\) such that \(A = PCP^{-1}\text{.}\)

  1. \(A = \left[\begin{array}{rr} -2 \amp -2 \\ -2 \amp 1 \\ \end{array}\right] \text{.}\)

  2. \(A = \left[\begin{array}{rr} -1 \amp 1 \\ -1 \amp -3 \\ \end{array}\right] \text{.}\)

  3. \(A = \left[\begin{array}{rr} 3 \amp -4 \\ 2 \amp -1 \\ \end{array}\right] \text{.}\)

3

Determine whether the following statements are true or false and provide a justification for your response.

  1. If \(A\) is invertible, then \(A\) is diagonalizable.

  2. If \(A\) and \(B\) are similar and \(A\) is invertible, then \(B\) is also invertible.

  3. If \(A\) is a diagonalizable \(n\times n\) matrix, then there is a basis of \(\real^n\) consisting of eigenvectors of \(A\text{.}\)

  4. If \(A\) is diagonalizable, then \(A^{10}\) is also diagonalizable.

  5. If \(A\) is diagonalizable, then \(A\) is invertible.

4

Provide a justification for your response to the following questions.

  1. If \(A\) is a \(3\times3\) matrix having eigenvalues \(\lambda = 2, 3, -4\text{,}\) can you guarantee that \(A\) is diagonalizable?

  2. If \(A\) is a \(2\times 2\) matrix with a complex eigenvalue, can you guarantee that \(A\) is diagonalizable?

  3. If \(A\) is similar to the matrix \(B = \left[\begin{array}{rrr} -5 \amp 0 \amp 0 \\ 0 \amp -5 \amp 0 \\ 0 \amp 0 \amp 3 \\ \end{array}\right] \text{,}\) is \(A\) diagonalizable?

  4. What matrices are similar to the identity matrix?

  5. If \(A\) is a diagonalizable \(2\times2\) matrix with a single eigenvalue \(\lambda = 4\text{,}\) what is \(A\text{?}\)

5

Describe geometric effect that the following matrices have on \(\real^2\text{:}\)

  1. \(A = \left[\begin{array}{rr} 2 \amp 0 \\ 0 \amp 2 \\ \end{array}\right]\)

  2. \(A = \left[\begin{array}{rr} 4 \amp 2 \\ 0 \amp 4 \\ \end{array}\right]\)

  3. \(A = \left[\begin{array}{rr} 3 \amp -6 \\ 6 \amp 3 \\ \end{array}\right]\)

  4. \(A = \left[\begin{array}{rr} 4 \amp 0 \\ 0 \amp -2 \\ \end{array}\right]\)

  5. \(A = \left[\begin{array}{rr} 1 \amp 3 \\ 3 \amp 1 \\ \end{array}\right]\)

6

We say that \(A\) is similar to \(B\) if there is a matrix \(P\) such that \(A = PBP^{-1}\text{.}\)

  1. If \(A\) is similar to \(B\text{,}\) explain why \(B\) is similar to \(A\text{.}\)

  2. If \(A\) is similar to \(B\) and \(B\) is similar to \(C\text{,}\) explain why \(A\) is similar to \(C\text{.}\)

  3. If \(A\) is similar to \(B\) and \(B\) is diagonalizable, explain why \(A\) is diagonalizable.

  4. If \(A\) and \(B\) are similar, explain why \(A\) and \(B\) have the same characteristic polynomial; that is, explain why \(\det(A-\lambda I) = \det(B-\lambda I)\text{.}\)

  5. If \(A\) and \(B\) are similar, explain why \(A\) and \(B\) have the same eigenvalues.

7

Suppose that \(A = PDP^{-1}\) where

\begin{equation*} D = \left[\begin{array}{rr} 1 \amp 0 \\ 0 \amp 0 \\ \end{array}\right],\qquad P = \left[\begin{array}{rr} 1 \amp -2 \\ 2 \amp 1 \\ \end{array}\right] \text{.} \end{equation*}
  1. Explain the geometric effect that \(D\) has on vectors in \(\real^2\text{.}\)

  2. Explain the geometric effect that \(A\) has on vectors in \(\real^2\text{.}\)

  3. What can you say about \(A^2\) and other powers of \(A\text{?}\)

  4. Is \(A\) invertible?

8

When \(A\) is a \(2\times2\) matrix with a complex eigenvalue \(\lambda = a+bi\text{,}\) we have said that there is a matrix \(P\) such that \(A=PCP^{-1}\) where \(C=\left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array}\right] \text{.}\) In this exercise, we will learn how to find the matrix \(P\text{.}\) As an example, we will consider the matrix \(A = \left[\begin{array}{rr} 2 \amp 2 \\ -1 \amp 4 \\ \end{array}\right] \text{.}\)

  1. Show that the eigenvalues of \(A\) are complex.

  2. Choose one of the complex eigenvalues \(\lambda=a+bi\) and construct the usual matrix \(C\text{.}\)

  3. Using the same eigenvalue, we will find an eigenvector \(\vvec\) where the entries of \(\vvec\) are complex numbers. As always, we will describe \(\nul(A-\lambda I)\) by constructing the matrix \(A-\lambda I\) and finding its reduced row echelon form. In doing so, we will necessarily need to use complex arithmetic.

  4. We have now found a complex eigenvector \(\vvec\text{.}\) Write \(\vvec = \vvec_1 - i \vvec_2\) to identify vectors \(\vvec_1\) and \(\vvec_2\) having real entries.

  5. Construct the matrix \(P = \left[\begin{array}{rr} \vvec_1 \amp \vvec_2 \end{array}\right]\) and verify that \(A=PCP^{-1}\text{.}\)

9

For each of the following matrices, sketch the vector \(\xvec = \twovec{1}{0}\) and powers \(A^k\xvec\) for \(k=1,2,3,4\text{.}\)

  1. \(A = \left[\begin{array}{rr} 0 \amp -1.4 \\ 1.4 \amp 0 \\ \end{array}\right] \text{.}\)

    <<SVG image is unavailable, or your browser cannot render it>>

  2. \(A = \left[\begin{array}{rr} 0 \amp -0.8 \\ 0.8 \amp 0 \\ \end{array}\right] \text{.}\)

    <<SVG image is unavailable, or your browser cannot render it>>

  3. \(A = \left[\begin{array}{rr} 0 \amp -1 \\ 1 \amp 0 \\ \end{array}\right] \text{.}\)

    <<SVG image is unavailable, or your browser cannot render it>>

  4. Consider a matrix of the form \(C=\left[\begin{array}{rr} a \amp -b \\ b \amp a \\ \end{array}\right]\) with \(r=\sqrt{a^2+b^2}\text{.}\) What happens when \(k\) becomes very large when

    1. \(r \lt 1\text{.}\)

    2. \(r = 1\text{.}\)

    3. \(r \gt 1\text{.}\)

10

For each of the following matrices and vectors, sketch the vector \(\xvec\) along with \(A^k\xvec\) for \(k=1,2,3,4\text{.}\)

  1. \begin{equation*} \begin{aligned} A \amp {}={} \left[\begin{array}{rr} 1.4 \amp 0 \\ 0 \amp 0.7 \\ \end{array}\right] \\ \\ \xvec \amp {}={} \twovec{1}{2}\text{.} \end{aligned} \text{.} \end{equation*}

    <<SVG image is unavailable, or your browser cannot render it>>

  2. \begin{equation*} \begin{aligned} A \amp {}={} \left[\begin{array}{rr} 0.6 \amp 0 \\ 0 \amp 0.9 \\ \end{array}\right] \\ \\ \xvec \amp {}={} \twovec{4}{3}\text{.} \end{aligned} \end{equation*}

    <<SVG image is unavailable, or your browser cannot render it>>

  3. \begin{equation*} \begin{aligned} A \amp {}={} \left[\begin{array}{rr} 1.2 \amp 0 \\ 0 \amp 1.4 \\ \end{array}\right] \\ \\ \xvec\amp{}={}\twovec{2}{1}\text{.} \end{aligned} \end{equation*}

    <<SVG image is unavailable, or your browser cannot render it>>

  4. \begin{equation*} \begin{aligned} A \amp {}={} \left[\begin{array}{rr} 0.95 \amp 0.25 \\ 0.25 \amp 0.95 \\ \end{array}\right] \\ \\ \xvec\amp{}={}\twovec{3}{0}\text{.} \end{aligned} \end{equation*}

    Find the eigenvalues and eigenvectors of \(A\) to create your sketch.

    <<SVG image is unavailable, or your browser cannot render it>>

  5. If \(A\) is a \(2\times2\) matrix with eigenvalues \(\lambda_1=0.7\) and \(\lambda_2=0.5\) and \(\xvec\) is any vector, what happens to \(A^k\xvec\) when \(k\) becomes very large?