Why eigenvectors and eigenvalues?

I took linear algebra in high school, and one thing that really confused me was eigenvectors and eigenvalues. Why are they so important? I’m taking linear algebra again, and my professor made it very clear why.

Let’s say we have a linear transformation $T$, and we want to find $T(\vecb{x})$. But what if transforming $\vecb{x}$ is difficult? Then, we’ll say that $T(\vecb{x}) = A\vecb{x}$, where $A$ is the associated matrix, and we’ll find its eigenvectors and eigenvalues, the $\vecb{v}$’s and $\lambda$’s such that $A\vecb{x} = \lambda\vecb{x}$.

If these eigenvectors happen to form a basis, then $\vecb{x}$ can be written as a linear combinations of the eigenvectors, $c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n$, and then we can plug it back into our transformation.

$$\begin{align}
T(\vecb{x}) &= A\vecb{x} \\
&= A(c_1\vecb{v}_1 + \ldots + c_n\vecb{v}_n) \\
&= c_1A\vecb{v}_1 + \ldots + c_nA\vecb{v}_n \\
&= c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\
\end{align}$$
Depending on the transformation we’re working with, this new calculation could turn out to be a lot easier!

Update: But that’s kind of clunky. Let’s try to express more simply the sum
$$ T(\vecb{x}) = c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n$$

Matrices were made to deal with large sets of data, so let’s define $P$ to be a matrix of eigenvectors (as column vectors) and $D$ to be a matrix of eigenvalues along the main diagonal (and zeros elsewhere):
$$
P = \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix} \qquad
D = \begin{bmatrix}
\lambda_1 & \\
& \ddots \\
&& \lambda_n
\end{bmatrix}$$

If we multiply the matrix $P$, the matrix $D$, and the vector $c$, notice that we get
$$
\begin{align*}
PD\vecb{c} &= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix}
\begin{bmatrix}
\lambda_1 & \\
& \ddots \\
&& \lambda_n
\end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\
&= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix}\begin{bmatrix} \lambda_1 c_1 \\ \vdots \\ \lambda_n c_n \end{bmatrix} \\
&=c_1\lambda_1\vecb{v}_1 + \ldots + c_n\lambda_n\vecb{v}_n \\
\end{align*}
$$
which is the original sum.

But we still have those $c_k$’s in there. We know that
$$
\begin{align*}
\vecb{x} &= \begin{bmatrix}
| & & | \\
\vecb{v}_1 & \ldots & \vecb{v}_n \\
| & & | \\
\end{bmatrix} \begin{bmatrix} c_1 \\ \vdots \\ c_n \end{bmatrix} \\
&= P\vecb{c}
\end{align*}
$$
so $\vecb{c} = P^{-1}\vecb{x}$ (provided $P$ is invertible).

Cool! Since we know that $T(\vecb{x}) = PD\vecb{c}$, we have our final expression:
$$ T(\vecb{x}) = A\vecb{x} = PDP^{-1} \vecb{x} $$

This is the diagonalization of $A$, where $D$ is the diagonalized matrix and $P$ is the change-of-basis matrix. Why would we do this? Working with diagonal matrices tends to be a lot easier than working with non-diagonal matrices. Again, depending on the transformation, this new calculation could be a lot easier than simply applying the original transformation.

March 5, 2013, 11:08am by Casey
Categories: Meta | Tags: | Permalink | Leave a comment

Leave a Reply

Required fields are marked *