Structure of Operators on Complex Vector Spaces

We discuss here in depth the structure of general operators on complex vector spaces. Unlike the previous note, we free ourselves from the definition of inner products on finite-dimensional spaces as it does not help much. All the results in this note assume vector space

V \neq {0}

, and

F = R

C

1. Generalized Eigenvectors and Nilpotent Operators 1.1. Generalized eigenvectors 1.2. Nilpotent operators 2. Decomposition of an Operator 2.1. Multiplicity of an eigenvalue 2.2. Block diagonal matrix 2.3. Square roots 3. Characteristic and Minimal Polynomial 3.1. The Cayley-Hamilton theorem 3.2. Minimal Polynomial 4. Jordan Form

1. Generalized Eigenvectors and Nilpotent OperatorsBy the terminology in Axler's book, structure of an operator refers heuristically to decomposition of the vector space the operator and its powers acting on. We will discuss in this section how a vector space can be divided into invariant subspaces spanned by generalized eigenvectors. To get started we need three theorems that set the arena for relating an operator and its powers to important subspaces and to generalized eigenvectors.For an operator

T \in L (V)

The first theorem tells us that the null space keeps growing as the power of

T

increases.

Theorem

For

T \in L (V)

{0} = n u l l T^{0} \subset n u l l T^{1} \dots \subset n u l l T^{n} \subset n u l l T^{n + 1} \subset \dots

Proof: we prove Theorem 1 by induction. We need to prove

n u l l T^{m} \subset n u l l T^{m + 1}

for any integer

m

. Let

v \in n u l l T^{m}

, then

T^{m + 1} v = T (T^{m} v) = 0

. Hence

n u l l T^{m} \subset n u l l T^{m + 1}

as desired.□The subset symbol "

\subset

" implies possible equality. The following theorem says that the equality, if it exists between two adjacent null spaces, propagates to consecutive powers.

Theorem

Suppose

T \in L (V)

. Suppose

m

is a nonnegative integer such that

n u l l T^{m} = n u l l T^{m + 1}

. Then

n u l l T^{m} = n u l l T^{m + 1} = \dots

Proof: we prove this by induction, and we need to show

n u l l T^{m + k} = n u l l T^{m + k + 1}

for any nonnegative integer

k

. Since we already know

n u l l T^{m + k} \subset n u l l T^{m + k + 1}

due to Theorem 1, we need to show

n u l l T^{m + k + 1} \subset n u l l T^{m + k}

. Let

v \in n u l l T^{m + k + 1}

, we have

T^{m + k + 1} v = T^{m + 1} (T^{k} v) = 0

. Hence

T^{k} v \in n u l l T^{m + 1} = n u l l T^{m}

. Thus,

T^{m} (T^{k} v) = T^{m + k} v = 0

, and

v \in n u l l T^{m + k}

. So

n u l l T^{m + k + 1} \subset n u l l T^{m + k}

as desired.□In the proof above, one might deduce

T^{m + k + 1} v = T^{m + 1} (T^{k} v) = T^{m} (T^{k} v) = 0

from

v \in n u l l T^{m + k + 1}

, which has a minor logic leap. Because

n u l l T^{m} = n u l l T^{m + 1}

does not necessarily lead to

T^{m} = T^{m + 1}

. With Theorem 2 we can take a step further and show that the consecutive equalities always hold when

m = d i m V

Theorem

Suppose

T \in L (V)

. Let

n = \dim V

. Then

n u l l T^{n} = n u l l T^{n + 1} = n u l l T^{n + 2} = \dots .

Proof: Suppose

n u l l T^{n} \neq n u l l T^{n + 1}

. Then for

m < n

we must NOT have

n u l l T^{m} = n u l l T^{m + 1}

due to Theorem 2. Hence, from Theorem 1, we have

{0} = n u l l T^{0} ⊊ n u l l T^{1} ⊊ \dots ⊊ n u l l T^{n} ⊊ n u l l T^{n + 1} ⊊ \dots

The

⊊

relation guarantees that

\dim n u l l T^{k + 1} \geq \dim n u l l T^{k} + 1

, which implies

\dim n u l l T^{n + 1} \geq n + 1 > \dim V

. This is not possible as

\dim V = \dim n u l l T^{n + 1} + \dim r a n g e T^{n + 1}

and hence,

\dim n u l l T^{n + 1} \leq \dim V

. Thus, the contradiction indicates

n u l l T^{n} = n u l l T^{n + 1}

.□The proof above used the fundamental theorem of linear maps,

\dim V = \dim n u l l T + \dim r a n g e T

, which does NOT always result in

V = n u l l T \oplus r a n g e T

. To see this, let

T \in L (R^{3})

, and

T (x_{1}, x_{2}, x_{3}) = (x_{2}, x_{3}, 0)

, then

n u l l T = {(x_{1}, 0, 0) : x_{1} \in R}, and r a n g e T = {(x_{1}, x_{2}, 0) : x_{1}, x_{2} \in R} .

while

\dim n u l l T + \dim r a n g e T = 1 + 2 = \dim V

n u l l T \cap r a n g e T = {(x_{1}, 0, 0) : x_{1} \in R} \neq {0}

. Thus,

R^{3} \neq n u l l T \oplus r a n g e T

. As a matter of fact,

R^{3} \neq n u l l T + r a n g e T

. Fortunately, the following theorem could be useful complement to the fundamental theorem of linear maps:

Theorem

Suppose

T \in L (V)

. Let

n = d i m V

. Then

V = n u l l T^{n} \oplus r a n g e T^{n} .

Proof: we prove Theorem 4 by two steps. The first step shows

n u l l T^{n} \cap r a n g e T^{n} = {0}

, and then prove that

\dim n u l l T^{n} + \dim r a n g e T^{n} = n

\dim (A \oplus B) = \dim A + \dim B

. Let

v \in n u l l T^{n} \cap r a n g e T^{n}

, then

T^{n} v = 0 and \exists u \in V such that T^{n} u = v

T^{2 n} u = T^{n} v = 0

. Note that

u \in n u l l T^{2 n} = n u l l T^{n}

due to Theorem 3, resulting in

T^{n} u = 0

. Thus,

v = 0

. Now that

n u l l T^{n} + r a n g e T^{n} = n u l l T^{n} \oplus r a n g e T^{n}

, we have

\dim (n u l l T^{n} \oplus r a n g e T^{n}) = \dim n u l l T^{n} + \dim r a n g e T^{n} = \dim V = n

where we used the fundamental theorem of linear maps again. The equation above implies

V = n u l l T^{n} \oplus r a n g e T^{n}

as desired.□1.1. Generalized eigenvectorsThe null-range division is arguably the simplest one caused by application of an operator and its power. Another nice decomposition of a vector space is a direct sum of 1-dimensional subspaces spanned by eigenvectors. Unfortunately, some operators do not have enough eigenvectors to lead to such a nice decomposition. To see how rare the decomposition is, let

T v_{i} = 𝜆_{i} v_{i}

for distinct

𝜆_{i}

and

i = 1, . . ., m

. When

U_{i} = {v : v = z v_{i}, z \in C}

, then

V = U_{1} \oplus U_{2} \oplus \dots \oplus U_{m}

if and only if

V

has a basis of eigenvectors of

T

. According to the diagonalizablity conditions, this happens if and only if

V = E (𝜆_{1}, T) \oplus \dots \oplus E (𝜆_{m}, T) .

From the complex spectral theorems, normal operators have the equation above hold no matter what on complex vector spaces. But it does NOT hold for general operators. Fortunately, by defining the concept of generalized eigenvectors, we will see any complex vector space is a direct sum of generalized eigenspace.

Definition

Suppose

T \in L (V)

and

𝜆

is an eigenvalue of

T

. A vector

v \in V

is called a generalized eigenvector (of rank j) of

T

corresponding to

𝜆

v \neq 0

and

(T - 𝜆 I)^{j} v = 0

for some positive integer

j

. But

(T - 𝜆 I)^{j - 1} v \neq 0

Definition

Suppose

T \in L (V)

and

𝜆 \in F

. The generalized eigenspace of

T

corresponding to

𝜆

, denoted

G (𝜆, T)

, is defined to be the set of all generalized eigenvectors of

T

corresponding to

𝜆

, along with the 0 vector.

We DO NOT define "generalized eigenvalue" here. From Definition 1,

(T - 𝜆 I)^{j}

is not injective, so does

(T - 𝜆 I)

, as nonzero vector

(T - 𝜆 I)^{j - 1} v \in n u l l (T - 𝜆 I)

. Because

T - 𝜆 I

is finite-dimensional,

𝜆

is just an eigenvalue of

T

. From Definition 1 and Definition 2, we know that

G (𝜆, T) = n u l l (T - 𝜆 I)^{j}

where different

𝜆

might correspond to different

j

. However, the next result shows that we can unify the value to define generalized eigenspaces for all possible eigenvalues.

Theorem

Suppose

T \in L (V)

and

𝜆 \in F

. Then

G (𝜆, T) = n u l l (T - 𝜆 I)^{d i m V}

Proof: Suppose

v \in n u l l (T - 𝜆 I)^{\dim V}

, then

v \in G (𝜆, T)

as indicated by Definition 2. Therefore

n u l l (T - 𝜆 I)^{\dim V} \subset G (𝜆, T)

. Conversely, Suppose

v \in G (𝜆, T)

, then there exist a positive integer such that

(T - 𝜆 I)^{j} v = 0

. If

j < \dim V

, then

n u l l (T - 𝜆 I)^{j} = G (𝜆, T) \subset n u l l (T - 𝜆 I)^{\dim V}

due to Theorem 1. If

j > \dim V

, then

n u l l (T - 𝜆 I)^{\dim V} = n u l l (T - 𝜆 I)^{j}

. Hence

G (𝜆, T) \subset n u l l (T - 𝜆 I)^{\dim V}

for any positive integer.□Eigenvectors corresponding to distinct eigenvalues are linearly independent, which is also true for generalized eigenvectors.

Theorem

Let

T \in L (V)

. Suppose

𝜆_{1}, \dots, 𝜆_{m}

are distinct eigenvalues of

T

and

v_{1}, \dots, v_{m}

are corresponding generalized eigenvectors. Then

v_{1}, \dots, v_{m}

is linearly independent.

1.2. Nilpotent operatorsWe end this section by introducing nilpotent operators. The latin word nil means zero, and thus nilpotent literally means zero power.

Definition

An operator is called nilpotent if some power of it equals 0

For a nilpotent operator, we never need to raise its power higher than

\dim V

to turn it into zero, as indicated by the theorem below

Theorem

Suppose

N \in L (V)

is nilpotent. Then

N^{d i m V} = 0

Proof: to see this, we use the definition of generalized eigenspace to se that

G (0, N) = V

. From Theorem 5,

G (0, N) = n u l l N^{\dim V} = V

, indicating that

N^{\dim V} = 0

.□Each nilpotent operator has a matrix that has all zero on its diagonal, as we can prove as the following.

Theorem

Suppose

N

is a nilpotent operator on

V

. Then there is a basis of

V

with respect to which the matrix of

N

has the form

(\begin{array}{lll} 0 & * \\ ⋱ \\ 0 & 0 \end{array})

here all entries on and below the diagonal are 0 's.

Proof: Because of Theorem 7, we can get a desired basis through the following steps. First, we find a basis of

n u l l N

. Due to Theorem 1, we then expand such the basis to a basis of

n u l l N^{2}

. We repeat such a step till we get a basis for

n u l l N^{\dim V} = V

. Now, suppose we write down

M (N)

using this basis, and the first several columns that correspond to the basis vectors of

n u l l N

must be made of all zeros. After the first serveral "zero" columns, we reach the second set of columns that correspond to the extended basis vectors of

n u l l N^{2}

. Suppose

v

is a basis vector of

n u l l N^{2}

, we have

N^{2} v = 0 = N (N v)

, indicating that

N v \in n u l l N

. Thus, applying

N

v

gives us a vector that is linear combination of basis vectors of

n u l l N

, which tells us that the second set of columns have nonzero entries at the rows that correspond to basis vectors of

n u l l N

, and such rows are all above the diagonal. By continuing the process described here, we will eventually get a matrix that has all nonzero entris above the diagonal.□2. Decomposition of an OperatorAs we discussed in the pevious section, for general complex vector spaces, it might not have enough normal eigenvectors that dissect the space into one-dimensional subspaces invariant under a given operator. But it is promised that complex vector spaces can always be represented as a direct sum of generalized eigenspaces. The following results show that those generalized eigenspaces are invariant under associated operator as well.

Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Let

𝜆_{1}, \dots, 𝜆_{m}

be the distinct eigenvalues of

T

. Then(a)

V = G (𝜆_{1}, T) \oplus \dots \oplus G (𝜆_{m}, T)

;(b) each

G (𝜆_{j}, T)

is invariant under

T

;(c) each

(T - 𝜆_{j} I) |_{G (𝜆_{j}, T)}

is nilpotent.

Proof: We are not going to prove (a) here as the proof given in Axler's book was not very clear to the author. One might need to know the Lemme des Noyaux to prove it. Because we have not introduced the concept of kernel so far, the proof is skipped here. To prove (b), From Theorem 5,

G (𝜆_{j}, T) = n u l l (T - 𝜆_{j} I)^{\dim V}

and notice that

(T - 𝜆_{j} I)^{\dim V}

is a polynomial of the operator

T - 𝜆_{j} I

. So to prove (b) we need to first prove the following lemma:

Lemma

Suppose

T \in L (V)

and

p \in P (F)

. Then

n u l l p (T)

and

r a n g e p (T)

are invariant under

T

Suppose

v \in n u l l p (T)

, and

p (T) v = 0

. Then

T p (T) v = p (T) (T v) = 0

, indicating that

T v \in n u l l p (T)

as well. Also if

v \in r a n g e p (T)

, then there exist a

u \in V

such that

v = p (T) u

. Similarly,

T v = T p (T) u = p (T) T u

, so

T v \in r a n g e p (T)

which can be obtained by applying

p (T)

T u \in V

. In summary,

n u l l p (T)

and

r a n g e p (T)

are invariant under

T

. Now, if we replace

p (T)

with

(T - 𝜆_{j} I)^{\dim V}

, then

n u l l (T - 𝜆_{j} I)^{\dim V} = G (𝜆_{j}, T)

is invariant under

T

because of Lemma 1. Finally, (c) must be true as the operator

(T - 𝜆_{j} I) |_{G (𝜆_{j}, T)}

lives on

G (𝜆_{j}, T)

, and we have

(T - 𝜆_{j} I)^{\dim V} v = 0

for every

v \in n u l l (T - 𝜆_{j} I)^{\dim V} = G (𝜆_{j}, T)

.□Now that we have Theorem 9(a), we can take the basis from every generalized eginspace to make a basis of any complex vector space, i.e.,

Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Then there is a basis of

V

consisting of generalized eigenvectors of

T

2.1. Multiplicity of an eigenvalueBecause of Theorem 9 (a), we can also define multiplicity of an eigenvalue, and the sum of multiplicities of all the eigenvalues of an operator

T \in L (V)

equals to

\dim V

Definition

- Suppose

T \in L (V)

. The multiplicity of an eigenvalue

𝜆

T

is defined to be the dimension of the corresponding generalized eigenspace

G (𝜆, T)

.- In other words, the multiplicity of an eigenvalue

𝜆

T

equals

d i m n u l l (T - 𝜆 I)^{d i m V}

Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Then the sum of the multiplicities of all the eigenvalues of

T

equals

d i m V

Multiplicity defined above is also called algebraic multiplicity in some books. The term of geometric multiplicity is also used, but it refers to the dimension of the corresponding eigenspace. In other words,

\begin{array}{c} geometric multiplicity of 𝜆 = \dim n u l l (T - 𝜆 I) \\ algebraic multiplicity of 𝜆 = \dim n u l l (T - 𝜆 I)^{\dim V} \end{array}

2.2. Block diagonal matrixWhat matrices of operators look like with respect to generalized eigenvectors? To answer this question we first introduce the concept of block diagonal matrix as the following:

Definition

A block diagonal matrix is a square matrix of the form

(\begin{array}{ccc} A_{1} & 0 \\ ⋱ \\ 0 & A_{m} \end{array}),

where

A_{1}, \dots, A_{m}

are square matrices lying along the diagonal and all the other entries of the matrix equal 0 .

As an example, the matrix below is a block diagonal matrix:

A = (\begin{array}{ccccc} 4 & 0 & 0 & 0 & 0 \\ 0 & 2 & - 3 & 0 & 0 \\ 0 & 0 & 2 & 0 & 0 \\ 0 & 0 & 0 & 1 & 7 \\ 0 & 0 & 0 & 0 & 1 \end{array}) = (\begin{array}{ccc} A_{1} & 0 \\ A_{2} \\ 0 & A_{3} \end{array})

The following result shows how a block diagonal matrix can be related to multiplicities of distinct eigenvalues through upper-triangular matrices.

Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Let

𝜆_{1}, \dots, 𝜆_{m}

be the distinct eigenvalues of

T

, with multiplicities

d_{1}, \dots, d_{m}

. Then there is a basis of

V

with respect to which

T

has a block diagonal matrix of the form

(\begin{array}{ccc} A_{1} & 0 \\ ⋱ \\ 0 & A_{m} \end{array}),

where each

A_{j}

is a

d_{j}

-by-

d_{j}

upper-triangular matrix of the form

A_{j} = (\begin{array}{ccc} 𝜆_{j} & * \\ ⋱ \\ 0 & 𝜆_{j} \end{array}) .

Proof: To see this, we first use Theorem 9 (a) to dissect a given complex vector space into generalized eigenspaces. For any eigenvalue

𝜆_{j}

G (𝜆_{j}, T) = n u l l (T - 𝜆_{j} I)^{\dim V}

indicates that

T - 𝜆_{j} I

is nilpotent. Then

T - 𝜆_{j} I

has a matrix in a form shown in Theorem 8 with respect to basis of

G (𝜆_{j}, T)

. Furthermore, the matrix of

T_{| G (𝜆_{j}, T)} = (T - 𝜆_{j} I)_{| G (𝜆_{j}, T)} + 𝜆_{j} I_{| G (𝜆_{j}, T)}

has

𝜆_{j}

on its diagonal with respect to the same basis. We can write similar matrices for other eigenvalues with respect to bases of generalized eigenspaces. Because putting the bases of

G (𝜆_{j}, T)

together give a basis for

T \in L (V)

, the matrix of

T

with respect to this basis has

M (T_{| G (𝜆_{j}, T)})

on its diagonal, as desired.□The matrix in Eqn.

(1)

has its diagonal blocks being upper-triangular matrices, which indicates that the eigenvalues of

T

are 4,2,1, with their multiplicities being 1,2,2.2.3. Square rootsNot every operator on complex vector spaces has a square root, but we will see that

I + N

always has a square root if

N

is nilpotent. Notice that the following lemma applies to both complex and real vector spaces.

Lemma

Suppose

N \in L (V)

is nilpotent. Then

I + N

has a square root.

Proof: We first consider the Tyler expansion of

\sqrt{1 + x} = 1 + a_{1} x + a_{2} x^{2} + \dots

. We don't care exact values of coefficients

a_{j}

but guess that the square root to

I + N

has a form of

I + a_{1} N + a_{2} N^{2} + \dots

. Because

N^{m} = 0

for some integer

m

, the series above ends at

a_{m - 1} N^{m - 1}

. With this guess, we should have

\begin{aligned} I + N & = (I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1})^{2} \\ = I + 2 a_{1} N + (2 a_{2} + a_{1}^{2}) N^{2} + (2 a_{3} + 2 a_{1} a_{2}) N^{3} + \dots \\ + (2 a_{m - 1} + terms involving a_{1}, . . ., a_{m - 2}) N^{m - 1} \end{aligned}

By matching the terms between the two sides above, we have

a_{1} = 1 / 2

2 a_{2} + a_{1}^{2} = 0

,... Again, we don't care specific values of the coefficients. What matters here is we can always find a set of

a_{j}

that satisfy the equation of

I + N = (I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1})^{2}

for some integer

m

.□Lemma 2 applies to both real and complex vector spaces, The following lemma about square root, however, only applies to complex vector spaces.

Lemma

Suppose

V

is a complex vector space and

T \in L (V)

is invertible. Then

T

has a square root.

Proof: Let

𝜆_{1}, . . ., 𝜆_{m}

to be distinct eigenvalues of

T

. On each generalized eigenspace

G (𝜆_{j}, T)

, we have

(T - 𝜆_{j} I)_{| G (𝜆_{j}, T)}

being nilpotent due to Theorem 9 (c). Let

N_{j} = (T - 𝜆_{j} I)_{| G (𝜆_{j}, T)}

, then

T_{| G (𝜆_{j}, T)} = N_{j} + 𝜆_{j} I = 𝜆_{j} (I + N_{j} / 𝜆_{j})

. Apparently

N_{j} / 𝜆_{j}

is also nilpotent so

I + N_{j} / 𝜆_{j}

has a square root due to Lemma 2. Thus, a square root

R_{j}

T_{| G (𝜆_{j}, T)}

\sqrt{𝜆_{j}} \times \sqrt{I + N_{j} / 𝜆_{j}}

. Let

v \in V

, and

u_{j} \in G (𝜆_{j}, T)

, we have

v = u_{1} + u_{2} + \dots + u_{m}

If we define

R

R v = R_{1} u_{1} + R_{2} u_{2} + \dots + R_{m} u_{m}

then

R^{2} v = R_{1}^{2} u_{1} + R_{2}^{2} u_{2} + \dots + R_{m}^{2} u_{m} = \sum_{i = 1}^{m} T_{| G (𝜆_{i} . T)} u_{i} = T v

as desired.□By imitating the techniques in this section, you should be able to prove that if

V

is a complex vector space and

T \in L (V)

is invertible, then

T

has a

k^{th}

root for every positive integer

k

.3. Characteristic and Minimal PolynomialWe will prove Cayley-Hamilton theorem in this section, based on which we can further include important results of minimal polynomial of operators. The theorem does not have much application in quantum theory, but it is a big deal in control theory (check this video about reachability and controllability by Prof. Steve Brunton). Mathematically, the theorem provides a way to simplify the calculation of matrix exponential as we will see later.3.1. The Cayley-Hamilton theoremBefore we prove Cayley-Hamilton theorem, we need to define the concept of characteristic polynomial.

Definition

Suppose

V

is a complex vector space and

T \in L (V)

. Let

𝜆_{1}, \dots, 𝜆_{m}

denote the distinct eigenvalues of

T

, with multiplicities

d_{1}, \dots, d_{m}

. The polynomial

(z - 𝜆_{1})^{d_{1}} \dots (z - 𝜆_{m})^{d_{m}}

is called the characteristic polynomial of

T

With this, we then give the theorem as the following

Theorem

Suppose

V

is a complex vector space and

T \in L (V)

. Let

q

denote the characteristic polynomial of

T

. Then

q (T) = 0

Proof: First, we can dissect

V

into a direct sum of distinct generalized eigenspaces associated with eigenvalues of

𝜆_{1}, . . ., 𝜆_{m}

. Let the multiplicities of these eigenspaces be

d_{1}, . . ., d_{m}

. Then the characteristic polynomail of

T

q (T) = (T - 𝜆_{1} I)^{d_{1}} \dots (T - 𝜆_{m} I)^{d_{m}}

From Theorem 9 (c) we know

(T - 𝜆_{j} I)_{| G (𝜆_{j}, T)}

is nilpotent. Also, for a nilpotent operator

N

on a vector space of dimension

d

, we have

N^{d} = 0

because of Theorem 7. Since

\dim G (𝜆_{j}, T) = d_{j}

, we have

(T - 𝜆_{j} I)_{| G (𝜆_{j}, T)}^{d_{j}} = 0

. To show that

q (T)

in eqn.

(2)

equals zero, we need to show

q (T)_{| G (𝜆_{j}, T)} = 0

for each

𝜆_{j}

. Let

v \in G (𝜆_{j}, T)

, we then have that

q (T) v = (T - 𝜆_{1} I)^{d_{1}} \dots (T - 𝜆_{j} I)^{d_{j}} \dots (T - 𝜆_{m} I)^{d_{m}} = (T - 𝜆_{1} I)^{d_{1}} \dots (T - 𝜆_{m} I)^{d_{m}} (T - 𝜆_{j} I)^{d_{j}} v

The second equality in Eqn.

(3)

holds because the operators all commute, i.e.,

(T - 𝜆_{i} I) (T - 𝜆_{j} I) = (T - 𝜆_{j} I) (T - 𝜆_{i} I)

. Therefore, we can move

(T - 𝜆_{j} I)^{d_{j}}

to the far right to give

(T - 𝜆_{j} I)^{d_{j}} v = 0

. That is,

q (T)_{| G (𝜆_{j}, T)} = 0

as desired.□ As an example of applying Theorem 13, we consider the calculation of

e x p (A t)

, with

A

being an operator or a matrix. Expanding

e x p (A t)

gives

\exp (A t) = I + A t + \frac{1}{2!} A^{2} t^{2} + \frac{1}{3!} A^{3} t^{3} + \dots

Let the characteristic polynomial of

A

to be

q (A) = (A - 𝜆_{1} I)^{d_{1}} \dots (A - 𝜆_{m} I)^{d_{m}}

, then Theorem 13 indicates that

A^{m} = c_{0} I + c_{1} A + c_{2} A^{2} + \dots + c_{m - 1} A^{m - 1}

Replacing

A^{m}

in Eqn.

(4)

with Eqn.

(5)

turns the infinite series into a finite series,

\exp (A t) = p_{0} (t) I + p_{1} (t) A + p_{2} (t) A^{2} + \dots + p_{m - 1} (t) A^{m - 1}

for appropriate polynomials

p_{0}, . . ., p_{m - 1}

. This conversion may allow simpler calculation.3.2. Minimal PolynomialThe definition of monimal polynomial for a operator depends on the concept of monic polynomial as we give below:

Definition

A monic polynomial is a polynomial whose highest-degree coefficient equals 1.

As an example, the polynomial of

z^{5} + 7 z^{3} + z + 1

is a monic polynomial of degree 5. We now give the definition of minimal polynomial and then prove that it is the unique monic polynomial for a given operator.

Definition

Suppose

T \in L (V)

. Then the minimal polynomial of

T

is the unique monic polynomial

p

of smallest degree such that

p (T) = 0

As promised, now we prove that

Lemma

Suppose

T \in L (V)

. Then there is a unique monic polynomial

p

of smallest degree such that

p (T) = 0

Proof: We first prove such a monic polynomial exists. Let

n = \dim V

, then

\dim L (V) = n^{2}

and the list

I, T, T^{2}, . . ., T^{n^{2}}

is not linearly independent as it has a length of

n^{2} + 1

. Let

m

be the lowest integer such that

I, T, . . . T^{m}

is linearly dependent. Because of the linear dependency lemma,

T^{m}

can be expressed as a linear combination of

I, T, . . T^{m - 1}

, i.e.,

a_{0} I + a_{1} T + \dots + a_{m - 1} T^{m - 1} + T^{m} = 0

Let

p (z) = a_{0} + a_{1} z + \dots + a_{m - 1} z^{m - 1} + z^{m}

,then

p (T) = 0

. To prove such a monic polynomial is unique, suppose we have another monic polynomial

q (T) = 0

. Notice that the degree of

q (T)

must be

m

q (T)

with its degree lower than

m

cannot be zero. Then

(p - q) (T) = 0

and

\deg (p - q) < m

, indicating that

p = q

.□The following results tell more about inner structure of minimal polynomial but we will omit their proofs here.

Lemma

Suppose

T \in L (V)

and

q \in P (F)

. Then

q (T) = 0

if and only if

q

is a polynomial multiple of the minimal polynomial of

T

p (T)

. In other words, there exists

s \in P (F)

such that

q = p s

Lemma

Suppose

F = C

and

T \in L (V)

. Then the characteristic polynomial of

T

is a polynomial multiple of the minimal polynomial of

T

From Definition 6, we know zeros of characteristic polynomial of an operator correspond to its eigenvalues. Now we can show that the minimal polynomial has the same zeros but may have different multiplicities.

Lemma

Let

T \in L (V)

. Then the zeros of the minimal polynomial of

T

are precisely the eigenvalues of

T

4. Jordan Form In Section 2.2. we have shown that there is a basis of complex vector space

V

to make matrix of operator

T

a nice upper-triangular matrix. Fortunately, we can do even better by writting down

M (T)

with respect to a Jordan basis.

Definition

Suppose

T \in L (V)

. A basis of

V

is called a Jordan basis for

T

if with respect to this basis

T

has a block diagonal matrix

(\begin{array}{ccc} A_{1} & 0 \\ ⋱ \\ 0 & A_{p} \end{array}),

where each

A_{j}

is an upper-triangular matrix of the form

A_{j} = (\begin{array}{cccc} 𝜆_{j} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & 𝜆_{j} \end{array})

The following result indicates that Jordan form always exists for operators on complex vector spaces.

Theorem

Suppose

V

is a complex vector space. If

T \in L (V)

, then there is a basis of

V

that is a Jordan basis for

T

We will skip the proof here. For interested reader, you might find the lemma below useful to prove Theorem 14.

Lemma

Suppose

N \in L (V)

is nilpotent. Then there exist vectors

v_{1}, \dots, v_{n} \in V

and nonnegative integers

m_{1}, \dots, m_{n}

such that(a)

N^{m_{1}} v_{1}, \dots, N v_{1}, v_{1}, \dots, N^{m_{n}} v_{n}, \dots, N v_{n}, v_{n}

is a basis of

V

;(b)

N^{m_{1} + 1} v_{1} = \dots = N^{m_{n} + 1} v_{n} = 0

I+N	=(I+a1N+a2N2+⋯+am-1Nm-1)2
	=I+2a1N+(2a2+a21)N2+(2a3+2a1a2)N3+⋯
	+ (2am-1+terms involving a1,...,am-2)Nm-1

𝜆j		*
	⋱
0		𝜆j