Invariant Subspaces, Eigenvectors, and Eigenvalues

It would be great if the author can find at least 2 hours a day to write and read, but life happens. The grand goal here is to finish the notes on what covered in Linear Algebra Done Right. Then we will move into topics of understanding various quantum algorithms through linear algebra.

1. Operator Revisited 2. From Invariant Subspace to Eigenspace 2.1. But, does

T

have eigenvalues? 2.2. Polynomial of operator 3. Eigen- in Matrix representation 3.1. Why are diagonal matrices important?

1. Operator RevisitedWe defined the concept of operator in our pevious note as linear map from a vector space to itself, i.e.,

T \in L (V)

. To better understand and prove theorems related to eigenvalues and eigenvectors, we introduce the following theorem:

Theorem

Suppose

V

is finite-dimensional and

T \in L (V)

. Then the following are equivalent: a.

T

is invertible, b.

T

is injective, and c.

T

is surjective.

Proof: It is obvious from a to b and c as invertibility of a map gives injectivity and surjectivity. Suppose

T \in L (V)

is injective, then

n u l l T = {0}

and

d i m n u l l T = 0

, resulting in

d i m V = d i m r a n g e T

because of the fundamental theorem of linear maps. So b allows c, which in turn results in a. Now if

T

is surjective. We have

r a n g e T = V

, indicating

d i m r a n g e T = d i m V

again.□With Theorem 1, we also know that for

T \in L (V)

T

not being invertible resulting in

T

not being injective and surjective, and vice versa. Now that we refreshed the knowledge of operator, we shall focus on the invariant subspace in the following sections from which the concept of eigenvalue and eigenvector are derived.2. From Invariant Subspace to EigenspaceWe first give the definition of invariant subspace, and we will see the subspace spanned by single eigenvector is a special case of invariant subspace.

Definition

Let

U \subset V

be a subspace of

V

and

T \in L (V)

, then we call

U

an invariant subspace when

\forall u \in U

T u \in U

If we let

v \in V

, then we can obtain an one-dimensional subspace as

U = s p a n (v) = {w : w = 𝜆 v, 𝜆 \in F}

. Suppose

U

is invariant under

T \in L (V)

, then

T w = w^{'}

, which is only possible if

w^{'} = 𝜆 w

with

𝜆 \in F

. As a result, we have

T w = 𝜆 w

for

w \in U

, and we call

𝜆

and

w

here eigenvalue and eigenvector, respectively. In the following, we give the formal definitions of these two concepts.

Definition

Suppose

T \in L (V)

, and

𝜆 \in F

is called an eigenvalue of

T

if there exists

v \in V

such that

v \neq 0

and

T v = 𝜆 v

Definition

Suppose

T \in L (V)

and

𝜆 \in F

v \in V

is called an eigenvector of

T

v \neq 0

and

T v = 𝜆 v

There are two things worth noting here: a) there is no constriction for

𝜆 = 0

. because we demand

v \neq 0

in Definition 2,

T \in L (V)

does NOT always has

0

to be one of its eigenvalues. Especially, we do not have

𝜆 = 0

T

is injective (e.g.

n u l l T = {0}

).b) zero vector can NOT be eigenvector.It is not unusual to have linear independent vectors corresponding the same eigenvalue. Suppose we have

v_{1}, . . . v_{m}

being eigenvectors of

T

and they all correspond to eigenvalue

𝜆

, then we call the subspace

U = s p a n (v_{1}, . . ., v_{m})

the eigenspace of

T

corrsponding to

𝜆

, denoted as

E (𝜆, T)

. This is an empirical definition of eigenspace. To make if formal, i.e., written in terminology of linear algebra, let us first introduce a special class of operator,

T - 𝜆 I \in L (V)

I

here is the identity operator on

V

. We think about it here because for an eigenvector of

T

, we have

(T - 𝜆 I) v = T v - 𝜆 v = 0

Thus, for

T \in L (V)

having eigenvalue and eigenvector is equivalent to

n u l l (T - 𝜆 I) \neq {0}

for some

𝜆 \in F

. Furthermore, from Theorem 1 we have the following

Theorem

For finite-dimensional vector space

V

T \in L (V)

, and

𝜆 \in F

, we have the following statements equivalent: a)

𝜆

is eigenvalue of

T

, b)

T - 𝜆 I

is not invertible, c)

T - 𝜆 I

is not injective, and d)

T - 𝜆 I

is not surjective.

With the introduction above, we are ready to give the definition of eigenspace as

Definition

Suppose

T \in L (V)

and

𝜆 \in F

. The eigenspace of

T

corresponding

𝜆

is defined by

E (𝜆, T) = n u l l (T - 𝜆 I)

Now that we have eigen-value/vector/space defined, we need to answer the question of their existence as indicated in the title of the following subsection.2.1. But, does

T

have eigenvalues?The following theorem guarantees the existence of eigenvalues in complex vector space.

Theorem

Every operator

T \in L (C^{n})

with

n < \infty

has its eigenvalue and corresponding eigenvectors.

Proof: Suppose

V

is a complex vector space, and

d i m V = n

. Let

v \in V

, then we know that

v, T v, T^{2} v, . . . T^{n} v

must not be linear independent as the list length,

n + 1 > d i m V

. Therefore,

0 = a_{0} v + a_{1} T v + a_{2} T^{2} v + \dots + a_{n} T^{n} v

has at least one of the complex coefficients

a_{i}

being nonzero. According to the fundamental theorem of algebra, we have

\begin{aligned} 0 & = a_{0} v + a_{1} T v + a_{2} T^{2} v + \dots + a_{n} T^{n} v \\ = (a_{0} I + a_{1} T + a_{2} T^{2} + \dots + a_{n} T^{n}) v \\ = c (T - 𝜆_{1} I) (T - 𝜆_{1} I) \dots (T - 𝜆_{m} I) v \end{aligned}

here

m \leq n

as some complex coefficient migth be zero. Thus,

T - 𝜆_{j} I

is not injective for at least one of

𝜆_{j}

, which equivalent to one of

𝜆_{j}

is eigenvalue of

T

.□The proof above uses the fundamental theorem of algebra to show the existence of eigenvalue, indicating that polynomial of operator,

p (T)

, could be very useful. So we introduce properties of

p (T)

in the following section.2.2. Polynomial of operatorThe polynomial in the proof above is a polynomial of operators. You will find such a representation pervasive in the literature of quantum computing/algorithm. So it's worth our time to know their interesting properties.Let us first discuss

p (T)

with a degree of two. For

T \in L (V)

, we call the operator idempotent if

T^{2} = T

. Idempotent operators on Hilbert space is an vibrant topic in the community of quantum computing, but we will save the excitement until we hit the topic. From the perspective of linear algebra, arguably the most interesting property of idempotent operation is

Theorem

Suppose

P \in L (V)

and

P^{2} = P

, we have

V = n u l l P \oplus r a n g e P

Proof: To see these, we first notice that for

v \in V

, we can rewrite it as

v = (I - P) v + P v

Clearly

P v \in R a n g e P

, and

P (I - P) v = 0

, so

(I - P) v \in n u l l P

.Thus, we have

V = n u l l P + r a n g e P

. To prove it is a direct sum. Suppose

u \in n u l l P \cap r a n g e P

, then there is

v \in V

such that

P v = u

and

P u = 0

. Notice that

u = P v = P^{2} v = P u = 0

, we have

n u l l P \cap r a n g e P = {0}

, which results in

V = n u l l P \oplus r a n g e P

.□The proof for Theorem 3 also indicates that the polynomial of operator can help to find eigenvalues of the operator. Indeed,

Theorem

for an arbitrary polynomial of operator,

p (T)

, if there exists a nonzero vector

v \in V

such that

p (T) v = 0

, then every zero of

p

is an eigenvalue of T.

Proof: Let

𝜆

to be one of the zeros for

p

. Then we can write

p (T) = (T - 𝜆 I) q (T)

with

q (T) \neq 0

. Suppose

𝜆

is not an eigenvalue of

T

, then

T - 𝜆 I

is injective according to Theorem 2. But

p (T) v = (T - 𝜆 I) q (T) v = 0

, indicating that

q (T) v \in n u l l (T - 𝜆 I)

, creating a contradiction. So

T - 𝜆 I

must be injective, and

𝜆

must be an eigenvalue□In quantum computing, we sometimes need to consider polynomials of composite operators, e.g,

p (A B C)

. Suppose

S

is an invertible operator on

V

, then we can show that

p (S T S^{- 1}) = S p (T) S^{- 1} f o r T, S \in L (V) .

3. Eigen- in Matrix representationFrom the previous discussion of the presentation of linear map, if

d i m V = n

, and

T \in L (V)

, then matrix of the linear map

M (T)

is a n-by-n matrices with respect to a basis of

V

. In the introductory class of linear algebra, it is emphasized that eigenvalues of a linear map are the diagonal values of its corresponding upper-triangular matrix. Using the language of basis and dimensions, a vector space has various distince bases, and an operator can only be represented by an upper-triangular matrix, if there is one, when we choose the correct basis. Since every operator on complex vector space has its eigenthe values, it should not be a surprise that every such operator can be represented by an upper-triangular matrix.

Theorem

Suppose

V

is a finite-dimensional complex vector space and

T \in L (V)

. Then

T

has an upper-triangular matrix with respect to some basis of

V

It is noteworthy that Theorem 6 does not imply that diagonal values of the upper-triangular matrix are always nonzero. It surely can be zero as 0 is a valid value for eigenvalue. On the other hand, if we have zero diagonal values, then the upper-triangular matrix is no longer invertible. Based on our discussion of presentation of linear map, we can deduce that for a linear map, if it has a upper-triangular matrix, its application to basis vector

v_{j}

of vector space results in linear combination of

v_{1}, v_{2}, . . . v_{j - 1}, v_{j}

. The means that

T v_{j} \in s p a n (v_{1}, v_{2}, . . . v_{j})

. Because

v_{j} \in s p a n (v_{1}, v_{2}, . . ., v_{j})

, we should have that

s p a n (v_{1}, v_{2}, . . ., v_{j})

is invariant under

T

. With this, we can write out the conditions for an operator to have upper-triangular matrix.

Theorem

Suppose

T \in L (V)

and

v_{1}, . . ., v_{n}

is a basis of

V

. Then the following conditions are equivalent: a) the matrix of

T

w.r.t

v_{1}, . . ., v_{n}

is upper triangular b)

T v_{j} \in \in s p a n (v_{1}, v_{2}, . . . v_{j})

for each

j = 1, . . ., n

, and c)

s p a n (v_{1}, v_{2}, . . . v_{j})

is invariant under

T

for each

j = 1, . . ., n

While the author admits that Theorem 7 is obvious logically based on our pevious dicussion, it might not be useful for solving real-world problems. In quantum-mechanical calculations, especially, people care more about finding out whether an operator on Hilbert space can be diagonalized. That is, can we find some basis of subspaces of Hilbert space, so the operators have matrices that contain nonzero entries only at their diagonals? To know if an operator is diagonalizable, we check it with the following conditions.

Theorem

Suppose

V

is finite-dimensional and

T \in L (V)

. Let its eigenvalues

𝜆_{1}, . . ., 𝜆_{m}

be distinct. Then

T

is diagonalizable is equivalent to the following statements: a)

V

has a basis consisting only of eigenvectors of

T

b) suppose

d i m V = n

, then there exist n one-dimensional subspaces

U_{i}

V

, each invariant under

T

, such taht

V = U_{1} \oplus U_{2} \oplus \dots \oplus U_{n}

V = E (𝜆_{1}, T) \oplus \dots \oplus E (𝜆_{m}, T)

d i m V = d i m E (𝜆_{1}, T) + d i m E (𝜆_{2}, T) + \dots + d i m E (𝜆_{m}, T)

Theorem 8 indicates that diagonalizability of operator is related relationships between vector space and its eigenspaces. As a matter of fact, an operator is diagonalizable if it has correct number of eigenvalues.

Theorem

T \in L (V)

has

d i m V

distinct eigenvalues, then

T

is diagonalizable.

3.1. Why are diagonal matrices important?Now that we have discussed the relationship between diagonal matrices and eigenvalues of operators, we can try to summarize the importance of diagonal matrices in quantum-mechanical calculations. As the starter, we first notice that people use eigenvectors to present states of quantum systems after measurements. Measurements here are physical realization of linear operators we discussed so far. Similarly, specific physical properties of a quantum system(e.g. energies) take the values of eigenvalues of such operators. Thus, for a diagonalizable operator, upon reading out the result(eigenvalue) of a measurement, we know the quantum system of interest must be in one of the states represented by eigenvectors. In some literature, states corresponding to eigenvectors of a diagonal matrix are referred as "pure state". Imagine that we have an operator that only has its upper-triangular matrix, upon the measurement the resulted state is a linear combination of multiple basis vectors, and thus has no "well-defined" physical properties. In quantum computing, we demand logic units of computation (i.e. qubits) have well-defined physical properties. Most of the time, this means the system treated as a qubit should have two energy levels, and/or distinct spin states. The reason is obvious: we do not want fuzzy results coming out of our computation!As an example, suppose we have a qubit made of a two-state system(e.g. superconducting qubit), and we measure the system state after apply to it the Z-gate. Z-gate is an operator that has the following diagonal matrix

Z = [\begin{array}{cc} 1 & 0 \\ 0 & - 1 \end{array}]

The matrix above is with respect to two basis vectors,

v_{1}

and

v_{2}

, that span the vector space containing all possible quantum states of the system. The two states corresponding to the two basis vectors will be used in calculation (think of these two states as binary numbers 0 and 1). From the matrix, it's transparent that Z-gate has two eigenvalues, 1 and -1. Let

Z v_{1} = v_{1}

and

Z v_{2} = - v_{2}

, then we know that Z-gate has no effect on state of

v_{1}

while turn

v_{2}

into

- v_{2}

(or add a global phase in the language of wavefunction). Thus, being diagonalizable here means that the effects of Z-gate is totally known to us, and we can safely use the gate in our quantum circuit to do complicated tasks.

0	=a0v+a1Tv+a2T2v+⋯+anTnv
	=(a0I+a1T+a2T2+⋯+anTn)v
	=c(T-𝜆1I)(T-𝜆1I)⋯(T-𝜆mI)v