Linear Map, Matrix, and Duality

The note here aims to provide shallow references to important aspects of linear maps that relate to the juicy part of linear algebra, matrix. Proofs are mostly omitted in the note as the author was in a hurry and lazy. Like the previous notes, we use

V

to denote vector space over filed

F

1. Linear Map and Its Properties 1.1. Injectivity and surjectivity of linear maps 1.2. Invertibility and Isomorphism of linear maps 2. Matrix, A Representation of Linear Map 2.1. Rank of a matrix 2.2. Notations for matrix manipulations 3. Duality 3.1. Null space and range of the dual of a linear map 4. Things missed out in this note

1. Linear Map and Its PropertiesIn mathematics, mapping builds the correspondence from one set to another set. In linear algebra, the sets are vector spaces. A map is called linear if the following conditions are satisfied:

Definition

A map

T

from

V

W

is called linear if

T (v + u) = T v + T u

for

v, u \in V

, and

T (𝜆 v) = 𝜆 T v

for

𝜆 \in F

and

v \in V

For the simplicity, we use

T : V \to W

to denote a linear map from vector space

V

W

T

here stands for transformation as you will soon see why. Similar to the definition of vector space, the definition of linear map is accompanied by the definition of "zero" or "null":

Definition

The null space of linear map

T : V \to W

is defined as

n u l l T = {T v = 0 : v \in V}

To give an example of null space, let

V = R^{3}

, and for arbitrary

(x_{1}, x_{2}, x_{3}) \in V

, we have

T (x_{1}, x_{2}, x_{3}) = (x_{1}, x_{2})

, i.e.

T : R^{3} \to R^{2}

. In this case,

n u l l T = {(0, 0, x_{3}) : x_{3} \in R}

. We also have a name for the set of vectors that lead to nonzero vectors upon application of

T

Definition

the range of a linear map

T : V \to W

, denoted as

r a n g e T

, is a subset of

W

containing vectors that satisfies the following condition

r a n g e T = {w \in W : \exists v \in V, T v = w}

With these definitions introduced, we are ready to give the theorem that relates

T

r a n g e T

, and

n u l l T

. Because we use it so frequently in the proofs of linear algebra, we give it a dramatic name as

Theorem

Fundemental theorem of linear maps: Suppose

V

is finite-dimensional and

T

is a linear map from

V

W

, then we have

d i m V = d i m n u l l T + d i m r a n g e T

Definition 1 indicates that linear maps are closed under addition and multiplication, so it should not be a surprise that all the linear maps

T : V \to W

form a linear space, denoted as

L (V, W)

. Before we discuss special properties of linear maps, we need to clarify two important concepts related to linear maps: injectivity and surjectivity.1.1. Injectivity and surjectivity of linear mapsA linear map is called injective if either one of the following conditions holds:

Theorem

Injectivity of a linear map

T : V \to W

means:a. for

v, u \in V

T v = T u

iff

v = u

, andb.

n u l l T = {0}

and

d i m n u l l T = 0

With Theorem 2, it should not be hard to imagine that a map is not injective when cordiality of

V

is larger than

W

. For vector spaces, cordinal numbers are usually infinite. Thus, a formal way of saying this is that a linear map is not injective when

d i m V > d i m W

. To see this, we notice that

r a n g e T \subset W

in Theorem 2. According to Theorem 1, we have

d i m n u l l T = d i m V - d i m r a n g e T \geq d i m e V - d i m W > 0

, contradicting to Theorem 2b. Thus, there is at least one pair of

v

and

u

satisfying

T v = T u

with

u \neq v

. As for the surjectivity, we have the following theorem holds

Theorem

Surjectivity of a linear map

T : V \to W

means

r a n g e T = W

Theorem 3 indicates that a linear map is not surjective if

d i m V \leq d i m W

, as can be proved by using Theorem 1. In other words, there is at least one

w \in W

which no

v \in V

can map to when

d i m V \leq d i m W

. So what is so special about injectivity and surjectivity? How can we solve problems by exploiting the two properties of a linear map? Arguably the easiest application is to check if (in)homogeneous system of linear equations have a solution or not. A homogeneous system of

m

linear equations can be written in the following form:

\begin{aligned} a_{1, 1} x_{1} + a_{1, 2} x_{2} + \dots a_{1, n} x_{n} & = 0 \\ a_{2, 1} x_{1} + a_{2, 2} x_{2} + \dots a_{2, n} x_{n} & = 0 \\ ⋮ \\ a_{m, 1} x_{1} + a_{m, 2} x_{2} + \dots a_{m, n} x_{n} & = 0 \end{aligned} anchor-tag

If we let

V = F^{n}

, then we can define a linear map

T : F^{n} \to F^{m}

as the following

T (x_{1}, x_{2}, . . ., x_{n}) = ({\sum_{j = 1}^{}}^{n} a_{1, j} x_{j}, {\sum_{j = 1}^{}}^{n} a_{2, j} x_{j}, . . ., {\sum_{j = 1}^{}}^{n} a_{m, j} x_{j}) anchor-tag

So having a solution to the system above is equavalent to having nonzero element in

n u l l T

, i.e.,

d i m n u l l T > 0

. Acoording to Theorem 1, such condition can only be satisfied when

d i m F^{n} > d i m r a n g e T

. Because

d i m r a n g e T \leq d i m F^{m}

, we must have solutions to the equation system when

n > m

. Thus, a homogeneous system of linear equations has solutions when its corresponding map

T : F^{n} \to F^{m}

is not injective.On the other hand, for inhomogeneous system of linear equations, having a solution means there is at least one vector

v = (x_{1}, x_{2}, . . ., x_{n}) \in F^{n}

satisfying the condition of

T v = (c_{1}, c_{2}, . . ., c_{m})

, where some

c_{i} \neq 0

. Given the linear map defined in Eqn.(2), we claim that solution for the inhomogeneous sytem does not exist for some choices of

c_{i}

when

m > n

as the linear map

T

is not surjective.Other than this simple application, surjectivity and injectivity also dictate two properties of linear maps, invertibility and isomorphism, as we shall now see.1.2. Invertibility and Isomorphism of linear mapsWe first define the invertibility of a linear map as the following

Definition

A linear map

T \in L (V, W)

is invertible if there exists another linear map

S \in L (W, V)

that make

S (T v) = I v = v

hold for all

v \in V

, where

I

is identity map.

As a matter of fact, the inverted map

S

in Definition 4 is usually written as

T^{- 1}

(i.e. inverse of

T

). If a linear map has an inverse, we have

Proposition

Inverse of a linear map is unique.

as can be shown by assuming two inverses,

S_{1}

and

S_{2}

, of a linear map

T

S_{1} = S_{1} I = S_{1} T S_{2} = I S_{2} = S_{2}

According to Definition 4,

S T v = v

for all

v \in V

, which indicates

Proposition

Invertibility of a linear map indicates injectivity and surjectivity.

Being invertible for a linear map is very important to quantum-mechanical calculations, as an invertible map, if it is unitary, is equivalent to a reversible operator whose physical realization is a logic gate on quantum circuit. Unlike quantum computer, classical logic gates, such as AND, OR, XOR are irreversible as numbers of their inputs are larger than numbers of their outputs. From the perspective of information theory, being reversible means that we can recover input information by simply knowing the outputs, which can be achieved by gates that follow invertible linear map from inputs to outputs. To be streamlined with possible later discussion of quantum operators(gates). Here we give the definition of operator in linear algebra:

Definition

An operator is a linear map from a vector space onto itself, i,e,

T : V \to V

, and the set of operators on vector space

V

forms a vector space as well, denoted as

L (V)

The concept of isomorphism is a natural extension of inversibility of vector space, as we shall introduce now

Definition

An isomorphism is a invertible linear map, and thus, two vector spaces are called isomorphic if an invertible linear map exists from one to another.

Isomorphism gives the equivalence of two vector spaces. This is very powerful as we can now make a problem, being hard in one vector space, much easier by simply reformulating it onto another vector space, if we know the two vector spaces are isomorphic. In linear algebra, arguably the most important isomorphism is between the vector space of linear map

L (V, W)

, and the vector space of matrices,

F^{m, n}

. We will come back to the proof of this proposition once we introduce the concept of matrix.2. Matrix, A Representation of Linear MapThe title of this section indicates on intention of explaining representation theory used in quantum theories, even though the author would much love to. Instead, we will introduce the form of matrix, and the interpretation of it. However, we will not focus on standard contents such as matrix addition, multiplication, etc., for which readers can refer to introductory textbooks or Wikipedia pages. The primary purpose of this section is to reveal how one can understand matrix in terms of linear map, and mechanistic calculation of matrices is never the fun part.To begin with, suppose we have a vector space

V

with its basis being

v_{1}, v_{2}, . . ., v_{n}

, and another vector space

W

with its basis of

w_{1}, w_{2}, . . ., w_{m}

. Therefore,

d i m V = n

and

d i m W = m

. Now we define a linear map

T : V \to W

T v_{i} = a_{1, i} w_{1} + a_{2, i} w_{2} + \dots + a_{m, i} w_{m} for i = 1, 2, . . ., n anchor-tag

where

a_{i, j} \in F

. Then, the linear map can be represented by a m-row, n-column matrix as shown below

\begin{array}{c} v_{1} \dots v_{i} \dots v_{n} \\ \begin{array}{c} w_{1} \\ w_{2} \\ ⋮ \\ w_{m} \end{array} (\begin{array}{ccccc} a_{1, 1} & \dots & a_{1, i} & \dots & a_{1, n} \\ a_{2, 1} & a_{2, i} & a_{2, n} \\ ⋮ & ⋮ & ⋮ \\ a_{m, 1} & a_{m, i} & a_{m, n} \end{array}) anchor-tag \end{array}

By multiplying elements in each column with its associated

w_{j}

outside of the matrix, we recover the pesentation of

T v_{i}

W

. To see how an arbitrary vector

v \in V

is mapped to

W

using the matrix in (4), let us first represent

v

as a conlumn vector, i.e.,

M (v) = (\begin{array}{c} c_{1} \\ c_{2} \\ ⋮ \\ c_{n} \end{array}) anchor-tag

which indicates

v = c_{1} v_{1} + c_{2} v_{2} + \dots + c_{n} v_{n}

and

c_{i} \in F

for

i = 1, 2, . . ., n

M

here maps

v \in V

to column vector, and it is also a linear map. We will discuss more about this later. Given Eqn.(3), we can calculate

T v

\begin{aligned} T v & = c_{1} T v_{1} + c_{2} T v_{2} + \dots + c_{n} T v_{n} \\ = c_{1} \sum_{j = 1}^{m} a_{j, 1} w_{j} + c_{2} \sum_{j = 1}^{m} a_{j, 2} w_{j} + \dots + c_{n} \sum_{j = 1}^{m} a_{j, n} w_{j} \\ = (\sum_{i = 1}^{n} c_{i} a_{1, i}) w_{1} + (\sum_{i = 1}^{n} c_{i} a_{2, i}) w_{2} + \dots + (\sum_{i = 1}^{n} c_{i} a_{m, i}) w_{m} \end{aligned} anchor-tag

If we use the rule of matrix multiplication to multiply matrix in (4) with the vector in (5), we obtain a m-row column vector being

w = (\begin{array}{c} (\sum_{i = 1}^{n} c_{i} a_{1, i}) \\ (\sum_{i = 1}^{n} c_{i} a_{2, i}) \\ ⋮ \\ (\sum_{i = 1}^{n} c_{i} a_{m, i}) \end{array}) anchor-tag

Comparing (7) with (6), it should be obvious that the resulted vector

w

is indeed the image of

v

W

. In other words,

v

is mapped to

w

through the linear map

T

represented as the matrix in (4). To be formal, we can define the matrix of a linear map as the following

Definition

Suppose

T \in L (V, W)

, and

v_{1}, v_{2}, . . ., v_{n}

is a basis of

V

, and

w_{1}, . . ., w_{m}

is a basis of

W

. The matrix of

T

with respect to these bases is a m-by-n matrix

M (T)

whose entries

a_{i, j}

are defined by Eqn.(3).

Definition 7 emphasizes the use of bases, implying that matrix is always related to the change of basis. If we treat

M

as a map that maps

T

to a matrix, then it's easy to show that

M

itself is a linear map as it is closed under addition and multiplication. Also, For the simplicity in later discussion, we use

F^{m, n}

to denote a m-by-n matrix.So far, we have taken for granted that linear map can always be represented by a matrix. But the validity of such a claim need to be examined by proving the following proposition:

Proposition

Suppose

v_{1}, v_{2}, . . ., v_{n}

is a basis of

V

and

w_{1}, . . ., w_{m}

a basis of

W

, then

M

is an isomorphism between

L (V, W)

and

F^{m, n}

Proof:

M

is a linear map as for any

T, S \in L (V, W)

M (T + S) = M (T) + M (S)

M (𝜆 T) = 𝜆 M (T)

for

𝜆 \in F

. So we only need to prove it is injective and surjective because of Proposition 2. According to Theorem 2, if

M (T) = 0

then

T v_{i} = 0

for any of the basis vector of

V

. This is only possible when

T = 0

, i.e.,

n u l l M = {0}

. Thus,

M

is injective. To prove

M

is surjective, suppose

A \in F^{m, n}

is an arbitrary matrix. we can always define a linear map

T

from

V

W

T v_{i} = \sum_{j = 1}^{m} A_{j, i} w_{j} for i = 1, 2, . . ., n .

from which

M (T) = A

holds. In other words,

r a n g e M = F^{m, n}

, and

M

is surjective.□2.1. Rank of a matrixSo far we only considered matrices that have each row or each column associated with a basis vector in vector spaces. What if we add to those matrices more rows or columns? Will the expanded matrices still be isomorphic to some linear maps? This is a weird question, but there is no limitation for expanding the size of matrices at all. Let's first add some rows to matrix

A

isomorphic to a linear map

T \in L (V, W)

, and adding columns follows the same logic. Suppose

v_{1}, v_{2}, v_{3}

is a basis of

V

and

w_{1}, w_{2}, w_{3}

a basis of

W

. If we define the map as

T v_{i} = A_{1, i} w_{1} + A_{2, i} w_{2} + A_{3, i} w_{3}

then we can write down

A

as a 3-by-3 matrix. Now, if we rewrite the definition above as

T v_{i} = A_{1, i}^{'} w_{1} + A_{2, i}^{'} w_{2} + A_{3, i}^{'} w_{3} + A_{4, i}^{'} (2 w_{2} + w_{3}) with A_{4, i}^{'} \neq 0

then it is obvious that

A_{1, i} = A_{1, i}^{'}

A_{2, i} = A_{2, i}^{'} + 2 A_{4, i}^{'}

A_{3, i} = A_{3, i}^{'} + A_{4, i}^{'}

. The two definitions will map arbitrary vector

v = c_{1} v_{1} + c_{2} v_{2} + c_{3} v_{3} \in V

to the same vector

w \in W

, but

A^{'}

is NOT isomorphic to

T

. To see this, we simply note that we can give different values to

A_{2, i}^{'}

A_{3, i}^{'}

, and

A_{4, i}^{'}

and still make the relations

A_{2, i} = A_{2, i}^{'} + 2 A_{4, i}^{'}

and

A_{3, i} = A_{3, i}^{'} + A_{4, i}^{'}

hold. In other words,

T \in L (V, W)

corresponds to multiple

A^{'} \in F^{4, 3}

, resulting in the map not injective or surjective. There are matrices that have rows or columns linearly dependent. In those cases, we define the row rank and the column rank of the matrix as the following

Definition

Suppose

A \in F^{m, n}

, then the row(column) rank of

A

is the dimension of the span of the rows(columns) of

A

The rank numbers of a matrix can be related to a linear map through the following proposition

Proposition

Dimension of

r a n g e T

equals column rank of

M (T)

and

Proposition

Row rank of a matrix equals to column rank.

Thus, the rank of a matrix

A \in F^{m, n}

is the column rank of

A

.2.2. Notations for matrix manipulationsWe end this section by introducing algebraic notations for matrix manipulations that will be used in the following sections. By following the notation in Axler's book, the transpose of a matrix is defined as the following

Definition

Transpose of a matrix

A

, denoted as

A^{t}

, has its element satisfying the relation below:

A_{i, j}^{t} = A_{j, i}

For people who have encountered linear algebra before, the following rule should not be alien:

Proposition

For two matrices

A

and

B

, we have

(A B)^{t} = B^{t} A^{t}

Proof: To see this, suppose

A

is a m-by-n matrix and

B

is n-by-q. So the element in

A B

can be represented as

(A B)_{i, j} = \sum_{k}^{n} A_{i, k} B_{k, j}

where

A_{i, k}

and

B_{k, j}

are elements in

A

and

B

, respectively. Definition 9 gives

\begin{aligned} (A B)_{i, j}^{t} & = (A B)_{j, i} \\ = \sum_{k}^{n} A_{j, k} B_{k, i} = \sum_{k}^{n} A_{k, j}^{t} B_{i, k}^{t} = (B^{t} A^{t})_{i, j} \end{aligned}

Because of Proposition 3, we have invertibility of matrices well defined. Let

A^{- 1}

to be the inverse of a m-by-n

A

, then

A^{- 1} A = I

, with

I

being n-by-n identity matrix. We can also use row and column indices to represent the

i

th row of a matrix

A

A_{i, \cdot}

, and the

j

th column as

A_{\cdot, j}

. With this, the following identities hold:

\begin{aligned} (A B)_{k, \cdot} & = A_{k, \cdot} B \\ (A B)_{\cdot, r} & = A B_{\cdot, r} \end{aligned}

3. DualityFor quantum computing and quantum information theories, linear maps that map vectors to scaler field

F

play critical roles, as those maps are used to collapse wavefunctions into states with real physical properties. Because it is so special, we call linear map

T \in L (V, F)

linear functional. For an arbitrary vector space

V

, we call

L (V, F)

its dual space,

V^{'}

. With this, we can find that

Proposition

d i m V^{'} = d i m V

Proof: To see this, we first note that the isomorphism between

L (V, W)

and

F^{m, n}

results in

d i m L (V, W) = d i m F^{m, n}

. Getting

d i m F^{m, n}

is easy. We can think of its basis vector as m-by-n matrices with only one nonzero element at

i

th row and

j

the column. Thus, we have

d i m F^{m, n} = m \times n = d i m L (V, W)

, and

d i m L (V, F) = n = d i m V

.□If

v_{1}, . . ., v_{n}

is a basis of

V

, then we can define the dual basis of

v_{1}, . . ., v_{n}

Definition

The dual basis of

v_{1}, . . ., v_{n} \in V

is the list

𝜙_{1}, . . ., 𝜙_{n}

of elements of dual space

V^{'}

, where each

𝜙_{j}

is a linear functional on

V

such that

𝜙_{j} (v_{k}) = {\begin{array}{c} 1 if k = j \\ 0 if k \neq j \end{array}

With this definition, we can proof the following

Proposition

Dual basis is a basis of the dual space

Proof: we just need to prove that

𝜙_{1}, . . ., 𝜙_{n}

is linearly independent because every linearly independent list of lenth

d i m V^{'}

is the basis of

V^{'}

. Suppose

a_{1}, . . ., a_{n} \in F

gives that

a_{1} 𝜙_{1} + \dots + a_{n} 𝜙_{n} = 0.

(a_{1} 𝜙_{1} + \dots + a_{n} 𝜙_{n}) (v_{j}) = a_{j}

for

j = 1, . . ., n

, and the equation above implies that

a_{1} = a_{2} = \dots = a_{n} = 0

. Hence the list

𝜙_{1}, . . ., 𝜙_{n}

is linearly independent □We can take a step further to define the dual map,

Definition

T \in L (V, W)

, then the dual map of

T

is the linear map

T^{'} \in L (W^{'}, V^{'})

such that

T^{'} (𝜙) = 𝜙 \circ T

for

𝜙 \in W^{'}

𝜙 \circ T

means the composite application of two mapps. For arbitrary vector

v \in V

(𝜙 \circ T)

first maps it to a vector

T v \in W

, and then apply

𝜙

T v

to map it to a scalar. The algebraic properties of dual maps listed below might be useful

\begin{array}{c} \begin{aligned} (S + T)^{'} & = S^{'} + T^{'} for all S, T \in L (V, W) \\ (𝜆 T)^{'} & = 𝜆 T^{'} for all 𝜆 \in F \\ (S T)^{'} & = T^{'} S^{'} where T \in L (U, V), S \in L (V, W) \end{aligned} \end{array}

3.1. Null space and range of the dual of a linear mapWe can describe

r a n g e T^{'}

and

n u l l T^{'}

in terms of

r a n g e T

and

n u l l T

by first defining the annihilator of subspaces.

Definition

For

U \subset V

, the annihilator of

U

, denoted as

U^{0}

, is defined by

U^{0} = {𝜙 \in V^{'} : 𝜙 (u) = 0 \forall u \in U}

The dimension of the annihilator is related to

d i m V

through the folowing eqaution

Proposition

Suppose

V

if finite-dimensional and

U

is a subspace of

V

, then

d i m U + d i m U^{0} = d i m V .

Suppose

V

and

W

are finite-dimensional and

T \in L (V, W)

, then the following results hold. The author lists them here while have no clue how they are used in quantum-mechanical calculations.

\begin{aligned} n u l l T^{'} & = (r a n g e T)^{0} \\ d i m n u l l T^{'} & = d i m n u l l T + d i m W - d i m V \\ d i m r a n g e T^{'} & = d i m r a n g e T \\ r a n g e T^{'} & = (n u l l T)^{0} \end{aligned}

4. Things missed out in this noteLinear map is a rich branch of linear algebra. There are many interesting things left out here, among which products and quotients of space vectors are definitely worth diving. While quotients may not be pervasive in quantum computing literature, it seems to be a handy tool in proofs of quantum information theories.

a1,1x1+a1,2x2+⋯a1,nxn	=0
a2,1x1+a2,2x2+⋯a2,nxn	=0
⋮
am,1x1+am,2x2+⋯am,nxn	=0

a1,1	⋯	a1,i	⋯	a1,n
a2,1		a2,i		a2,n
⋮		⋮		⋮
am,1		am,i		am,n

Tv	=c1Tv1+c2Tv2+⋯+cnTvn
	=c1m∑j=1aj,1wj+c2m∑j=1aj,2wj+⋯+cnm∑j=1aj,nwj
	= (n∑i=1cia1,i)w1+(n∑i=1cia2,i)w2+⋯+(n∑i=1ciam,i)wm

(AB)k,⋅	=Ak,⋅B
(AB)⋅,r	=AB⋅,r

(S+T)′	=S′+T′ for all S,T∈L(V,W)
(𝜆T)′	=𝜆T′ for all 𝜆∈F
(ST)′	=T′S′ where T∈L(U,V) , S∈L(V,W)

nullT′	=(range T)0
dim nullT′	=dim null T+dimW-dimV
dim range T′	=dim range T
range T′	=(null T)0

(AB)ti,j	= (AB)j,i
	=n∑kAj,kBk,i=n∑kAtk,jBti,k=(BtAt)i,j