Trace and Determinant

This is the last note on preliminary linear algebra. For people who have been following along they might notice that much of the "nomal" stuff about matrices are not stressed or even mentioned so far in this series. To finish strong, we will focus on determinant and trace of matrices in our last note. This first-operator-then-matrix approach is what Axler's book is best known for. To the author, such a method helps to cultivate your intuition about linear maps and operators. If you can prove things without referring to matrices, you are doing the "hard part" of the work to understand linear algebra. Once you survive the "hard" work, manipulation of matrices might be more transparent. Like previous notes, here we still assume vector spaces

V

is nonzero and finite-dimensional over the field

F = C

1. Trace 1.1. Change of Basis 1.2. Trace: A connection between matrices and operators 2. Determinant 2.1. Determinant of an operator 2.2. Determinant of a matrix 2.3. Volume 2.3.1. Polar coordinates 2.3.2. Spherical coordinates 3. Where to go from here?

1. TraceAs a starter, we need to revisit the way matrix is define. For any operator

T \in L (V, W)

, we need to one basis of

V

and another basis of

W

to define the matrix that represents

T

. Let

v_{1}, . . ., v_{n}

and

w_{1}, . . ., w_{m}

be the two bases of

V

and

W

, respectively. Then the matrix of

T

with respect to the two bases is

M (T, (v_{1}, . . ., v_{n}), (w_{1}, . ., w_{m}))

, which has a shape of m-by-n. We will make it explicit the information of bases in the definition of matrices as it helps us to prove some results in the subsection below. 1.1. Change of BasisSuppose

v_{1}, . . ., v_{n}

is a basis of

V

, and the identity operator

I \in L (V)

satisfies

I v_{i} = v_{i}

. So

M (I, (v_{1}, . . ., v_{n}), (v_{1}, . ., v_{n}))

is a n-by-n matrix with its diagonal entries all being unity. Now, let

u_{1}, . . ., u_{n}

be another basis of

V

M (I, (v_{1}, . . ., v_{n}), (u_{1}, . . ., u_{n}))

is a matrix that changes basis of every vector

v \in V

from

v_{1}, . . ., v_{n}

u_{1}, . . ., u_{n}

. Change of basis has extensive use in physics as it usually is a way to simplify the problem. Here we are interested in using

M (I, (v_{1}, . . ., v_{n}), (u_{1}, . . ., u_{n}))

to relate trace of operators to trace of corresponding matrices. For an operator

T \in L (V)

, we have discussed its matrix

M (T, (v_{1}, . . ., v_{n}))

, i.e., we use only one basis to represent the matrix. If we use two different bases to represent matrices of operators, we can then rewrite the rule of matrix multiplication as the following

Lemma

Suppose

u_{1}, \dots, u_{n}

and

v_{1}, \dots, v_{n}

and

w_{1}, \dots, w_{n}

are all bases of

V

. Suppose

S, T \in L (V)

. Then

\begin{aligned} M (S T, (u_{1}, \dots, u_{n}), (w_{1}, \dots, w_{n})) = \\ M (S, (v_{1}, \dots, v_{n}), (w_{1}, \dots, w_{n})) M (T, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n})) . \end{aligned}

Applying

S T

to any vector

v \in V

, the right hand side of eqn.

(1)

changes the basis of

v

first from

{u_{i}}

{v_{i}}

, and then from

{v_{i}}

{w_{i}}

. With Lemma 1, we can also prove that

M (I, (v_{1}, . . ., v_{n}), (u_{1}, . . ., u_{n}))

is invertible.

Lemma

Suppose

u_{1}, \dots, u_{n}

and

v_{1}, \dots, v_{n}

are bases of

V

. Then the matrices

M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

and

M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n}))

are invertible, and each is the inverse of the other.

Proof: We replace

w_{i}

with

u_{i}

and

S, T

with

I

in eqn.

(1)

to give

I = M (I, (u_{1}, \dots, u_{n})) = M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n})) M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n})) = A B

Interchaging

v_{i}

with

u_{i}

in eqn.

(2)

gives

I = M (I, (v_{1}, \dots, v_{n})) = M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n})) M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n})) = B A

i.e.,

I = A B = B A

, and

M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

is the inverse of

M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n}))

.□ Let

A = M (I, (v_{1}, \dots, v_{n}), (u_{1}, \dots, u_{n}))

, the proof above indicates that

A^{- 1} = M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

. The following formulae shows how a matrix of operator changes when we change bases of the vector space.

Lemma

Suppose

T \in L (V)

. Let

u_{1}, \dots, u_{n}

and

v_{1}, \dots, v_{n}

be bases of

V

. Let

A = M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

. Then

M (T, (u_{1}, \dots, u_{n})) = A^{- 1} M (T, (v_{1}, \dots, v_{n})) A

Proof: Based on Lemma 1, if we replace

w_{j}

with

u_{j}

and replace

S

with

I

, we get

M (T, (u_{1}, \dots, u_{n})) = A^{- 1} M (T, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

Use Lemma 1 again, but replace

w_{j}

with

v_{j}

this time. Also replace

T

with

I

and replace

S

with

T

, getting

M (T, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n})) = M (T, (v_{1}, \dots, v_{n})) A

Substituting

(6)

into

(5)

gives desired result.□1.2. Trace: A connection between matrices and operatorsWe will first give the definitions of traces of operators and matrices, separately. While their definitions look different, we will prove that they are essentially the same thing. Also, the trace of a matrix is invariant under the change of basis.

Definition

Suppose

T \in L (V)

- If

F = C

, then the trace of

T

is the sum of the eigenvalues of

T

, with each eigenvalue repeated according to its multiplicity.- If

F = R

, then the trace of

T

is the sum of the eigenvalues of

T_{C}

, with each eigenvalue repeated according to its multiplicity.The trace of

T

is denoted by trace

T

The

T_{C}

above is the complexification of operator

T

acting on a real vector space. For the sake of completeness, we define

T_{C}

and complexification of real vector space as the following:

Definition

Suppose

V

is a real vector space.- The complexification of

V

, denoted

V_{C}

, equals

V \times V

. An element of

V_{C}

is an ordered pair

(u, v)

, where

u, v \in V

, but we will write this as

u + i v

.- Addition on

V_{C}

is defined by

(u_{1} + i v_{1}) + (u_{2} + i v_{2}) = (u_{1} + u_{2}) + i (v_{1} + v_{2})

for

u_{1}, v_{1}, u_{2}, v_{2} \in V

.- Complex scalar multiplication on

V_{C}

is defined by

(a + b i) (u + i v) = (a u - b v) + i (a v + b u)

for

a, b \in R

and

u, v \in V

Definition

Suppose

V

is a real vector space and

T \in L (V)

. The complexification of

T

, denoted

T_{C}

, is the operator

T_{C} \in L (V_{C})

defined by

T_{C} (u + i v) = T u + i T v

for

u, v \in V

The matrix of

T_{C}

equals the matrix of

T

. When

𝜆

is a real eigenvalue of

T

, then it is also an eigenvalue of

T_{C}

. Suppose

V

is a real vector space,

T \in L (V)

, and

𝜆 \in C

. Then

𝜆

is an eigenvalue of

T_{C}

if and only if

\bar{𝜆}

is an eigenvalue of

T_{C}

.Recall that characteristic polynomial of an operator is

(z - 𝜆_{1})^{d_{1}} (z - 𝜆_{2})^{d_{2}} \dots (z - 𝜆_{m})^{d_{m}}

where

𝜆_{i}

and

d_{i}

are eigenvalues and their corresponding multiplicities (dimensions of generalized eigenspaces). We expand such a polynomial to get ,

z^{n} - (\underset{d_{1}}{\underset{⏟}{𝜆_{1} + \dots + 𝜆_{1}}} + \dots + \underset{d_{m}}{\underset{⏟}{𝜆_{m} + \dots + 𝜆_{m}}}) z^{n - 1} + \dots + (- 1)^{n} (𝜆_{1}^{d_{1}} \dots 𝜆_{m}^{d_{m}})

which results in the following lemma:

Lemma

Suppose

T \in L (V)

. Let

n = d i m V

. Then trace

T

equals the negative of the coefficient of

z^{n - 1}

in the characteristic polynomial of

T

For the trace of a matrix, we define it as the following

Definition

The trace of a square matrix

A

, denoted trace

A

, is defined to be the sum of the diagonal entries of

A

With Definition 4, we can easily show that

Lemma

A

and

B

are square matrices of the same size, then

t r a c e (A B) = t r a c e (B A) .

Proof: Let

A

and

B

be two n-by-n matrices, then the

j

th diagonal element of

A B

\sum_{k = 1}^{n} A_{j, k} B_{k, j}

. So the trace of the product is

t r a c e (A B) = \sum_{j = 1}^{n} \sum_{k = 1}^{n} A_{j, k} B_{k, j} = \sum_{k = 1}^{n} \sum_{j = 1}^{n} B_{k, j} A_{j, k} = t r a c e (B A)

To prove that trace of an operator equals trace of its matrix, we need to first show that the trace of matrix does not depend on the choice of basis, i.e.,

Lemma

Let

T \in L (V)

. Suppose

u_{1}, \dots, u_{n}

and

v_{1}, \dots, v_{n}

are bases of

V

. Then

t r a c e M (T, (u_{1}, \dots, u_{n})) = t r a c e M (T, (v_{1}, \dots, v_{n}))

Proof: From Lemma 3,

M (T, (u_{1}, \dots, u_{n})) = A^{- 1} M (T, (v_{1}, \dots, v_{n})) A

with

A = M (I, (u_{1}, \dots, u_{n}), (v_{1}, \dots, v_{n}))

, so

t r a c e A^{- 1} M (T, (v_{1}, \dots, v_{n})) A \underset{t r a c e (A B) = t r a c e (B A)}{\underset{⏟}{=}} t r a c e M (T, (v_{1}, \dots, v_{n})) A A^{- 1} = t r a c e M (T, (v_{1}, \dots, v_{n}))

which gives

t r a c e M (T, (u_{1}, \dots, u_{n})) = t r a c e M (T, (v_{1}, \dots, v_{n}))

as desired.□We are now ready to prove

Theorem

Suppose

T \in L (V)

. Then

t r a c e T = t r a c e M (T)

Proof: Because trace of matrix is invariant under the choice of basis (Lemma 6), we only need to show

t r a c e T

equals to

t r a c e M (T)

for some bases. Recall that operators on complex vector spaces have their block diagonal matrices with respect to bases of generalized eigenspaces. The block diagonal matrix has eigenvalues on its diagonal, resulting in

t r a c e T = t r a c e M (T)

. If

V

is a real vector space, then applying the complex case to the complexification

T_{C}

gives the desired result.□ Based on Theorem 1 we can show that trace is additive

Lemma

Suppose

S, T \in L (V)

. Then trace

(S + T) =

trace

S +

trace

T

Proof:

\begin{aligned} t r a c e (S + T) & = t r a c e M (S + T) \\ = t r a c e (M (S) + M (T)) \\ = t r a c e M (S) + t r a c e M (T) \\ = t r a c e S + t r a c e T \end{aligned}

where the third equality comes from the definition of trace. □The following result has found its extensive use in quantum theory, especially its infinite-dimensional generalization.

Lemma

There do not exist operators

S, T \in L (V)

such that

S T - T S = I

Proof: From Lemma 5, we have

\begin{aligned} t r a c e (S T - T S) & = t r a c e (S T) - t r a c e (T S) \\ = t r a c e M (S T) - t r a c e M (T S) \\ = t r a c e (M (S) M (T)) - t r a c e (M (T) M (S)) \\ = 0. \end{aligned}

2. Determinant2.1. Determinant of an operatorThe definition of determinant below mimics the definition of trace. The only difference is that the sum of eigenvalues is replaced with the product.

Definition

Suppose

T \in L (V)

.- If

F = C

, then the determinant of

T

is the product of the eigenvalues of

T

, with each eigenvalue repeated according to its multiplicity.- If

F = R

, then the determinant of

T

is the product of the eigenvalues of

T_{C}

, with each eigenvalue repeated according to its multiplicity.The determinant of

T

is denoted by

d e t T

𝜆_{1}, \dots, 𝜆_{m}

are the distinct eigenvalues of

T

(or of

T_{C}

V

is a real vector space) with multiplicities

d_{1}, \dots, d_{m}

, then the definition above implies

d e t T = 𝜆_{1}^{d_{1}} \dots 𝜆_{m}^{d_{m}} .

From the expansion of characteristic polynomial at

(7)

, we have

Lemma

Suppose

T \in L (V)

. Then the characteristic polynomial of

T

can be written as

z^{n} - (t r a c e T) z^{n - 1} + \dots + (- 1)^{n} (d e t T)

We now show the following lemma can be proved easily using Definition 5.

Lemma

An operator on

V

is invertible if and only if its determinant is nonzero.

Proof: Suppose

V

is a complex vector space. An operator

T \in L (V)

is invertible if and only if 0 is not its eigenvalue, resulting in nonzero determinant. If

V

is a real vector space, we have again

T

is invertible if and only if 0 is not an eigenvalue of

T

, which happens if and only if 0 is not an eigenvalue of

T_{C}

. Thus, we have

\det T \neq 0

□ Definition of determinant also gives a new way to define the characteristic polynomial of operators as shown below.

Lemma

Suppose

T \in L (V)

. Then the characteristic polynomial of

T

equals

d e t (z I - T)

Proof: First we notice that

- (T - 𝜆 I) = z I - T - (z - 𝜆) I

𝜆

is an eigenvalue of

T

indicates that

(z - 𝜆)

is an eigenvalue of

z I - T

. The equality also implies that

\dim n u l l (T - 𝜆 I)^{\dim V} = \dim n u l l (z I - T - (z - 𝜆) I)^{\dim V}

i.e., the multiplicity of

𝜆

equals the multiplicity of

z - 𝜆

. Let

𝜆_{1}, . . ., 𝜆_{m}

be the eigenvalues of

T

, and

d_{1}, . . ., d_{m}

corresponding multiplicities.

\det (z I - T) = (z - 𝜆_{1})^{d_{1}} \dots (z - 𝜆_{m})^{d_{m}}

, where the right hand side is exactly the characteristic polynomial of

T

. If

V

is real vector space, we apply the complex case to

T_{C}

to give the same result.□Finally, as a special kind of operator, we have the following result regarding the determinant of isometries.

Lemma

Suppose

V

is an inner product space and

S \in L (V)

is an isometry. Then

| d e t S | = 1

This result should be obvious on complex vector spaces as isometries have absolute values of their eigenvalues being unity. By definition of the determinant on real vector spaces, we have

\det S = \det S_{C}

and thus

| \det S | = 1

, completing the proof.2.2. Determinant of a matrix To define determinant of an arbitrary matrix, we first introduce the concept of permutation and its sign.

Definition

- A permutation of

(1, \dots, n)

is a list

(m_{1}, \dots, m_{n})

that contains each of the numbers

1, \dots, n

exactly once.- The set of all permutations of

(1, \dots, n)

is denoted perm

n

Definition

- The sign of a permutation

(m_{1}, \dots, m_{n})

is defined to be 1 if the number of pairs of integers

(j, k)

with

1 \leq j < k \leq n

such that

j

appears after

k

in the list

(m_{1}, \dots, m_{n})

is even and

- 1

if the number of such pairs is odd.- In other words, the sign of a permutation equals 1 if the natural order has been changed an even number of times and equals

- 1

if the natural order has been changed an odd number of times.

With Definition 6 and Definition 7, it's possible to prove that

Lemma

Interchanging two entries in a permutation multiplies the sign of the permutation by -1.

The determinant of a matrix is now defined as

Definition

Suppose

A

is an

n

-by-

n

matrix

A = (\begin{array}{ccc} A_{1, 1} & \dots & A_{1, n} \\ ⋮ & ⋮ \\ A_{n, 1} & \dots & A_{n, n} \end{array})

The determinant of

A

, denoted

d e t A

, is defined by

d e t A = \sum_{(m_{1}, \dots, m_{n}) \in p e r m n}^{} (s i g n (m_{1}, \dots, m_{n})) A_{m_{1}, 1} \dots A_{m_{n}, n}

Notice that the product

A_{m_{1}, 1} \dots A_{m_{n}, n}

in the definition above has its second subscripts ranging from 1 to n, i.e., is a product of n entries. Also,

p e r m n = n!

. Based on Definition 7 we can prove that

Lemma

Suppose

A

is a square matrix and

B

is the matrix obtained from

A

by interchanging two columns. Then

d e t A = - d e t B .

Proof: From Eqn.

(8)

, all the products

A_{m_{1}, 1} \dots A_{m_{n}, n}

remain the same after interchanging two columns, but two numbers interchange their places in the list of

m_{1}, . . ., m_{n}

. Thus, each term in the sum of

(8)

is multiplied by -1 due to Lemma 13, resulting

d e t A = - d e t B .

□If the interchanged two columns are identical, then

Lemma

A

is a square matrix that has two equal columns, then

d e t A = 0

Proof: From the proof of Lemma 14, we have

\det A = - \det A

, indicating

\det A = 0

.□The following lemma shows what would happen if we permutate multiple columns of a matrix. Note here that

A_{\cdot, i}

is the

i

th column of the matrix

A

Lemma

Suppose

A = (\begin{array}{ccc} A_{\cdot, 1} & \dots & A_{\cdot, n} \end{array})

is an

n

-by-n matrix and

(m_{1}, \dots, m_{n})

is a permutation. Then

d e t (\begin{array}{ccc} A_{\cdot, m_{1}} & \dots & A_{\cdot, m_{n}} \end{array}) = (s i g n (m_{1}, \dots, m_{n})) d e t A

Proof: converting

A

(\begin{array}{ccc} A_{\cdot, m_{1}} & \dots & A_{\cdot, m_{n}} \end{array})

can be accomplished by a series of two-column interchange. In each step, we need to multiply

\det A

by -1. Because

s i g n (m_{1}, \dots, m_{n})

is 1 for even number of interchanging and -1 for odd steps, eqn.

(9)

must be true.□Unlike trace of matrix, determinant of matrix is multiplicative.

Lemma

Suppose

A

and

B

are square matrices of the same size. Then

d e t (A B) = d e t (B A) = (d e t A) (d e t B) .

But just like trace of a given matrix, the determinat does not depend on bases either, i.e.,

Lemma

Let

T \in L (V)

. Suppose

u_{1}, \dots, u_{n}

and

v_{1}, \dots, v_{n}

are bases of

V

. Then

d e t M (T, (u_{1}, \dots, u_{n})) = d e t M (T, (v_{1}, \dots, v_{n}))

which can be proved through the same steps used for proving Lemma 6. Now that we have Lemma 18, we repeat the steps in the proof of Theorem 1 to prove

Theorem

d e t T = d e t M (T)

for

T \in L (V)

2.3. VolumeDeterminant has one important application in undergrad-level mathematics, i.e., in computing volumes and integrals. In this section we first lay out critical tools and then investigate volumes and and integration through linear algebra. Recall that if

V

is an inner product space and

T \in L (V)

, then

T^{†} T

is a positive operator and hence has a unique positive square root,

\sqrt{T^{†} T}

. By the polar decomposition, there is an isometry

S \in L (V)

such that

T = S \sqrt{T^{†} T}

thus

| \det T | = | \det S | \det \sqrt{T^{†} T}

because eigenvalues of the square root are nonnegative. Eqn.

(10)

and Lemma 12 lead to the following

Lemma

Suppose

V

is an inner product space and

T \in L (V)

. Then

| d e t T | = d e t \sqrt{T^{*} T} .

We now study the relation between determinant and volume by calculating volumes of arbitrary subsets of real vector spaces. The volume of arbitrary subset could take any shape geometrically, which make the calculation hard. We will relieve this issue by dissecting the volume into smaller "boxes" whose volume is well defined and easy to calculate. A box is defined as the following

Definition

b o x

R^{n}

is a set of the form

{(y_{1}, \dots, y_{n}) \in R^{n} : x_{j} < y_{j} < x_{j} + r_{j} for j = 1, \dots, n},

where

r_{1}, \dots, r_{n}

are positive numbers and

(x_{1}, \dots, x_{n}) \in R^{n}

. The numbers

r_{1}, \dots, r_{n}

are called the side lengths of the box.

and its volume is defined as

Definition

The volume of a box

B

R^{n}

with side lengths

r_{1}, \dots, r_{n}

is defined to be

r_{1} \dots r_{n}

and is denoted by volume

B

For an arbitray volume

𝛺

, we can write it as a subset of a union of many small boxes, as inidicated in the definition below:

Definition

Suppose

Ω \subset R^{n}

. Then the volume of

Ω

, denoted volume

Ω

, is defined to be the infimum of

volume B_{1} + volume B_{2} + \dots,

where the infimum is taken over all sequences

B_{1}, B_{2}, \dots

of boxes in

R^{n}

whose union contains

Ω

Since we seek a relation between

volume 𝛺

and an acting operator

T

, we use the notation of

T (𝛺)

to represent the set of

{T x : x \in 𝛺}

. Let

T

first be an positive operator, then

volume 𝛺

and

volume T (𝛺)

are related through the lemma below:

Theorem

Suppose

T \in L (R^{n})

is a positive operator and

Ω \subset R^{n}

. Then

volume T (Ω) = (d e t T) (volume Ω) .

Proof: Consider an arbitrary positive operator

T \in L (R^{n})

. By the Real Spectral Theorem, there exist an orthonormal basis

e_{1}, \dots, e_{n}

R^{n}

and nonnegative numbers

𝜆_{1}, \dots, 𝜆_{n}

such that

T e_{j} = 𝜆_{j} e_{j}

for

j = 1, \dots, n

. That is,

T

stretches the

j^{th}

basis vector in an orthonormal basis by a factor of

𝜆_{j}

. Because volume behaves the same with respect to each orthonormal basis, we should have volume

T (Ω)

equal to volume

Ω

multiplied by a factor of

𝜆_{1} \dots 𝜆_{n} = \det T

. □For

T

being isometry, we have instead

Lemma

Suppose

T \in L (R^{n})

is an isometry and

Ω \subset R^{n}

. Thenvolume

T (Ω) =

volume

Ω

Now we can prove that for arbitrary operator

T \in (R^{n})

changes volume by a factor of

| \det T |

Theorem

Suppose

T \in L (R^{n})

and

Ω \subset R^{n}

. Thenvolume

T (Ω) = | d e t T | (

volume

Ω)

Proof: By the Polar Decomposition, there is an isometry

S \in L (V)

such that

T = S \sqrt{T^{†} T} .

Ω \subset R^{n}

, then

T (Ω) = S (\sqrt{T^{†} T} (Ω))

. Thus

\begin{aligned} volume T (Ω) & = volume S (\sqrt{T^{†} T} (Ω)) \\ = volume \sqrt{T^{†} T} (Ω) \\ = (d e t \sqrt{T^{†} T}) (volume Ω) \\ = | d e t T | (volume Ω) \end{aligned}

We finish this section by discussing how determinant appears in multivariable integration. The notions of differentiable and derivative are giving as the following:

Definition

Suppose

Ω

is an open subset of

R^{n}

and

𝜎

is a function from

Ω

R^{n}

. For

x \in Ω

, the function

𝜎

is called differentiable at

x

if there exists an operator

T \in L (R^{n})

such that

{lim}_{y \to 0}^{} \frac{‖ 𝜎 (x + y) - 𝜎 (x) - T y ‖}{| | y | |} = 0.

𝜎

is differentiable at

x

, then the unique operator

T \in L (R^{n})

satisfying the equation above is called the derivative of

𝜎

x

and is denoted by

𝜎^{'} (x)

If the notation of

𝜎^{'} (x)

is used in Definition 12, then eqn.

(11)

implies that the derivative at

x \in R^{n}

satisfying

𝜎 (x + y) \approx 𝜎 (x) + (𝜎^{'} (x)) (y)

where

y \in R^{n}

as well. In some textbook,

𝛥 x

is used in the place of

y

. Definition 12 also implies the format of

M (𝜎^{'} (x))

as we shall see now. Suppose

Ω

is an open subset of

R^{n}

and

𝜎

is a function from

Ω

R^{n}

. We can write

𝜎 (x) = (𝜎_{1} (x), \dots, 𝜎_{n} (x))

where

𝜎_{j}

is a functin from

𝛺

R

. Let

{\hat{y}}_{i} = (0, . . ., 1, . . ., 0)

then we can write

y

y = y_{1} {\hat{y}}_{1} + \dots + y_{n} {\hat{y}}_{n} .

Because

𝜎^{'} (x) \in L (R^{n})

, we must have based on eqn.

(12)

that

\begin{array}{c} (𝜎^{'} (x)) (y_{i} {\hat{y}}_{i}) = y_{i} (𝜎^{'} (x)) ({\hat{y}}_{i}) \approx (𝜎_{1} (x + y_{i} {\hat{y}}_{i}) - 𝜎_{1} (x), \dots, 𝜎_{n} (x + y_{i} {\hat{y}}_{i}) - 𝜎_{n} (x)) \\ \Rightarrow (𝜎^{'} (x)) ({\hat{y}}_{i}) \approx (\frac{𝜎_{1} (x + y_{i} {\hat{y}}_{i}) - 𝜎_{1} (x)}{y_{i}}, \dots, \frac{𝜎_{n} (x + y_{i} {\hat{y}}_{i}) - 𝜎_{n} (x)}{y_{i}}) \end{array}

At the limit of

y_{i} \to 0

we can write

D_{i} 𝜎_{j} = \frac{𝜎_{j} (x + y_{i} {\hat{y}}_{i}) - 𝜎_{j} (x)}{y_{i}}

as the partial derivative of

𝜎_{j}

with respect to the

i

th coordinate, then eqn.

(13)

gives

M (𝜎^{'} (x))

as the following

M (𝜎^{'} (x)) = \begin{array}{cc} \begin{array}{ccc} {\hat{y}}_{1} & \dots & {\hat{y}}_{n} \end{array} \\ \begin{array}{c} (𝜎_{1}^{'} (x)) ({\hat{y}}_{i}) \\ ⋮ \\ (𝜎_{n}^{'} (x)) ({\hat{y}}_{i}) \end{array} & (\begin{array}{ccc} D_{1} 𝜎_{1} (x) & \dots & D_{n} 𝜎_{1} (x) \\ ⋮ & ⋮ \\ D_{1} 𝜎_{n} (x) & \dots & D_{n} 𝜎_{n} (x) \end{array}) \end{array}

where expressions outside of the matrix emphasize the operator maps

{\hat{y}}_{i}

(𝜎^{'} (x)) ({\hat{y}}_{i})

just like what discussed in our nots on matrix presentation notes. Now we can state the change of variables integration formula. Some additional mild hypotheses are needed for

f

and

𝜎^{'}

(such as continuity or measurability), so we will not worry about the rigorous proof of the following theorem.

Theorem

Suppose

Ω

is an open subset of

R^{n}

and

𝜎 : Ω \to R^{n}

is differentiable at every point of

Ω

. If

f

is a real-valued function defined on

𝜎 (Ω)

, then

\int_{𝜎 (Ω)}^{} f (y) d y = \int_{Ω}^{} f (𝜎 (x)) | d e t 𝜎^{'} (x) | d x

Theorem 5 is called a change of variables formula because you can think of

y = 𝜎 (x)

as a change of variables. Also the theorem above should not be a surprise to the reader as we have already show in Theorem 4 that volume

T (Ω) = | d e t T | (

volume

Ω)

.The key point when making a change of variables is that the factor of

| d e t 𝜎^{'} (x) |

must be included when making a substitution

y = f (x)

. We finish up by illustrating this point with two important examples.2.3.1. Polar coordinatesDefine

𝜎 : R^{2} \to R^{2}

𝜎 (r, 𝜃) = (r \cos 𝜃, r \sin 𝜃)

where

𝜎_{1} = r \cos 𝜃

𝜎_{2} = r \sin 𝜃

, and

y = (𝛥 r, 𝛥 𝜃)

based on the terminology introduced above. Then

M (𝜎^{'} (r, 𝜃)) = (\begin{array}{cc} \cos 𝜃 & - r \sin 𝜃 \\ \sin 𝜃 & r \cos 𝜃 \end{array})

The determinant of the matrix above equals

r

, thus explaining why a factor of

r

is needed when computing an integral in polar coordinates. For example, the extra factor of

r

is added when converting to the polar coordinates the integration of

f

over a disk in

R^{2}

\int_{- 1}^{1} \int_{- \sqrt{1 - x^{2}}}^{\sqrt{1 - x^{2}}} f (x, y) d y d x = \int_{0}^{2 𝜋} \int_{0}^{1} f (r \cos 𝜃, r \sin 𝜃) r d r d 𝜃

2.3.2. Spherical coordinatesDefine

𝜎 : R^{3} \to R^{3}

𝜎 (𝜌, 𝜑, 𝜃) = (𝜌 \sin 𝜑 \cos 𝜃, 𝜌 \sin 𝜑 \sin 𝜃, 𝜌 \cos 𝜑),

For this choice of

𝜎

, the matrix of partial derivatives is

M (𝜎^{'} (𝜌, 𝜑, 𝜃)) = (\begin{array}{ccc} \sin 𝜑 \cos 𝜃 & 𝜌 \cos 𝜑 \cos 𝜃 & - 𝜌 \sin 𝜑 \sin 𝜃 \\ \sin 𝜑 \sin 𝜃 & 𝜌 \cos 𝜑 \sin 𝜃 & 𝜌 \sin 𝜑 \cos 𝜃 \\ \cos 𝜑 & - 𝜌 \sin 𝜑 & 0 \end{array}) .

The determinant of the matrix above equals

𝜌^{2} \sin 𝜑

, thus explaining why a factor of

𝜌^{2} \sin 𝜑

is needed when computing an integral in spherical coordinates.3. Where to go from here?This ends my notes on preliminary linear algebra. Anything beyond what's been showing in this series of notes can be termed as "advanced" aspects of linear algebra. For people who are more interested in applied mathematics, Hubbard's "Vector Calculus, Linear Algebra, and Differential Forms: A Unified Approach" could be your next step. People love the book because it discusses interesting physics problem (e.g., Maxwell's equations) by setting up tools from linear algebra. For self-learners, the book is also a good choice as it is self-contained and is accompanied with a solution manual. If one demans broad scope of applications of linear algebra, then Brunton's "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" will likely satisfy one's curiosity. The author is not a pure-math person, so any recommendation from me is not justified for the book of grad-level linear algebra. But the Springer graduate textbook seems to be a decent one for smart minds seeking to understand elements introduced here in greater detail.Like what is preluded in my first note on linear algebra, this series serves as a solid prequel to my journey into quantum computing and quantum information theory. My current life has nothing to do with these two subjects but the impetus of nudging myself to the frontier of humanity sustains over the years. Needless to say, I believe(maybe religeously) that many problems, whether they are scientifically challenging or socially controversial, can be solved by running qunatum algorithm on quantum computers. When such achievements happen, I want to be there among people who dream big and move quick. Sooner or later, I will write new notes again with interesting stories to tell about quantum algorithm and quantum information theory. So please be patient my dear readers and thank you for following along with me so far.

trace(S+T)	=traceM(S+T)
	=trace(M(S)+M(T))
	=traceM(S)+traceM(T)
	=traceS+traceT

trace(ST-TS)	=trace(ST)-trace(TS)
	=traceM(ST)-traceM(TS)
	=trace(M(S)M(T))-trace(M(T)M(S))
	=0.□

A1,1	…	A1,n
⋮		⋮
An,1	…	An,n

volume T(Ω)	= volume S(T†T(Ω))
	= volume T†T(Ω)
	=(detT†T)( volume Ω)
	=\|detT\|( volume Ω)□

cos𝜃	-rsin𝜃
sin𝜃	rcos𝜃

	M(ST,(u1,…,un),(w1,…,wn))=
	M(S,(v1,…,vn),(w1,…,wn))M(T,(u1,…,un),(v1,…,vn)).

sin𝜑cos𝜃	𝜌cos𝜑cos𝜃	-𝜌sin𝜑sin𝜃
sin𝜑sin𝜃	𝜌cos𝜑sin𝜃	𝜌sin𝜑cos𝜃
cos𝜑	-𝜌sin𝜑	0