This document was ed by and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this report form. Report 3i3n4
i and consider fJ=Li bi'YJ(i). Then (fJ, IX)= (Li bi'YJ(i), oc) =(IX, Li bi'YJ(i))=Li bi(ot, 'Y)(i))=Li bi('YJ(i), at)= Li bii(ot)=(IX). Thus 'Y)()= fJ=Li bi'YJ(i) and 'Y) is linear. D PROOF.
i we consider y=Li bi1J(i). (y, IX)= (Li hi'YJ(i), IX)=Li bi('YJ(i), oc)=Li bii(ot) =(a).
=
=
=
Thus,
L. b;;
i=l i� bi c�/ij,p;) ;�J ;� b;P;;) ,P;.
(3.1)
B' = BP.
(3.2)
We are looking at linear functionals from two different points of view. Considered as a linear transformation, the effect of a change of coordinates
(4.5)
is given by formula
(3.2)
of Chapter II, which is identical with
above.
Considered as a vector, the effect of a change of coordinates is given by formula
(4.3)
of Chapter II. In this case we would represent
vectors are represented by column matrices.
Then, since
,P by BT, since (P-1)T is the
matrix of transition, we would have
or
B which is equivalent to point of view.
=
(3.2).
(3.3)
B'P-1,
Thus the end result is the same from either
It is this two-sided aspect of linear functionals which has
made them so important and their study so fruitful.
Example
I.
In analytic geometry, a hyperplane ing through the
origin is the set of all points with coordinates equation of the form
[b1b2
·
•
•
bn]
b1x1 + b2x2 +
·
·
·
+
(x1, x2, bnxn 0. •
=
•
•
, xn)
satisfying an
Thus the n-tuple
can be considered as representing the hyperplane.
Of course,
a given hyperplane can be represented by a family of equations, so that there is not a one-to-one correspondence between the hyperplanes through the origin and the n-tuples.
However, we can still profitably consider the
space of hyperplanes as dual to the space of points. Suppose the coordinate system is changed so that points now have the coordinates
(y1 ,
.
•
.
,
Yn)
where
x;
=
hyperplane becomes
=
L7�i a;;Y;·
n = 0. 2,ciyi j�
l
Then the equation of the
(3.4)
Linear Functionals, Bilinear Forms, Quadratic Forms I IV
136
Thus the equation of the hyperplane is transformed by the rule
c;
=
.27� 1 b;a;;·
Notice that while we have expressed the old coordinates in of the new coordinates we have expressed the new coefficients in of the old coefficients. This is typical of related transformations in dual spaces.
Example
2.
A much more illuminating example occurs in the calculus of
functions of several variables.
x1, x2,
•
•
•
, xn, w
f(x1, x2,
=
Suppose that w is a function of the variables •
•
, xn).
•
Then it is customary to write down
formulas of the following form:
dw
=
aw ax2
aw axl
- dx1 + - dx2 +
and
\i'w
=
aw ··· +dxn,
(3.5)
axn
(aw , aw '
)
aw axl ax2 ... axn . ,
(3.6)
dll' is usually called the differential of w, and \i'w is usually called the gradient w. It is also customary to call \i'w a vector and to regard dw as a scalar, approximately a small increment in the value of w. The difficulty in regarding \i'w as a vector is that its coordinates do not
of
follow the rules for a change of coordinates of a vector. us consider
(x1' x2,
•
•
•
,
xn
space. This implies the existence of a basis combination � is the vector with coordinates
{ �1,
.
.
•
, �n}
such that the linear
n .2 X;�;
(3.7)
i=l
•
•
.
, xn).
Let
{{J1,
.
•
•
, fJn}
be a new
[p;;] where
=
{J; =
=
(x1, x2,
basis with matrix of transition P
Then, if�
For example, let
) as the coordinates of a vector in a linear vector
= i�Il P;;�i· n
(3.8)
.2;�1 Y;fJ1 is the representation of� in the new coordinate system,
we would have
X; or
X;
=
=
n .2 P;1Y i• ;�1
n ax. .2 -· ;�1 ayj
Y;·
(3.9) (3.10)
Let us contrast this with the formulas for changing the coordinates of \i'w. From the calculus of functions of several variables we know that
(3.11)
3
I Change of Basis
137
This formula corresponds to (3.2). Thus V'w changes coordinates as if it were in the dual space. In vector analysis it is customary to call a vector whose coordinates change according to formula (3.10) a contravariant vector, and a vector whose
[
[ ·]
coordinates change according to formula (3.11) a covariant vector.
The
(PT)-1•
Thus
reader should that if
P
=
iJx. oy
:
=
J
Pi1 , then
Thus (3.11) is equivalent to the formula
ow oxi
=
oy
ox
:
=
oy 1 ow . 1=1 ox i oy1
i
(3.12)
•
From the point of view of linear vector spaces it is a mistake to regard both types of vectors as being in the same vector space. As a matter of fact, their sum is not defined. It is clearer and more fruitful to consider the co variant and contravariant vectors to be taken from a pair of dual spaces. This point of view is now taken in modern treatments of advanced calculus and vector analysis. Further details in developing this point of view are given in Chapter VI, Section 4. In traditional discussions of these topics, all quantities that are represented by n-tuples are called vectors. In fact, the n-tuples themselves are called vectors. Also, it is customary to restrict the discussion to coordinate changes in which both covariant and contravariant vectors transform according to the same formulas. This
amounts to having P, the matrix of transition, satisfy the condition
P.
(P-1) T
=
While this does simplify the discussion it makes it almost impossible to
understand the foundations of the subject. Let A basis in
=
V.
{ 0(1,
•
Let B
the dual basis
.
•
=
p in
, O(n} be a basis of V and let A {cfo 1, ••• , n} be the dual {{31, ••• , f3n} be any new basis of V. We are asked to find V. This problem is ordinarily posed by giving the repre =
sentation of the {31 with respect to the basis A and expecting the representations
of the elements of the dual basis with respect of with respect to A in the form
{3j
Let the {31 be represented
n
=
and let
1Pi
A.
=
L PiiO(i, i
(3.13)
n ; !q;; =
(3 . 14)
=l
i l
A
be the representations of the elements of the dual bases B
=
{1fJ1, ••. , 'I/in}.
138
Linear Functionals, Bilinear Forms, Quadratic Forms
Then
bki
=
'1Pkf31
IV
( i=li qkii)( 3i.Pn1X; =1 )
=
n
=
I
n
i=l j=lqkiPn;1X; i_L=l qkiPi!· _L _L n
=
(3.15)
In matrix form, (3.15) is equivalent to I= QP.
(3.16)
Q is the inverse of P. Because of (3.15), the 1P; are represented by the rows of Q. Thus, to find the dual basis, we write the representation of the basis B
in the columns of P, find the inverse matrix ,.,
sentations of the basis B in the rows of
p-1,
p-1.
and read out the repre-
EXERCISES
1. Let A= {(1, 0, ... , 0), (0, 1,..., 0), ..., (0, 0,..., 1)} be a basis of Rn. The basis of Rn dual to A has the same coordinates. It is of interest to see if there
are other bases of Rn for which the dual basis has excatly the same coordinates. Let A' be another basis of Rn with matrix of transition P. What condition should
P satisfy in order that the elements of the basis dual to as the corresponding elements of the basis
A' have the same coordinates
A' ?
2. Let A= { oc1, oc2, oc3} be a basis of a 3-dimensional vector space V, and let A= 4> { 1, 4>2, 4>3}be thebasis of V dual to A. Then let A' = {(1, 1, 1), (1, 0, 1),(0, 1, -1)}
be another basis of V (where the coordinates are given in of the basis Use the matrix of transition to find the basis
A'
3. Use the matrix of transition to find the basis dual to
(1, 1, 1)}.
4. Use the matrix of transition to find the basis dual to
(0, 1, 1)} .
5. Let B represent a linear functional bases, so that BX is the value
�
,
A).
dual to A'.
{(1, 0, -1), ( -1, 1, 0),
and X a vector
of the linear functional.
{(1, 0,O), (1, 1,O),
�
with respect to dual
Let P be the matrix of
transition to a new basis so that if X' is the new representation of�. then X = PX'. By substituting PX' for X in the expression for the value of 4>� obtain another proof that BP is the representation of
in the new dual coordinate system.
4 I Annihilators A
Definition. an IX E
Let V be an n-dimensional vector space and V its dual. If, for
V and a
E
V,
we have
1X
=
0, we say that
and
IX are
orthogonal.
4
I
Annihilators
139
and ex are from different vector spaces, it should and ex are at "right angles.''
Since
be clear that we do
not intend to say that the
Definition.
Let W be a subset (not necessarily a subspace) of V.
all linear functionals such that
rpcx
The set of
0 for all ex E W is called the annihilator of W, and we denote it by w_j_. Any E w_j_ is called an annihilator of w. =
Theorem 4.1. The annihilator W_j_ of W is a subspace of V. If W is a subspace of dimension p, then w_j_ is of dimension n - p. a
w. Hence, w_j_ is a subspace of v. Suppose W is a subspace of V of dimension
all
IX
=
E
{cx1, , cxn} {cx1, ... , exp} is a basis of W. Let A {1, ,
be a basis of V such that
=
=
•
•
•
.
.
•
=
=
=
=
=
and W_j_ is of dimension n dimension of W. D
=
The dimension of W_j_ is called the co
p.
It should also be clear from this argument that W is exactly the set of all IX
E v annihilated by all
E w_j_.
Thus we have A
Theorem 4.2. If S is any subset of V, the set of all ex E V annihilated by all E 5 is a subspace of V , denoted by S_!_. If 5 is a subspace of dimension r, then S_j_ is a subspace of dimension n - r. D a
Theorem 4.2 is really Theorem 1.16 of Chapter II in a different form. If linear transformation of V into another vector space W is represented by
a matrix A, then each row of A can be considered as representing a linear functional on V.
The number
r
dimension of the subspace s of
of linearly independent rows of A is the
v
spanned by these linear functionals. s_j_
is the kernel of the linear transformation and its dimension is n - r. The symmetry in this discussion should be apparent.
ex
=
0 for all ex
E
w. On the other hand, for
IX
E
W,
ex
=
E w_j_' then 0 for all E w_j_.
If
If W is a subspace, (W_j_ )_]_ W. W_j__j_ is the set of ex E V such that cx 0 for all E W_j_. Clearly, W c W_j__j_. Since dim W_j__j_ n dim W_j_ dim W, W_j__j_ W. D Theorem 4.3.
=
By definition, (W_j_)_j_
PROOF.
=
=
=
=
=
This also leads to .a reinterpretation of the discussion in Section IJ-8. A subspace W of V of dimension p can be characterized by giving its annihilator w_j_
c
v of dimension r
=
n
-
p.
Linear Functionals, Bilinear Forms, Quadratic Forms I
140
IV
Theorem 4.4. If W1 and W2 are two subspaces of V, and Wt and Wf are their respective annihilators in V, the annihilator of W1 + W2 is Wt II W2J_ and the annihilator of W1 II W2 is Wt+ Wf. PROOF. If rp is an annihilator of W1 + W1, then rp annihilates all <x. E W1 and all {3 E W2 so that rp E Wt II Wf. If rp E Wt II Wf, then for all <x. E W1 and f3 E W2 we have rp<x. 0 and r/>/3 0. Hence, rp(a<x. + b/3) arp<x. + b/3 0 so that annihilates W1 + W2• This shows that (W1 + A
=
=
W2)J_
Wt
=
11
=
=
Wf.
The symmetry between the annihilator and the annihilated means that the second part of the theorem follows immediately from the first. Namely,
(W1 + W2)J_ Wt II Wf, we have by substituting Wt W1 and W2, (Wt + Wf )J_ (Wt )J_ II (Wf)J_ W1 II W2• Wt+ Wt. D (W1 II W2)J_
since
=
for
=
and
Wf
Hence,
=
=
Now the mechanics for finding the sum of two subspaces is somewhat simpler than that for finding the intersection.
To find the sum we merely
combine the two bases for the two subspaces and then discard dependent vectors until an independent spanning set for the sum remains.
It happens
W2 it is easier to find Wt and W2L W1 II W2 as (Wt + Wt)J_, than it is to
that to find the intersection W1 II
Wt + Wt
and then
and obtain
find the intersection directly. The example in Chapter II-8, page 71, is exactly this process carried out in detail.
In the notation of this discussion E1
=
A
Wt and E2
=
Wt.
Let V be a vector space, V the corresponding dual vector space, and let be a subspace of V. A
Since
W
c
V, is there any simple relation between
W W A
and V? There is a relation but it is fairly sophisticated. Any function defined A
on all of V is certainly defined on any subset. therefore, defines a function on
W.
to
W, A
This does not mean that V A
c
W. A
a mapping of V into
{>
by
linear.
<x. n
on
dim
W is
Since
E V,
A
W;
it means that the restriction defines
W by {>,
and denote the mapping of
R. We call R the restriction mapping. It is easily seen that A
The kernel of
W.
E -
which we have called the restriction of
Let us denote the restriction of to onto
A linear functional
R is the set of all
E V such that
rp(<x.) n
=
R is
0 for all
Thus K(R) WJ_. Since dim W dim W dim WJ_ K(R), the restriction map is an epimorphism. Every linear functional =
=
=
-
=
the restriction of a linear functional on V.
K(R)
=
WJ_, we have also shown that W and V/WJ_
are isomorphic.
But two vector spaces of the same dimension are isomorphic in many ways. We have done more than show that
W and V/WJ_ are isomorphic.
We have
shown that there is a canonical isomorphism that can be specified in a natural way independent of any coordinate system.
If
{> is a residue class in V/ W_!_,
4 I Annihilators
141
and is any element of this residue class, then this natural isomorphism.
V
V/WJ_,
onto
then R
=
and
T'YJ, and
-r
f, and
R() correspond under
If 'YJ denotes the natural homomorphism of
denotes the mapping of
f,
onto R() defined above,
is uniquely determined by Rand 'YJ and this relation.
-r
Theorem 4.5. Let W be a subspace of V and let W1- be the annihilator of W in V. Then W is isomorphic to Vf Wl.. Furthermore, if R is the restriction map ofV onto W, if 'YJ is the natural homomorphism of Vonto V/W1-, and -r is the unique isomorphism of V/Wl. onto W characterized by the condition R T'YJ,
then
=
-r
V/W1-.
(/,)
=
R()
where is any linear functional in the residue class f,
E
D
EXERCISES
((1, 0, -1), (1, -1, O), (0, 1, -1)). 1. (a) Find a basis for the annihilator of W (b) Find a basis for the annihilator of W ((1, 1, 1, 1, 1), (1, 0, 1, 0, 1), (0, 1, 1, 1, 0), (2, 0, 0, 1, 1), (2, 1, 1, 2, 1), (1, -1, -1, -2, 2), (1, 2, 3, 4, -1)). What are the dimensions of W and W1- ? =
=
2. Find a non-zero linear functional which takes on the same non-zero value (2, 1, 1), and ;3 (1, 0, 1). (1, 2, 3), ;2
for ;1
oc
=
=
=
3. Use an argument based on the dimension of the annihilator to show that if ¥ 0, there is a E
4. Show that if 5 5. Show that (5)
V such that
c =
T, then 51-
¥ 0.
::::i
T1-.
51-1-.
6. Show that if 5 and T are subsets of Veach containing 0, then
(5 + T)1-
c
51-
51- + T1-
c
(5
T1-,
n
and n
T)1-.
7. Show that if 5 and T are subspaces of V, then
(5 + T)1and
5 1- + T 1-
=
=
51-
(5
n
n
T 1- ,
T)1-.
8. Show that if 5 and T are subspaces of V such that the sum 5 + T is direct ,
then 51- + P
=
V.
9. Show that if 5 and T are subspaces of V such that 5 + T p
=
{O}.
=
V, then 51-
V, then 10. Show that if 5 and T are subspaces of V such that 5 EE> T 51- EE> T 1-. Show that 51- is isomorphic to T and that T1- is isomorphic to S. =
V
n
=
11. Let V be vector space over the real numbers , and let be a non-zero linear functional on V. We refer to the subspace 5 of Vannihilated by as a hyperplane of V. Let 5+
=
{oc I
>
O}, and 5-
=
{ oc I
<
O}.
We call 5+ and 5- the two
142
Linear Functionals, Bilinear Forms, Quadratic Forms
sides of the hyperplane S.
-
I
IV
If oc and {J are two vectors, the line segment ing
{ toc
and {J is defined to be the set
+
(1
t)/3
I
0 :.::;; t :.::;;
1},
oc
which we denote by
ocf3. Show that if oc and f3 are both in the same side of S, then each vector in oc/3 is also in the same side. And show that if oc and /3 are in opposite sides of S, then
oc
{J
contains a vector in S.
5 I The Dual of a Linear Transformation Let U and V be vector spaces and let
a
be a linear transformation mapping
U into V. Let V be the dual space of V and let be a linear functional on V. For each IX EU, a(1X) E V so that can be applied to a(1X). Thus [a(1X)] E F and
=
Theorem 5.1. For a a linear transformation of U into V, and E V, the mapping
Theorem 5.2. For a given linear transformation a mapping U into V, the mapping of V into U defined by making E V correspond to a EU is a linear A A transformation of V into U. a
the mapping defined is linear. D The mapping of
Definition.
is denoted by
Thus
8-.
8-()
=
A
E V onto
A
A
is called the
a
of
a
and
with respect to the bases A in U and B A
A
Let A and B be the dual bases in U and V, respectively.
now arises:
dual
Let A be the matrix representing in V.
A
"How is the matrix representing
8-
The question
with respect to the bases
B and A related to the matrix representing a with respect to the bases A and B ?'' , /3n} we have a(1X1) !;=1 a;1{3;. {{31, {1X1, , ixm} and B For A A Let {1, , m} be the basis of U dual to A and let {'1/'1, . . . , 11'n} be the basis A A =
•
•
.
•
=
•
•
•
=
•
•
of V dual to B. Then for tp; E V we have
[8-( '1/';)](1X 1)
=
=
=
( '1/';<1)(1X;)
(
=
)
'1/'i<1(1X;)
ak1f3k ki =l n L aki1Pif3k k =l V'i
(5.1)
5
I
The Dual of a Linear Transformation
143
[6(1J!;)](e<;) a;; is a(tp;) a then a( ) b If a ) LZ'=1 ;kk· 1P Li'=1 ;'!j!;. 1J' L7=1 MIZ'=1 ;kk I;:1 k· Thus the representation of a('IJ!) is BA. To follow absolutely the The linear functional on U which has the effect
=
=
=
=
=
notational conventions for representing a liner transformation as given in
(2.2), a
AT. However, because we 1J! by the row matrix B, and because a(1J!) is repre sented by BA, we also use A to represent a. We say that A represents a Chapter II,
should be represented by
have chosen to represent A
A
A
A
with respect to B in V and A in U.
AT is chosen. The reason A in this: in Chapter V we define a closely related linear transformation a* , the adt of a. The adt is not repre AT, the conjugate complex of the sented by AT; it is represented by A* transpose. If we chose to represent a by AT, we would have a represented by A, a by AT in both the real and complex case, and a* represented by AT in the real case and AT in the complex case. Thus, the fact that the adt is represented by AT in the real case does not, in itself, provide a compelling reason for representing the dual by AT. There seems to be less confusion if both a and a are represented by A, and a* is represented by A* (which reduces to AT in the real case). In a number of other respects our choice In most texts the convention to represent a by
we have chosen to represent a by
=
results in simplified notation.
a(1J!)(fl, by definition of a(1J!). If� is represented 1J!(<1(�)) X, then 1J!(a(�)) B(AX ) (BA)X a(tp)(fl. Thus the representation
If� EU, then by
=
=
=
=
convention we are using allows us to interpret taking the dual of a linear transformation as equivalent to the associative law. The interpretation could be made to look better if we considered operator on V.
1J!(a�)
=
a as a left operator on U and a right a(fl as a� and a( 'IJ!) as 1J!<1. Then
(1J!a)� would correspond to ing to the dual.
Theorem 5.3. PROOF.
In other words, write
K(a)J_
If 1J! E K(a)
1J! E lm(a)_!_.
If
1J!
=
Im( a).
A
V, then for all
c
e< EU, 1J!(<1)e<)) 0. Thus a(1J!)(e<) 0. 1J!( (e<)) EU, a(1J!)(e<) e< <1 =
=
E lm (a) J_ , then for all
Thus 1P E K(a) and K(a)
=
=
=
Im(a)J_. o
Corollary 5.4. A necessary and sufficient condition for the solvability of the linear problem a(fl {3 is that {3 E K(a)J_. D =
The ideas of this section provide a simple way of proving a very useful theorem concerning the solvability of systems of linear equations.
The
theorem we prove, worded in of linear functionals and duals, may not at first appear to have much to do· with with linear equations. worded in of matrices, it is identical to Theorem
7.2
But, when
of Chapter II.
144
Linear Functionals, Bilinear Forms, Quadratic Forms I IV Let
Theorem 5.5.
a
be a linear transformation of U into V and let
vector in V. Either there is a
�
fJ be any
E U such that
(I) a(fl = fJ, A
E V such that
f
or there is a
(2) a(f) =
0 and
f{3 (I)
fJ rt K(a)J_.
1.
=
Condition
PROOF.
means that
fJ E
I m(r) and condition
(2)
means that
Thus the assertion of the Theorem follows directly from Theorem
5.3. D
Theorem 5.5 is also equivalent to Theorem x I matrix. Either there is an n (I) AX= B, or there is a I x m matrix C such that (2) CA 0 and CB = I.
B an m
7.2
of Chapter
2.
Let A be an m x n matrix and
In matrix notation Theorem 5.5 reads:
x I matrix
X such
that
=
Theorem 5.6. a and a have the same rank. By Theorems 5.3 and 4.l, v(a) = n
PROOF.
-
p(a) = v(a).
Let W be a subspace of V invariant under
Theorem 5.7.
a.
o
Then WJ_ is a
A
subspace of V invariant under Let
PROOF.
a(ix)
E W.
f E Wj_. For af E WJ_. D
8'. any
ix
a
E W we have
Thus
Theorem 5.8.
=
.
0, since
The dual of a scalar transformation is also a scalar trans
formation generated by the same scalar. If
PROOF.
faix =
a(ix)
=
aix for all (1. E V, then for each
afix. D
f
E
V, (af)(ix) =fa(ix) =
Theorem 5.9. If A is an eigenvalue for a, then A is also an eigenvalue for a. If A is an eigenvalue for a, then a A is singular. The dual of
PROOF.
a
-
A is
-
a
-
eigenvalue of
A and it must also be singular by Theorem 5.6. Thus A is an
a.
Theorem 5.10.
D
Let V have a basis consisting of eigenvectors of
v has a basis consisting of eigenvectors of a.
a.
Then
V, and assume that ix, is an , fn} be the corresponding { f1, f2, dual basis. For all IX;, af;(ix;) = f;a(tX;) f;A;tX; = A;fitX; A;b;; = A;(\;. Thus af; = },if; and f; is an eigenvector of a with eigenvalue A i. D PROOF.
Let { ix1,
ix2,
•
•
•
, ixn} be a basis of
eigenvector of a with eigenvalue A i. Let
•
=
•
•
=
6 I Duality of Linear Transformations
145
EXERCISES
1. Show that
,,..__
<JT
= M.
[: =:1J
2. Let a be a linear transformation of R2 into R3 represented by
A=
2
2
Find a basis for ( a(R2))J_. Find a linear functional that does not annihilate (I, 2, 1 ) . Show that (1, 2, 1) rf= a(R2).
3. The following system of linear equations has no solution. functional whose existence is asserted in Theorem 5.5. 3x 1 Xl
-x
1
+
x2 =
+ 2x +
2
=
3x2 =
Find the linear
2 1 1.
* 6 I Duality o f Linear Transformations
In Section 5 we have defined the dual of a linear transformation.
What is
the dual of the dual? In considering this question we restrict our attention to finite dimensional vector spaces. In this case, the mapping defined in Section 2, is an isomorphism. Since of
§,
J of V into
V,
the dual of a, is a mapping
V into itself, the isomorphism J allows us to define a corresponding linear
transformation on V. For convenience, we also denote this linear transforma tion by
&.
where the
Thus,
a
&(oc)
=
1-1[8(J(oc))J.
(6.1)
on the left is the mapping of v into itself defined by the ex
pression on the right. Theorem 6.1.
The relation between
the dual of a. PROOF.
a
and
a
is symmetric;
that is,
a
is
By definition,
a(J(oc))( )
=
J(oc)a()
=
a ()(oc)
=
a(oc)
=
J(a(oc))().
Thus a(J(oc)) J(a(oc)). By (6.1) this means a(oc) a(oc). Hence, a is the dual of a. D J-1[J(a(oc))] =
=
J-1[§(J(oc))]
=
=
The reciprocal nature of duality allows us to establish dual forms of theorems without a new proof. that
K(a)_j_
=
For example, the dual form of Theorem 5.3 asserts
Im (a ) . We exploit this principle systematically in this section.
Theorem 6.2.
The dual of a monomorphism is an epimorphism.
dual of an epimorphism is a monomorphism.
The
146
Linear Functionals, Bilinear
Forms, Quadratic Forms I
IV
PROOF. By Theorem 5.3, lma ( ) = K(cJ).l. If a is an epimorphism, Im(a) = V so that K(a) = v.i = {O}. Dually, lmO( ) = K(a).l. If a is a monomorphism, K(a) = {O} and Im(O-) =U. D ALTERNATE PROOF. By Theorem 1.15 and 1.16 of Chapter II, a is an epimorphism if and only if TG = 0 implies T 0. Thus M = 0 implies f 0 if and only ifa is an epimorphism. Thusa is an epimorphism if and only ifa is a monomorphism. Dually, T is a monomorphism if and only if f is an epimorphism. D =
=
Actually, a much more precise form of this theorem can be established. If W is a subspace of V, the mapping t of W into V that maps IX E W onto IX E V is called the injection of W into V. Theorem 6.3.
Let W be a subspace of
V
of W into V. Let R be the restriction map of mappings.
and let t be the injection mapping A
V
A
onto W. Then t and R are dual
PROOF. Let E v. For any IX E W, R ( )1X ( ) =t(1X) R() = t() for each. Hence, R =t. D
Theorem 6.4.
=
t()(1X).
Thus
If 7T is a projection of U onto S along T, the dual 7r is a
projection of D onto T .l along s.i. PROOF.
A projection is characterized by the property TT2 =TT. ......
By
Theorem 5.7, 7r2 =TT2 =7r so that7r is also a projection. By Theorem ( ) = Im(TT).l = S.i and Im(7r) = K(TT).l = T .i. D K.fr
5.3,
A careful comparison of Theorems 6.2 and 6.4 should reveal the perils of being careless about the domain and codomain of a linear transformation. A projection 7T of U onto the proper subspace S is not an epimorphism because the codomain of 7T is U, not S. Since7r is a projection with the same rank as 7T,7r cannot be a monomorphism, which it would be if 7T were an epimorphism. Theorem 6.5. Leta be a linear transformation of U into V and letT be a linear transformation of V into W. Let a and f be the corresponding dual transformations. Iflm(a) = K(T), then Imf ( ) = Ka ( ). PROOF. Since Im(a) c KT ( ), TG(IX) 0 for all IX EU; that is, TG = 0. Since M = ia = 0, lm(f) c Ka ( ). Now dim lm(f) = dim Im(T) sinceT and f have the same rank. Thus dim Im(f) = dim V dim K(T) = dim V dim Jm(a) =dim V dim Im(a) = dim K(a). Thus K(a) Im(f). o =
-
-
-
=
Definition. Experience has shown that the condition lma ( ) = K(T) is very useful because it is preserved under a variety of conditions, such as the taking of duals in Theorem 6.5. Accordingly, this property is given a special name. We say the sequence of mappings
u�v�w
(6.1)
7
Direct Sums
I
147
is exact at V if Im(a)
=
K(r). A sequence of mappings of any length is said
to be exact if it is exact at every place where the above condition can apply. In these , Theorem
sequence
6.5
says that if the sequence A
:
A
cJ
(6.1)
is exact at V, the
A
(6.2)
U+--V+--W
V.
is exact at
We say that
(6.1)
and
(6.2)
are dual sequences of mappings.
Consider the linear transformation a of U into V. Associated with a is the
following sequence of mappings
G
0----+ K(a)----+ U----+ V----+ V/Im(a)----+ 0, I
where
t
(6.3)
T/
is the injection mapping of K(a) into U, and
ri is the natural homo
morphism of V onto V/Im(a). The two mappings at the ends are the only ones they could be, zero mappings.
It is easily seen that this sequence is
exact.
Associated with a is the exact sequence A
o By Theorem R an
ri
6.3
+---
u /Im(a)
A
u
a +---
U/Im(a) is isomoprhic to
(6.3)
A v
i
+---
K(a)
+---
o.
(6.4)
the restriction map R is the dual oft, and by Theorem
differ by a natural isomorphism.
sequences
*7
1] +---
and
(6.4)
Kca),
4.5
With the understanding that
and V/Im(a) is isomorphic to
are dual to each other.
Kca),
the
I Direct Sums
If A and 8 are any two sets, the set of pairs, ( b), where EA E 8, is called the product set of A and 8, and is denoted by A x 8. If {A; I i 1, 2, ... , ri} is a finite indexed collection of sets, the product set of the {A;} is the set of all n-tuples, ( , ) where EA;. This product set is denoted by X�1 A;. If the index set is not ordered, the description of
Definition.
a,
and b
a
=
a1, a2,
•
•
•
an ,
a;
the product set is a little more complicated. To see the appropriate generali
zation, notice that an n-tuple in
x;=1 A;, in effect, selects one element from A;. Generally, if {Aµ J µEM} is an indexed collection of sets, an element of the product set XµeM Aµ selects for each index µ an element of Aw Thus, an element of XµeM Aµ is a function defined on M which associates with each µEM an element µ EAw Let {V; I i 1, 2, ... , n} be a collection of vector spaces, all defined over each of the
a
=
the same field of scalars F.
With appropriate definitions of addition and
Linear Functionals, Bilinear Forms, Quadratic Forms I
148
IV
scalar multiplication it is possible to make a vector space over F out of the product set
x;=1 Vi.
(<X1,
•
•
•
'
We define addition and scalar multiplication as follows:
<Xn) + (/31, a(<X1,
•
.
•
•
•
•
•
, <Xn)
f3n) =
=
(<X1 + /31,
(a<X1,
•
•
·
·
·
'
<Xn + f3n)
, a<Xn).
•
(7.1) (7.2)
It is not difficult to show that the axioms of a vector space over Fare satisfied, and we leave this to the reader.
Definition.
The vector space constructed from the product set
definitions given above is called the denoted by If D
=
V1
EB
V2
EB7=1 Vi is
EB
·
·
EB
·
Vn
=
external direct EB7=1 Vi.
the external direct sum of the
Vi,
sum
the
X�1 Vi by the Vi and is
of the
Vi are not subspaces
of D (for n > 1 ). The elements of D are n-tuples of vectors while the elements of any
Vi are vectors.
For the direct sum defined in Chapter I, Section 4, the
summand spaces were subspaces of the direct sum. If it is necessary to distinguish between these two direct sums, the direct sum defined in Chapter I will be called the
interna l direct sum.
Vi
Even though the
are not subspaces of D it is possible to map the
Vi
monomorphically into D in such a way that D is an internal direct sum of these
<Xk E vk the element (0, ... , 0, <Xk, 0, ... , 0) ED, <Xk appears in the kth position. Let us denote this mapping by '"· tk is a monomorphism of Vk into D, and it is called an injection. It provides an embedding of Vk in D. If V� = Im (tk) it is easily seen that D is an internal images.
Associate with
in which
direct sum of the
V�.
It should be emphasized that the embedding of
Vk in D provided by the
injection map tk is entirely arbitrary even though it looks quite natural. There are actually infinitely many ways to embed
Vk in D. For example, let a be any Vk into V1 (we assume k ¥- 1). Then define a new mapping t� of Vk into D in which <Xk E Vk is mapped onto (a(<Xk), 0, ... , 0, <Xn, 0, ... , 0) ED. It is easily seen that t� is also a monomorphism of V" linear transformation of
into D.
Theorem 7.1. PROOF.
be a basis
(0, {3,.)}
=
A of V. (A, 8) Let
If dim
n, then dim U EB V = m + n. and dim V , <Xm} be a basis of U and let 8 = {/31> ... , f3n} Then consider the set {(<X1, 0), ... , (<Xm, 0), (0, {31), , in U EB V. If <X = L�i ai<X; and f3 = L�=1 b1{31, then n m (<X, {3 ) L a;(<Xi, 0) + L b;(O, /31)
=
{<X1,
U
.
.
= m
=
•
.
i=l V. If we
j=l
=
and hence
(A, 8)
spans U EB
have a relation of the form
m
n
i=l
j=l
L ai(<Xi, 0) + L b;(O, {31)
=
0,
•
•
7
I Direct
Sums
149
then
( i� a;1x;, ;�ib;/3;)
and hence
2I:1 a; rx ; a;
dependent, all
=
ffi Vis of dimension
U
0 and
=
0 and all
0.
Since
Thus
(A, B)
,27=1 b;p; b;
=
0.
=
A
and
B
are linearly in
is a basis of U ffi V and
D
m + n.
EB?=i
It is easily seen that the external direct sum of dimension
0,
=
27=1 m;.
V;, where dim V;
m;,
=
is
We have already noted that we can consider the field F to be a 1-dimen
sional vector space over itself.
With this starting point we can construct
the external direct sum F ffi F, which is easily seen to be equivalent to the 2-dimensional coordinate space f2.
Similarly, we can extend the external
direct sum to include more summands, and consider P to be equivalent to F ffi
·
·
·
ffi F, where this direct sum includes
We can define a mapping rrk
is called a
projection of
rrk
n
summands.
of D onto Vk by the rule
rr
D onto the kth component.
i rx1,
•
•
•
Actually,
)
,
rxn
rxk.
=
is not a
rrk
projection in the sense of the definition given in Section 11-1, because here the domain and codomain of
( tkrrk)
2
=
kernel of
tkrrktkrrk
=
tk l rrk
=
rrk
are different and
tkrrk
so that
It is easily seen that
rrk.
Wk
V1 ffi
=
·
·
i
ffi Vk- ffi
·
tkrrk
2 rrk
is not defined. However,
is a projection.
ffi Vk+1 ffi
{O}
·
·
ffi Vn
·
Let
Wk denote (7.3)
·
The injections and projections defined are related in simple but important ways. It is readily established that rrktk rr;tk
=
t17T1
The mappings
tkrri
for
i ¥- k
include the codomain of
+
•
i ¥-
for •
·
+
ln1Tn
=
(7.5)
k,
(7.6)
lo.
are not defined since the domain of
(7.4), (7.5),
(7.6),
and
does not
are sufficient to define the
Starting with the Vk, the monomorphisms
Im ( tk). Let D'
tk
rri.
Conversely, the relation direct sum.
0
(7.4)
I vk'
=
tk
embed the Vk in D.
+ V�. Conditions (7.4) and (7.5) imply that D' is a direct sum of the V�. For if 0 rx � + + rx � , with rx � E V�, there exist rxk E Vk such that tk ( rxk) rx � . Then rrk (O) rrk ( rx �) + + rrk ( rx �) rrkt1 ( rx1 ) + + rrktn ( rxn) rxk 0. Thus rx � 0 and the sum is Let V�
=
=
V�
+
·
·
·
=
·
·
·
=
=
·
direct. Condition Theorem 7.2. PROOF.
m+n
·
·
(7.6) implies
First of all, if dim A
and dim U ffi V
=
=
that D'
The dual space of U A
=
=
U
m + n.
=
=
·
A
naturally isomorphic to
and dim V
Since
·
D.
ffi V is
m
·
=
�
U
=
n, A
A
U ffi V.
./'--._
then dim U ffi V A
=
ffi V and U ffi V have the same
·
Linear Functionals, Bilinear Forms, Quadratic Forms I
150
dimension, there exists an isomorphism between them.
IV
The real content
of this theorem, however, is that this isomorphism can be specified in a natural way independent of any coordinate system. For (c/>,
1P) E 0
ffi
V
and (.x, {3) EU ffi V, define
(c/>, 1/1)(.x, /3)
=
c/>.x +
(7.7)
1Pf3·
It is easy to that this mapping of (.x, {3) EU ffi V onto c/>.x + 11'/3 E Fis linear and, therefore, corresponds to a linear functional, an element of /'-..._
U
A
/'-..._
A
V. It is also easy to that the mapping of U ffi V into U ffi V that this defines is a linear mapping. Finally, if (c/>, 1/1) corresponds to the zero linear functional, then (c/>, 1/1)(.x, 0) c/>.x 0 for all .x EU. This implies that c/> 0. ffi
=
=
In a similar way we can conclude that 1P A
U
A
ffi
/'-..._
V into U
ffi
V has kernel
Corollary 7.3. to
V1
ffi ... ffi
vn"
{(O, O)}.
=
=
0.
This shows that the mapping of
Thus the mapping is an isoinorphism. o
The dual space to V1
ffi
·
·
·
ffi
V
D
n is naturally isomorphic
The direct sum of an infinite number of spaces is somewhat more com
XµeMV is a function µ .x(µ) denote the value of this function in Vµ. Then we can define .x + {3 and a.x (for a E f) by the rules plicated. In this case an element of the product set P on the index set M.
For <X E xµEMV , let <Xµ µ
(.x + /3)(µ) (a.x)(µ)
=
=
=
=
(7.8) (7.9)
<Xµ + /3µ, a.xw
It is easily seen that these definitions convert the product set into a vector space. As before, we can define injective mappings tµ of Vµ into P. However, P is not the direct sum of these image spaces because, in algebra, we permit
sums of only finitely many summands. Let D be the subset of P consisting of those functions that vanish on all but a finite number of elements of M. With the operations of vector addition and scalar multiplication defined in P, D is a subspace. concepts.
Both D and P are useful
To distinguish them we call D the external direct sum and P the
direct product. These are not universal and the reader of any mathe matical literature should be careful about the intended meaning of these or related .
To indicate the summands in P and D, we will denote P by
xµEMVµ and D by EBµEM vµ" In a certain sense, the external direct sum and the direct product are dual concepts. Let t denote the injection of V into P and let
µ
µ
jection of P onto V . It is easily seen that we have
µ
and
TTµtµ
=
lvµ•
for
v-:;eµ.
7T
µ
denote the pro
.. A� F131r I\_,,
NUCLEAR
I
7
I Direct
. :;;;:,·· C\ J1 A
"""'" ,,.::_,,." "
Sums
� i'W'
151
These mappings also have meaning in reference to same notation,
TTµ
D.
Though we use the
tµ D the analog of (7.6) is correct,
requires a restriction of the domain and
restriction of the codomain. For
requires a
(7.6)' (7.6)' oc ED,
Even though the left side of when applied to an element
involves an infinite number of ,
(7.10) involves only a finite number of .
An analog of
product is not available.
(7.6)
for the direct
Consider the diagram of mappings
(7. 1 1 ) and consider the dual diagram
(7. 12) For
v
I = 1. phism. onto
� µ,
v µ"
TTvlµ = 0.
By Theorem Thus
frµ
Thus
6.2, tµ
�fr. = -:;:;:t; = 0.
For
v =
is an epimorphism and
is an injection of
vµ
into
6,
and
"'
µ,
frµ
�frµ =:;;;;, =
is a monomor
is a projection of
D
If D is the external direct sum of the indexed collection {Vµ Iµ EM}, D is isomorphic to the direct product of the indexed collection {Vµ Iµ EM}. PROOF. Let ED. For eachµ EM, tµ is a linear functional defined on Vµ; that is, tµ corresponds to an element in Vµ. In this way we define a function on M which has atµ EM the value tµ E vµ" By definition, this is an element in XµeM Yw It is easy to check that this mapping of D into the direct product x µEM Vµ is linear. If� 0, there is an oc ED such that oc � 0. Since oc= [(LµeM tµTT µ)(oc)] = Lµ eM tµTTµ(oc) � 0, there is a µEM such that tµTTµ{oc) � 0. Since TTµ(oc) E Vµ, tµ � 0. Thus, the kernel of the mapping of D into xµEM vµ is Theorem 7.4.
A
zero.
Let tp E xµEM vµ" 'l/Jµ = tp(µ) E Vµ be the value of tp at µ. For oc ED, define oc = LµeM"Pµ (TTµoc). This sum is defined since TTµ°'= 0 for all but finitely manyµ. Finally , we show that this mapping is an epimorphism.
Let
Linear Functionals, Bilinear Forms, Quadratic Forms
152
tv(oc,)
J IV
(tvocv) = L "Pµ( 7Tµlvocv) µEM =
(7.13) This shows that "P is the image of . While Theorem
Hence,
6
and xµEM
vµ are ismorphic.
D
7.4 shows that the direct product D is the dual of the exter
nal direct sum D, the external direct sum is generally not the dual of the direct product. This conclusion follows from a fact (not proven in this book) that infinite dimensional vector spaces are not reflexive.
However, there is more
symmetry in this relationship than this negative assertion seems to indicate. This is brought out in the next two theorems. Let {Vµ Iµ EM} be an indexed collection of vector spaces {aµ I µ EM} be an indexed collection of linear transformations, where aµ has domain Vµ and codomain Ufor allµ. Then there is a unique linear transformation a of (f)µ EMVµ into U such that <1µ <Jtµ for eachµ. Theorem 7.5.
over F and let
=
PROOF.
Define
(7.14) oc E ([)µEM Vµ, a(oc)
For each
=
LµeM aµ7Tµ(oc) is well defined since only a finite oc, E V.,
number of on the right are non-zero. Then, for
<Jtv(av)
=
=
L <1 (tvocv) µEM µ7Tµ L <1 (7T t,)(ocv) µEM µ µ
ai. a,. If <11 is another linear transformation of (BµEM Vµ into LJ such that <Jµ
Thus
(7.15)
=
then
a'
a'ln = a' I lµ7Tµ µEM = .2 a'tµ7Tµ µEM
=
= <J.
Thus, the
a with the desired
property is unique. o
=
<11 tµ
,
7
I
Direct Sums
153
Let {Vµ Iµ E M} be an indexed collection of vector s paces {Tµ Iµ E M} be an indexed collection of linear transformations where Tµ has domain wand codomain vµ for all µ. Then there is a linear transformation T Of winto xµEM vµ SUCh that Tµ =7T µT for each µ. PROOF. Let oc E W be given. Since T(oc) is supposed to be in XµEMVµ, T(oc) is a function on M which forµ E M has a value in vµ" Define Theorem 7.6.
over F and let
T(oc)(µ) =Tµ(oc).
(7.16)
Then
(7.17) so that
7T µT =Tw
D
The distinction between the external direct sum and the direct product is that the external direct sum is too small to replace the direct product in Theorem
7.6.
This replacement could be done only if the indexed collection
of linear transformations were restricted so that for each many mappings have non-zero values
oc E
W only finitely
Tµ(oc).
The properties of the external direct sum and the direct product established in Theorems
7.5
and
7.6
are known as "universal factoring" properties.
In
Theorem 7.5 we have shown that any collection of mappings of Vµ into a space U can be factored through D.
In Theorem
collection of mappings of W into the
7.7 and 7.8 show that D and Theorem 7.7.
7.6
we have shown that any
Vµ can be factored
through P. Theorems
P are the smallest spaces with these properties.
Let W be a vector space over F with an indexed collection
of linear transformations
{ A.µ Iµ EM}
where each
Aµ
has domain
V1,
and co
domain W. Suppose that,for any indexed collection of linear transformations
{aµ Iµ EM} tion
A. of
with domain
Vµ and codomain U, there exists a linear transforma aµ =A.A.w Then there exists a monomorphism of
W into U such that
D into W. PROOF. D such that a
A. of W into 7.5 there is a unique linear transformation
By assumption, there exists a linear transformation
tµ =A.A.w
By Theorem
of D into W such that
Aµ = atw 1
Then
=µLlµ7Tµ EM
=µ! A.atµ7Tµ EM
=A.a! tµ7Tµ µ EM
=A.a. This means that
a is a monomorphism
(7.18) and A. is an epimorphism. D
Linear Functionals, Bilinear Forms, Quadratic Forms I IV
154
Theorem 7.8. Let Y be a vector space over F with an indexed collection of linear transformations {Oµ IµEM} where each()µ has domain Y and codomain Vw Suppose that, for any indexed collection of linear transformations {rµ IµEM} with domain w and codomain vµ, there exists a linear transfor mation () of W into Y such that Tµ ()µ()· Then P is isomorphic to a subspace of Y . =
With P in place of W and TTµ in place of r, µ the assumptions
PROOF.
0f the theorem say there is a linear transformation 0 of P into Y such that TT
µ
=
()µ() for eachµ. By Theorem
into P such that()µ
in
Recall that
vµ" Thus
Thus
r 0 ( 1X)
IX E P
IX
=
=
TTµT
7.6 there is a linear transformation r of Y
for eachµ. Then
is a function defined on M that has at µ EM a value IXµ
is uniquely defined by its values. ForµEM
IX
and
rO
=
This means that 0 is a monomorphism and
Ip.
r
is an epimorphism and P is isomorphic to Im(O). D
Theorem 7.9. Suppose a space D' is given with an indexed collection of monomorphisms {t; IµEM} of Vµ into D' and an indexed collection of epi morphisms {TT; IµEM} of D' onto Vµ such that
v
Then
D
�µ.
and D' are isomorphic.
This theorem says, in effect, that conditions
characterize the external direct sum. PROOF. IXE D'
For
only
IXE D'
finitely
let
IXµ
many
=
;(1X) .
TT
and
(7.6)'
We wish to show first that for a given
non-zero. By (7.6)' IX 10 (IX) LµEM i;TT;( 1X) L µEM i;IXµ- Thus, only finitely many of the i;IXµ are non-zero. Since i; is a monomorphism, only finitely many of the IXµ are non-zero. IXµ
are
(7.4), (7.5),
=
.
=
=
Now suppose that {aµ IµEM} is an indexed collection of linear transforma
tions with domain
Define A LµEM O'µTT;. For IXE f)'' LµEM
A(1X)
=
LµEM
vµ ;(ix)
and codomain U.
=
=
=
.
=
=
7I
Direct Sums
155
we also have 10'
=
! i;7T;
µEM
=
! m,,7T�
µEM
=
! aA.i;7T�
µEM
=
=
Since
a is both
aA. ! i;7T; aA..
a monomorphism and an epimorphism, D and D' are iso
morphic. D The direct product cannot be characterized quite so neatly. the direct product has a collection of mappings satisfying
(7.6)'
(7.4)
Although and
(7.5),
is not satisfied for this collection if M is an infinite set. The universal
factoring property established for direct products in Theorem
is inde
pendent of
but not
(7.4)
and
since direct sums satisfy
(7.5),
the universal factoring property of Theorem
7.6.
7.6 (7.4) and (7.5)
We can combine these three
conditions and state the following theorem. Theorem 7.10. Let P' be a vector space over F with an indexed collection of monomorphisms {i; Iµ EM} of Vµ into P' and an indexed collection of epi morphisms {7T; Iµ E M} of P' onto Vµ such that
for
and such that if {p µ I µ E M} is any indexed collection of linear transformations with domain wand codomain vµ, there is a linear transformation p of winto P' such that Pµ 1T�p for each µ. If P' is minimal with respect to these three properties, then P and P' are isomorphic. =
When we say that mean: Let
P"
P'
is minimal with respect to these three properties we
be a subspace of
P'
and let
;
7T
7T; to P". {i; Iµ EM} with
be the restriction of
If there exists an indexed collection of monomorphisms
domain Vµ and codomain P" such that (7.4), (7.5) and the universal factoring properties are satisfied with in place oft and in place of then P" P' ..
i;
PROOF.
By Theorem
isomorphism and let
(P'
in place of Y and
the relations
�
7T;
7T�,
7.8, P is isomorphic to a subspace of P'. P" Im(O). With appropriate changes =
7T�
=
Let() be t he in notation
in place of()µ), the proof of Theorem
7.8
yields
156
where
Linear Functionals, Bilinear Forms, Quadratic Forms I IV T
is an epimorphism of P' onto P. Thus, if
TT
;
is the restriction of TT�
to P", we have
TT; is an epimorphism. i; = ()iw
This shows that Now let and
for
�µ.
v
Since P has the universal factoring property, let T be a linear transformation of W into P such that Pµ =
11 T
for each µ, where
TTµT
OT.
=
property of Theorem 7.6.
for eachµ. Then
This shows that P" has universal factoring
Since we have assumed P' is minimal, we have
P" = P' so that P and P' are isomorphic. D 8 I Bilinear Forms Definition.
Let U and V be two vector spaces with the same field of scalars
Let f be a mapping of pairs of vectors,
F.
the field "of scalars such that function of
f(a1(J.1
+
(J.
and
+
(J. EU
and
/3 E
V, is a linear
b2/32) = aif((J.1, b1/31 + b2/32) + aJ((J.2, b1/31 = a1bif((J.1, /31) + a1bJ((J.1, /32) + a2bif((J.2, /31) + a2bJ((J.2, /32).
Such a mapping is called a
(1)
one from U and one from V, into
where
separately. Thus,
/3
02(J.2, b1/31
f((J., /3),
Take U= V =
Rn
bilinear form.
and F =
R.
+
b2/32)
In most cases we shall have U = V.
Let A=
{(J.1,
•
.
.
, (J.n}
be a basis in
For�= I:=l X;(J.; and 'YJ = I:=1 Y;(J.; we may definef� ( , 'YJ) = I:=1 This is a bilinear form and it is known as the inner, or dot, product.
Rn.
(2)
We can take F=
functions on the interval
R
(8.1)
X;Yi·
and U = V= space of continuous real-valued
[O, l].
We may then definef((J.,
/3)= n (J.(x){J(x) dx.
This is an infinite dimensional form of an inner product. It is a bilinear form.
As usual, we proceed to define the matrices representing bilinear forms with respect to bases in U and V and to see how these matrices are transformed when the bases are changed. Let A=
{/31, ... , /3n} be a basis {(J.1, ... , (J.m} be a basis in U and let B (J. E U, /3 E V, we have (J.= L!1 xi(J.i and {3= I7=1 Y;/3;
in V. Then, for any
=
8
I
Bilinear Forms
157
where x;, Y; E F. Then f(ex, {J)
=
=
=
c� X;ex;, P) i� x;f(ex i,;� Y;fJ;) i� xi ct Y;f(exi, fJ;))
f
i
m
=
n
(8.2)
L .! X;Y; f(ex;, {J;).
i=l j=l
Thus we see that the value of the bilinear form is known and determined for any ex EU, {J E V, as soon as we specify the mn values f(ex;, {31). Con versely, values can be assigned to f(ex;, {31) in an arbitrary way andf(ex, {J) can be defined uniquely for all ex EU, {J E V, because A and B are bases in U and V, respectively. We denotef(ex;, {31) by b;; and define B [b;;] to be the matrix represent ing the bilinear form with respect to the bases A and B. We can use the m-tuple X= (x1, xm) to represent ex and the n-tuple Y (Yi. ..., Yn) to represent {J. Then =
.
.
•
,
=
f(ex, fJ)
m
=
n
L ,Lx ibi;Y; i=l j=l
,
(8.3)
(, our convention is to use an m-tuple X (x1, xm) to represent an m x 1 matrix. Thus X and Y are one-column matrices.) Suppose, now, that A'= {ex�, ..., ex;,.} is a new basis of U with matrix of transition P, and that B' {{J�, ..., {J�} is a new basis of V with matrix of transition Q. The matrix B' [b;11 representingf with respect to these new bases is determined as follows: =
•
.
•
=
=
m =
n
L L Pribrsqs;•
r=ls=l
(8.4)
158
Linear Functionals, Bilinear Forms, Quadratic Forms I
IV
Thus, B' =
PTBQ .
From now on we assume that U = V.
(8.5)
Then when we change from one
basis to another, there is but one matrix of transition and discussion above.
P=
Q in the
Hence a change of basis leads to a new representation
of f in the form B'
Definition. be
The matrices
B
and
= PTBP. pTBP,
(8.6)
where
P is
non-singular, are said to
congruent.
Congruence is another equivalence relation among matrices.
Notice
that the particular kind of equivalence relation that is appropriate and meaningful depends on the underlying concept which the matrices are used to represent.
Still other equivalence relations appear later.
This
occurs, for example, when we place restrictions on the types of bases we allow.
Definition.
lff(IX,
f is symmetric.
/3) = j(/3, IX) for all IX, f3
E V, we say that the bilinear form
Notice that for this definition to have meaning it is necessary
that the bilinear form be defined on pairs of vectors from the same vector space, not from different vector spaces. Iff(a, that the bilinear form/ is
a) = 0
for all
IX
E
V, we say
skew-symmetric.
Theorem 8.1. A bilinear form f is symmetric if and only if any matrix B representing f has the property BT = B. PROOF. The matrix B = [bi; ] is determined by f(1Xi, IX;). But b1; = B. that b , = (a so f = , ; BT f(a1 IX;} = ; ; IX;) If BT= B, we say the matrix B is symmetric. We shall soon see that symmetric bilinear forms and symmetric matrices are particularly important. If BT= B, then f(1Xi, IX;)= b;; = b;; = f(a1, a; ). Thus f(a, /3) = f("i,�1 ailX;, !i�i b1a1) = !f�i !i�i aib;f(1X;, a1) = !�1 !i�i b1a;f(a1, IX;)= f(/3, IX). It then follows that any other matrix representing/will be symmetric; that is, if B is symmetric, then
pTBP is also
symmetric. D
Theorem 8.2. If a bilinear form f is skew-symmetric, then any matrix B representing f has the property BT= -B. PROOF. For any IX, /3 E V, 0 =/(IX + {3, IX + /3) = /(IX, IX) + /(IX, /3) + f(/3, a) + f({3, {3) = f(a, {3) +f({3, IX). From this it follows that f(a, {3) = - f(/3, IX) and hence BT= -B. D Theorem 8.3. If 1 + 1 � 0 and the matrix B representing f has the property BT = -B, thenf is skew-symmetric.
159
8 I Bilinear Forms
Suppose that BT -B, or f(rx., f.J) -f(f.J, rx.) for all rx., f.J E v. Then f(rx., rx.) - f(rx., rx.), from which we have f(rx., rx.) + f(rx., rx.) (1 + I)f(rx., rx.) 0. Thus, if 1 + 1 ¥=- 0, we can conclude thatf(rx., rx.) 0 so that f is skew-symmetric. D PROOF.
=
=
=
=
=
=
If BT -B, we say the matrix B is skew-symmetric. The importance of symmetric and skew-symmetric bilinear forms is implicit in =
Theorem 8.4.
If
l +
1
¥=- 0, every bilinear form can be represented
uniquely as a sum of a symmetric bilinear form and a skew-symmetric bilinear form.
PROOF. Let f be the given bilinear form. Define f,( rx., {.J) Hf(rx., f.J) + f(f.J, rx.)] andf•• (rx., f.J) t[f(rx., f.J) - f(f.J, rx.)]. (The assumption that 1 + 1 ¥=0 is required to assure that the coefficient 'T' has meaning.) It is clear that f,(rx., f.J) f,((.J, rx.) and f,,(rx., rx.) 0 so that f. is symmetric and ..fs, is skew symmetric. We must yet show that this representation is unique. Thus, suppose that f1(a, f.J) + f2(rx., f.J) where f 1 is symmetric and f2 is skew-symmetric. f(rx., f.J) fi(rx., f.J) + f2(rx., f.J) + fi (f.J, rx.) + f2((.J, rx.) Then f(oc, f.J) + f(f.J, rx.) 2f1(rx., f.J) . Hence f1(rx., f.J) t[f(oc, f.J) + f(f.J, rx.)]. If follows immediately that f2(oc, f.J) Hf(rx., f.J) - f(f.J, rx.)]. D =
=
=
=
=
=
=
=
=
We shall, for the rest of this book, assume that 1 + 1 ¥=- 0 even where such an assumption is not explicitly mentioned. EXERCISES 1. Let oc = (xi. x2) E
R2
and let f3
form /(oc, /3) = Determine the 2
x
X1Y1
+
=
(Yi. y2, y3) E R3.
2x1Y2 - X2Y1 - X2Y2
Then consider the bilinear + 6x1y3.
3 matrix representing this bilinear form.
2. Express the matrix
as the sum of a symmetric matrix and a skew-symmetric matrix. 3. Show that if B is symmetric, thenPTBP is symmetric for each P, singular or
non-singular.
Show that if B is skew-symmetric, then pi' BP is skew-symmetric
for eachP. 4. Show that if A is any
m
x n
matrix, then ATA and AAT are symmetric.
5. Show that a skew-symmetric matrix of odd order must be singular.
160
Linear Functionals, Bilinear Forms, Quadratic Forms I IV
6. Let
/(ix,
f be
a bilinear form defined on
U
and V.
Show that, for each
/3) defines a linear functional
With this fixed f show that the mapping of formation of U into
ix EU
V.
onto
7. (Continuation) Let the linear transformation of 6 be denoted by
a1.
Show that there is an
all f3 if and only if the nullity of
a1
ix EU, ix
U into
A
EV
ix EU,
is a linear trans-
A
V defined in Exercise
7i6 0, such that /(ix,
{J)
=0 for
is positive.
{J E V, f ( ix, /3) defines a linear function 'Pp Ea is a linear transformation Tf of v into D.
8. (Continuation) Show that for each
on u. The mapping of /3
E v onto
9. (Continuation) Show that
a1
'Pp
and T1 have the same rank.
U and V are of different dimensions, there must be either an ix EU, ix 7i6 0, such that /(ix, {J) =0 for all {J E V or a f3 E V, {J 7i6 0, such that f(ix, {J) =0 for all ix EU. Show that the same conclusion follows 10. (Continuation) Show that, if
if the matrix representing/ is square but singular. 11. Let U0 be the set of all ix EU such that f(ix, /3) = 0 for all {J E V. Similarly, let V0 be the set of all {J E V such that f(ix, {J) = 0 for all ix EU. Show that U0 is a subspace of U and that V0 is a subspace of V.. 12. (Continuation) Show that
m
- dim U0
= n
- dim V0•
13. Show that if f is a skew-symmetric bilinear form, then /(ix,
for all
ix, {J E V.
{J)
= -f({J,
ix)
14. Show by an example that, if A and Bare symmetric, it is not necessarily true
that AB is symmetric. AB =BA?
What can be concluded if A and B are symmetric and
15. Under what conditions on B does it follow that XTBX =0 for all X? 16. Show the following: If A is skew-symmetric, then A2 is symmetric.
If A is
skew-symmetric and B is symmetric, then AB - BA is symmetric. If A is skew symmetric and Bis symmetric, then AB is skew-symmetric if and only if AB =BA.
9 I Quadratic Forms Definition. setting
q(oc)
A =
quadratic form is a function q on a vector space defined f(oc, oc), where f is a bilinear form on that vector space.
by
Iff is represented as a sum of a symmetric and a skew-symmetric bilinear form,
f(oc, {3)
=
symmetric, then
J.(oc, {3) + J••(oc, {3) where f. is symmetric and f•• is skew q(oc) J.(oc, oc) + J••(oc, oc) J.(oc, oc). Thus q is completely =
=
determined by the symmetric part off alone.
In addition, two different
bilinear forms with the same symmetric part must generate the same quadratic form. We see, therefore, that if a quadratic form is given we should not expect
9 I
Quadratic Forms
161
to be able to specify the bilinear form from which it is obtained.
At best
we can expect to specify the symmetric part of the underlying bilinear form. This symmetric part is itself a bilinear form from which
q
can be obtained.
Each other possible underlying bilinear form will differ from this symmetric bilinear form by a skew-symmetric term. What is the symmetric part of the underlying bilinear from expressed in of the given quadratic form? We can obtain a hint of what it should
x2 as obtained from the bilinear (x + y)2 x2 + xy + yx + y2• Thus if xy yx (sym t[(x + y2) - x2 - y2 ] . express xy as a sum of squares, xy
be by regarding the simple quadratic function function
xy.
Now
metry), we can
=
=
=
In general, we see that the symmetric part of the underlying bilinear form can be recovered from the quadratic form by means of the formula
t[q(a + {3) - q (a) - q({3) ] Hf(a+ {3, a + {3) - f(a, a) f({J, {3)] Hf(a, a)+ f(a, fJ) + f({3, a)+ f({3, fJ) - f(a, a) - f ({3, {3)] Hf(a, fJ) + f({3, a)] J.(a, fJ). =
=
=
=
=
(9.1)
f. is the symmetric part off Thus it is readily seen that Theorem 9.1. Every symmetric bilinear form fs determines a unique quadratic form by the rule q(a) J.(a, a ), and if 1 + I 7"f 0, every quadratic form determines a unique symmetric bilinear form J.(a, {J) t[q(a+ {3) q (a) - q(fJ)]from which it is in turn determined by the given rule. There is a one-to-one correspondence between symmetric bilinear forms and quadratic forms. D =
=
The significance of Theorem
9.1
is that, to treat quadratic forms ade
quately, it is sufficient to consider symmetric bilinear forms.
It is fortunate
that symmetric bilinear forms and symmetric matrices are very easy to handle.
Among many possible bilinear forms corresponding to a given
quadratic form a symmetric bilinear form can always be selected.
Hence,
among many possible matrices that could be chosen to represent a given quadratic form, a symmetric matrix can always be selected. The unique symmetric bilinear formf. obtainable from a given quadratic form
q
is called the
polar form
of
q.
It is desirable at this point to give a geometric interpretation of quadratic forms and their corresponding polar forms.
This application of quadratic
forms is by no means the most important, but it the source of much of the terminology.
(x)
=
In a Euclidean plane with Cartesian coordinate system, let
(x1, x2 ) be
the coordinates of a general point. Then
q((x))
=
X12 - 4X1X2 + 2x22
162
Linear Functionals, Bilinear Forms, Quadratic Forms
I
IV
is a quadratic function of the coordinates and it is a particular quadratic form.
The set of all points (x) for which q((x))
=
1 is a conic section (in
this case a hyperbola). Now, let (y)
=
(y1, y2) be the coordinates of another point. Then
f.((x), (y))
=
X1Y1 - 2X1Y 2 - 2X2Y1 + 2X2Y2
is a function of both (x) and (y) and it is linear in the coordinates of each point separately.
It is a bilinear form, the polar form of q.
(x), the set of all (y) for whichf.((x), (y))
=
For a fixed
1 is a straight line. This straight
line is called the polar of (x) and (x) is called the pole of the straight line. The relations between poles and polars are quite interesting and are ex plored in great depth in projective geometry.
One of the simplest relations
is that if (x) is on the conic section defined by q((x))
=
1, then the polar of
(x) is tangent to the conic at (x). This is often shown in courses in analytic geometry and it is an elementary exercise in calculus. We see that the matrix representingf.((x), (y)), and therefore also q((x)), is
[
]
1
-2
2 .
-2 EXERCISES
1. Find the symmetric matrix representing each of the following quadratic
forms:
(a) (b) (c)
(d)
(e) ([) (g)
2x2 + 3xy + 6y2 Sxy + 4y2 x2 + 2Ty + 4xz + 3y2 + yz + 7z2 4xy x2 + 4xy + 4y2 + 2xz + z2 + 4yz x2 + 4xy - 2y2 x2 + 6xy - 2y2 - 2yz + z2•
2. Write down the polar form for each of the quadratic forms of Exercise 1. 3. Show that the polar form[, of the quadratic form q can be recovered from the quadratic form by the formula
fh·, {3) 10
=
t{q(o:
+
{3) - q(o: - /3)}.
I The Normal Form
Since the symmetry of the polar form
f.
is independent of any coordinate
system, the matrix representing f. with respect to any coordinate system will be symmetric.
The simplest of all symmetric matrices are those for
which the elements not on the main diagonal are all zeros, the diagonal matrices.
A great deal of the usefulness and importance of symmetric
10
I
The
163
Normal Form
bilinear forms lies in the fact that for each symmetric bilinear form, over a field in which 1 + 1 =;e. 0, there exists a coordinate system in which the matrix representing the symmetric bilinear form is a diagonal matrix. Neither the coordinate system nor the diagonal matrix is unique. Theorem JO.I. For a given symmetric matrix B over a field F (in which 1 + 1 =;C. 0), there is a non-singular matrix P such that pTBP is a diagonal
matrix. In other words, if f. is the underlying symmetric bilinear (polar) form, there is a basis A'= {a�, ..., oc�} ofV such that f.(a;, a;)= 0 whenever i ¥-j. PROOF. The proof is by induction on n, the order of B. If n = 1, the
theorem is obviously true (every 1 x 1 matrix is diagonal). Suppose the assertion of the theorem has already been established for a symmetric bilinear form in a space of dimension n - 1. If B = 0, then it is already diagonal. Thus we may as well assume that B =;e. 0. Let f. and q be the corresponding symmetric bilinear and quadratic forms. We have already shown that (10.1) /,(oc, (3) = Hq(oc + (3) - q(oc) - q((3)]. The significance of this equation at this point is that if q(oc) = 0 for all oc, then/,(oc, (3) = 0 for all oc and (3. Hence, there is an oc� EV such that q(oc�) = di¥- 0. With this oc� held fixed, the bilinear formf.(oc�, oc) defines a linear functional
0 0
0 0
0
0
dr
0
0
0
0
0
di
0
0
, ocn
}
B'=
In this display of B' the first r elements of the main diagonal are non-zero
164
Linear Functionals, Bilinear Forms, Quadratic Forms I
IV
and all other elements of B' are zero. r is the rank of B' and B, and it is
also called the ra,nk of the corresponding bilinear or quadratic form. The d/s along the main diagonal are not uniquely determined. introduce a third basis A"
=
{<X�, ... , <X:} such that <X�
=
We can
xi<X; where x; ":/= 0.
Then the matrix of transition Q from the basis A' to the basis A" is a diagonal
matrix with x1,
.
.
•
,
xn
down the main diagonal. The matrix B" representing
the symmetric bilinear form with respect to the basis A" is -
B"
=
Q TB Q '
d1X12
0
0
d2x22
0
0
0
0
0 0
0 0
0
O_
=
Thus the elements in the roam diagonal may be multiplied by arbitrary non-zero squares from F.
[
By 3
0
�]king -3
B'
=
[� �J -
and P
=
[� �]
we get B"
=
pTB'P
=
. Thus, it is possible to change the elements in the main diagonal
.
by factors which are not squares.
However, IB"I
=
IB'I IPl2 so that it ·
is not possible to change just one element of the main diagonal by a non square factor. The question of just what changes in the quadratic form can be effected by P with rational elements is a question which opens the door to the arithmetic theory of quadratic forms, a branch of number theory. Little more can be said without knowledge of which numbers in the field of scalars can be squares. is a square;
In the field of complex numbers every number
that is, every complex number has at least one square root.
Therefore, for each d; ":/= 0 we can choose
xi
1
=
1- so that dixi2 v di
=
l.
In this case the non-zero numbers appearing in the main diagonal of B" are all 1 's. Thus we have proved Theorem
10.2.
If F
is the field of complex numbers, then every symmetric
matrix B is congruent to a diagonal matrix in which all the non-zero elements are 1 's.
The number of 1 's appearing in the main diagonal is equal to the
rank of B. D The proof of Theorem 10.1 provides a thoroughly practical method for find
ing a non-singular P such that pTBP is a diagonal matrix.
The first problem
10
I
The Normal Form
165
is to find an oc� such that q( oc�) ¥- 0. The range of choices for such an oc� is generally so great that there is no difficulty in finding a suitable choice by trial and error. For the same reason, any systematic method for finding an oc� must be a matter of personal preference. Among other possibilities, an efficient system for finding an oc� is the following: First try oc� = oc1• If q(oc1) = b11 = 0, try oc� = oc2 • If q( oc2 ) = h22 = 0, then q(oc1 + oc2) = q(oc1) + 2f.(oc1, oc2) + q(oc2) = 2f.(oc1 oc2) = 2b12 so that it is convenient to try oc� oc1 + oc2 • The point of making this sequence of trials is that the outcome of each is determined by the value of a single element of B. If all three of these fail, then we can our attention to oc3, oc1 + oc3, and oc2 + oc3 with similar ease and proceed in this fashion. Now, with the chosen oc�,f.(oc�, oc) defines a linear functional � on V. If oc� is represented by (p11, , Pni) and oc by (x1, , xn), then =
•
f,(oc'i_,oc) =
•
•
•
•
•
i�J1 P;1h;1x1 = ,�C� pilbi1) x1.
(10.2)
This means that the linear functional � is represented by [p11 • PnilB. The next step described in the proof is to determine the subspace W1 annihilated by �. However, it is not necessary to find all of W1• It is sufficient to find an oc� E W1 such that q(oc�) ¥- 0. With this oc�, f.(oc�, oc) defines a linear functional � on V. If oc� is represented by (p12, , Pn2), then � is represented by [P12 • • • Pn2JB. The next subspace we need is the subspace W2 of W1 annihilated by �. Thus W2 is the subspace annihilated by both � and �. We then select an oc� from W2 and proceed as before. Let us illustrate the entire procedure with an example. Consider •
•
.
B=l� � �]
.
•
0 .
2
Since b11 = b22 = 0, we take oc� = oc1 + oc2 functional � is represented by [1 1 O]B = [l
=
(1, 1, 0). Then the linear
3].
A possible choice for an oc� annihilated by this linear functional is (1, The linear functional � determined by (1, -1, 0) is represented by [1 -1 O]B [-1 1 1].
-1,
0).
=
We should have checked to see that q(oc�) ¥- 0, but it is easier to make that check after determining the linear functional � since q(oc�) = ef>�oc� = -2 ¥- 0 and the arithmetic of evaluating the quadratic form includes all the steps involved in determining �.
166
Linear Functionals, Bilinear Forms, Quadratic Forms I
We must now find an
IX
� annihilated by� and �.
IV
This amounts to solving
the system of homogeneous linear equations represented by
�
A possible choice is tional
�
IX
(-1, -2,
=
I).
The corresponding linear func
is represented by
[-I
l]B
-2
0 -4].
[O
=
The desired matrix of transition is
p
=
r:
-1
lo
=�]
0
l .
Since the linear functionals we have calculated along the way are the rows of prB, the calculation of P7'BP is half completed. Thus,
pTBP
=
[-:O : �] [: _: =�] [� -� �] =
0
-4
0
0
0
1
0 -4 .
It is possible to modify the diagonal form by multiplying the elements in the main diagonal by squares from F.
Thus, if F is the field of rational
{2, -2, -I}. numbers we can get the diagonal {I, -1, -I}. If plex numbers we can get the diagonal {l, I, 1 }. numbers we can obtain the diagonal
If F is the field of real F is the field of com
Since the matrix of transition P is a product of elementary matrices the diagonal from pTBP can also be obtained by a sequence of elementary row and column operations, provided the sequence of column operations is exactly the same as the sequence of row operations.
This method is
commonly used to obtain the diagonal form under the congruence. element
If an
bii in the main diagonal is non-zero, it can be used to reduce all other
elements in row i and column i to zero. If every element in the main diagonal is zero and
b;; -:F 0,
then adding row j to row i and column j to column i
will yield a matrix with 2b;; in the ith place of the main diagonal. The method
is a little fussy because the same row and column operations must be used, and in the same order. Another good method for quadratic forms of low order is called
pleting the square. If
xrBX
X TBX
=
L�;�1 X;b;;X; and h;; -:F 0,
1 (b;1X1 + -b;;
·
·
·
+
b;nxn)2
com
then
(10.3)
10 I The Normal Form
167
is a quadratic form in which
xi
does not appear. Make the substitution
(10.4) Continue in this manner if possible.
The steps must be modified if at any
stage every element in the main diagonal is zero. x
;
If b;; �
0,
then the sub
and x; xi x1 will yield a quadratic form repre sented by a matrix with 2b;1 in the ith place of the main diagonal and -2bii in the jth place. Then we can proceed as before. In the end we will have stitution
=
xi
+
x1
=
-
(10.5) expressed as a sum of squares; that is, the quadratic form will be in diagonal form. The method of elementary row and column operations and the method of completing the square have the advantage of being based on concepts much less sophisticated than the linear functional.
However, the com
putational method based on the proof of the theorem is shorter, faster, and more compact.
It has the additional advantage of giving the matrix
of transition without special effort.
EXERCISES 1. Reduce each of the following symmetric matrices to diagonal form.
Use the
method of linear functionals, the method of elementary row and column operations,
[� : -;]
and the method of completing the square,
(a)
(c)
(b)
_
[� : -:J _
�[ - ] (d) [� �i -� �
2
0
-1
0
2
-1
-1
1
2
0
3
0
2
1
0
2. Using the methods of this section, reduce the quadratic forms of Exercise 1,
Section 9, to diagonal form. 3. Each of the quadratic forms considered in Exercise 2 has integral coefficients.
Obtain for each a diagonal form in which each coefficient in the main diagonal is a square-free integer.
168
Linear Functionals, Bilinear Forms, Quadratic Forms I IV
11 I Real Quadratic Forms A quadratic form over the complex numbers is not really very interesting. From Theorem 10.2 we see that two different quadratic forms would be distinguishable if and only if they had different ranks. Two quadratic forms of the same rank each have coordinate systems (very likely a different coordinate system for each) in which their representations are the same. Hence, any properties they might have which would be independent of the coordinate system would be indistinguishable. In this section let us restrict our attention to quadratic forms over the field of real numbers. In this case, not every number is a square; for example, -1 is not a square. Therefore, having obtained a diagonalized representation of a quadratic form, we cannot effect a further transformation, as we did in the proof of Theorem 10.2 to obtain all 1's for the non-zero elements of the main diagonal. ·The best we can do is to change the positive elements to + 1 's and the negative elements to -1 's. There are mariy choices for a basis with respect to which the representation of the quadratic form has only + 1 's and -1's along the main diagonal. We wish to show that the number of+ 1's and the number of -1 's are independent of the choice of the basis; that is, these numbers are basic properties of the underlying quadratic form and not peculiarities of the representing matrix.
Theorem 11.1. Let q be a quadratic form over the real numbers. Let P be the number of positive in a diagonalized representation of q and let N be the number of negative . In any other diagonalized representation of q the number of positive is P and the number of negative is N. PROOF. Let A { a:1, ... , a:n} be a basis which yields a diagonalized representation of q with P positive and N negative in the main diagonal. Without loss of generality we can assume that the first P elements of the main diagonal are positive. Let B { {31 , , fln} be another basis yielding a diagonalized representation of q with the first P' elements of the =
=
•
.
•
main diagonal positive. Let U (ct1, , ctp ) and let W ({JP'+i• ... , fln>· Because of the form of the representation using the basis A, for any non-zero a: E U we have q(a:) > 0. Similarly, for any f3 E W we have q({J) :::;; 0. This shows that {O}. Now dim U P, dim W n - P', and dim (U + W):::;; n. Un W Thus P + n - P' dim U + dim W dim (U + W) + dim (U n W) dim (U + W) :::;; n. Hence, P - P' :::;; 0. In the same way it can be shown that P' - P :::;; 0. Thus P P' and N r -P - P' N'. D =
.
•
•
=
=
=
=
=
=
=
=
=
=
=
Definition. The number S P - N is called the signature of the quadratic form q. Theorem 11.1 shows that S is well defined. A quadratic form is called non-negative semi-de.finite if S r. It is called positive de.finite if S n. =
=
=
11 I Real Quadratic Forms
169
It is clear that a quadratic form is non-negative semi-definite if and only if q(rt.) � 0 for all rt. E V. non-zero rt. E V.
It is positive definite if and only if q(rt.) > 0 for
These are the properties of non-negative semi-definite
and positive definite forms that make them of interest.
We use them ex
tensively in Chapter V. If the field of constants is a subfield of the real numbers, but not the real numbers, we may not always be able to obtain + I's and - I's along the main diagonal of a diagonalized representation of a quadratic form. However, the statement of Theorem I I. I and its proof referred only to the diagonal as being positive or negative, not necessarily +I or
-
1
.
Thus the theorem is equally valid in a subfield of the real numbers, and the definitions of the signature, non-negative semi-definiteness, and positive definiteness have meaning. In calculus it is shown that
oo
J
e_.,
-oo
2
dx
=
u 7r'2•
It happens that analogous integrals of the form
,L x;a;;X;
appear in a number of applications. The term
=
XT AX appearing
in the exponent is a quadratic form, and we can assume it to be symmetric. In order that the integrals converge it is necessary and sufficient that the There is a non-singular matrix P such
quadratic form be positive definite. that pTAP of L.
(y1,
•
=
If X •
•
Lis a diagonal matrix. Let
=
, Yn)
(x1,
•
•
•
,
xn}
{}.1,
.
•
•
,
An} be the
main diagonal
are the old coordinates of a point, then Y
are the new coordinates where
x;
=
_L1p;1y1.
Since
the Jacobian of the coordinate transformation is det P. Thus,
y,
y,
-
det P� v.
A1
·
·
·
:'.:.._ 4
An
OX·
�1 Y
=
=
p;1,
170
Linear Functionals, Bilinear Forms, Quadratic Forms
Since Ai
·
·
An
·
=
det L
=
det P det A det P I-
=
I IV
det P2 det A, we have
7T1l/2 -
det A'-'i .
EXERCISES 1. Determine the rank and signature of each of the quadratic forms of Exercise 1, Section 9.
2. Show that the quadratic form Q(x, y) = ax2 + bxy + cy2(a, b, c real) is positive definite if and only if a > 0 and b2 - 4ac < 0.
3. Show that if A is a real symmetric positive definite matrix, then there exists a real non-singular matrix P such that A = pTP.
4. Show that if A is a real non-singular matrix, then ATA is positive definite. 5. Show that if A is a real symmetric non-negative semi-definite matrix-that is, A represents a non-negative semi-definite quadratic form-then there exists a real matrix R such that A = RT R.
6. Show that if A is real, then ATA is non-negative semi-definite. 7. Show that if A is real and ATA = 0, then A = 0. 2 8. Show that if A is real symmetric and A 0, then A = 0. =
9. If Ai, ... , Ar are real symmetric matrices, show that
implies Ai =A2 =
12 I
·
·
·
=Ar = 0.
Hermitian Forms
For the applications of forms to many problems, it turns out that a quadratic form obtained from a bilinear form over the complex numbers is not the most useful generalization of the concept of a quadratic form over the real numbers.
As we see later, the property that a quadratic form
over the real numbers be positive-definite is a very useful property.
While
x2 is positive-definite for real x, it is not positive-definite for complex x. When dealing with complex numbers we need a function like lxl2 ix, where x is the conjugate complex of x. xx is non-negative for all complex (and real) x, and it is zero only when x 0. Thus xx is a form which has =
=
the property of being positive definite.
In the spirit of these considerations,
the following definition is appropriate. Definition.
Let
F be the field of complex numbers, or a subfield of the F. A scalar valued
complex numbers, and let V be a vector space over
I Hermitian Forms
12
171
functionf of two vectors,
f(IX,
(1)
IX,
fJ
E
fJ)=f ({J,
IX
V is called a Hermitian form if
(12.1)
). f(1X, b1f31 + b2f32)=bif(1X, f31) + bd(IX, f32).
(2)
A Hermitian form differs from a symmetric bilinear form in the taking of the conjugate complex when the roles of the vectors
and fJ are inter
IX
changed. But the appearance of the conjugate complex also affects the bilinearity of the form. Namely,
f(a11X1 + a21X2,
fJ)
= f ({J, a11X1 + a21X2)
= aif({J, 1X1) + ad(fJ, IX2) = aif({J, 1X1) + ad(fJ, 1X2) = iiif(1X1, {J) + iid(1X2, {J). We describe this situation by saying that a Hermitian form is linear in the second variable and conjugate linear in the first variable. Accordingly, it is also convenient to define a more appropriate general ization to vector spaces over the complex numbers of the concept of a bilinear form on vector spaces over the real numbers. A function of two vectors on a vector space over the complex numbers is said to be conjugate bilinear if it is conjugate linear in the first variable and linear in the second. We say that a function of two vectors is Hermitian symmetric if f(1X, {J) =
f({J, IX) .
This is the most useful generalization to vector spaces over the
complex numbers of the concept of symmetry for vector spaces over the real numbers.
In this terminology a Hermitian form is a Hermitian sym
metric conjugate bilinear form. For a given Hermitian formf, we define we call a Hermitian quadratic form.
q(1X)=f(1X, IX)
and obtain what
In dealing with vector spaces over the
field of complex numbers we almost never meet a quadratic form obtained from a bilinear form. The useful quadratic forms are the Hermitian quadratic forms. Let
A= {1X1,
. . ., 1Xn} be any basis of V. Then we can let
f(1Xi, IX;)=hi;
and obtain the matrix H= [hii] representing the Hermitian form f with respect to
A.
H has the property that h;;=f(1X;,
IX;
) = f(1X1,
IX;
)=h1;,
and any matrix which has this property can be used to define a Hermitian form. Any matrix with this property is called a Hermitian matrix. If A is any matrix, we denote by
A the
matrix obtained by taking the
conjugate complex of every element of A; that is, if A= [aii] then A=
[iii1].
We denote ,.fT =AT by A*. In this notation a matrix His Hermitian if and only if H*=H.
If a new basis B = {{Ji. .. . , fJn} is selected, we obtain the representation
172 H' is,
=
(31
Linear Functionals, Bilinear Forms, Quadratic Forms I IV
[h;11
where
h;1
L�=I p;/X;.
=
=
f((J;, (31).
Let
P be the matrix of transition;
h;j
=
f ((J;, f31)
=
f
=
8
=
L PsiL PkJ(rxk, rx,) S=l k=l
=
L L AihksPsi• S=lk=l
Ct Pkirxk, � Ps1rxs) � Psd (J1 Pkirxk, rxs) 8
n
n
n
n
In matrix form this equation becomes H'
Definition.
=
(12.3)
P*HP.
If a non-singular matrix P exists such that
that Hand H' are
that
Then
H'
=
P* HP, we say
Hermitian congruent.
Theorem 12.1. For a given Hermitian matrix H there is a non-singular matrix P such that H' P*HP is a diagonal matrix. In other words, iff is the underlying Hermitian form, there is basis A' {rx�, ... , rx�} such that f(rt::. rx;) 0 whenever i � j. =
=
=
PROOF.
The proof is almost identical with the proof of Theorem 10.1, the
corresponding theorem for bilinear forms.
There is but one place where
a modification must be made. In the proof of Theorem 10.1 we made use of a formula for recovering the symmetric part of a bilinear form from the associated quadratic form. For Hermitian forms the corresponding formula is
![q(rx + (J) - q(rx - (J) - iq (rx + i(J) + iq(rx - i(J) ] f(rx, (J). (12.4) Hence, if f is not identically zero, there is an rx1 E V such that q (rx1) � 0. =
The rest of the proof of Theorem 10.1 then applies without change. D Again, the elements of the diagonal matrix thus obtained are not unique. We can transform H' into still another diagonal matrix by means of a diagonal matrix Q with fashion we obtain
H"
=
Q*HQ '
x1, .•• , x n ,
-d1 lx1l2
X;
� 0, along the main diagonal. In this 0
0
0
d2 lx 12 2
0
0
(12.5)
=
0
0
dr lxrl2
0
0
0
0
0
12 I Hermitian Forms
173
We see that, even though we are dealing with complex numbers, this trans formation multiplies the elements along the main diagonal of H' by positive real numbers. Since q(r.t.)
=
f (r.t., r.t.)
=
f (r.t., r.t.), q((f.) is always real.
We can, in fact, apply
without change the discussion we gave for the real quadratic forms.
Let
and let N denote the number of negative in the main diagonal.
The
P denote the number of positive in the diagonal representation of q, number S form
q.
=
P
Again,
-
signature rank of q.
N is called the
P+
N
=
r,
the
of the Hermitian quadratic
The proof that the signature of a Hermitian quadratic form is independent of the particular diagonalized representation is identical with the proof given for real quadratic forms.
non-negative semi-definite if S r. definite if S = n. Iffis a Hermitian form whose associated Hermitian quadratic form q is positive-definite (non-negative semi-definite), we say that the Hermitian form f is positive-definite (non-negative semi definite). A Hermitian quadratic form is called
=
It is called positiV!e
A Hermitian matrix can be reduced to diagonal form by a method analo gous to the method described in Section 10, as is shown by the proof of Theorem 12.1.
A modification must be made because the associated Her
mitian form is not bilinear, but complex bilinear. Let
a� be a vector for which q(a�) -:;t: 0. With this fixed r.t.�, f(a�, a) defines
a linear functional
•
•
•
=
•
then
f(r.t.�, a)
=
=
This means the linear functional
•
•
=
n n L L pilh;;X; i=lj=l
i c� pilhii) X;.
; l
represented by
(12.6) P*H.
EXERCISES 1.
[: :] [ ]
Reduce the following Hermitian matrices to diagonal form. (a)
(b)
-
1
1 + i
1
-
i
1
2. Let f be an arbitrary complex bilinear form. Definef* by the rule, f*( ex, {3) /({3, oc) . Show that/* is complex bilinear.
=
174
Linear Functionals, Bilinear Forms, Quadratic Forms
J IV
3. Show that if His a positive definite Hermitian matrix--that is, H represents
a positive definite Hermitian form-then there exists a non-singular matrix P such that H
=
P*P.
4. Show that if A is a complex non-singular matrix, then A* A is a positive
definite Hermitian matrix. 5. Show that if H is a Hermitian non-negative semi-definite matrix-that is, H
represents a non-negative semi-definite Hermitian quadratic form-then there exists a complex matrix R such that H
=
R* R.
6. Show that if A is complex, then A*A is Hermitian non-negative semi-definite. 7. Show that if A is complex and A*A 8. Show that if A is hermitian and A2
=
=
0, then A
0, then A
=
=
0.
0.
9. If Ai, ... , Ar are Hermitian matrices, show that Ai2 +
implies Ai
=
·
·
·
=
Ar
=
·
·
·
+ Ar2
=
0
0.
10. Show by an example that, if A and B are Hermitian, it is not necessarily
true that AB is Hermitian. What is true if A and B are Hermitian and AB
=
BA?
chapter
v Orthogonal and unitary transformations, normal matrices
In this chapter we introduce an inner product based on an arbitrary positive definite symmetric bilinear form, or Hermitian form.
On this basis the
length of a vector and the concept of orthogonality can be defined.
From
this point on, we concentrate our attention on bases in which the vectors are mutually orthogonal and each is of length 1, the orthonormal bases.
The
Gram-Schmidt process for obtaining an orthonormal basis from an arbitrary basis is described. Isometries are linear transformations which preserve length.
They also
preserve the inner product and therefore map orthonormal bases onto orthonormal bases.
It is shown that a matrix representing an isometry has
exactly the same properties as a matrix of transition representing a change of bases from one orthonormal basis to another.
If the field of scalars is
real, these matrices are said to be orthogonal; and if the field of scalars is complex, they are said to be unitary. If A is an orthogonal matrix, we show that AT= A-1; and if A is unitary, we show that A* = A-1.
Because of this fact a matrix representing a linear
transformation and a matrix representing a bilinear form are transformed by exactly the same formula under a change of coordinates provided that the change is from one orthonormal basis to another.
This observation
unifies the discussions of Chapter III and IV. The penalty for restricting our attention to orthonormal bases is that there is a corresponding restriction in the linear transformations and bilinear forms that can be represented by diagonal matrices.
The necessary and
sufficient condition that this be possible, expressed in of matrices, is that A*A = AA*.
Matrices with this property are called normal matrices.
Fortunately, the normal matrices constitute a large class of matrices and 175
Orthogonal and Unitary Transformations, Normal Matrices I V
176
they happen to include as special cases most of the types that arise in physical problems. Up to a certain point we can consider matrices with real coefficients to be special cases of matrices with complex coefficients.
However, if we wish
to restrict our attention to real vector spaces, then the matrices of transition
must also be real.
This restriction means that the situation for real vector
spaces is not a special case of the situation for complex vector spaces.
In
particular, there are real normal matrices that are unitary similar to diagonal matrices but not orthogonal similar to diagonal matrices.
The necessary
and sufficient condition that a real matrix be orthogonal similar to a diagonal matrix is that it be symmetric. The techniques for finding the diagonal normal form of a normal matrix and the unitary or orthogonal matrix of transition are, for the most part, not new. The eigenvalues and eigenvectors are found as in Chapter III. We show that eigenvectors corresponding to different eigenvalues are automati cally orthogonal so all that nee'1s to be done is to make sure that they are of length 1.
However, something more must be done in the case of multiple
eigenvalues.
We are assured that there are enough eigenvectors, but we
must make sure they are orthogonal.
The Gram-Schmidt process provides
the method for finding the necessary orthonormal eigenvectors.
1 I Inner Products and Orthogonal Bases Even when speaking in abstract we have tried to draw an analogy between vector spaces and the geometric spaces we have encountered in
2- and 3-dimensional analytic geometry.
For example, we have referred to
lines and planes through the origin as subspaces; however, we have nowhere used the concept of distance.
Some of the most interesting properties of
vector spaces and matrices deal with the concept of distance.
So in this
chapter we introduce the concept of distance and explore the related proper ties. For aesthetic reasons, and to show as clearly as possible that we need not have an a priori concept of distance, we use an approach which will emphasize the arbitrary nature of the concept of distance. It is customary to restrict attention to the field of real numbers or the field of complex numbers when discussing vector space concepts related to dis tance.
However, we need not be quite that restrictive.
The scalar field F
must be a subfield of the complex numbers with the property that, if a E F, the conjugate complex ii is also in F. Such a field is said to be normal over its real subfield.
The real field and the complex field have this property, but
so do many other fields. For most of the important applications of the mate rial to follow the field of scalars is taken to be the real numbers or the field
1
I Inner Products and Orthogonal Bases
of complex numbers.
177
Although most of the proofs given will be valid for
any field normal over its real subfield, it will suffice to think in of the two most important cases. In a vector space V of dimension n over the complex numbers (or a subfield of the complex numbers normal over its real subfield), let f be any fixed positive definite Hermitian form. For the purpose of the following develop ment it does not matter which positive definite Hermitian form is chosen, but it will remain fixed for all the remaining discussion. Since this particular Hermitian form is now fixed, we write
(ix, /3)
instead off(ix,
/3). (ix, /3)
called the inner product, or scalar product, of ix and /3.
Since we have chosen a positive definite Hermitian form, and
(ix, oc) > 0
unless
ix=
0. Thus
.J (ix, ix)
=
ll ixll
llaixll
=
.J(aix, aix)
=
.Jaa(ix, ix)
=
la l
·
11ixll ,
(ix, ix) � 0
is a well-defined non
negative real number which we call the length or norm of ix. by a scalar a multiplies its length by l a l.
is
Observe that
so that multiplying a vector
We say that the distance between
two vectors is the norm of their difference;
that is,
d(ix, /3)
=
11/3 - ixll .
We should like to show that this distance function has the properties we might reasonably expect a distance function to have.
But first we have to
prove a theorem that has interest of its own and many applications.
Theorem 1.1.
For any vectors
ix, /3 E
l(ix, /3)1 � llix ll
V,
equality is known as Schwarz's inequality. PROOF.
llixll
=
=
This in
=
l(ix, /3)12 llixll2 t2
- 2t
l(ix, /3)12 + 1111112•
(1.1)
0, the fact that this inequality must hold for arbitrarily large t l(ix, /3)1 0 so that Schwarz's inequality is satisfied. If llixll ¥: 0, l/llixll2• Then (1.1) is equivalent to Schwarz's inequality,
implies that take t
11/1 11.
For ta real number consider the inequality
0 � ll(ix, /3)tix - /1112 If
·
=
l(ix, /1)1 � llixl[
·
[[/3[1.
(1.2)
D
This proof of Schwarz's inequality does not make use of the assumption that the inner product is positive definite and would remain valid if the inner product were merely semi-definite.
Using the assumption that the
inner product is positive definite, however, an examination of this proof of Schwarz's inequality would reveal that equality can hold if and only if
/3 -
(ix /3) , IX= ( ix, ix)
that is, if and only if f3 is a multiple of If
ix
O;
( 1.3)
ix.
¥: 0 and f3 ¥: 0, Schwarz's inequality can be written in the form [ ( ix, /3)1
llixll
·
< 1. llPll -
(1.4)
Orthogonal and Unitary Transformations, Normal Matrices I V
178
In vector analysis the scalar product of two vectors is equal to the product of the lengths of the vectors times the cosine of the angle between them. The inequality
(1.4)
numbers the ratio
says, in effect, that in a vector space over the real
�� �� ·
II
.
II
can be considered to be a cosine. It would be
a diversion for us to push this point much further.
to show that d(oc,
{3)
We do, however, wish
behaves like a distance function.
Theorem 1.2. For d(oc, /3) = 11/3 - ocll , we have, (1) d(oc, {3) = d(/3, oc),
(2) d(oc, /3) � 0 and d(oc, /3) 0 if and only if oc = {3, (3) d(oc, /3) � d(oc, y) + d(y, {3). PROOF. (1) and (2) are obvious. (3) follows from Schwarz's =
inequality.
To see this, observe that
lloc + /3112 = (oc + {3, oc + /3) = (oc, oc) + (oc, /3) + (/3, oc) + ({3, /3) = llocll2 + (oc, /3) + (oc, /3) + 11/3112
� lloc ll2 + 2 I (oc, /3)1 + 11/3112 � lloc ll2 + 2 llocll 11/311 + 11/3112 = ( llocll + 11/311 )2. •
Replacing oc by
y - oc and
f3 by f3
- y,
we have
11/3 - ocll � lly - ocll + 11/3 - yll. (3)
(l.5)
(1.6)
D
is the familiar triangular inequality. It implies that the sum of two small
vectors is also small.
Schwarz's inequality tells us that the inner product
of two small vectors is small. Both of these inequalities are very useful for these reasons. According to Theorem
12.1
of Chapter IV and the definition of a positive
definite Hermitian form, there exists a basis A=
{oc1,
•
•
•
, ocn}
with respect
to which the representing matrix is the unit matrix. Thus,
(1.7) Relative to this fixed positive definite Hermitian form, the inner product, every set of vectors that has this property is called an
orthonormal set.
The word "orthonormal" is a combination of the words "orthogonal"
oc and f3 are said to be orthogonal if (oc, /3) = (/3, oc) = 0. A vector oc is normalized if it is of length 1; that is, if (oc, oc) = 1. Thus the vectors of an orthonormal set are mutually orthogonal and nor malized. The basis A chosen above is an orthonormal basis. We shall see
and "normal." Two vectors
that orthonormal bases possess particular advantages for dealing with the properties of a vector space with an inner product. A vector space over the complex numbers with an inner product such as we have defined is called
1 I Inner Products and Orthogonal Bases
179
a unitary space. A vector space over the real numbers with an inner product is called a Euclidean space. For
ex ,
{J E V, let
ex
=
_k�1 xicxi and {J = L�=l Y;<X;. Then
c� xicxi•Jl yjcxj) = � x{ � y ;(cx;, cxj)J i j 1
(ex, {3) =
n
= L X;Y;·
(1.8)
i=l
If we represent
(y1,
•
•
•
,
Yn) =
ex
by the n-tuple
(x1,
•
•
.
x11) =
,
X, and f3 by the n-tuple
Y, the inner product can be written in the form n
( ex, {J) = L X;Y; = i=l
(1.9)
X*Y.
This is a familiar formula in vector analysis where it is also known as the inner product, scalar, or dot product. Theorem 1.3. PROOF.
0.
An orthonormal set is linearly independent.
$2, ••• } is an orthonormal set and that Li x;$; = , $ = , 0) ( ; L; x; $;) = L; x;($;, $;) = X;. Thus the set is ; ; (
Suppose that {$1,
Then 0 =
linearly independent. D
It is an immediate consequence of Theorem 1.3 that an orthonormal set cannot contain more than n elements. Since V has at least one orthonormal basis and orthonormal sets are linearly independent, some questions naturally arise.
Are there other
orthonormal bases? Can an orthonormal set be extended to an orthonormal basis? Can a linearly independent set be modified to form an orthonormal set?
For infinite dimensional vector spaces the question of the existence
of even one orthonormal basis is a non-trivial question. For finite dimen sional vector spaces all these questions have nice answers, and the technique employed in giving these answers is of importance in infinite dimensional vector spaces as well. Theorem 1.4.
{cx1,
If A=
•
•
•
, cx8}
is any linearly independent set whatever
$8} such that $k = L�=l a;kcxi. Since cx1 is an element of a linearly independent set cx1 ¥:- 0, and therefore II cx1ll > 0.
in V, there exists an orthonormal set X = PROOF.
g1,
•
•
•
,
(The Gram-Schmidt orthonormalization process). 1
cx1. CX1
Clearly, 11 $111 = 1. II II Suppose, then, {$1, , $r} has been found so that it is an orthonormal set and such that each ;k is a linear combination of {cx1, , cxk}. Let Let
•
$1 = .
•
•
•
•
(1.10)
Orthogonal and Unitary Transformations, Normal Matrices I V
180
Then for any � i•
1
::;; i ::;; r, we have
(�i•
; )=(�i• 1Xr+l)- (�i• 1Xr+l)= 0 .
(1 .11)
ix +1
Furthermore, since each �k is a linear combination of the {ix1,
�1
ix
is a linear combination of the {ix1,
since {ix1, ixr +i
•
•
}
. . . , ixr+i ·
•
•
•
, ixk } ,
Also, ix;+i is not zero
} is a linearly independent set and the coefficient of
, ixr +i
•
in the representation of
;
ix +i
is l. Thus we can define
(1.12) Clearly, {�i .
. . . ,
�r-r1}
is an orthonormal set with the desired properties.
We can continue in this fashion until we exhaust the elements of A. set X
=
g1,
•
.
.
The
,�.} has the required properties. D
The Gram-Schmidt process is completely effective and the computations can be carried out exactly as they are given in the proof of Theorem For example, let A= { ix1
=
(1,
l)}. Then
I, 0 , 1) , ix2=
(3,
1,
1, -1),
ix3= (0,
1.4. 1, -1,
1 �1= /3 (1,1, 0,1), IX�= (3, 1,1,-1)
- J3 )3
(1,1,0,1)= (2, 0,1, -2),
�2= !(2, 0,1, -2),
, -3 1 2 1 IX3=(0,1,-1, 1)- ;- ;-(1,1,0,1)- --(2,0,1, -2) =
y3 y3 l(O, 1, -2, -1),
3 3
1 �3= (0,1, -2,-1). .J6 It is easily verified that {�1, �2, �3} is an orthonormal set. Corollary 1.5. =
{�1,
If A= {ix1,
.
•
•
, ix n} is a basis of V, the orthonormal set
, �n}, obtained from A by the application of the Gram-Schmidt process, is an orthonormal basis of V. X
.
.
•
PROOF. Since X is orthonormal it is linearly independent. contains n vectors it also spans V and is a basis. D
Theorem
Since it
1.4 and its corollary are used in much the same fashion in which 3.6 of Chapter I to obtain a basis (in this case an ortho
we used Theorem
normal basis) such that a subset spans a given subspace.
1 I Inner Products and Orthogonal Bases Theorem
1.6.
181
Given any vector oc1 of length I, there is an orthonormal
basis with oc1 as the first element. PROOF.
Since the set
{oc1}
is linearly independent it can be extended to a
basis with oc1 as the first element.
Now, when the Gram-Schmidt process
is applied, the first vector, being of length 1, is unchanged and becomes the first vector of an orthonormal basis. o
EXERCISES
In the following problems we assume that all n-tuples are representations of their vectors with respect to orthonormal bases. 1. Let A {ix1, •••, ix4} be an orthonormal basis of R4 and let ix, f3 E V be represented by (1, 2, 3, -1) and (2, 4, -1, 1), respectively. Compute (ix, {J). =
2. Let ix= (1, i, 1 + i) and f3= (i, 1, i - 1) be vectors in ea, where e is the field of complex numbers. Compute (ix, {J). 3. Show that the set {(1, i, 2), (1, i, -1), (1, 4. Show that
(ix, O)= (0, ix)= 0 for all ix E
-
i
,
O)} is orthogonal in
ea.
V.
5. Show that llix + fJll2 + llix - fJll2= 2 llixll2 + 2 llfJll2• 6. Show that if the field of scalars is real and II ix II = II fJ 11, then ix - f3 and ix + f3 are orthogonal, and conversely. 7. Show that if the field of scalars is real and
ix and f3 are orthogonal, and conversely.
8. Schwarz's inequality for the vectors
llix + f3112= llix ll2 + llfJll2, then
ix and f3 in Exercises 1 and 2.
9. The set {(1, -1, 1), (2, 0, 1), (0, 1, 1)} is linearly independent, and hence a basis for F3. Apply the Gram-Schmidt process to obtain an orthonormal basis.
10. Given the basis {(1, 0, 1, O), (1, 1, 0, O), (O, 1, 1, 1,), (0, 1, 1, O)} apply the Gram-Schmidt process to obtain an orthonormal basis.
11. Let W be a subspace of V spanned by {(O, 1, 1, O), (0, 5, -3, -2), (-3, 7 ) } Find an orthonormal basis for W. -3, 5, -
.
12. In the space of real integrable functions let the inner product be defined by
l�
/(x)g(x) dx.
Find a polynomial of degree 2 orthogonal to 1 and x. Find a polynomial of degree 3 orthogonal to 1, x, and x2• Are these two polynomials orthogonal? 13. Let X {.;1, ..., �,,,} be a set 'of vectors in the n-dimensional space V. Consider the matrix G= [g;1] where =
gij= ( ;;, ;j)· Show that if X is linearly dependent, then the columns of G are also linearly dependent. Show that if X is linearly independent, then the columns of G are also
182
Orthogonal and Unitary Transformations, Normal Matrices I V
linearly independent. Det G is known as the Gramian of the set X. Show that X is linearly dependent if and only if det G
=
0. Choose an orthonormal basis in V and
represent the vectors in X with respect to that basis. Show that G can be represented as the product of an m
x
n matrix and an n
x
m matrix.
Show that det G � 0.
*2 I Complete Orthonormal Sets We now develop some properties of orthonormal sets that hold in both finite and infinite dimensional vector spaces.
These properties are deep
and important in infinite dimensional vector spaces, but in finite dimensional vector spaces they could easily be developed in ing and without special terminology.
It is of some interest, however, to borrow the terminology of
infinite dimensional vector spaces and to give proofs, where possible, which are valid in infinite as well as finite dimensional vector spaces.
X= g1, � , ••• }
be an orthonormal set and let et. be any vector in V. 2 {et.;= (�;.et.)} are called the Fourier coefficients of et.. There is, first, the question of whether an expression like Li xi�i has any meaning in cases where infinitely many of the X ; are non-zero. This is a Let
The numbers
question of the convergence of an infinite series and the problem varies from case to case so that we cannot hope to deal with it in all generality. We have to assume for this discussion that all expressions like
Li xi�i that
we write down have meaning. Theorem 2.1.
The minimum of llet. - Li xi�i ll is attained if and only if all xi
=
(�i• x)
=
a i.
PROOF.
llet. - L X;� ; ll2= (et. - L xi�i• et. - L xi�i) i i i =(et., et.) - L xJii - L x ;ai + L xixi i i i = L c'i;a; - L x;a; - L xiai + L X;X; + (et., et.) - L c'iiai i i i i i = L (c'i; - xi)(a; - x;) + llet.112 - L c'iiai i i =
Only the term
L la; - x;l2 + llet.112 - L lail2• i
(2.1)
Li la; - xil2 depends on the xi and, being a sum of real a i. D xi
squares, it takes on its minimum value of zero if and only if all Theorem not.
2.1
is valid for any orthonormal set
X,
=
whether it is a basis or
If the norm is used as a criterion of smallness, then the theorem says
that the best approximation of is obtained if and only if all
X;
et.
in the form
Li xi�i (using only the �i EX)
are the Fourier coefficients.
2
I Complete Orthonormal Sets
183
Theorem 2.2 L• lail2 � JlocJl2• This inequality is known as Bessel's in equality. PROOF. Setting xi = a, in equation (2.1) we have
Jlocl12 - L lail2 = lloc - L a.;,112 � 0. i
D
(2.2)
It is desirable to know conditions under which the Fourier coefficients oc. This means we would like to have oc Li ai;i.
will represent the vector
=
In a finite dimensional vector space the most convenient sufficient con
dition is that X be an orthonormal basis.
In the theory of Fourier series
and other orthogonal functions it is generally not possible to establish the validity of an equation like
oc = L; a;;i
without some modification of
what is meant by convergence or a restriction on the set of functions under consideration.
Instead, we usually establish a condition known as com
pleteness. An orthonormal set is said to be complete if and only if it is not a subset of a larger orthonormal set.
Theorem 2.3. Let X = {;i} be an orthonormal set. conditions are equivalent:
(3
The following three
(2.3)
L (;i, oc)(;;, (3). ' (2) For each oc E V, llocll2 =LI(;., oc)l2. (1) For each
(3)
oc,
E V,
(oc, (3)
=
(2.4)
X is complete.
Equations (2.3) and (2.4) are both known as Parseval's identities. PROOF.
Assume
Assume (1). Then
(2).
JlocJl2 = (oc, oc) =Li(;., oc)(;i, oc) =Li I<;., oc)J2•
If X were not complete, it would be contained in a larger
orthonormal set Y. But for any
1
=
oc0
llixoll 2
E Y,
=
ix0 ef= X,
we would have
L l (;i, oco)l2
=
0
i
because of (2) and the assumption that Y is orthonormal. Thus X is complete. Now, assume
f3);i.
Then
(3).
Let
(3 be any
vector in V and consider
(; ., (3') = ;i,
(3' = /3 - L; (;;,
(;;, (3);; (3 = c;i, (3) L (; , (3)(; ., ; ) ; ; ;
(
that is,
(3'
is orthogonal
-
t
)
= (;., (3) - (;., (3) = O; to all ;i EX. If llf3'11 ::;!= 0,
then X
U
� }
{ ll 'll P'
Orthogonal and Unitary Transformations, Normal Matrices I
184
would be a larger orthonormal set. Hence,
1/,8'11 = 0.
V
Using the assumption
that the inner product is positive definite we can now conclude that
,B' = 0.
However, it is not necessary to use this assumption and we prefer to avoid using it. What we really need to conclude is that if
(cx , ,B') = 0,
rr
is any vector in V then
and this follows from Schwarz's inequality. Thus we have 0
= ( cx, ,B') = (oc, ,B - L ($;, ,8)$i) i = (oc, ,8) - L ($i, ,B)(oc, $i) i = ( cx, ,8) - L ($i, oc) ($i, ,8), i
or
(oc, ,8) =L ($;, oc)($;, ,8). i This completes the cycle of implications and proves that conditions
(2),
(3)
and
(1),
are equivalent. D
Theorem 2.4.
The following two conditions are equivalent:
(4) The only vector orthogonal to all vectors in X is the zero vector. (5) For each cx E V, oc = Lai, oc) $i. (2.5) i PROOF. Assume ( 4). Let oc be any vector in V and consider oc' = oc L i ($i, cx)$;. Then
($;, oc')
=
($;, oc - L ($;, cx)$;) ;
= ($;, oc) - L ($;, oc)($;, $;) ; = ($;, oc) - ($;, cx) O; =
that is,
cx'
is orthogonal to all
Now, assume
Li ($;, oc)$i = 0. Theorem 2.5.
(3).
PROOF.
(5)
and let
oc
;i Ex.
Thus
cx' = 0
cx =Li ai, oc)$;. EX. Then oc = ; $
and
be orthogonal to all
D
The conditions (4) or (5) imply the conditions (1), (2), and
Assume
(5).
Then
(oc, ,8) = (L ($;, oc)$;, L ($;, ,8)$;) i i =L ($;, cx) L ($;, ,8)($;, $;) i j =L ($;, oc)($;, ,8). D i Theorem 2.6.
If the inner product is positive definite, the conditions (1),
(2), or (3) imply the conditions (4) and (5).
2I
185
Complete Orthonormal Sets
In the proof that (3) implies (I) we showed that if oc' oc Li(�;, oc)�;, then lloc'll 0. If the inner product is positive definite, then ' oc 0 and, hence, PROOF.
=
=
=
oc
=
L (�;, oc)�;·
D
i
The proofs of Theorems 2.3, 2.4, and 2.5 did not make use of the positive definiteness of the inner product and they remain valid if the inner product is merely non-negative semi-definite. Theorem 2.6 depends critically on the fact that the inner product is positive definite. For finite dimensional vector spaces we always assume that the inner product is positive definite so that the three conditions of Theorem 2.3 and the two conditions of Theorem 2.4 are equivalent. The point of our making a distinction between these two sets of conditions is that there are a number of important inner products in infinite dimensional vector spaces that are not positive definite. For example, the inner product that occurs in the theory of Fourier series is of the form
( oc, (3)
1 =
-
27T
" oc(x){3(x) dx.
J
-"
(2.6)
This inner product is non-negative semi-definite, but not positive definite if V is the set of integrable functions. Hence, we cannot from the com pleteness of the set of orthogonal functions to a theorem about the con vergence of a Fourier series to the function from which the Fourier coefficients were obtained. In using theorems of this type in infinite dimensional vector spaces in general and Fourier series in particular, we proceed in the following manner. We show that any oc E V can be approximated arbitrarily closely by finite sums of the form L; x;�;· For the theory of Fourier series this theorem is known as the Weierstrass approximation theorem. A similar theorem must be proved for other sets of orthogonal functions. This implies that the minimum mentioned in Theorem 2.1 must be zero. This in turn implies that condition (2) of Theorem 2.3 holds. Thus Parseval's equation, which is equivalent to the completeness of an orthonormal set, is one of the principal theorems of any theory of orthogonal functions. Condition (5), which is the convergence of a Fourier series to the function which it represents, would follow if the inner product were positive definite. Unfortunately, this is usually not the case. To get the validity of condition (5) we must either add further conditions or introduce a different type of convergence. EXERCISES 1. Show that if X is an orthonormal basis of a finite dimensional vector space, then condition
(5) holds.
186
Orthogonal and Unitary Transformations, Normal Matrices I V
2. Let X be a finite set of mutually orthogonal vectors in V.
Suppose that the
only vector orthogonal to each vector in X is the zero vector.
Show that X is a
basis of V.
3 I The Representation of a Linear Functional by an Inner Product
For a fixed vector linear functional
fJE V, (fJ, a)
is a linear function of
at.
Thus there is a
V such that (a)= (fJ, a) for all at. We denote the linear
functional defined in this way by p·
of this observation.
The following theorem is a converse
Theorem 3.1. Given a linear functional E V, there exists a unique 'Y) E V such that (ot)=('Y), a) for all IX E V. PROOF. Let X= { �1, ... , � n} be an orthonormal basis of V, and let X = {1, ,
.
.
unique. D
Call the mapping defined by this theorem 'YJ;
'YJ() E V has
the property that
Theorem 3.2.
(IX)= ('Y)(), at)
that is, for each
for all
at E V.
V,
A
The correspondence betweenE Vand 'YJ()
E
Vis one-to-one
and onto V.
In Theorem 3.1 we have already shown that 'YJ() is well defined. fJ be any vector in V and let p be the linear functional in V such that p ( 1X)= (fJ, a) for all IX. Then fJ= 11(p) and the mapping is onto. Since (fJ, oc), as a function of oc, determines a unique linear functional p the PROOF.
Let
correspondence is one-to-one. D Theorem 3.3.
A
If the inner product is symmetric , 'YJ is an isomorphism of V
onto V.
'YJ is one-to-one and
onto.
Let
We have already shown in Theorem 3.2 that
Notice that
'YJ
is not linear if the scalar field is complex and the inner
Then for
product is Hermitian. We see that
187
3 I The Representation of a Linear Functional by an Inner Product Thus
r;(rp)
=
y
=
that even when
r;
Li b i r;(i) and r; is conjugate linear.
It should be observed
is conjugate linear it maps subspaces of
V onto subspaces
of V. We describe this situation by saying that we can "represent a linear func tional by an inner product." Notice that although we made use of a particular basis to specify the
r;
corresponding to
rp,
the uniqueness shows that this
choice is independent of the basis used. If V is a vector space over the real numbers,
and r; happen to have the same coordinates.
V in
cidence allows us to represent
This happy coin
V and make V do double duty. This fact is
exploited in courses in vector analysis.
In fact, it is customary to start
immediately with inner products in real vector spaces with orthonormal bases and not to mention
V at all.
All is well as long as things remain simple.
As soon as things get a little more complicated, it is necessary to separate the structure of
V
superimposed on V.
The vectors representing themselves
in V are said to be contravariant and the vectors representing linear functionals
V
in
are said to be covariant.
We can see from the proof of Theorem 3.1 that, if V is a vector space over the complex numbers,
have the same coordinates.
In fact, there is no choice of a basis for which
each
and the corresponding
and its corresponding r;
r;
will not necessarily
will have the same coordinates.
Let us examine the situation when the basis chosen in V is not orthonormal. Let A
=
{ot1, ... ,
otn } be any basis of V, and let
corresponding dual basis of Hermitian, b;1
=
b1;, or
V.
[bu]
=
Let bii B
=
B*.
=
(oti, ot1).
A
=
{1J11, ... ,
1Pn } be the
Since the inner product is
Since the inner product is positive
definite, B has rank n. That is, B is non-singular. Let
=
L�=l Ci1Jli
be an
arbitrary linear functional in V. What are the coordinates of the correspond ing
r;?
Let
r;
=
L�=l Yioti.
Then
(r;, ot;)
=
c� Yioti, ot1) n
=
I :Yioti, ot1)
i=l n
=
I Yibi i
i=l n
=
=
L C1c1J1iot;) lc=l
(3.1)
C;.
Thus, we have to solve the equations n
L Yibii i=l
n
=
L biiyi i=l
=
c1,J
=
1, . . .
, n.
(3.2)
Orthogonal and Unitary Transformations, Normal Matrices I
188
V
In matrix form this becomes where or
=B-1C* =(CB-1)*.
Y
(3.3 )
Of course this means that it is rather complicated to obtain the coordinate representation of
'f/
from the coordinate representation of
.
But that is
not the cause for all the fuss about covarient and contravariant vectors. After all, we have shown that used and the coordinates of
'f/ corresponds to
apply to any other vector in V. The real difficulty stems from the insistence upon using
(1.9)
as the definition of the inner product, instead of using a
definition not based upon coordinates. If
'f/= L7=1 Yioci,
and
e= L7=1 X;OC;,
n
we see that
n
=L 2, Yibiixi
i=l j=l
= Y*BX. Thus, if
'f/ represents the
linear functional
(3.4) ,
we have
('f/, fl = Y*BX = (CB-1)BX =CX (3.5)
= (C*)*X. Elementary treatments of vector analysis prefer to use sentation of
'f/·
C*
as the repre
This preference is based on the desire to use
definition of the inner product so that
(3.5)
(1.9)
rather than to use a coordinate-free definition which would lead being represented by components of
'f/·
(3.4).
The elements of
We obtained
C
as the
('f/, ;), to ('f/, e)
is the representation of
C*
are called the covariant
by representing
V.
Since the dual space
is not available in such an elementary treatment, some kind of artifice must be used. Itis then customary to introduceareciproca/ basis A*= where
ocj'
has the property
of the dual basis
A
in V.
of the dual basis.
(ocj''
But
{oci, . .. , oc�}.
OC;
) =oij =i(oc;). A*
C
was the original representation of
m
=
I C;b. i�l
The array within the rectangle is the augmented matrix of the system of equations AX B. The first column in front of the rectangle gives the identity of the basis element in the feasible subset of A, and the second column contains the corresponding values of c;. These are used to compute the values of d;(j 1, . .. , n) and
. If ix f/'- W, by Exercise 12 there is a such that ix ( ) 1 and (P) o. =
=
IV-2 1. 2.
4.
This is the dual of Exercise 5 of Section 1. Dual of Exercise 6 of Section 1. 3. Dual of Exercise 12 of Section 1. Dual of Exercise 14 of Section 1. 5. Dual of Exercise 15 of Section 1.
Answers to Selected Exercises
333
IV-3
(P-l)T = (PT)-1.
1. p= 2. P
[; ; J.
�
(�')T
�
.A'= {[-1 1 1], [2 -1 -1], [1 O -ll}. {[1 -1 O], [O 1 -1], [O 0 ll}. Ut. t -tl. c-t t -n et t m.
Thus 3. 4. 5.
-[ : =: �1 -
BX= B(PX')= (BP)X'= B'X'.
IV-4
1. (a) {[1 2.
1 ll} (b) {[-1 -1 [1 -1 l].
0
1 ll}.
3. Let W be the space spanned by {oc}. Since dim W = 1, dim W.l = dim V 1
=
=
=
=
11. f(t)=
IV-5 1. Let -r be a mapping of U into Vanda a mapping of V into W. Then, if E W, we have for all
2.
{T[a(
3. [1
-2
1).
g EU, ( ,;;( ))a) =
334
Answers to Selected Exercises
IV-8
[ _�
1.
-1
2
2.
[� � �] [� -� =�] +
57 9 2 det AT = det (-A) = ( -lr det A.
1 O · 5. det AT = det A and Thus det A = -det A. 7. a1( oc ) =0 if and only if a1(oc)(/3) =f ( oc, /3) = 0 for all f3 E V. 9. Let dim U = m; dim V = n. /(oc, /3) =0 for all f3 E V means oc E (T,(V)]_l_ or, oc E Of1(0). equivalently, Thus p(T1) =dim T1(V) = m - dim (T,(V)]_l_ = m - dim a/1(0) = m - v(a1) = p(a1). 10. If m � n, then either p(a1) < m or p(T1) < n. 1 1. U0 is the kernel of a1 and V0 is the kernel of T1. 12. m - dim U0 = m - v(a1) = p(a1) = p(T1) = n - v(T1) = n - dim V0. 13. 0 =/(oc + /3, oc + /3) =/ ( oc , oc ) + /(oc, /3) + /(/3, oc ) + /(/3, {3) =/(oc, {3) + /(/3, oc ) . 14. If AB =BA, then (AB)T =(BA)T = ATBT = AB. 15. Bis skew-symmetric. 16. (a) (A2)T = ATAT = ( -A)(-A) = A2; (b) (AB - BA)T = (AB)T - (BA)T =B(-A) - ( -A)B = AB - BA; (c) (AB)T = (BA)T = ATBT = t-A)B = -AB. If (AB)T = -AB, then AB = -(AB)T = -BTAT = -B(-A) =BA.
1.
IV-9
(a)
[2t _;!_�] '
(c)
[� � �]
(e)
[� ! �]
1 2 1 · 2!7 · %y1x2 + 6y1y2 (if (x1' y1) and (x2, y2) are the coordinates of the two points), (c) x1x2 + x1y2 + x2y1 + 2x1z2 + 2x2z1 + 3Y1Y2 + !Y1z2 + 7z1z2, (e) X1X2 + 2x1Y2 + 2x2Y1 + 4Y1Y2 + X1Z2 + X2Z1 + Z1Z2 + 2Y1Z2 + !Y2Z1 + 2y2z1.
2. (a) 2x1x2
+
%x1y2
+
IV-10
1. (In this and the following exercises the matrix of transition P, the order of the elements in the main diagonal 'of PTBP, and their values, which may be
� � :�l r : ; : � ::q : :: : : � [ :�'::::,: ::=:::: e.
,
{1, Ii : : -�l{l,4,
The diagonal of PTBP is
(e)
h
llo
[� -� -�], 0
-1, -4)
ns
s can only
(
0
-3, 9}; (b)
n
0
4
0
{1, -4,
1 .
68};
Answers to Selected Exercises
2. (a) p
�
[� -: �J -
(<)
IV-11 r
If P=
1
r: -� } -I
(o)
(c)
3, 3; (d)2, O; (e) a O
then pTnp=
a
1. 2, 30);
{I, 0, O}
(b) 2, O; , b/2 [0 - ]
=2, S=2;
1. (a) 2.
[� -�]. {2, 78)
335
[0
(a/4)(
-
1,
1; (/) 2, O; (g) 3, 1.
J
b
2+ 4ac) .
There is a non-singular Q such that QTAQ=I. Take P=Q-1. 1 4. Let P=A- • 5. There is a non-singular Q such that QTAQ=B has r 1 's along the main 1 diagonal. Take R=BQ- • Thus XTATAX= 6. For real Y=(y1, ..., Yn), yT Y=!f=1 y;2 � 0. (AX)T(AX) yTy � 0 for all real X (x1, ... , xn). 7. If Y=(y1, •••, Yn) � 0, then YT Y > 0. If A � 0, there is an X=(x1, •••, xn) such that AX= Y � 0 (why?). Then we would have 0=XTATAX=yTy >
3.
=
0.
=
0 for any i, then 0=XTC�i=i A;2)X=!I=1 XT A;TA;X 0 for all X and A;=0.
If A;X �
9.
A;X
=
>
0.
IV-12
1. (a) 3-9. 10.
P
=
[� :J.
diagonal={l,
O}
(b) [� -\ i]. {1, -1}.
Proofs are similar to those for Exercises Similar to Exercise 14 of Section 8.
+
3-9 of
Section
11.
V-1
2. 2i. 1. 6. 6. (ex - {J, ex+ {J)=(ex, ex) - ({J, ex)+ (ex, {J) - ({J, {J)= [[exJ12 - Jl{J[[2=0. 7. [[ex+ {J[[2= [[ex[[2+ 2(ex, {J) + [[{J[[2• 9.
11.
{v'31
o.
}
1 1 -1, o. vi (1, 1, o), v6 <-1, 1, 2) .
{� (0, 1, 1, 0),
l(O,
2, -2, -1), t(-3, -2, 2,
-
8)
}
.
Thus
336
Answers to Selected Exercises
12. x2 - !, x3 - 3x/5.
13. (a) If I7!=1 a;g;= 0, then I7!=1 a;(gi• g;) = ( gi• LY!=i a;g;) = (gi• O) = 0 for each i. Thus I'f!=1gi;a;= 0 and the columns of G are dependent. (b) If L'f'=1g;;a;= 0 for each i, then 0= L'f'=i a;(g;, g;)= (g;, Li'=i a;g;) for each i. Hence, Lf°=i ii;(gi, L'f'=i a;g) = (Lf°=i a;gi• I7!=1 a;g;) = 0. Thus Li'=i a;g;= 0. (c) Let A = { oc1, , °'n} be orthonormal, g;= Lf=i a;;°';· Then g;;= (g;, g;) = LZ=i iikiaki· Thus G = A*A where A = [a;;]. .
.
•
V-2
1. If °' = Lf=i a;g;, then (gi• oc)= Lf=i a;(gi, g;) = ai. · l'mear1y m · dependent. 2. X 1s
Since
(g;, {3)= 0, {3= 0.
Let
� in= (g;, oc}g; °' E V and cons1'd er {3 = °' - ""I II g;ll2
•
V-4 1. 2.
3. 5. 6. 8. 10.
11. 13. 14. 15. 16. 21.
((aT)*(oc), {3)= (oc, O'T(/3)) =(a*(oc), T(/3))= (T*a*(oc), {3). (a(oc), a(oc)) = (oc, a*a(oc))= 0 for all °'· (a*(oc), {3)= (oc, a({3))=/(ix, {3)= -/(/3, ix )= -(/3, a(ix))= -(a(ix), {3)= ( -0' ( ix ), {3). Let g be an eigenvector corresponding to J.. Then J.(g, g) = ( g, a(g)) = (a*W, g) = ( -J.g, g) = -X(g, g). Thus (J. + X) = o. 7. a is skew-symmetric. a is skew-symmetric. Let g E W1-. Then for all 11 E W, (a*W, 11)= (g, a(11))= 0. Since (7r*)2= (7r2)* = 1T*, 1T* is a projection. g E K(7T*) if and only if (1T*W, 11)= (g, 1T(11))= 0 for all 11; that is, if and only if gES1-. Finally, (7r*(g), 11)= (g, 1T(11)) vanishes for all g if and only if 7T(11)= O; that is, if and only if 11 ET. Then 1T*(V) ET 1-, Since 7T*(V) and T 1- have the same dimension, 1T*(V) = T 1-, (g, a(11))= (a*(g), 11) =0 for all 11 if and only if a*W = 0, or g E W1-. By Theorem 4. 3, V =WEB W1-. By Exercise 11, a*(V)= a*(W). a*(V) =a*(W)= a*a(V). a(V)= aa*(V) is the dual statement. a*(V) =a*a(V)= aa*(V)= a(V). By Exercises 15 and 11, W1- is the kernel of a* and a. By Exercise 15, a(V)= a*(V). Then a2(V) = aa*(V) =a(V) by Exercise 14.
V-5
g be the corresponding eigenvector. Then (g, g)= (a(g), a(g))= (J.g, J.;) XJ.(g, g), 1 -; 3. It also maps ;2 onto ± ; v'2: 2 . 1. Let
4. For example, ;2 onto !(U1 - U 2
+
;3) and ;3 onto i(U1
+
;2 - U3).
337
Answers to Selected Exercises V-6
2. (a). 1. (a) and (c) are orthogonal. (a) Reflection in a plane (x1, x2-plane). (b) 180° rotation about an axis (x3-axis). (c) Inversion with respect to the origin. (d) Rotation through 0 about an axis (x3-axis). (e) Rotation through () about an axis (x3-axis) and reflection in the perpendicular plane (x1, x2-plane). The characteristic equation of a third-order orthogonal matrix either has three real roots (the identity and (a), (b), and (c) represent all possibilities) or two complex roots and one real root ((d) and (e) represent these possibilities).
5.
V-7
1. Change basis in Vas in obtaining the Hermite normal form. Apply the Gram Schmidt process to this basis. a;;'YJ;, then a*(rik)= 2. If a(ri;)= (ri;, a*))ri;= (a(ri;), ri,Jri;= a;, ri;, rik)ri;= akiT/J· 3. Choose an orthogonal basis such that the matrix representing a* is in super diagonal form.
L{=1 !f=1
Lf=t
.Lr=k
Lf=I
V-8
1. (a) normal; (b) normal; (c) normal; (d) symmetric, orthogonal; (e) orthog onal, skew-symmetric; (/) Hermitian; (g) orthogonal; (h) symmetric, orthogonal; (i) skew-symmetric; normal; (j) non-normal; (k) skew-symmetric normal. b 3. ATA = (-A)A = -A2 =AAT.
: A[� � r �:"](j). ·
6. Exercise l(c).
0 .
i
V-9
4. (a*(a), {J) =(a, a({J))=/(a, 5. [_(a, {J)=(a, a({J))=
.Lr=l a;fJi).i·
6. q(a)=f(a, max
{A;} for
{J)
=f({J,
{J)
= ({J, a(a)) = (a(a),
{J).
a ES, and both equalities occur.
�
If a ""' 0, there is a real positive
scalar a such that aa ES. Then q(a)= q(aa) :?: min a values are > O.
{A;} >
0, if all eigen-
V-10
1. (a) unitary, diagonal is { 1, i}. (b) Hermitian, {2, O }. (c) orthogonal, {cos 0 + isinO, cosO-isinO}, where O=arccos 0.6. (d) Hermitian, {1, 4 }.
(e) Hermitian, { 1, 1
+
'\/i., 1 - ./i.}.
Answers to Selected Exercises
338 V-11
1.
Diagonal is {15, -5} . (d) {9, -9, -9}. (e) {18, 9, 9}. (/) { -9, 3, 6}. (g){ -9, 0, O}. (h){l, 2, O}. (i){l, -1, - 1 D· (j){3, 3, -3}. (k) { -3, 6, 6}.
2. (d), (h). Since pTBP
3.
=B' is symmetric, there is an orthogonal matrix R such that RTB'R =B" is diagonal matrix. Let Q =PR. Then (PR)TA(PR) = RTP'f'APR = RTR = 1 and (PR)TB(PR) =RTPTBPR=RTB'R =B" is
4.
ATA is symmetric. Thus there is an orthogonal matrix Q such that QT(ATA)Q= D is diagonal. Let B = QTAQ. Let P be given as in Exercise 3. Then det (PTBP - xl) =det (PT(B - xl)P)=
5.
diagonal.
det P2· det (B - xA). Since pTBP is symmetric, the solutions of det (B - xA )=
0 are also real. 7. Let
A =
[: !]
. Then
A
is normal if and only ifb2
=
c
2
and ab+ cd
=
+ bd. Ifb or c is zero, then both are zero and A is symmetric. If a =0, then b2=c2 and cd=bd. If d -F 0, c =d and A is symmetric. If d=O, eitherb =c and A is symmetric, orb = -c and A is skew-symmetric. 8. Ifb =c, A is symmetric. Ifb = -c, then a =d. 9. The first part is the same as Exercise 5 of Section 3. Since the eigenvalues of a linear transformation must be in the field of scalars, a has only real eigen values. 10. a2 = -a*a is symmetric. Hence the solutions of I A 2 - xii =0 are all real. Let ). be an eigenvalue of a2 corresponding to ;. Then (aa), a(;)) =(;,a*a(;))=(;, a2(;)) = -).(;,;). Thus ). � 0. Let ).=-µ2. ac
1
1
a(11) =- a2(;)= - (-µ2;)= -µ;. µ µ
(
)
(;, 11)=
a(e) = � (a*(;), ;) =� (-a(;), ;) = -(e, 11). µ µ µ 11. a(e)=µr1, a(11)= -µ;. 12. The eigenvalues of an isometry are of absolute value 1. If a(;) =).; with ). real , then a*(;)=U, so that (a+ a*)(;)=2M. 13. If(a+ a*)W=2µ; and a(e )=).;, then ).= ±l and 2 µ=2).= ±2. Since (;,a(;))=(a*W, ;) =(;,a*(;)). 2µ(;, ;) =(;, (a+ a*)(;))=2(;, a(;)). Thus lµI 11; 112=la. a(;))I � ml· lla(e) ll = 11; 112, and hence lµI � 1. If lµI = 1, equality holds in Schwarz's inequality and this can occur if and only if a(;) is a multiple of ;. Since ; is not an eigenvector,this is not possible. ;,
�
a2(r1) =-µa(;) =-µ211.
·
14.
a, 11)= /
y1
1 -
µ2
{(e,
a(;)) - µ(;, m =0.
Since
a(;) + a*W = 2µ;,
a2(;) + ; = 2µa(;). Thus a(11)=
a2(;) - µa(;)
=
J 1 - µ2 = - Jl - ,,2+
15.
µa(;)
- ; = µ2;+ µJ� 11 -
J 1 - µ2
J1
;
- µ2
µ11.
Let ;1, 111 be associated with µ1, and ; ,11 be associated with µ , where µ >6 µ . 2 2 2 2 2 Then (;1, (a+ a*)a ))=(;1, 2µ ; )=((a+ a*)(e1). ; ) = (2µ1;1, ; ). 2 2 2 2 2 Thus (e1. e ) =0. 2
339
Answers to Selected Exercises V-12
[ -! �]· i
2. A+A*=
-� 3
0
Thus µ = -l and 1.
J2
The eigenvalues of A+A* are {
-
i,
-
i , 2}.
-i
To µ = 1 corresponds an eigenvector of A which is
(1, -1, O). An eigenvector of A+A* corresponding to -i is (0, 0, 1).
If this represents senting
a
J2 J2
g, the triple representing 1/ is
[
with respect to the basis {(O, 0, 1),
�
2
0
2.J2
_: 0
]
(1, 1, 0). The matrix repre-
(1, 1, 0),
: 1
J2
(1, -1, O)} is
.
VI-1
1. (1) (1, 0, 1) + t1(1, 1, 1)+ t2(2, 1, O); [1 -2 l](x1, x2, x3) = 2. (2) (1, 2, 2) +f1(2, 1, -2) + f2(2, -2, 1); [1 2 2j(xl, X2, X3) = 9. (3) (1,1,l,2)+t1(0,1,0,-l)+t2(2,1,-2,3);[1 0 1 O](x1,x2,x3,x4)= 2, [-2 1 0 lj(X1, X2, X3, X4) }. 2. (6, 2, -1) + t(-6, -1, 4); [1 -2 l](x1, X2, X3) = 2, [1 2 2](x1, X2, x3) = 9. 3. L = (2, 1, 2) + ((0, 1, -1), (-3, 0, -!)>. 4. i5(1, 1) + iH-6, 7) +H-<s. -6) = (O, o). 6. Let L1 and L2 be linear manifolds. If L1 n L2 ""'0, let cx0 E L1 n L2. Then L1= cx0+51 and L2= cx0 + 52, where 51 and 52 are subspaces. Then L1 II L2= cxo+ (51 II 52). 7. Clearly, cxl+51 c CX1 + (cx2 - CX1) +51 + 52 and 1X2 + 52= 1X1+(ot2 -ot1)+ 52 c ot1+ (ot2 - ot1) +51 +52. On the other hand, let ot1+ 5 be the of L1 and L2. Then L1= 1X1+ 51 c CX1+ 5 implies 51 c 5, and L2= CX2+ 52 c 1X1+ 5 implies 1X2 - IX1+52 c 5. Since 5 is a subspace, (ot2 - ot1) +51+ 52 c 5. Since, ot1+ 5 is the smallest linear manifold containing L1 and L2, otl+ 5 = otl + (ot2 - cxl) +51+52. 8. If cx0 E L1 II L2, then L1 = ot0 +51 and L2 = cx0+52. Thus L1JL2= ot0+ (cxo - oto) +51+52= oto+51 +52. Since IX1 E L1JL2, L1JL2= 1X1+51+ 52. 9. If ot2 - ot1 E 51+52, then cx2 - ot1= /31+ /32 where /31 E 51 and /32 E 52. Hence ot2 - /32= ot1+ /31. Since cx1+ /31 E ot1+51 = L1 and cx2 - /32 E ot2+52= L2, L1 n L2 ""'0. 10. If L1 n L2 ""'0, then L1JL2= cx1+51 + 52. Thus dim L1JL2= dim (51+52). If L1 II L2= 0, then L1JL2= ot1+ (ot2 - ot1) + 51 +52 and L1JL2 ""' cx1+ 52+52. Thus dim L1JL2 =dim (51+52) + 1. =
340
Answers to Selected Exercises
VI-2
1. If Y= [y1 Y2 y3], then Y must satisfy the conditions Y(l, 1, 0) ;?: 0, Y(l, 0, -1);?: 0, Y(O, -1, 1);?: 0 2. {[1 -1 l], [1 -1 -1], [1 1 ll}. 3. {(l, 1, 0), (1, 0, -1), (0, -1, 1), (0, 1, 1), (1, -1, 0), (1, 1, 1)}. 4. {(l, 0, -1), (0, -1, 1), (0, 1, 1)}. Express the omitted generators in of the elements of this set. 2], [1 5. {[-1 -1 1 -1], [1 -1 -ll}. 6. {(1, 0, 1), (3, 1, 2), (1, -1, O)}. 7. Let Y= [-1 -1 2]. Since YA ;?: 0 and YB= -2 < 0, (1, 1, 0) rt C2• 8. Let Y= [-2 9. Let Y = [1 l]. -2 -2]. 10. This is the dual of Theorem 2. 14. 1 1. Let Y= [2 2 l]. 1 2. Let A= { ef>v . . . , n} be the dual basis to A. Let 0=L?=i ;. Then ; is semi-positive if and only if ;;?: 0 and ef>0; > 0. In Theorem 2. 1 1, take {J= 0 and g= 1. Then V'fJ= 0 < g= 1 for all 'I' E V and the last condition in (2) of Theorem 2. 1 1 need not be stated. Then the stated theorem follows immediately from Theorem 2. 1 1. 14. Using the notation of Exercise 13, either (1) there is a semi-positive ; such that a(;) = 0, that is, ; E W, or (2) there is a 'I' E V such that 6(11') > 0. Let = 6(1/'). For ; E W, ef>;= 0-(1/'); = 'PaW = 0. Thus E W1-. 15. Take {J = O,g= 1, and= L7=1;,where {1 . . . ,
1. Given A, B, e, the primal problem is to find X;?: 0 which maximizes ex subject to AX � B. The dual problem is to find Y;?: 0 which minimiz�s YB subject to YA � C. 2. Given A, B, e, the primal problem is to find X;?: 0 which maximizes ex subject to AX= B. The dual problem is to find Y which minimizes YB subject to YA;?: C. 6. The pivot operation uses only the arithmetic operations permitted by the field axioms. Thus no tableau can contain any numbers not in any field containing the numbers in the original tableau. 7. Examining Equation (3.7) we see that ef>;' will be smaller than ef>; if ck - dk < 0. This requires a change in the first selection rule. The second selection rule is imposed so that the new ;' will be feasible, so this rule should not be changed. The remaining steps constitute the pivot operation and merely carry out the decisions made in the first and second steps. 8. Start with the equations =6
4x1 + x2 -X1 + X2
+ x4
= 10
+ x5 = 3.
The first feasible solution is (0, 0, 6, 10, 3). The optimal solution is (2, 2, 0, 0, 3). The numbers in the indicator row of the last tableau are (0, 0, -J, -!, O). 9. The last three elements of the indicator row of the previous exercise give Y1= }, Y2= f, Ya= 0.
Answers to Selected Exercises
10.
341
yi 10y2 y3 My4 My5, M 2 i a Ya 2yYi 4Y2Y2 - YYa Y1 5. Ys 10 0 2 2 2 -2). 11.12. -(0, (0, 2), 12(1, xi(2, 5)2,. x2 5. xi 2, x2 2 15. AX = 0. 0, 17. 0, The problem is to minimize 6 very large, subject to +
+
+ 3
+
where
+
is
=
-
+ Y4
+
+
=
+
When the last tableau is obtained, the row of {d;} will be [6 The fourth and fifth elements correspond to the unit matrix in = = to Exercise 8. the original tableau and give the solution Maximum = at = =
4),
O),
Xand Ymeet the test for optimality given in Exercise 14, and both are optimal. 16. B has a non-negative solution if and only if min FZ =
(-1-h,
1:.,
-t�-fj:l.4),
--lf-4
VI-6
1. A= (-1)[-2!, -it] [2: it] . A = 2[! !J { ! -!] !] A = 2[�0 �:] [ �: �:] A = [! !J (-1>[ ! -:1 2. A = [� ; -��] 2[� �: ;!] [� --2� �] 0 0 7 0 0 0. A AE, AE, E, [ =� =� �l E, n AEi = 2Ei Ni. Ni [-�l -: -�] 2 1 eA e2 [ � -� -�] [ � � �] 2 0. -1 -1 0 + 3
(a)
+ (-3
(b)
_
5
'
+ (-8)
5
(e)
10
'
3
+
-
_
10
10
l0
_
-3
-
-3
�
+
_
+
-
4.
'
-6
+ 3
3
whore
�
�
-3 3 3
and
where
+
=
.
6.
+ e
=
3
VI-8
� {<x2 - xi)2 [Hx2 -xa) - �3 (Y2 - Ya)J2 [t<xa - x1) �3 (Y3 Yi)]}
i. v =
+
+
+
Answers to Selected Exercises
342 M
2. 2 /. 4. These displacements represent translations of the molecule in the plane con taining it. They do not distort the molecule, do not store potential energy, and do not lead to vibrations of the system.
VI-9
2. 1T = (124), <1 = (234), <11T = (134), p = <11T-l = (12)(34). 3 . Since the subgroup is always one of its cosets, the alternating group has only two cosets in the full symmetric group, itself and the remaining elements. Since this is true for both right and left cosets, its right and left cosets are equal. 5. (e) = D((123)) = D((132)) = [1], D((12)), = D((13)) = D((23)) = [-1].
r [�
[
]
_: -�J
7. The matrix ap ear ng in (9.7) is H = 4 The matrix of transition 1 . is then P = r 2v 6 0 2 8. G is commutative if and only if every element is conjugate only to itself. By Theorem 9.11 and Equation 9.30, each n, = 1. 10. Let ' = e21ti/n be a primitive nth root of unity. If a is a generator of the cyclic group, let Dk(a) ['k], k = 0, . . . , n - 1. 11. _
=
e
C4
a
D1
D
4
2
as
-1
-1 -i
-1
e
tJ
a
h
n1
1
n2 ns
a
-i
n2
-1
ns
D
4
c
1 -1 -1 -1
-1
-1 -1
12. By Theorem 9.12 each n, l p2. But n, = p or p2 is impossible because of (9.30) and the fact that there is at least one representation of dimension 1. Thus each nr 1, and the group is commutative. 16. Since ah must be of order 1 or 2, we have (ah)2 = e, or ah = h-1a-1. Since a and h are of order 1 or 2, a-1 = a and h-1 = h. 17. If G is cyclic, let a be a generator of G, let ' = e"i/4, and define Dk(a) = ['k]. If G contains an element a of order 4 and no element of higher order, then G contains an element h which is not a power of a. h is of order 2 or 4. If h is of order 4, then h2 is of order 2. If h2 is a power of a, then h2 = a2. Then c = ah is of order 2 and not a power of a. In any event there is an element c of order 2 which is not a power of a. Then G is generated by a and c. If G contains elements of order 2 and no higher, let a, h, c be three distinct elements of order 2. They generate the group. Hints for obtaining the character tables for these last two groups are given in Exercises 21, 25, and 26. 18. The character tables for these two non-isomorphic groups are identical. 29. 12 + 12 + 22 + 32 + 32. 30. U4 contains C1 (the conjugate class containing only the identity), C3 (the class containing the eight 3-cycles), and C5 (the class containing the three pairs of interchanges). =
343
Answers to Selected Exercises
Vl-10
2.
The permutation
(123) is represented by 0
0
0
0
0
0
0
-t
-../3/2
0
0
0
0
0
0
0
0
-!
.J312
0
0
-!
0
0
.j3/2
-! 0
0
.j3/2
0
0
The representation of
(12) is 0
0
-1
0
0
0
-1
0
0 0
0
C1 4. �1 3.
=
=
-t - ../3/2 .J 3/ 2 -l
0
0
0
0
0
0
0
0
0
0
0
0 0
0
0
0
0
-1
0
0
0
0
0
C3 2. ( -.j3/2, -t . .j3/2, -t, 0, 1).
1, C2
=
1,
=
The displacement is a uniform expansion of the molecule. 5. �2 (-!, t. -t, -t, 0). =
1,
4
This displacement is a rotation of the molecule without storing potential energy. (0, 1, 0, 0, This subspace consists of (1, 0, 1, 0, 1, 0), �4 6. translations without distortion in the plane containing the molecule. 7. This subspace is spanned by the vectors �6 and �6 given in Exercise 8.
{g3
=
=
l,
l)}.
Notation
S,,:µeM {et, /3}
{et I P} {et,, J µeM}
(for sets)
Et rel="nofollow">
O"-l(et) lm(o-) Hom(U, V) p(o-) K(o-) v(o-) AT U/K R(a-) sgn n det A
laiil
adj A
C(.l.) S(.l.) Tr(A) V (space) A (basis) WJ_
R a
AxB Xi=1A; X/J.EMA/J. EE>i=1 V; Ef>/J.EMV/J. f. f..
6 6 6 6 12 21 23 27 28 30 31 31 31 55 80 84 87 89 89 95 106 107 115 129 130 139,191 140 142
A
A*
(et, /3) II et II d(et, /3) 1J
a-* W1 1- W2
A(S) H(S) L1 J L2
w+ P (positive orthant)
;;::: (for vectors) > (for vectors)
f'(�, 17) df(�) eA
X (for n-tuples)
D( G) (representation) x
345
147 147 147 148 150 159 159 171 171 177 177 177 186 189 191 225 228 229 230 234 234 238 262 263 275 278 294 298
Index
Abelian group, 8, 293
Characteristic, equation, 100
Addition, of linear transformations, 29
matrix, 99 polynomial, of a matrix, 100
of matrices, 39 of vectors, 7
of a linear transformation, 107
Adt, of a linear transformation, 189 of a system of differential equations, 283
value, 106 Characterizing equations of a subspace, 69 Codimension, 139
Adjunct, 95 Affine, closure, 225
Codomain, 28
combination, 224
Cofactor, 93
n-space, 9
Column rank, 41
Affinely dependent, 224
Commutative group, 8, 293
Algebraically closed, 106
Companion matrix, 103
Algebraic multiplicity, 107
Complement, of a set, 5
Alternating group, 308
of a subspace, 23
Annihilator, 139, 191
Complementary subspace, 23
Associate, 76
Complete inverse image, 27
Associated, homogeneous problem, 64
Completely reducible representation, 295
linear transformation, 192 Associative algebra, 30
Complete orthonormal set, 183 Completing the square, 166
Augmented matrix, 64
Component of a vector, 17
Automorphism, 46, 293
Cone, convex, 230
inner, 293
dual, 230 finite, 230
Basic feasible vector, 243 Basis, 15
polar, 230 polyhedral, 231 reflexive, 231
dual, 130
Congruent matrices, 158
standard, 69
Conjugate, bilinear form, 171
Bessel's inequality, 183
class, 294
Betweenness, 227
elements in a group, 294
Bilinear form, 156 Bounded linear transformation, 260
linear, 171 space, 129 Continuously differentiable, 262, 265
Cancellation, 34
Continuous vector function, 260
Canonical, dual linear programming
Contravariant vector, 137, 187
problem, 243
Convex, cone, 230
linear programming problem, 242 mapping, 79
hull, 228 linear combination, 227
Change of basis, 50
set, 227
Character, of a group, 298 table, 306
Coordinate, function, 129 space, 9
347
348
Index
Coordinates of a vector, 17 Coset, 79 Covariant vector, 137, 187 Cramer's rule, 97
Equation, characteristic, 100 minimum, 100 Equations, linear, 63 linear differential, 278 standard system, 70
Degenerate linear programming problem,
246 Derivative, of a matrix, 280 of a vector function, 266 Determinant, 89 Vandermonde, 93 Diagonal, main, 38
Equivalence, class, 75 relation, 74 Equivalent representations, 296 Euclidean space, 179 Even permutation, 87 Exact sequence, 147 Extreme vector, 252
matrix, 38, 113 Differentiable, 261, 262, 265 Differential of a vector function, 263 Dimension, of a representation, 294 of a vector space, 15 Direct product, 150 Direct sum, external, 148, 150 internal, 148 of representations, 296 of subspaces, 23, 24 Directional derivative, 264
Factor, group, 293 of a mapping, 81 space, 80 Faithful representation, 294 Feasible, linear programming problem,
241, 243 subset of a basis, 243 vector, 241, 243 Field, 5 Finite, cone, 230
Direct summand, 24
dimensional space, 15
Discriminant of a quadratic form, 199
sampling theorem, 212
Distance, 177
Flat, 220
Divergence, 267
Form, bilinear, 156
Domain, 28
conjugate bilinear, 171
Dual, bases, 142
Hermitian, 171
basis, 134
linear, 129
canonical linear programming problem,
quadratic, 160
243 cone, 230
Four-group, 309 Fourier coefficients, 182
space, 129
Functional, linear, 129
spaces, 134
Fundamental solution, 280
standard linear programming problem,
240 Duality, 133
General solution, 64 Generators of a cone, 230 Geometric multiplicity, 107
Eigenspace, I 07
Gradient, 136
Eigenvalue, 104, 192
Gramian, 182
problem, 104 Eigenvector, 104, 192 Elementary, column operations, 57 matrices, 58 operations, 57
Gram-Schmidt orthonormalization process, 179 Group, 8, 292 abelian, 8, 293 alternating, 308
Elements of a matrix, 38
commutative, 8, 293
Empty set, 5
factor, 293
Endomorphism, 45
order of, 293
Epimorphism, 28
symmetric, 308
Index
349
Half-line, 230
Lagrangian, 287
Hamilton-Cayley theorem, 100
Length of a vector, 177
Hermite normal form, 55
Line, 220
Hermitian, congruent, 172 form, 171
segment, 227 Linear, 1
matrix, 171
algebra, 30
quadratic form, 171
combination, 11
symmetric, 171 Homogeneous, associated problem, 64
non-negative, 230 conditions, 221
Homomorphism, 27, 293
constraints, 239
Hyperplane, 141, 220
dependence, 11
Idempotent, 270
functional, 129
Identity, matrix, 46
independence, 11
form, 129
permutation, 87
manifold, 220
representation, 308
problem, 63
transformation, 29 Image, 27, 28 inverse, 27 Independence, linearly, 11
relation, 11 transformation, 27 Linearly, dependent, 11 independent, 11
Index set, 5
Linear programming problem, 239
Indicators, 249
Linear transformation, 27
Induced operation, 79
addition of, 29
Injection, 146, 148
matrix representing, 38
Inner, automorphism, 293
multiplication of, 30
product, 177 Invariant, subgroup, 293 subspace, 104
normal, 203 scalar multiple of, 30 symmetric, 192
under a group, 294 Inverse, image, 27
Main diagonal, 38
matrix, 46
Manifold, linear, 220
transformation, 43
Mapping, canonical, 29
Inversion, of a permutation, 87 with respect to the origin, 37 Invertible, matrix, 46 transformation, 46 Irreducible representation, 271, 295 Isometry, 194
into, 27 natural, 29 onto, 28 Matrix polynomial, 99 Matrix, 37 addition, 39
Isomorphic, 18
characteristic, 99
Isomorphism, 28, 293
companion, 103 congruent, 158
Jacobian matrix, 266
diagonal, 38
, 229
Hermitian, 171
Jordan normal form, 118
congruent, 172 identity, 46
Kernel, 31
normal, 201
Kronecker delta, 15
of transition, 50
Kronecker product, 310
product, 40 representing, 38
Lagrange interpolation formula, 132
scalar, 46
350
Index
Matrix (continued)
Parallel, 221
sum, 39
Parametric representation, 221
symmetric, 158
Parity of a permutation, 88
unit, 46
Parseval's identities, 183
unitary, 194
Particular solution, 63
Maximal independent set, 14
Partitioned matrix, 250
Mechanical quadrature, 256
Permutation, 86
Minimum, equation, 100
even, 87
polynomial, 100
identity, 87
Monomorphism, 27
group, 308
Multiplicity, algebraic, 107 geometric, 107
odd, 87 Phase space, 285 Pivot, element, 249
n-dimensional coordinate space, 9
operation, 249
Nilpotent, 274
Plane, 220
Non-negative, linear combination, 230
Point, 220
semi-definite, Hermitian form, 168 quadratic form, 173 Non-singular, linear transformation, matrix, 46 Non-trivial linear relation, 11
Pointed cone, 230 Polar, 162 cone, 230 form, 161 Pole, 162
Norm of a vector, 177
Polyhedral cone, 231
Normal, coordinates, 287
Polynomial, characteristic, 100
form, 76
matrix, 99
Hermite form, 55
minimum, 100
Jordan form, 118 linear transformation, 203 matrix, 201 over the real field, 176 subgroup, 293
Positive, orthant, 234 vector, 238 Positive-definit, Hermitian form, 173 quadratic form, 168 Primal linear programming problem, 240
Normalized vector, 178
Principal axes, 287
Normalizer, 294
Problem, associated homogeneous, 64
Nullity, of a linear transformation, 31 of a matrix, 41
eigenvalue, 104 linear, 63 Product set, 147
Objective function, 239
Projection, 35, 44, 149
Odd permutation, 87
Proper subspace, 20
One-to-one mapping, 27 Onto mapping, 28 Optimal vector, 241 Order, of a determinant, 89
Quadratic form, 160 Hermitian, 171 Quotient space, 80
of a group, 293 of a matrix, 37
Rank, column, 41
Orthant, positive, 234
of a bilinear form, 164
Orthogonal, linear transformation, 270
of a Hermitian form, 173
matrix, 196
of a linear transformation, 31
similar, 197
of a matrix, 41
transformation, 194
row, 41
vectors, 138, 178 Orthonormal, basis, 178
Real coordinate space, 9 Reciprocal basis, 188
Index
351
Reducible representation, 295 Reflection, 43 Reflexive, cone, 231
orthogonal, 197 unitary, 197 Simplex method, 248
law, 74
Singular, 46
space, 133
Skew-Hermitian, 193
Regular representation, 301 Relation, of equivalence, 74 linear, 11 Representation, identity, 308
Skew-symmetric, bilinear form, 158 linear transformation, 192 matrix, 159 Solution, fundamental, 280
irreducible, 271, 295
general, 64
of a bilinear form, 157
particular, 63
of a change of basis, 50 of a group, 294 of a Hermitian form, 171
Space, Euclidean, 179 untary, 179 vector, 7
of a linear functional, 130
Span, 12
of a linear transformation, 38
Spectral decomposition, 271
of a quadratic form, 161
Spectrum, 270
of a vector, 18
Standard, basis, 69
parametric, 221 reducible, 295
dual linear programming problem, 240 primal linear programming problem, 239
Representative of a class, 75
Steinitz replacement theorem, 13
Resolution of the identity, 271
Straight line, 220
Restriction, mapping, 84
Subgroup, invarient, 293
of a mapping, 84 Rotation, 44 Row-echelon, form, 55
Subspace, 20 invariant under a linear transformation,
104 invariant under a representation. 295
Sampling, function, 254 theorem, 253 Scalar, 7 matrix, 46 multiplication, of linear transformations,
30
Sum of sets, 39 Superdiagonal form, 199 Sylvester's law of nullity, 37 Symmetric, bilinear form. 158 group, 308 Hermitian form. 192
of matrices, 39
law, 74
of vectors, 7
linear transformation, 192
product, 177
matrix, 158
transformation, 29
part of a bilinear form, 159
Schur's lemma, 297 Schwarz's inequality, 177 Self-adt, linear transformation, 192 system of differential equations, 283
Symmetrization of a linear transformation,
295 Symmetry, of a geometric figure, 307 of a system, 312
Semi-definite, Hermitian form, 173 quadratic form, 168
Tableau, 248
Semi-positive vector, 238
Trace, 115, 298
Sgn, 87
Transformation, identity, 29
Shear, 44
inverse, 43
Signature, of a Hermitian form, 173
linear, 27
of a quadratic form, 168 Similar, linear transformations, 78 matrices, 52, 76
orthogonal, 194 scalar, 29 unit, 29
352 Transformation (continued) unitary, 194
Index Vandermonde determinant, 93 Vector, 7
Transition matrix, 50
feasible, 241, 243
Transitive law, 74
normalized, 178
Transpose of a matrix, 55
optimal, 241
Trivial linear relation, 11
positive, 238 semi-positive, 238 space, 7
Unitary, matrix, 196
Vierergruppe (see Four-group), 309
similar, 197 space, 179
Weierstrass approximation theorem, 185
transformation, 194 Unit matrix, 46
Zero mapping, 28