TBIL-LA Problem Set 5

Section A.6 Problem Set 5

Instructions.

Prior to beginning this problem set, consider reading the Problem Set Success Guide Section A.1 for advice and clarity around expectations for Problem Sets in this course. Upload your solutions to all problems on this page to gradescope as a single .pdf file, remembering to assign pages appropriately for each question. Complete instructions for Problem Sets are available on Canvas.

Problem A.6.1. (Problem 1).

In class, we defined the inverse of a matrix in two steps. First, if \(A\) is a matrix, we said that \(A\) was invertible if the corresponding linear map \(T\colon\IR^n\to\IR^n\) was an invertible function. Given this, the inverse of \(A\text{,}\) denoted \(A^{-1}\) was defined to be the standard matrix of the inverse linear map \(T^{-1}\colon\IR^n\to\IR^n\text{.}\)

An alternative approach would be: an \(n\times n\) matrix \(A\) is invertible if we can find a matrix \(B\) for which \(AB=BA=I_n\text{;}\) in this case, \(B\) is unique and we define \(A^{-1}\) to be this matrix. For this problem (and future ones) you may decide to use either characterization you want.

(a)

If \(A,B\) are both invertible matrices, explain why \(AB\) is also invertible and that \((AB)^{-1}=B^{-1}A^{-1}\text{.}\)

(b)

If \(A,B\) are \(n\times n\) matrices and \(A,B\) are invertible, explain and demonstrate how to solve the following matrix equations for \(X\text{:}\)

\(\displaystyle A^{-1}XA=B\)
\(\displaystyle AXA^{-1}=B\)
\(\displaystyle ABX=I\)

(c)

If \(H,G\) are invertible matrices, is it necessarily the case that \((H+G)\) is invertible? If yes, prove it; if not, provide a counterexample.

Solution.

(a)

Since \(A,B\) are both invertible, it follows that both \(A^{-1}\) and \(B^{-1}\) exist. We have:

\begin{equation*} AB(B^{-1}A^{-1})=AI_nA^{-1}=AA^{-1}=I_n \end{equation*}

and

\begin{equation*} (B^{-1}A^{-1})AB=B^{-1}I_nB=I_n, \end{equation*}

which shows that \(AB\) is invertible and its inverse is \(B^{-1}A^{-1}\) by above.

Alternatively: if \(S,T\) denote the corresponding transformations, then the composition \(T\circ S\) is invertible and its inverse is \(S^{-1}\circ T^{-1}\text{;}\) the result now follows by definition of invertible matrix.

(b)

Consider the following steps to solve for \(X\text{:}\)

\begin{align*} A^{-1}XA \amp= B\\ XA \amp= AB\\ X \amp= ABA^{-1} \end{align*}

Consider the following steps to solve for \(X\text{:}\)

\begin{align*} AXA^{-1} \amp= B\\ XA \amp= A^{-1}B\\ X \amp= A^{-1}BA \end{align*}
Consider the following steps to solve for \(X\text{:}\)

\begin{align*} ABX \amp =I\\ BX \amp =A^{-1}I\\ X \amp =B^{-1}A^{-1}I=B^{-1}A^{-1} \end{align*}

(c)

No. Both \(I_n\) and \(-I_n\) are invertible, but there sum, the zero matrix, is not.

Problem A.6.2. (Problem 2).

Let \(A\) and \(B\) be two matrices for which the products \(AB\) and \(BA\) are both defined. If \(AB=BA\text{,}\) we say that \(A,B\) commute; if \(AB=-BA\text{,}\) we say that \(A\) and \(B\) anti-commute.

(a)

Let \(A=\left[\begin{array}{cc}1& 3\\-1& 1\end{array}\right]\) and \(S=\{X\in M_{2,2}|\ AX=XA\}\) be the subset of \(2\times 2\) matrices that commute with \(A\text{.}\) Explain why \(S\) is a subspace of \(M_{2,2}\) and explain and demonstrate how to find a basis for \(S\text{.}\)

(b)

Let \(B=\left[\begin{array}{cc}0& 1\\-1& 0\end{array}\right]\) and \(T=\{Y\in M_{2,2}|\ BY=-YB\}\) be the subset of \(2\times 2\) matrices that anti-commute with \(B\text{.}\) Explain why \(T\) is a subspace of \(M_{2,2}\) and explain and demonstrate how to find a basis for \(T\text{.}\)

Solution.

(a)

Let’s begin by showing that \(S\) is indeed a subspace. Firstly, if \(\vec{0}\) denotes the zero matrix, then \(\vec{0}A=A\vec{0}=\vec{0}\text{,}\) so \(\vec{0}\in S\text{.}\)

Next, if \(X,Y\in S\text{,}\) then we know that \(AX=XA\) and \(AY=YA\text{.}\) When we add them, we find

\begin{equation*} A(X+Y)=AX+AY=XA+YA=(X+Y)A, \end{equation*}

so \(S\) is closed under taking sums. Likewise, if \(k\) is some scalar, then

\begin{equation*} A(kX)=kAX=kXA=(kX)A \end{equation*}

shows closure under scalar multiplication.

Let \(X=\left[\begin{array}{cc}a&b\\c&d\end{array}\right]\in M_{2,2}\text{.}\) Then \(X\in S\) if and only if \(AX=XA\text{;}\) that is, if and only if

\begin{equation*} \left[\begin{array}{cc}1&3\\-1&1\end{array}\right]\left[\begin{array}{cc}a&b\\c&d\end{array}\right]=\left[\begin{array}{cc}a&b\\c&d\end{array}\right]\left[\begin{array}{cc}1&3\\-1&1\end{array}\right] \end{equation*}

Performing the multiplication, we get:

\begin{equation*} \left[\begin{array}{cc}a+3c&b+3d\\-a+c&-b+d\end{array}\right]=\left[\begin{array}{cc}a-b&3a+b\\c-d&3c+d\end{array}\right] \end{equation*}

By equating each corner, we obtain a linear system of equations in the variables \(a,b,c,d\text{.}\) The corresponding equivalent system obtained by taking an appropriate RREF is:

\begin{equation*} a=d,\ \ b=-3c \end{equation*}

It follows that any element of \(S\) can be written as:

\begin{equation*} \left[\begin{array}{cc}a&b\\c&d\end{array}\right]=\left[\begin{array}{cc}d&-3c\\c&d\end{array}\right]=d\left[\begin{array}{cc}1&0\\0&1\end{array}\right]+c\left[\begin{array}{cc}0&-3\\1&0\end{array}\right]. \end{equation*}

This demonstrates that \(\left\{\left[\begin{array}{cc}1&0\\0&1\end{array}\right],\left[\begin{array}{cc}0&-3\\1&0\end{array}\right]\right\}\) spans \(S\text{.}\) Since these matrices are not multiples of each other, the set is also linearly independent and therefore a basis.

(b)

An analogous argument to that used in (a) shows that \(T\) is a subspace.

By starting with an arbitrary \(Y=\left[\begin{array}{cc}x&y\\z&w\end{array}\right]\in M_{2,2}\) and considering the equation \(BY=-YB\text{,}\) one similarly obtains a system of equation in the variables \(x,y,z,w\) whose solutions describe elements of \(T\text{.}\) After RREFing this system, we get:

\begin{equation*} x=-w,\ \ y=z. \end{equation*}

It follows that if \(Y\in T\text{,}\) then

\begin{equation*} Y=\left[\begin{array}{cc}x&y\\z&w\end{array}\right]=\left[\begin{array}{cc}-w&z\\z&w\end{array}\right]=w\left[\begin{array}{cc}-1&0\\0&1\end{array}\right]+z\left[\begin{array}{cc}0&1\\1&0\end{array}\right]. \end{equation*}

It follows, by the same reasoning in (a), that the set \(\left\{\left[\begin{array}{cc}-1&0\\0&1\end{array}\right],\left[\begin{array}{cc}0&1\\1&0\end{array}\right]\right\}\) is a basis for \(T\text{.}\)

Problem A.6.3. (Problem 3).

Suppose that \(S\colon V\to W\) and \(T\colon W\to U\) are two linear transformations of vector spaces.

(a)

If \(S\) and \(T\) are both surjective, explain why \(T\circ S\) is also surjective.

(b)

Let \(A\) be an \(m\times n\) matrix and \(B\) an \(n\times k\) matrix. Suppose further that we know that \(\RREF(A)\) and \(\RREF(B)\) have pivot positions in each row. Explain why each row of \(\RREF(AB)\) also has a pivot.

(c)

Again, let \(A\) be an \(m\times n\) matrix and \(B\) an \(n\times k\) matrix. Suppose we know that \(\RREF(AB)\) has a pivot in each row. Explain why \(\RREF(A)\) must also have a pivot position in each of its rows, but show, by providing an example, that it is possible for some row of \(\RREF(B)\) to be missing a pivot.

Remark: If you’d like explore further: formulate and answer a related sequence of activities involving injective transformations and products of matrices whose RREFs have pivots in each column.

Solution.

(a)

To show that \(T\circ S\) is surjective, we need to start with an arbitrary \(\vec{u}\in U\) and explain why we can find some \(\vec{v}\in V\) that maps to it. Since \(\vec{u}\in U\) and \(T\) is surjective, we know we can find some \(\vec{w}\in W\) for which \(T(\vec{w})=\vec{u}\text{.}\) Likewise, since we know that \(S\) is surjective and \(\vec{w}\in W\text{,}\) it follows that there is some \(\vec{v}\in V\) for which \(S(\vec{v})=\vec{w}\text{.}\)

But then \((T\circ S)(\vec{v})=T(S(\vec{v}))=T(\vec{w})=\vec{u}\text{.}\) This shows that the composition is surjective.

(b)

Notation as given in the problem, let \(S\colon \IR^k\to\IR^n\) and \(T\colon\IR^n\to\IR^m\) denote the linear transformations corresponding to \(B\) and \(A\) respectively. Since both matrices have pivots in each row, both transformations are surjective. By part (a), it follows that \(T\circ S\) is surjective. Since \(T\circ S\) is surjective, it follows that its standard matrix, which is \(AB\text{,}\) has a pivot in each of its rows.

(c)

In part (b), we were able to prove a statement about pivot positions by first translating the statement into one about transformations. Let’s see if this works in this new context.

I claim that if \(S\colon V\to W\) and \(T\colon W\to U\) are linear maps for which the composition \(T\circ S\) is surjective, then it follows that \(T\) must itself be surjective.

To see this, let \(\vec{u}\in U\text{.}\) Since the composition is surjective, we are entitled to some \(\vec{v}\in V\) for which \(T(S(\vec{v}))=\vec{u}\text{.}\) But \(S(\vec{v})\in W\text{,}\) which means some element of \(W\) maps to \(\vec{u}\text{,}\) showing surjectivity of \(T\text{.}\)

Translating this into statements about matrices having pivots in each row, the result follows just as it did above in part (b).

One example that shows that \(B\) need not have a pivot in each row is given by the following:

\begin{equation*} A=\left[\begin{array}{ccc}1&2&0\\0&1&0\end{array}\right], B=\left[\begin{array}{cc}1&0\\0&1\\0&0\end{array}\right] \end{equation*}

Then, we have

\begin{equation*} AB=\left[\begin{array}{cc}1&2\\0&1\end{array}\right] \end{equation*}

The matrix \(A\) has a pivot in each row, as does \(AB\text{,}\) but \(B\) is missing a pivot in its second row.

Observation A.6.4. A Different Take on Matrix Multiplication.

In class, we defined the product of two matrices to be the standard matrix of the composition of the two corresponding linear transformations. Here is an alternate formula/definition for the matrix product that builds on the work you’ve been doing with dot-products.

Suppose that \(A\) is an \(m\times n\) matrix and that \(B\) is an \(m\times k\) matrix. Let \(\vec{r}_1,\dots,\vec{r}_m\) be the rows of \(A\) and let \(\vec{c}_1,\dots,\vec{c}_k\) be the columns of \(B\text{,}\) so that we can write \(A=\left[\begin{array}{c}\vec{r}_1\\\vdots\\\vec{r}_m\end{array}\right]\) and \(B=[\vec{c}_1\ \cdots\ \vec{c}_k]\text{.}\) Since the rows of \(A\) and the columns of \(B\) are all vectors in \(\IR^n\text{,}\) it makes sense to take the dot-product between them. The matrix product can then be defined by

\begin{equation*} AB=\left[\begin{array}{ccc}\vec{r}_1\bullet\vec{c}_1&\cdots&\ \vec{r}_1\bullet\vec{c}_k\\\vdots &\vdots & \vdots\\\vec{r}_m\bullet\vec{c}_1&\cdots& \vec{r}_m\bullet\vec{c}_k\end{array}\right]. \end{equation*}

In other words, the \(ij\) entry of the product \(AB\) is the dot-product of the \(i\)-th row of \(A\) and the \(j\)-th column of \(B\text{.}\)

As an example, in class, we computed that:

\begin{equation*} \left[\begin{array}{cc}1&2\\0&1\\3&5\\-1&-2\end{array}\right]\left[\begin{array}{ccc}2&1&-3\\5&-3&4\end{array}\right]=\left[\begin{array}{ccc}12&-5&5\\5&-3&4\\31&-12&11\\-12&5&-5\end{array}\right]. \end{equation*}

Now, we can check the above as follows: we see that the \((3,2)\) entry of this product is \(-12\text{.}\) This is the same as taking the dot-product of the third row of \(A\) and the second column of \(B\text{:}\)

\begin{equation*} \left[\begin{array}{c}3\\5\end{array}\right]\bullet\left[\begin{array}{c}1\\-3\end{array}\right]=-12. \end{equation*}

Problem A.6.5. (Problem 4).

Use the above conceptualization of matrix-product to revisit some old friends with a new perspective.

(a)

Let \(T\colon\IR^n\to\IR^m\) be a linear transformation with \(m\times n\) standard matrix \(A\text{.}\) Explain why the kernel of \(T\) is equal to the orthogonal complement of the row space of \(A\text{.}\) That is, explain why:

\begin{equation*} \ker(T)=\textrm{Row}(A)^\perp. \end{equation*}

(b)

Using part (a), explain and demonstrate how to calculate a basis for \(W^\perp\) where:

\begin{equation*} W=\vspan\left\{\left[\begin{array}{c}1\\2\\-3\\2\end{array}\right],\left[\begin{array}{c}2\\7\\-1\\5\end{array}\right]\right\}. \end{equation*}

(c)

Let \(W\) be a subspace of \(\IR^n\text{.}\) Explain, using results covered in class or in previous problems sets, why

\begin{equation*} \dim(W)+\dim(W^\perp)=n. \end{equation*}

Hint.

For part (c): choose a spanning set for \(W\text{.}\) That is, suppose that \(W=\vspan\{\vec{v}_1,\dots, \vec{v}_r\}\) for some finite set of vectors.

Solution.

(a)

If \(\vec{x}=\left[\begin{array}{c}x_1\\\vdots\\x_n\end{array}\right]\in\IR^n\text{,}\) then

\begin{equation*} T(\vec{x})=A\vec{x}=\left[\begin{array}{c}\vec{r}_1\bullet\vec{x}\\\vdots\\\vec{r}_m\bullet\vec{x}\end{array}\right]. \end{equation*}

using the dot-product description of the matrix product, where \(\vec{r}_i\) denotes the \(i\)-th row of \(A\text{.}\)

With this description, \(T(\vec{x})=\vec{0}\) if and only if \(\vec{x}\) is orthogonal to every row of \(A\text{.}\) Using our result from Problem Set 3 about spanning sets and orthogonal complements, this is equivalent to saying that \(\vec{x}\in\textrm{Row}(A)^\perp\text{.}\)

(b)

By what we showed in part (a), \(W^\perp\) is equal to the kernel of the linear transformation represented by the following matrix:

\begin{equation*} A=\left[\begin{array}{cccc}1&2&-3& 2\\2&7&-1& 5\end{array}\right]. \end{equation*}

Using our usual methods from EV7, a basis for the solution space to the corresponding homogeneous equation is:

\begin{equation*} \left\{\left[\begin{array}{c}\frac{19}{3}\\-\frac{5}{3}\\1\\0\end{array}\right],\left[\begin{array}{c}-\frac{4}{3}\\-\frac{1}{3}\\0\\1\end{array}\right]\right\} \end{equation*}

(c)

Suppose that we have some spanning set of \(W\text{;}\) that is, \(W=\vspan{\vec{v}_1,\dots, \vec{v}_r}\) for some vectors \(\vec{v}_i\text{.}\) As we did above, let \(A\) denote the matrix whose rows are given by the \(\vec{v}_i\text{.}\) Then, \(A\) represents some linear transformation \(T\colon\IR^n\to\IR^r\text{.}\)

The rank of \(A\) is equal to \(\dim(W)\text{.}\) The nullity, by part (a), is equal to \(\dim(W)^\perp\text{.}\) Their sum, by the rank-nullity theorem, is \(n\text{.}\)

Prev Top Next