Skip to main content

Some Notes on Matrices and Linear Operators

The Matrix As a Linear Operator

Let AA be an m×nm\times n matrix. The function

TA:RnRm,TA(x)=Ax,T_A:\mathbb{R}^n\to\mathbb{R}^m, T_A(\underline{x}) = A\underline{x},

is linear, that is

TA(ax+by)=aTA(x)+bTA(y)T_A (a\underline{x} + b\underline{y}) = aT_A(\underline{x}) + bT_A(\underline{y})

if x,yRn\underline{x}, \underline{y} \in \mathbb{R}^n and a,bRa, b \in \mathbb{R}.

Examples

Example

If

A=[12]A = \begin{bmatrix} 1 & 2 \end{bmatrix}

then TA(x)=x+2yT_A(\underline{x}) = x + 2y where x=(xy)R2\underline{x} = \displaystyle{x \choose y}\in \mathbb{R}^2

Example

If

A=[0110]A = \begin{bmatrix} 0 & 1 \\ 1 & 0 \end{bmatrix}

then

TA(xy)=[yx]T_A\displaystyle{x \choose y} = \begin{bmatrix} y \\ x \end{bmatrix}
Example

If

A=[023101]A = \begin{bmatrix} 0 & 2 & 3 \\ 1 & 0 & 1 \end{bmatrix}

then

TA(xyz)=[2y+3zx+z]T_A \left( \begin{array}{ccc} x \\ y \\ z \end{array} \right) = \begin{bmatrix} 2y + 3z \\ x + z \end{bmatrix}
Example

If

T(xy)=(x+y2x3y)T \displaystyle{x \choose y } = \left( \begin{array}{cc} x + y \\ 2x - 3y \end{array} \right)

then T(x)=AxT (\underline{x}) = A \underline{x} if we set

A=[1123]A = \begin{bmatrix} 1 & 1 \\ 2 & -3 \end{bmatrix}

Inner Products and Norms

Assuming xx and yy are vectors, then we define their inner product by

xy=x1y1+x2y2++xnynx \cdot y = x_1y_1 + x_2y_2 + \cdots + x_ny_n

where

x=(x1xn)x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}

and

y=(y1yn)y = \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}

Details

If xx, yy Rn\in \mathbb{R}^n are arbitrary (column) vectors, then we define their inner product by

xy=x1y1+x2y2++xnynx \cdot y = x_1y_1 + x_2y_2 + \cdots + x_ny_n

where

x=(x1xn)x = \begin{pmatrix} x_1 \\ \vdots \\ x_n \end{pmatrix}

and

y=(y1yn)y = \begin{pmatrix} y_1 \\ \vdots \\ y_n \end{pmatrix}
Note

Note that we can also view xx and yy as n×1n \times 1 matrices and we see that xy=xyx \cdot y = x^\prime y.

Definition

The normal length of a vector is defined by x2=xx\left \| x \right \|^2 = x \cdot x. It may also be expressed as x=x12+x22++xn2\left \| x \right \| = \sqrt{x_1^2 + x_2^2 + \cdots + x_n^2}.

It is easy to see that for vectors a,ba, b and cc we have (a+b)c=ac+bc(a+b)\cdot c=a\cdot c+ b\cdot c and ab=baa\cdot b=b\cdot a.

Examples

Two vectors xx and yy are said to be orthogonal if xy=0x \cdot y = 0

Example

If

x=(34)x = \begin{pmatrix} 3 \\ 4 \end{pmatrix}

and

y=(21)y = \begin{pmatrix} 2 \\ 1 \end{pmatrix}

then

xy=32+41=10x \cdot y = 3 \cdot 2 + 4 \cdot 1 = 10

and

x2=32+42=25\left \| x \right \|^2 = 3^2 + 4^2 = 25

so

x=5\left\| x \right \| = 5

Orthogonal Vectors

Two vectors xx and yy are said to be orthogonal if xy=0x\cdot y=0 denoted xyx \perp y.

Details

Definition

Two vectors xx and yy are said to be orthogonal if xy=0x\cdot y=0 denoted xyx \perp y.

If a,bRna,b \in \mathbb{R}^n then

a+b2=aa+2ab+bb\left\|a+b\right\|^2=a\cdot a+2a\cdot b+b\cdot b

so

a+b2=a2+b2+2ab\left\|a+b\right\|^2=\left\|a\right\|^2+\left\|b\right\|^2 + 2\underline{a}\underline{b}

Note

Note that if aba \perp b then a+b2=a2+b2\left\|a+b\right\|^2=\left\|a\right\|^2+ \left\|b\right\|^2, which is Pythagoras' theorem in nn dimensions.

Linear Combinations of Independent Identicallly Distributed Variables

Suppose X1,,XnX_1,\dots,X_n are independent identically distributed random variables and have mean μ1,,μn\mu_1, \dots, \mu_n and variance σ2\sigma^2 then the expected value of YY of the linear combination is:

Y=aiXiY=\displaystyle\sum a_i X_i

and if a1,,ana_1,\dots,a_n are real constants then the mean is:

μY=aiμi\mu_Y = \displaystyle\sum a_i \mu_i

and the variance is:

σ2=ai2σi2\sigma^2 = \displaystyle\sum a^2_i \sigma^2_i

Examples

Example

Consider two independent identically distributed random variables, Y1,Y2Y_1,Y_2 such that E[Y1]=E[Y2]=2E[Y_1]=E[Y_2]=2 and Var[Y1]=Var[Y2]=4Var[Y_1]=Var[Y_2]=4, and a specific linear combination of the two, W=Y1+3Y2W=Y_1+3Y_2. We first obtain

E[W]=E[Y1+3Y2]=E[Y1]+3E[Y2]=2+32=2+6=8E[W]=E[Y_1+3Y_2]=E[Y_1]+3E[Y_2]=2+3\cdot 2=2+6=8

Similarly, we can first use independence to obtain

Var[W]=Var[Y1+3Y2]=Var[Y1]+Var[3Y2]Var[W]=Var[Y_1+3Y_2]=Var[Y_1]+Var[3Y_2]

and then (recall that Var[aY]=a2Var[Y]Var[aY]=a^2Var[Y] )

Var[Y1]+Var[3Y2]=Var[Y1]+32Var[Y2]=124+324=14+94=40Var[Y_1]+Var[3Y_2]=Var[Y_1]+3^2Var[Y_2]=1^2 \cdot 4+3^2\cdot 4= 1 \cdot 4 + 9 \cdot 4= 40

Normally, we just write this up in a simple sequence

Var[W]=Var[Y1+3Y2]=Var[Y1]+32Var[Y2]=124+324=14+94=40Var[W]=Var[Y_1+3Y_2]=Var[Y_1]+3^2Var[Y_2]=1^2 \cdot 4+3^2\cdot 4 = 1 \cdot 4 + 9 \cdot 4= 40

Covariance Between Linear Combinations of Independent Identically Distributed Random Variables

Suppose Y1,,YnY_1,\ldots,Y_n are independent identically distributed, each with mean μ\mu and variance σ2\sigma^2 and a,bRna,b\in \mathbb{R}^n. Writing

Y=(Y1Yn)Y = \left( \begin{array}{ccc} Y_1 \\ \vdots \\ Y_n \end{array} \right)

consider the linear combination aYa'Y and bYb'Y.

Details

The covariance between random variables UU and WW is defined by

Cov(U,W)=E[(Uμu)(Wμw)]Cov(U,W)= E[(U-\mu_u)(W-\mu_w)]

where μu=E[U]\mu_u=E[U] and μw=E[W]\mu_w=E[W]. Now, let U=aY=YiaiU=a'Y=\displaystyle\sum Y_ia_i and W=bY=YibiW=b'Y=\displaystyle\sum Y_ib_i, where Y1,,YnY_1,\ldots,Y_n are independent identically distributed with mean μ\mu and variance σ2\sigma^2, then we get

Cov(U,W)=E[(aYΣaμ)(bYΣbμ)]Cov(U,W)= E[(a'Y-\Sigma a_\mu)(b'Y-\Sigma b\mu)]

=E[(ΣaiYiΣaiμ)(ΣbjYjΣbjμ)]= E[(\Sigma a_iY_i -\Sigma a_i\mu)(\Sigma b_jY_j -\Sigma b_j\mu )]

and after some tedious (but basic) calculations we obtain

Cov(U,W)=σ2abCov(U,W)=\sigma^2a\cdot b

Examples

Example

If Y1Y_1 and Y2Y_2 are independent identically distributed, then

Cov(Y1+Y2,Y1Y2)=Cov((1,1)(Y1Y2),(1,1)(Y1Y2))Cov(Y_1+Y_2, Y_1-Y_2) = Cov \left( (1,1) \begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix}, (1,-1) \begin{pmatrix} Y_1 \\ Y_2 \end{pmatrix} \right)
=(1,1)(11)σ2=0= (1,1) \begin{pmatrix} 1 \\ -1 \end{pmatrix} \sigma^2 = 0

and in general, Cov(aY,bY)=0Cov(\underline{a}'\underline{Y}, \underline{b}'\underline{Y})=0 if ab\underline{a}\bot \underline{b} and Y1,,YnY_1,\ldots,Y_n are independent.

Random Vectors

Y=(Y1,,Yn)Y= (Y_1, \ldots, Y_n) is a random vector if Y1,,YnY_1, \ldots, Y_n are random variables.

Details

Definition

If E[Yi]=μiE[Y_i] = \mu_i then we typically write

E[Y]=(μ1μn)=μE[Y] = \left( \begin{array}{ccc} \mu_1 \\ \vdots \\ \mu_n \end{array} \right) = \mu

If Cov(Yi,Yj)=σijCov(Y_i, Y_j) = \sigma_{ij} and Var[Yi]=σii=σi2Var[Y_i]=\sigma_{ii} = \sigma_i^2, then we define the matrix

Σ=(σij)\boldsymbol{\Sigma} = (\sigma_{ij})

containing the variances and covariances. We call this matrix the covariance matrix of YY, typically denoted Var[Y]=ΣVar[Y] = \boldsymbol{\Sigma} or CoVar[Y]=ΣCoVar[Y] = \boldsymbol{\Sigma}.

Examples

Example

If Yi,,YnY_i, \ldots, Y_n are independent identically distributed, EYi=μEY_i = \mu, VYi=σ2VY_i = \sigma^2, a,bRna,b\in\mathbb{R}^n and U=aYU=a'Y, W=bYW=b'Y, and

T=[UW]T = \begin{bmatrix} U \\ W \end{bmatrix}

then

ET=[ΣaiμΣbiμ]ET = \begin{bmatrix} \Sigma a_i \mu \\ \Sigma b_i \mu \end{bmatrix}
VT=Σ=σ2[Σai2ΣaibiΣaibiΣbi2]VT = \boldsymbol{\Sigma} = \sigma^2 \begin{bmatrix} \Sigma a_i^2 & \Sigma a_i b_i \\ \Sigma a_ib_i & \Sigma b_i^2 \end{bmatrix}
Example

If Y\underline{Y} is a random vector with mean μ\boldsymbol{\mu} and variance-covariance matrix Σ\boldsymbol{\Sigma}, then

E[aY]=aμE[a'Y] = a'\mu

and

Var[aY]=aΣaVar[a'Y] = a' \boldsymbol{\Sigma} a

Transforming Random Vectors

Suppose

Y=(Y1Yn)\mathbf{Y} = \left( \begin{array}{c} Y_1 \\ \vdots \\ Y_n \end{array} \right)

is a random vector with E[Y]=μE[\mathbf{Y}] = \mu and Var[Y]=ΣVar[\mathbf{Y}] = \boldsymbol{\Sigma} where the variance-covariance matrix

Σ=σ2I\boldsymbol{\Sigma} = \sigma^2 I

Details

Note that if Y1,,YnY_1, \ldots, Y_n are independent with common variance σ2\sigma^2 then

Σ=[σ12σ12σ13σ1nσ21σ22σ23σ2nσ31σ32σ32σ3nσn1σn2σn3σn2]=[σ12000σ220σ320000σn2]=σ2[100010100001]=σ2I\boldsymbol{\Sigma} = \left[ \begin{array}{ccccc} \sigma_{1}^{2} & \sigma_{12} & \sigma_{13} & \ldots & \sigma_{1n} \\ \sigma_{21} & \sigma_2^{2} & \sigma_{23} & \ldots & \sigma_{2n} \\ \sigma_{31} &\sigma_{32} &\sigma_3^{2} & \ldots & \sigma_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \sigma_{n1} & \sigma_{n2} & \sigma_{n3} & \ldots & \sigma_n^{2} \end{array} \right] = \left[ \begin{array}{ccccc} \sigma_{1}^{2} & 0 & \ldots & \ldots & 0 \\ 0 & \sigma_2^{2} & \ddots & 0 & \vdots \\ \vdots & \ddots &\sigma_3^{2} & \ddots & \vdots \\ \vdots & 0 & \ddots & \ddots & 0 \\ 0 & \ldots & \ldots & 0 & \sigma_n^{2} \end{array} \right] = \sigma^2 \left[ \begin{array}{ccccc} 1 & 0 & \ldots & \ldots & 0 \\ 0 & 1 & \ddots & 0 & \vdots \\ \vdots & \ddots & 1 & \ddots & \vdots \\ \vdots & 0 & \ddots & \ddots & 0 \\ 0 & \ldots & \ldots & 0 & 1 \end{array} \right] = \sigma^2 I

If AA is an m×nm \times n matrix, then

E[AY]=AμE[A\mathbf{Y}] = A \mathbf{\mu}

and

Var[AY]=AΣAVar[A\mathbf{Y}] = A \boldsymbol{\Sigma} A'