machine learning for signal...

88
Machine Learning for Signal Processing Fundamentals of Linear Algebra Class 2. 6 Sep 2016 Instructor: Bhiksha Raj 11-755/18-797 1

Upload: others

Post on 29-Aug-2019

25 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Machine Learning for Signal Processing

Fundamentals of Linear Algebra

Class 2. 6 Sep 2016

Instructor: Bhiksha Raj

11-755/18-797 1

Page 2: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Overview

• Vectors and matrices

• Basic vector/matrix operations

• Various matrix types

• Projections

11-755/18-797 2

Page 3: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Book

• Fundamentals of Linear Algebra, Gilbert Strang

• Important to be very comfortable with linear algebra – Appears repeatedly in the form of Eigen analysis, SVD, Factor

analysis

– Appears through various properties of matrices that are used in machine learning

– Often used in the processing of data of various kinds

– Will use sound and images as examples

• Today’s lecture: Definitions – Very small subset of all that’s used

– Important subset, intended to help you recollect

11-755/18-797 3

Page 4: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Incentive to use linear algebra

• Simplified notation!

• Easier intuition

– Really convenient geometric interpretations

• Easy code translation!

11-755/18-797 4

for i=1:n

for j=1:m

c(i)=c(i)+y(j)*x(i)*a(i,j)

end

end

C=x*A*y

y j x iaiji

j

yAx T

Page 5: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

And other things you can do

• Manipulate Data

• Extract information from data

• Represent data..

• Etc.

11-755/18-797 5

Rotation + Projection +

Scaling + Perspective

Time F

requ

en

cy

From Bach’s Fugue in Gm

Decomposition (NMF)

Page 6: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Scalars, vectors, matrices, …

• A scalar a is a number – a = 2, a = 3.14, a = -1000, etc.

• A vector a is a linear arrangement of a collection of scalars

• A matrix A is a rectangular arrangement of a collection of scalars

11-755/18-797 6

511.3

62.21A

a 1 2 3 , a 3.14

32

Page 7: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vectors in the abstract • Ordered collection of numbers

– Examples: [3 4 5], [a b c d], ..

– [3 4 5] != [4 3 5] Order is important

• Typically viewed as identifying (the path from origin to) a location in an N-dimensional space

11-755/18-797 7

x

z

y

3

4

5

(3,4,5)

(4,3,5)

Page 8: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vectors in reality • Vectors usually hold sets of

numerical attributes

– X, Y, Z coordinates

• [1, 2, 0]

– [height(cm) weight(kg)]

– [175 72]

– A location in Manhattan

• [3av 33st]

• A series of daily temperatures

• Samples in an audio signal

• Etc.

11-755/18-797 8

[-2.5av 6st]

[2av 4st]

[1av 8st]

Page 9: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector norm

• Measure of how long a vector is:

– Represented as

• Geometrically the shortest distance to travel from the origin to the destination

– As the crow flies

– Assuming Euclidean Geometry

11-755/18-797 9

[-2av 17st] [-6av 10st]

a b ... a2 b2 ...2

x

3

4

5

(3,4,5) Length = sqrt(32 + 42 + 52)

Page 10: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector Operations: Multiplication by scalar

• Vector multiplication by scalar: each component multiplied by scalar

– 2.5 x (3,4,5) = (7.5, 10, 12.5)

• Note: as a result, vector norm is also multiplied by the scalar

– ||2.5 x (3,4,5)|| = 2.5x|| (3, 4, 5)||

11-755/18-797 10

3

(3,4,5)

(7.5, 10, 12.5)

Multiplication by scalar “stretches” the vector

Page 11: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector Operations: Addition

• Vector addition: individual components add

– (3,4,5) + (3,-2,-3) = (6,2,2)

– Only applicable if both vectors are the same size 11-755/18-797 11

3

4

5

(3,4,5) 3

-2

-3

(3,-2,-3)

(6,2,2)

Page 12: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

An introduction to spaces

• Conventional notion of “space”: a geometric construct of a certain number of “dimensions”

– E.g. the 3-D space that this room and every object in it lives in

11-755/18-797 12

Page 13: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

A vector space

• A vector space is an infinitely large set of vectors the following properties

– The set includes the zero vector (of all zeros)

– The set is “closed” under addition • If X and Y are in the set, aX + bY is also in the set for any two

scalars a and b

– For every X in the set, the set also includes the additive inverse Y = -X, such that X + Y = 0

11-755/18-797 13

Page 14: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Additional Properties

• Additional requirements:

– Scalar multiplicative identity element exists: 1X = X

– Addition is associative: X + Y = Y + X

– Addition is commutative: (X+Y)+Z = X+(Y+Z)

– Scalar multiplication is commutative: a(bX) = (ab) X

– Scalar multiplication is distributive: (a+b)X = aX + bX a(X+Y) = aX + aY

11-755/18-797 14

Page 15: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Example of vector space

• Set of all three-component column vectors

– Note we used the term three-component, rather than three-dimensional

• The set includes the zero vector

• For every X in the set 𝛼 ∈ ℛ, every aX is in the set

• For every X, Y in the set, aX + bY is in the set

• -X is in the set

• Etc.

11-755/18-797 15

Page 16: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Example: a function space

11-755/18-797 16

Page 17: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimension of a space

• Every element in the space can be composed

of linear combinations of some other

elements in the space

– For any X in S we can write X = aY1 + bY2 + cY3..

for some other Y1, Y2, Y3 .. in S

• Trivial to prove..

11-755/18-797 17

Page 18: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimension of a space

• What is the smallest subset of elements that can compose the entire set?

– There may be multiple such sets

• The elements in this set are called “bases”

– The set is a “basis” set

• The number of elements in the set is the “dimensionality” of the space

11-755/18-797 18

Page 19: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimensions: Example

• What is the dimensionality of this vector space

11-755/18-797 19

Page 20: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimensions: Example

• What is the dimensionality of this vector space?

– First confirm this is a proper vector space

• Note: all elements in Z are also in S (slide 19)

– Z is a subspace of S

11-755/18-797 20

Page 21: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimensions: Example

• What is the dimensionality of this space?

11-755/18-797 21

Page 22: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Moving on….

11-755/18-797 22

Page 23: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Interpreting Matrices • Two interpretations of a matrix

• As a transform that modifies vectors and vector spaces

• As a container for data (vectors)

• In the next two classes we’ll focus on the first view

• But we will mostly consider the second view for ML algorithms in the our discussions of signal representations

11-755/18-797 23

dc

ba

onmlk

jihgf

edcba

Page 24: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Interpreting Matrices as collections of vectors

• A matrix can be vertical stacking of row vectors

– The space of all vectors that can be composed from the rows of the matrix is the row space of the matrix

• Or a horizontal arrangement of column vectors

– The space of all vectors that can be composed from the columns of the matrix is the column space of the matrix

11-755/18-797 24

fedcba

R

fedcba

R

Page 25: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Dimensions of a matrix

• The matrix size is specified by the number of rows and columns

– c = 3x1 matrix: 3 rows and 1 column

– r = 1x3 matrix: 1 row and 3 columns

– S = 2 x 2 matrix

– R = 2 x 3 matrix

– Pacman = 321 x 399 matrix

11-755/18-797 25

cba

c

b

a

rc ,

fed

cba

dc

baRS ,

Page 26: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Representing an image as a matrix • 3 pacmen

• A 321 x 399 matrix

– Row and Column = position

• A 3 x 128079 matrix

– Triples of x,y and value

• A 1 x 128079 vector

– “Unraveling” the matrix

• Note: All of these can be recast as the matrix that forms the image

– Representations 2 and 4 are equivalent

• The position is not represented

11-755/18-797 26

1.1.00.1.11

10.10.65.1.21

10.2.22.2.11

1..000.11.11

Y

X

v

Values only; X and Y are

implicit

Page 27: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Basic arithmetic operations

• Addition and subtraction (vectors and matrices)

– Element-wise operations

11-755/18-797 27

a b

a1

a2

a3

b1

b2

b3

a1 b1

a2 b2

a3 b3

a b

a1

a2

a3

b1

b2

b3

a1 b1

a2 b2

a3 b3

A B a11 a12

a21 a22

b11 b12

b21 b22

a11 b11 a12 b12

a21 b21 a22 b22

Page 28: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector/Matrix Transposition

• A transposed row vector becomes a column (and vice versa)

• A transposed matrix gets all its row (or column) vectors transposed in order

11-755/18-797 28

x

a

b

c

, xT a b c

X a b c

d e f

, X

T

a d

b e

c f

y a b c , yT a

b

c

M

, MT

Page 29: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector multiplication • Multiplication by scalar

• Dot product, or inner product

– Vectors must have the same number of elements

– Row vector times column vector = scalar

• Outer product or vector direct product

– Column vector times row vector = matrix

11-755/18-797 29

a

b

c

d e f a d a e a f

b d b e b f

c d c e c f

a b c d

e

f

a d b e c f

dcdbdacbad

cd

bd

ad

d

c

b

a

.

Page 30: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector dot product • Example:

– Coordinates are yards, not ave/st

– a = [200 1600],

b = [770 300]

• The dot product of the two vectors relates to the length of a projection

– How much of the first vector have we covered by following the second one?

– Must normalize by the length of the “target” vector

11-755/18-797 30

[200yd 1600yd]

norm ≈ 1612

[770yd 300yd]

norm ≈ 826

a bT

a

200 1600 770

300

200 1600 393yd

norm

≈ 393yd

Page 31: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector dot product

• Vectors are spectra – Energy at a discrete set of frequencies

– Actually 1 x 4096

– X axis is the index of the number in the vector • Represents frequency

– Y axis is the value of the number in the vector • Represents magnitude

11-755/18-797 31

frequency

Sqrt

(energ

y)

frequency frequency 1...1540.911 1.14.16..24.3 0.13.03.0.0

C E C2

Page 32: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector dot product

• How much of C is also in E – How much can you fake a C by playing an E

– C.E / |C||E| = 0.1

– Not very much

• How much of C is in C2? – C.C2 / |C| /|C2| = 0.5

– Not bad, you can fake it

• To do this, C, E, and C2 must be the same size 11-755/18-797 32

frequency

Sqrt

(energ

y)

frequency frequency 1...1540.911 1.14.16..24.3 0.13.03.0.0

C E C2

Page 33: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Vector outer product

• The column vector is the spectrum

• The row vector is an amplitude modulation

• The outer product is a spectrogram

– Shows how the energy in each frequency varies with time

– The pattern in each column is a scaled version of the spectrum

– Each row is a scaled version of the modulation

11-755/18-797 33

Page 34: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying a matrix by a scalar

• Multiplying a matrix by a scalar multiplies every element of the matrix

• Note: bA = Ab

11-755/18-797 34

34333231

24232221

14131211

34333231

24232221

14131211

babababa

babababa

babababa

aaaa

aaaa

aaaa

bbA

Page 35: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying a vector by a matrix

• Multiplying a vector by a matrix transforms the vector

– Dimensions must match!!

• No. of columns of matrix = size of vector

• Result inherits the number of rows from the matrix

11-755/18-797 35

333232131

323222121

313212111

4

3

2

1

34333231

24232221

14131211

bababa

bababa

bababa

b

b

b

b

aaaa

aaaa

aaaa

BA

Page 36: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying a vector by a matrix

• Generalization of vector scaling

– Left multiplication: Dot product of each vector

pair

– Dimensions must match!!

• No. of columns of matrix = size of vector

• Result inherits the number of rows from the matrix

11-755/18-797 36

ba

bab

a

aBA

2

1

2

1

cd

bd

ad

d

c

b

a

.

Page 37: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying a vector by a matrix

• Generalization of vector multiplication

– Right multiplication: Dot product of each vector pair

– Dimensions must match!!

• No. of columns of matrix = size of vector

• Result inherits the number of rows from the matrix

11-755/18-797 37

2121 .. bababbaBA

dcdbdacbad .

Page 38: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix Multiplication: Column space

• Right multiplication by a matrix transform a row space

vector to a column space vector

• It mixes the column vectors of the matrix using the

numbers in the vector

• The column space of the Matrix is the complete set of

all vectors that can be formed by mixing its columns

11-755/18-797 38

f

cz

e

by

d

ax

z

y

x

fed

cba

Page 39: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix Multiplication: Row space

• Left multiplication mixes the row vectors of the matrix. – Converts a vector in the column space to one in the

row space

• The row space of the Matrix is the complete set of all vectors that can be formed by mixing its rows

11-755/18-797 39

fedycbaxfed

cbayx

Page 40: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplication of vector space by matrix

• The matrix rotates and scales the space

– Including its own vectors

11-755/18-797 40

6.13.1

7.03.0Y

Row space Column space

Page 41: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplication of vector space by matrix

• The normals to the row vectors in the matrix become the new axes – X axis = normal to the second row vector

• Scaled by the inverse of the length of the first row vector 11-755/18-797 41

6.13.1

7.03.0Y

Page 42: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix Multiplication

• The k-th axis corresponds to the normal to the hyperplane represented by the 1..k-1,k+1..N-th row vectors in the matrix

– Any set of K-1 vectors represent a hyperplane of dimension K-1 or less

• The distance along the new axis equals the length of the projection on the k-th row vector

– Expressed in inverse-lengths of the vector

11-755/18-797 42

ifc

heb

gda

Page 43: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing vectors

• A physical example

– The three column vectors of the matrix X are the spectra of three notes

– The multiplying column vector Y is just a mixing vector

– The result is a sound that is the mixture of the three notes

11-755/18-797 43

1..

.249

0..

031

X

1

2

1

Y

2

.

.

7

=

Page 44: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing vectors

• Mixing two images – The images are arranged as columns

• position value not included

– The result of the multiplication is rearranged as an image 11-755/18-797 44

200 x 200 200 x 200 200 x 200

40000 x 2

75.0

25.0

40000 x 1

2 x 1

Page 45: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying matrices

• Simple vector multiplication: Vector outer product

11-755/18-797 45

2212

2111

21

2

1

baba

bababb

a

aab

Page 46: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying matrices

• Generalization of vector multiplication

– Outer product of dot products!!

– Dimensions must match!!

• Columns of first matrix = rows of second

• Result inherits the number of rows from the first matrix and the number of columns from the second matrix

11-755/18-797 46

A B a1

a2

b1 b2

a1 b1 a1 b2

a2 b1 a2 b2

Page 47: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Multiplying matrices: Another view

• Simple vector multiplication: Vector inner product

11-755/18-797 47

2211

2

1

21 babab

baa

ab

Page 48: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: another view

11-755/18-797 48

NKN

MN

N

K

M

K

M

NKN

NK

MNM

N

N

bb

a

a

bb

a

a

bb

a

a

bb

bb

aa

aa

aa

..

.....

.

..

.

.

.

...

.

..

....

..

..

1

1

221

2

12

111

1

11

1

11

1

221

111

The outer product of the first column of A and the first row of B + outer product of the second column of A and the second row of B + ….

Sum of outer products

2222

2

1

21 . babab

baaBA

Page 49: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Why is that useful?

• Sounds: Three notes modulated independently

11-755/18-797 49

1..

.249

0..

031

X

.....195.09.08.07.06.05.0

......5.005.07.09.01

.....05.075.0175.05.00

Y

Page 50: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing modulated spectra

• Sounds: Three notes modulated independently

11-755/18-797 50

1..

.249

0..

031

X

.....195.09.08.07.06.05.0

......5.005.07.09.01

.....05.075.0175.05.00

Y

Page 51: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing modulated spectra

• Sounds: Three notes modulated independently

11-755/18-797 51

1..

.249

0..

031

X

.....195.09.08.07.06.05.0

......5.005.07.09.01

.....05.075.0175.05.00

Y

Page 52: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing modulated spectra

• Sounds: Three notes modulated independently

11-755/18-797 52

.....195.09.08.07.06.05.0

......5.005.07.09.01

.....05.075.0175.05.00

1..

.249

0..

031

X

Page 53: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing modulated spectra

• Sounds: Three notes modulated independently

11-755/18-797 53

.....195.09.08.07.06.05.0

......5.005.07.09.01

.....05.075.0175.05.00

1..

.249

0..

031

X

Page 54: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Mixing modulated spectra

• Sounds: Three notes modulated independently

11-755/18-797 54

Page 55: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Image transition

• Image1 fades out linearly

• Image 2 fades in linearly

11-755/18-797 55

..

..

..

22

11

ji

ji

19.8.7.6.5.4.3.2.1.0

01.2.3.4.5.6.7.8.9.1

Page 56: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Image transition

• Each column is one image – The columns represent a sequence of images of decreasing

intensity

• Image1 fades out linearly

11-755/18-797 56

19.8.7.6.5.4.3.2.1.0

01.2.3.4.5.6.7.8.9.1

..

..

..

22

11

ji

ji

0......8.09.0

.0.........

0.........

0......8.09.0

0......8.09.0

222

111

NNN iii

iii

iii

Page 57: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Image transition

• Image 2 fades in linearly

11-755/18-797 57

19.8.7.6.5.4.3.2.1.0

01.2.3.4.5.6.7.8.9.1

..

..

..

22

11

ji

ji

Page 58: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix multiplication: Image transition

• Image1 fades out linearly

• Image 2 fades in linearly

11-755/18-797 58

..

..

..

22

11

ji

ji

19.8.7.6.5.4.3.2.1.0

01.2.3.4.5.6.7.8.9.1

Page 59: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Matrix Operations: Properties

• A+B = B+A

– Actual interpretation: for any vector x

• (A+B)x = (B+A)x (column vector x of the right size)

• x(A+B) = x(B+A) (row vector x of the appropriate size)

• A + (B + C) = (A + B) + C

• AB != BA

11-755/18-797 59

Page 60: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

The Space of Matrices

• The set of all matrices of a given size (e.g. all 3x4 matrices) is a space!

– Addition is closed

– Scalar multiplication is closed

– Zero matrix exists

– Matrices have additive inverses

– Associativity and commutativity rules apply!

11-755/18-797 60

Page 61: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

The Identity Matrix

• An identity matrix is a square matrix where

– All diagonal elements are 1.0

– All off-diagonal elements are 0.0

• Multiplication by an identity matrix does not change vectors

11-755/18-797 61

10

01Y

Page 62: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Diagonal Matrix

• All off-diagonal elements are zero

• Diagonal elements are non-zero

• Scales the axes – May flip axes

11-755/18-797 62

10

02Y

Page 63: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Diagonal matrix to transform images

• How?

11-755/18-797 63

Page 64: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Stretching

1.1.00.1.11

10.10.65.1.21

10.2.22.2.11

100

010

002

• Location-based representation

• Scaling matrix – only scales the X axis

– The Y axis and pixel value are scaled by identity

• Not a good way of scaling.

11-755/18-797 64

Page 65: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Stretching

)2x(

Newpic

.....

.0000

.5.000

.5.15.0

.005.1

NN

DA

A

• Better way

• Interpolate

11-755/18-797 65

D =

N is the width

of the original

image

Page 66: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Modifying color

• Scale only Green

100

020

001

PNewpic

BGR

P

11-755/18-797 66

Page 67: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Permutation Matrix

• A permutation matrix simply rearranges the axes

– The row entries are axis vectors in a different order

– The result is a combination of rotations and reflections

• The permutation matrix effectively permutes the arrangement of the elements in a vector

11-755/18-797 67

x

z

y

z

y

x

001

100

010

3

4

5

(3,4,5)

X

Y

Z

4

5

3

X (old Y)

Y (old Z)

Z (old X)

Page 68: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Permutation Matrix

• Reflections and 90 degree rotations of images and objects

11-755/18-797 68

100

001

010

P

001

100

010

P

1.1.00.1.11

10.10.65.1.21

10.2.22.2.11

Page 69: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Permutation Matrix

• Reflections and 90 degree rotations of images and objects

– Object represented as a matrix of 3-Dimensional “position” vectors

– Positions identify each point on the surface

11-755/18-797 69

100

001

010

P

001

100

010

P

N

N

N

zzz

yyy

xxx

..

..

..

21

21

21

Page 70: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Rotation Matrix

• A rotation matrix rotates the vector by some angle q

• Alternately viewed, it rotates the axes

– The new axes are at an angle q to the old one

11-755/18-797 70

'

'

cossin

sincos

y

xX

y

xX

new

qq

qqqR

X

Y

(x,y)

newXXR q

X

Y

(x,y)

(x’,y’)

qq

qq

cossin'

sincos'

yxy

yxx

x’ x

y’

y q

Page 71: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Rotating a picture

• Note the representation: 3-row matrix – Rotation only applies on the “coordinate” rows – The value does not change – Why is pacman grainy?

11-755/18-797 71

1.1.00.1.11

..10.65.1.21

..2.22.2.11

100

045cos45sin

045sin45cos

R

1.1.00.1.11

..212.2827.23.232

..28.2423.2.20

Page 72: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

3-D Rotation

• 2 degrees of freedom

– 2 separate angles

• What will the rotation matrix be?

11-755/18-797 72

X

Y

Z q

Xnew

Ynew

Znew a

Page 73: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• What would we see if the cone to the left were transparent if we looked at it from above the plane shown by the grid?

– Normal to the plane

– Answer: the figure to the right

• How do we get this? Projection

11-755/18-797 73

Page 74: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• Actual problem: for each vector

– What is the corresponding vector on the plane that is “closest approximation” to it?

– What is the transform that converts the vector to its approximation on the plane?

11-755/18-797 74

90degrees

projection

Page 75: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• Arithmetically: Find the matrix P such that

– For every vector X, PX lies on the plane

• The plane is the column space of P

– ||X – PX||2 is the smallest possible

11-755/18-797 75

90degrees

projection

error

Page 76: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection Matrix

• Consider any set of independent vectors W1, W2 … on the plane

– Arranged as a matrix [W1 W2 ..]

• The plane is the column space of the matrix

– Any vector can be projected onto this plane

– The matrix P that rotates and scales the vector so that it becomes its projection is a projection matrix

11-755/18-797 76

90degrees

projection W1

W2

Page 77: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection Matrix

• Given a set of vectors W1, W2 … which form a matrix W = [W1 W2.. ]

• The projection matrix to transform a vector X to its projection on the plane is

– P = W (WTW)-1 WT

• We will visit matrix inversion shortly

• Magic – any set of independent vectors from the same plane that are expressed as a matrix will give you the same projection matrix

– P = V (VTV)-1 VT

11-755/18-797 77

90degrees

projection W1

W2

Page 78: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• HOW?

11-755/18-797 78

Page 79: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• Draw any two vectors W1 and W1 W2 that lie on the plane

– ANY two so long as they have different angles

• Compose a matrix W = [W1 W2.. ]

• Compose the projection matrix P = W (WTW)-1 WT

• Multiply every point on the cone by P to get its projection

11-755/18-797 79

Page 80: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections

• The projection actually projects it onto the plane, but you’re still seeing the plane in 3D

– The result of the projection is a 3-D vector

– P = W (WTW)-1 WT = 3x3, PX = 3x1

– The image must be rotated till the plane is in the plane of the paper

• The Z axis in this case will always be zero and can be ignored

• How will you rotate it? (remember you know W1 and W2 )

11-755/18-797 80

Page 81: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection matrix properties

• The projection of any vector that is already on the plane is the vector itself

– PX = X if X is on the plane

– If the object is already on the plane, there is no further projection to be performed

• The projection of a projection is the projection

– P(PX) = PX

• Projection matrices are idempotent

– P2 = P 81 11-755/18-797

Page 82: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projections: A more physical meaning

• Let W1, W2 .. Wk be “bases”

• We want to explain our data in terms of these “bases”

– We often cannot do so

– But we can explain a significant portion of it

• The portion of the data that can be expressed in terms of our vectors W1, W2, .. Wk, is the projection of the data on the W1 .. Wk (hyper) plane

– In our previous example, the “data” were all the points on a cone, and the bases were vectors on the plane

11-755/18-797 82

Page 83: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection : an example with sounds

• The spectrogram (matrix) of a piece of music

11-755/18-797 83

How much of the above music was composed of the above notes I.e. how much can it be explained by the notes

Page 84: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection: one note

• The spectrogram (matrix) of a piece of music

11-755/18-797 84

M = spectrogram; W = note

P = W (WTW)-1 WT

Projected Spectrogram = P * M

M =

W =

Page 85: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection: one note – cleaned up

• The spectrogram (matrix) of a piece of music

11-755/18-797 85

Floored all matrix values below a threshold to zero

M =

W =

Page 86: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection: multiple notes

• The spectrogram (matrix) of a piece of music

11-755/18-797 86

P = W (WTW)-1 WT

Projected Spectrogram = P * M

M =

W =

Page 87: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection: multiple notes, cleaned up

• The spectrogram (matrix) of a piece of music

11-755/18-797 87

P = W (WTW)-1 WT

Projected Spectrogram = P * M

M =

W =

Page 88: Machine Learning for Signal Processingmlsp.cs.cmu.edu/courses/fall2016/slides/Lecture2.LinearAlgebra.CMU.pdf · Machine Learning for Signal Processing Fundamentals of Linear Algebra

Projection and Least Squares • Projection actually computes a least squared error estimate

• For each vector V in the music spectrogram matrix – Approximation: Vapprox = a*note1 + b*note2 + c*note3..

– Error vector E = V – Vapprox

– Squared error energy for V e(V) = norm(E)2

– Total error = sum over all V { e(V) } = SV e(V)

• Projection computes Vapprox for all vectors such that Total error is minimized – It does not give you “a”, “b”, “c”.. Though

• That needs a different operation – the inverse / pseudo inverse 11-755/18-797 88

c

b

a

Vapprox

note

1

note

2

note

3