structure from motion

31
Structure from Motion Structure from Motion

Upload: ursa-deleon

Post on 31-Dec-2015

40 views

Category:

Documents


4 download

DESCRIPTION

Structure from Motion. Structure from Motion. For now, static scene and moving camera Equivalently, rigidly moving scene and static camera Limiting case of stereo with many cameras Limiting case of multiview camera calibration with unknown target - PowerPoint PPT Presentation

TRANSCRIPT

Structure from MotionStructure from Motion

Structure from MotionStructure from Motion

• For now, static scene and moving cameraFor now, static scene and moving camera– Equivalently, rigidly moving scene andEquivalently, rigidly moving scene and

static camerastatic camera

• Limiting case of stereo with many Limiting case of stereo with many camerascameras

• Limiting case of multiview camera Limiting case of multiview camera calibration with unknown targetcalibration with unknown target

• Given Given nn points and points and NN camera positions, camera positions, have 2have 2nNnN equations and 3 equations and 3nn+6+6NN unknowns unknowns

ApproachesApproaches

• Obtaining point correspondencesObtaining point correspondences– Optical flowOptical flow

– Stereo methods: correlation, feature Stereo methods: correlation, feature matchingmatching

• Solving for points and camera motionSolving for points and camera motion– Nonlinear minimization (bundle adjustment)Nonlinear minimization (bundle adjustment)

– Various approximations…Various approximations…

OrthographicOrthographic ApproximationApproximation

• Simplest SFM case: camera Simplest SFM case: camera approximated by orthographic approximated by orthographic projectionprojection

PerspectivePerspective OrthographicOrthographic

Weak PerspectiveWeak Perspective

• An orthographic assumption is An orthographic assumption is sometimes well approximated by a sometimes well approximated by a telephoto lenstelephoto lens

Weak PerspectiveWeak Perspective

Consequences ofConsequences ofOrthographic ProjectionOrthographic Projection

• Scene can be recovered up to scaleScene can be recovered up to scale

• Translation perpendicular to image Translation perpendicular to image planeplanecan never be recoveredcan never be recovered

Orthographic Structure from Orthographic Structure from MotionMotion

• Method due to Tomasi & Kanade, 1992Method due to Tomasi & Kanade, 1992

• Assume Assume nn points in space points in space pp11 … … ppnn

• Observed at Observed at NN points in time at image points in time at image coordinates (coordinates (xxijij, , yyijij), ), ii = 1: = 1:NN, , jj=1:=1:nn

– Feature tracking, optical flow, etc.Feature tracking, optical flow, etc.

Orthographic Structure from Orthographic Structure from MotionMotion

• Write down matrix of dataWrite down matrix of data

NnN

n

NnN

n

yy

yy

xx

xx

1

111

1

111

D

NnN

n

NnN

n

yy

yy

xx

xx

1

111

1

111

D

Points Points Fra

mes

Fram

es

Orthographic Structure from Orthographic Structure from MotionMotion

• Step 1: find translationStep 1: find translation

• Translation parallel to viewingTranslation parallel to viewingdirection can not be obtaineddirection can not be obtained

• Translation perpendicular to viewing Translation perpendicular to viewing direction equals motion of average direction equals motion of average position of all pointsposition of all points

Orthographic Structure from Orthographic Structure from MotionMotion

• Subtract average of each rowSubtract average of each row

NNnNN

n

NNnNN

n

yyyy

yyyy

xxxx

xxxx

1

11111

1

11111

~D

NNnNN

n

NNnNN

n

yyyy

yyyy

xxxx

xxxx

1

11111

1

11111

~D

Orthographic Structure from Orthographic Structure from MotionMotion

• Step 2: try to find rotationStep 2: try to find rotation

• Rotation at each frame defines local Rotation at each frame defines local coordinate axes , , andcoordinate axes , , and

• ThenThen

ii jj kk

jiijjiij yx pjpi ~ˆ~,~ˆ~ jiijjiij yx pjpi ~ˆ~,~ˆ~

Orthographic Structure from Orthographic Structure from MotionMotion

• So, can write where R is a So, can write where R is a “rotation” matrix and S is a “shape” “rotation” matrix and S is a “shape” matrixmatrix

RSD ~RSD ~

n

N

N ppS

j

j

i

i

R ~~

ˆ

ˆ

ˆ

ˆ

1

T

T1

T

T1

n

N

N ppS

j

j

i

i

R ~~

ˆ

ˆ

ˆ

ˆ

1

T

T1

T

T1

Orthographic Structure from Orthographic Structure from MotionMotion

• Goal is to factor Goal is to factor

• Before we do, observe that Before we do, observe that rankrank( ) = 3( ) = 3(in ideal case with no noise)(in ideal case with no noise)

• Proof:Proof:– Rank of Rank of RR is 3 unless no rotation is 3 unless no rotation

– Rank of Rank of SS is 3 iff have noncoplanar points is 3 iff have noncoplanar points

– Product of 2 matrices of rank 3 has rank 3Product of 2 matrices of rank 3 has rank 3

• With noise, With noise, rankrank( ) might be > 3( ) might be > 3

D~D~

D~D~

D~D~

SVDSVD

• Goal is to factor into Goal is to factor into RR and and SS

• Apply SVD:Apply SVD:

• But should have rank 3 But should have rank 3 all but 3 of the all but 3 of the wwii should be 0 should be 0

• Extract the top 3 Extract the top 3 wwii, together with the , together with the

corresponding columns of corresponding columns of UU and and VV

D~D~

T~UWVD T~UWVD

D~D~

Factoring for Factoring for Orthographic Structure from Orthographic Structure from

MotionMotion• After extracting columns, After extracting columns, UU33 has has

dimensions 2dimensions 2NN3 (just what we wanted 3 (just what we wanted for for RR))

• WW33VV33TT has dimensions 3 has dimensions 3nn (just what (just what

we wanted for we wanted for SS))

• So, let So, let RR**==UU33, , SS**==WW33VV33TT

Affine Structure from MotionAffine Structure from Motion

• The The ii and and jj entries of entries of RR** are not, in general, are not, in general,

unit length and perpendicularunit length and perpendicular

• We have found motion (and therefore We have found motion (and therefore shape)shape)up to an affine transformationup to an affine transformation

• This is the best we could do if we didn’tThis is the best we could do if we didn’tassume orthographic cameraassume orthographic camera

Ensuring OrthogonalityEnsuring Orthogonality

• Since can be factored as Since can be factored as RR** SS**, it can , it can also be factored as (also be factored as (RR**QQ)()(QQ-1-1SS**), for any ), for any QQ

• So, search for So, search for QQ such that such that RR == RR** QQ has has the properties we wantthe properties we want

D~D~

Ensuring OrthogonalityEnsuring Orthogonality

• Want orWant or

• Let Let TT = = QQQQTT

• Equations for elements of Equations for elements of TT – solve by – solve byleast squaresleast squares

• Ambiguity – add constraints Ambiguity – add constraints

1ˆˆ T*T* QiQi ii 1ˆˆ T*T* QiQi ii

0ˆˆ

1ˆˆ

1ˆˆ

*TT*

*TT*

*TT*

ii

ii

ii

jQQi

jQQj

iQQi

0ˆˆ

1ˆˆ

1ˆˆ

*TT*

*TT*

*TT*

ii

ii

ii

jQQi

jQQj

iQQi

0

1

0ˆ,

0

0

1ˆ *

1T*

1T jQiQ

0

1

0ˆ,

0

0

1ˆ *

1T*

1T jQiQ

Ensuring OrthogonalityEnsuring Orthogonality

• Have found Have found TT = = QQQQTT

• Find Find Q Q by taking “square root” of by taking “square root” of TT– Cholesky decomposition if Cholesky decomposition if TT is positive is positive

definitedefinite

– General algorithms (e.g. General algorithms (e.g. sqrtmsqrtm in Matlab) in Matlab)

Orthogonal Structure from MotionOrthogonal Structure from Motion

• Let’s recap:Let’s recap:– Write down matrix of observationsWrite down matrix of observations

– Find translation from avg. positionFind translation from avg. position

– Subtract translationSubtract translation

– Factor matrix using SVDFactor matrix using SVD

– Write down equations for orthogonalizationWrite down equations for orthogonalization

– Solve using least squares, square rootSolve using least squares, square root

• At end, get matrix At end, get matrix RR == RR** QQ of camera positions of camera positionsand matrix and matrix SS == QQ-1-1SS** of 3D points of 3D points

ResultsResults

• Image sequenceImage sequence

[Tomasi & Kanade][Tomasi & Kanade]

ResultsResults

• Tracked featuresTracked features

[Tomasi & Kanade][Tomasi & Kanade]

ResultsResults

• Reconstructed shapeReconstructed shape

[Tomasi & Kanade][Tomasi & Kanade]

Front viewFront viewTop viewTop view

Orthographic Orthographic Perspective Perspective

• With orthographic or “weak With orthographic or “weak perspective” can’t recover all perspective” can’t recover all informationinformation

• With full perspective, can recover more With full perspective, can recover more information (translation along optical information (translation along optical axis)axis)

• Result: can recover geometry and full Result: can recover geometry and full motion up to global scale factormotion up to global scale factor

Perspective SFM MethodsPerspective SFM Methods

• Bundle adjustment (full nonlinear Bundle adjustment (full nonlinear minimization)minimization)

• Methods based on factorizationMethods based on factorization

• Methods based on fundamental Methods based on fundamental matricesmatrices

• Methods based on vanishing pointsMethods based on vanishing points

Motion Field for Camera MotionMotion Field for Camera Motion

• Translation:Translation:

• Motion field lines converge (possibly at Motion field lines converge (possibly at ))

Motion Field for Camera MotionMotion Field for Camera Motion

• Rotation:Rotation:

• Motion field lines do not convergeMotion field lines do not converge

Motion Field for Camera MotionMotion Field for Camera Motion

• Combined rotation and translation:Combined rotation and translation:motion field lines have component that motion field lines have component that converges, and component that does notconverges, and component that does not

• Algorithms can look for vanishing point,Algorithms can look for vanishing point,then determine component of motion then determine component of motion around this pointaround this point

• ““Focus of expansion / contraction”Focus of expansion / contraction”

• ““Instantaneous epipole”Instantaneous epipole”

Finding Instantaneous EpipoleFinding Instantaneous Epipole

• Observation: motion field due to Observation: motion field due to translation depends on depth of pointstranslation depends on depth of points

• Motion field due to rotation does notMotion field due to rotation does not

• Idea: compute Idea: compute differencedifference between between motion of a point, motion of neighborsmotion of a point, motion of neighbors

• Differences should point towards Differences should point towards instantaneous epipoleinstantaneous epipole

SVD (Again!)SVD (Again!)

• Want to fit direction to all Want to fit direction to all v (differences in v (differences in optical flow) within some neighborhoodoptical flow) within some neighborhood

• PCA on matrix of PCA on matrix of vv

• Equivalently, take eigenvector of Equivalently, take eigenvector of AA = = ((v)v)((v)v)T T corresponding to largest eigenvaluecorresponding to largest eigenvalue

• Gives direction of parallax Gives direction of parallax llii in that patch, in that patch,

together with estimate of reliabilitytogether with estimate of reliability

SFM AlgorithmSFM Algorithm

• Compute optical flowCompute optical flow

• Find vanishing point (least squares solution)Find vanishing point (least squares solution)

• Find direction of translation from epipoleFind direction of translation from epipole

• Find perpendicular component of motionFind perpendicular component of motion

• Find velocity, axis of rotationFind velocity, axis of rotation

• Find depths of points (up to global scale)Find depths of points (up to global scale)