ece-c453 image processing architecture

48
Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 1 Lecture 6 ECE-C453 Image Processing Architecture Lecture 6, 2/3/04 Lossy Video Coding Ideas Technology of DCT and Motion Estimation Oleh Tretiak Drexel University

Upload: tessa

Post on 11-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

ECE-C453 Image Processing Architecture. Lecture 6, 2/3/04 Lossy Video Coding Ideas Technology of DCT and Motion Estimation Oleh Tretiak Drexel University. Decorrelation Ideas. Orthogonal Transforms (KLR, DCT) Main method for intra-frame coding Wavelet New stuff (JPEG 2000) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 1Lecture 6

ECE-C453Image Processing Architecture

Lecture 6, 2/3/04Lossy Video Coding Ideas

Technology of DCT and Motion EstimationOleh Tretiak

Drexel University

Page 2: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 2Lecture 6

Decorrelation Ideas Orthogonal Transforms (KLR, DCT)

Main method for intra-frame coding Wavelet

New stuff (JPEG 2000) Predictive coding

Simple Used for inter-frame coding (video)

Page 3: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 3Lecture 6

Lossy Predictive Coding How to decorrelate?

Predict values Block coding (DFT) wavelet

Predictive (sample based, feedback) encoder,Differential Pulse Code Modulation (DPCM)

QuantizerPredictorx i+-+ q i

ˆ x ip i

q i+Predictor

p i

ˆ x i

EncoderDecoder

Page 4: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 4Lecture 6

Review: Image Decorrelation x = (x1, x2, ... xn), a sequence of image gray values Preprocess: convert to y = (y1, y2, ... yn), y = Ax, A ~ an

orthogonal matrix (A-1 = AT) Theoretical best (for Gaussian process): A is the Karhunen-

Loeve transformation matrix Images are not Gaussian processes Karhunen-Loeve matrix is image-dependent, computationally

expensive to find Evaluating y = Ax with K-L transformation is computationally

expensive In practice, we use DCT (discrete cosine transform) for

decorrelation Computationally efficient Almost as good as the K-L transformation

Page 5: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 5Lecture 6

Review: Block-Based Coding Full image DCT - one set of

decorrelated coefficients for whole image

Block-based coding: Image divided into ‘small’

blocks Each block is decorrelated

separately Block decorrelation performs

almost as well (better?) than full image decorrelation

Current standards (JPEG, MPEG) use 8x8 DCT blocks

Page 6: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 6Lecture 6

Rate-Distortion: 1D vs. 2D coding Theory on tradeoff between distortion and least number of bits Interesting tradeoff only if samples are correlated “Water-filling” construction to compute R(d)

Page 7: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 7Lecture 6

Wavelet Transform Filterbank and wavelets 2 D wavelets Wavelet Pyramid

Page 8: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 8Lecture 6

Filterbank Pyramid

LHx(i)

LH

LH

1000 500

250

125

125

Page 9: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 9Lecture 6

Lena: Top Level, next level

1.01

0.372.52

48.81 9.23

15.45 6.48

Page 10: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 10Lecture 6

This Lecture Idea

Video Coding by Pixel Prediction Motion Estimation

Technology: DCT, and how much it costs Technology: Motion Estimation Algorithms

Page 11: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 11Lecture 6

Video Coding Video: Sequence of images Reason for changes between successive images

Edits Camera pan, zoom Intra-frame motion Intra-frame texture Noise

Model: Successive images are similar Video coding uses intra-frame redundancy to achieve lossy

compression

Page 12: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 12Lecture 6

Predicting sequential images

f(t-1) f(t)

f(t)–f(t–1)

Page 13: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 13Lecture 6

Motion Compensation Macroblock size

MxN Matching criterion

MAE (mean absolute error) Search window

±p pixel locations Search algorithm

Full search Logarithmic search Parallel Hierarchical One-Dimensional Search Pixel subsampling and projection Hierarchical downsampling

Page 14: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 14Lecture 6

Motion Estimation Methods

No compensation

Full search

logarithmicsearch 3 level

hierarchical

Page 15: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 15Lecture 6

DCT Technology DCT Formula How it works

DCT plus quantization DCT implementations and cost

Direct Separable Fast Refinements

Page 16: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 16Lecture 6

What is the DCT? One-dimensional 8 point DCT

Input x0, ... x7, output y0, ... y7

One-dimensional inverse DCTInput y0, ... y7, output x0, ... x7

Matrix form of equations: x, y are one column matrices

yk =c(k)

2 xi cos (2i +1)kp16

⎛ ⎝

⎞ ⎠

i=0

7∑ , k =0,1,K ,7. c(k) = 1/ 2 k =0

1 otherwise⎧ ⎨ ⎩

y=Tx, x=TTy, tki =c(k)

2 cos (2i +1)kp16

⎛ ⎝

⎞ ⎠

xk = yi

c(i)2 cos (2k +1)ip

16⎛ ⎝

⎞ ⎠

i=0

7∑ , k =0,1,K ,7.

Note: in these equations, p stands for

Page 17: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 17Lecture 6

Forward 2DDCT. Input xij i = 0, ... 7, j = 0, ... 7. Output ykl k = 0, ... 7, l = 0, ... 7

Matrix form, X, Y ~ 8x8 matrices with coefficients xij , ykl

The 2DDCT is separable!

Two-Dimensional DCT

ykl =c(k)c(l)

2 xij cos (2i +1)kp16

⎛ ⎝

⎞ ⎠ cos (2j +1)lp

16⎛ ⎝

⎞ ⎠

j=0

7∑

i=0

7∑c(k) = 1/ 2 k =0

1 otherwise⎧ ⎨ ⎩

Y=TXTT , X=TTYT, tki =c(k)

2 cos (2i +1)kp16

⎛ ⎝

⎞ ⎠

Note: in these equations, p stands for

Page 18: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 18Lecture 6

General DCT One dimension

Two dimensions

y(k) = t(k,i)x(i)i=0

N−1

∑ , k = 0,1,K ,N −1

t(k,i) =1/ N k = 0

2/ N cos (2i +1)kπ2N

⎛ ⎝ ⎜ ⎞

⎠ ⎟ k ≠ 0

⎧ ⎨ ⎪

⎩ ⎪

y(k,l) = x(i, j)t(k,i)t(l, j)j=0

N−1

∑i=0

N−1

Page 19: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 19Lecture 6

See 06IPA.xls

Page 20: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 20Lecture 6

Computational Complexity 1D DCT

N input and output samples ~ N2= 64 operations (additions + multiplications)

2D DCT - direct implementation M = N2 input values, M output values -> M2 = N4

2D DCT - separable implementation, Y = TXTT = ZTT, where Z = TX, all matrices are NxN -> 2N3 operations

For N = 8 2D DCT direct — 4096 operations, 64 operations per pixel 2D DCT separable — 1024 operations, 16 ops/pixel

Big savings due to separable transform Inverse DFT — same story.

Page 21: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 21Lecture 6

DCT: Encoding in JPEG, MPEG

Take 8x8 blocks of pixels Subtract range mean value Compute 8x8 DCT Quantize the DCT coefficients

Typically, many of the samples are equal to zero Lossless entropy coding of the quantized samples Different quantization step is used for different DCT coefficients

ykl — DCT coefficients, qkl — quantizer steps zkl — quantized values

zkl =roundyklqkl

⎛ ⎝ ⎜

⎞ ⎠ ⎟

Page 22: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 22Lecture 6

DCT: Example Data from lena, ‘smooth’ area. RMS error = 3.5

136 142 148 161 178 191 191 188138 145 149 164 179 190 191 187131 143 150 158 175 191 185 188135 139 149 162 179 190 185 184135 147 148 164 188 194 193 192138 143 155 167 188 195 190 190143 144 155 169 185 189 186 189139 150 157 175 189 190 188 192

1349 -160 -34 20 -2 -11 1 -2-16 -5 8 4 -8 2 0 -2

6 5 1 -4 3 3 -2 -16 -4 0 5 1 1 -1 31 -2 -1 1 1 -1 -3 -1

-9 3 1 0 1 3 5 1-1 -3 -1 -2 -3 -3 2 00 -2 3 -1 -1 -4 -2 -1

1344 -165 -30 16 0 0 0 0-12 0 14 0 0 0 0 0

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

138 140 148 160 175 186 190 190138 141 148 161 176 186 190 190137 141 149 163 177 187 190 190137 141 150 165 179 188 191 190137 141 152 167 181 190 191 189136 142 153 169 183 191 191 189136 142 154 170 185 192 191 188136 142 154 171 185 192 192 188

Original DCT

DCT, quantized Reconstructed

Page 23: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 23Lecture 6

DCT example Data from lena, ‘busy’ area. RMS error = 7.3

188 181 155 149 179 117 86 96168 179 168 174 180 111 86 95150 166 175 189 165 100 88 97163 165 179 184 134 90 91 96170 180 178 144 102 86 91 98175 174 141 104 85 83 88 96153 134 105 83 84 87 92 96117 104 86 80 85 91 92 103

1016 216 -7 -27 29 -21 -11 8136 53 -93 -7 34 -19 -11 11-45 -49 14 54 11 -25 0 8

9 38 48 16 -18 -11 4 4-1 -6 -1 -5 0 7 5 0-4 -1 3 8 7 6 0 1-3 -2 1 -1 0 -3 -1 -1-1 -3 -1 -2 -4 -1 2 2

1024 220 -10 -32 24 -40 0 0132 48 -98 0 26 0 0 0-42 -52 16 48 0 0 0 014 34 44 29 0 0 0 0

0 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 00 0 0 0 0 0 0 0

192 176 156 162 169 131 91 91165 171 170 176 168 121 86 94144 171 186 185 161 107 80 98153 179 186 172 142 94 77 100177 185 168 140 117 87 78 101178 173 139 106 97 85 83 102147 144 111 84 86 86 87 105114 118 95 75 83 88 90 107

Original DCT

DCT, quantized Reconstructed

Page 24: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 24Lecture 6

Overview: DCT coding Transformation decorrelates samples Transformed samples are quantized, quantization step depends

on the coefficient. Degree of compression and loss can be changed by scaling the quantization steps

Many quantized samples are zero —> run length coding At receiver, perform inverse DCT Many calculations!

16 11 10 16 24 40 51 6112 12 14 19 26 58 60 5514 13 16 24 40 57 69 5614 17 22 29 51 87 80 6218 22 37 56 68 104 113 9224 35 55 64 81 104 113 9249 64 78 87 103 121 120 10172 92 95 98 112 100 103 99

JPEG standard quantization steps

Page 25: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 25Lecture 6

Speeding up the DCT Separable transform - basic speedup Fast DCT transform - like FFT Further speedup through Scaled DCT

Page 26: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 26Lecture 6

Optimized (fast) DCT 1-D Chen DCT diagram.

Dashed lines indicate subtraction, — multi-plication by a constant, — multiplication by 0.5 (shift).

DCT or IDCT Method1-D 2-D 1-D 2-D

1-D Chen 16 256 26 4161-D Lee 12 192 29 464

1-D Loeffler, Ligtenberg 11 176 29 4642-D Kamangar, Rao 128 430

2-D Cho, Lee 96 466

Multiplications Additions

x0

x1

x2x3

x4

x5

x6x7

y0

y1

y 2

y 3

y 4

y 5

y6

y7

Characteristicsof optimizedDCT algorithms

Page 27: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 27Lecture 6

DCT Complexity Direct DCT computation:

64 DCT values, each requires 64 multiplications & additions —> 4096 multiply-accumulate (MA) operations per block

Separable algorithm (operate on rows, then on columns) —> 16 one-dimensional 8 point DCT operations —> 1024 MA operations

Fast implementation ~ Nlog2N operations ~ 16x24 = 384 MA ops Special methods ~ many operations involve multiplication by 1

or -1, take advantage of this!

Page 28: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 28Lecture 6

Fast Scaled DCT Picture of a butterfly at last stage of DCT + following quantizer

q1 =(1/ D1)(bt2 +at1)

q 2 =(1/ D2 )(bt1 −at2 )

aabbt1

t2

y1

y 2

q1

q2

1/ D1

1/ D2

q1 =(a / D1)((b / a)t2 +t1)=d 1(b1t2 +t1)

q 2 =(a / D2 )((b / a)t1 −t2 ) =d 2(b2t2 +t1)

t1

t2

y1

y 2

q1

q2

b1b2

d1

d2

Page 29: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 29Lecture 6

DCT refinements

Multiply-accumulate architectures Basic operation is a = bc + d, well suited for DCT

Super-scalar architectures Multi-register, multi-ALU processors Perform several operations in parallel

DCT or IDCT Method1-D 2-D 1-D 2-D

1-D Chen 8 128 26 4161-D Winograd 5 80 29 464

1-D Lee 11 176 29 4642-D Kamangar, Rao 92 4302-D Feig, Winograd 54 462

Multiplications AdditionsComplexity of scaled DCT algorithms, excluding quantization

Page 30: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 30Lecture 6

Motion Estimation Architecture of Motion Estimation Algorithms and Costs

Full Search Logarithmic Search PHODS Downsample, projection Hierarchical motion estimation Other criteria Multi-image estimation

Page 31: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 31Lecture 6

Baseline Models Previous frame predicts current frame

I(x, y, t) = I(x, y, t-1) + e(x, y, t) Not effective in presence of motion ~ zoom, pan, etc. Prediction to account for motion:

I(x, y, t) = I(x+u, y+v, t-1) + e(x, y, t) (u, v) — motion (displacement) vector

Model works (somewhat) for pan, not for other motion Compromise: Compute independent motion estimates for

rectangular image regions — macroblocks. Macroblocks are, in general, bigger than DCT blocks

Page 32: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 32Lecture 6

Generic Encoder - simplifiedI(x, y, t-1)I(x, y, t)Motion vector

(u, v)e(x, y, t) = I(x, y, t) - I(x-u, y-v, t-1)DCT codingMotionEstimation

MotionCompenastionTransmit

Page 33: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 33Lecture 6

Generic Decoder

DCT codesMotion vector(u, v)

FrameMemory

+DCT decodingMotion CompensationI(x, y, t)e(x, y, t)I(x-u, y-v, t-1)Receive

Page 34: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 34Lecture 6

Motion Compensation Macroblock size

MxN Matching criterion

MAE (mean absolute error) Search window

±p pixel locations Search algorithm

Full search Logarithmic search Parallel Hierarchical One-Dimensional Search Pixel subsampling and projection Hierarchical downsampling

Page 35: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 35Lecture 6

Motion Estimation Terminology

Issues: Size of macroblock Size of search region

In video coding standards, M = N = 16

Current PictureMacroblockMNReference (previous) pictureSearch regionBestMatch

Motion vector (u, v)Search regionReference (previous) pictureNM-ppp-p

Page 36: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 36Lecture 6

Matching Criterion Matching criterion: what produces the fewest coded bits for the

error image Coding for each value of motion vector (u, v) is too time

consuming (expensive) In practice, mean absolute error (MAE) is most popular C - current image, R - reference image, (x, y) - macroblock

origin

MAE (i , j ) =1

MNC(x +k,y+l)−R (x + i +k,y+ j+l)

l=0

N−1

∑k=0

M−1

−p ≤ i ≤ p, − p ≤ j ≤ p

Page 37: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 37Lecture 6

Full-Search Method Compute for (2p+1)2 values of (i, j). Each location requires 3MN operations Picture dimensions IxJ, F pictures per second

3IJF(2p + 1)2 operations per second I = 720, J = 480, F = 30, p = 15 —> 30 GOPS

Guaranteed to find best (MAE) displacement How to do it?

Special computers Smaller p Faster (suboptimal) algorithm

Page 38: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 38Lecture 6

Logarithmic Search (1D) Goal: find minimum over u in [-p, p] First step: evaluate at -p/2, 0, p/2 (interval ~ p) Next step: choose interval of length p/2 around minimum (2

more evaluations) Continue until interval length is equal to 2. This takes

k = ceiling(log2p) iterations Example p = 7

0-77

Evaluate at -4, 0, 4 —> minimum at -4Evaluate at -6, (-4), -2 —> minimum at -2Evaluate at -3, (-2), -1 —> minimum at -3. Done!

Page 39: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 39Lecture 6

Logarithmic Search - 2D

First stage requires 3x3 = 9 evaluations Subsequent stages require 8 evaluations k = ceiling(log2p) stages (iterations) Rate = 3IJF(8k+1)

p = 15, I = 720, J = 480, F = 30 —> 1 GOPS Can fail to find minimum Bottom line: Faster method, more error than full search

Page 40: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 40Lecture 6

PHODS Parallel Hierarchical One-Dimensional Search 1-st Blue

2-nd Green3-rd Red

(-7, -7)(-7, 7)(7, -7)(7, 7)00

Min V

Min H

~Twice as fast as logarithmicLess reliable

Page 41: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 41Lecture 6

Other Fast Methods Subsample (do not use all points in macroblock) Projection: Row and column projection of pixels, follow with 1-D

search Hierarchical motion estimation

Downsample reference image and current image Perform low resolution search Refine

Page 42: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 42Lecture 6

Hierarchical Search Prepare downsampled versions of current and reference

images Full macroblock 16x16 Down 2 macroblock 8x8 Down 4 macroblock 4x4

Full search in Down 4 reference image 16 x speedup, smaller macroblock 16 x speedup, fewer displacement vectors

p = ±16, p’ = ±4 Around point of best match, do local search in Down 2

reference image (3x3 search zone) Repeat for Full reference image (3x3 search zone)

Full

Down 2

Down 4

Page 43: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 43Lecture 6

Motion Estimation Methods

No compensation

Full search

logarithmicsearch 3 level

hierarchical

Page 44: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 44Lecture 6

Comparison

p = 17 p=7

Full Search 29.89 6.99

Logarithmic 1.02 0.78

PHODS 0.53 0.40

Hierarchical 0.51 0.40

Search Method Operations per MacroblockOperations for video

720x480 at 30 fps, GOPS

3(2 p+1)2NM

3(4 log2 ⎡ ⎤+1)NM

3 (2 / 4⎡ ⎤+1)2 +180[ ]NM / 16

3(8 log2 ⎡ ⎤+1)NM

Page 45: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 45Lecture 6

More Speedup Simpler comparison criteria

Binarize difference, count pixels that do not match PDC (Pixel Difference Classification)

Binarize current and reference BPROP (count matching pixels) DPC (count different pixels) BMP (operations done on bitplanes)

Produce 3-25 fold speedup

Page 46: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 46Lecture 6

Big Picture on Speedup Speedup methods are less accurate

Same Bit Rate, lower SNR Same SNR, higher bit rate Binary criteria lose about 0.5 dB

Suppose we have adequate computing power? Can we do better?

Sub-pixel motion estimation First find best match with pixel accuracy in displacement vectors Interpolate images for half-pixel shifts

Page 47: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 47Lecture 6

Multipicture Motion Estimation Estimate on basis of past and future

Non-sequential image transmission More chances to find good match More calculations

xytt+mt-nABX

Page 48: ECE-C453 Image Processing Architecture

Image Processing Architecture, © 2001-2004 Oleh Tretiak Page 48Lecture 6

Video Compression - Summary Video — sequence of images Can use intraframe compression

Motion JPEG Interframe compression offers great potential for savings No motion compensation — lower compression Motion compensation — greater compression All video standards provide for motion compensation

Compensation done on macroblocks, multiple motion vectors per image

Tradeoff between computing requirement and image quality