introduction to vp8
DESCRIPTION
沒什麼內容的 VP8 簡介TRANSCRIPT
Introduction to VP8郭至軒 (KuoE0)
Latest update: Jun 13, 2013
Attribution-ShareAlike 3.0 Unported (CC BY-SA 3.0)
http://creativecommons.org/licenses/by-sa/3.0/
Situation
web m
web m
Video Codec
VP8
An Open Source Codec
Developed by On2 Technology
Developed by On2 Technology
February, 2010
Acquired by Google
February, 2010
Patent
web m
March, 2013
web m
Royalty-Free TermsMarch, 2013
web m
Successor
VP9
Successor
VP9May 15, 2013
Feature
focus on
Internetweb-based
application
Low Bandwidth Requirement
Image Quality:
watchable (PSNR: ~30dB)
visually lossless (PSNR: ~45dB)
Heterogeneous Client Hardware
Heterogeneous Client Hardware
Heterogeneous Client Hardware
Efficient Implementations
Web Video Format
YUV 420 color sampling
8 bit per channel depth
Up to 16383 × 16383 pixels
Processing Flow
CodingPredict
Transform + Quantize
Entropy Code
Loop Filter
DecodingEntropy Decode
Predict
Dequantize+Inverse Transform
Loop Filter
Reference Frame
Golden FrameLast FrameAlternate Frame
Reference Frame
Golden Frame Last FrameAlternate
Frame
At most 3 reference frames in VP8.
Last Frame
Last Frame
Last Frame
Last Frame Current Frame
Golden Frame
Choose an arbitrary frame in the past.
Define a number of flags to notify decoderwhen and how to update this buffer.
Golden Frame
Choose an arbitrary frame in the past.
Define a number of flags to notify decoderwhen and how to update this buffer.
set as the golden frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Golden Frame
Reconstruct
moving objectbackground
Alternate Frame
Other Frame
Alternate Frame
Alternate Frame
Other Frame
Alternate Frame
decodeshow
Alternate Frame
Other Frame
Alternate Frame
decodeshow
decode show
Alternate Frame
Other Frame
Alternate Frame
decodeshow
decode show
store beneficial information
Construct from multi-frame
Construct from multi-frame
Construct from multi-frame
Construct from multi-frame
Alternate Frame
Typical Frame
I B B P B B P B B I B B P
VP8
L G
A
G G G G G L G G G
A
G L
Prediction
Intra Prediction
Inter Prediction
use data within a single video frame
use data from previously encoded frames
Intra Prediction
Luma
LumaChroma
Intra Prediction
Luma
LumaChroma
16 4 8
H_PRED (horizontal prediction)
V_PRED (vertical prediction)
DC_PRED (DC prediction)
TM_PRED (TrueMotion prediction)
Four Prediction Modes:
Horizontal Prediction
Fills each column of the block with a copy of the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X Y
Horizontal Prediction
Fills each column of the block with a copy of the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X Y
ejoty
Horizontal Prediction
Fills each column of the block with a copy of the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X Y
ejoty
e e e e ej j j j jo o o o ot t t t ty y y y y
Vertical Prediction
Fills each row of the block with a copy of the above row.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X Y
Vertical Prediction
Fills each row of the block with a copy of the above row.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X YU V W X Y
Vertical Prediction
Fills each row of the block with a copy of the above row.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X YU V W X Y
U V W X YU V W X YU V W X YU V W X YU V W X Y
DC Prediction
Fills the block with a single value using the average of the pixels in the above row and the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X Y
DC Prediction
Fills the block with a single value using the average of the pixels in the above row and the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X YU V W X Y
ejoty
Z = (U + V + W + X + Y + e + j + o + t + y) ÷ 10
DC Prediction
Fills the block with a single value using the average of the pixels in the above row and the left column.
a b c d ef g h i jk l m n op q r s tu v w x y
A B C D EF G H I JK L M N OP Q R S TU V W X YU V W X Y
ejoty
Z = (U + V + W + X + Y + e + j + o + t + y) ÷ 10
Z Z Z Z ZZ Z Z Z ZZ Z Z Z ZZ Z Z Z ZZ Z Z Z Z
* * * * L0
* * * * L1
* * * * L2
* * * * L3
* * * * L4
* * * * ** * * * ** * * * ** * * * *
A0 A1 A2 A3 A4
TrueMotion Prediction
Horizontal differences between pixels in above row and vertical differences between pixels in left column are propagated (starting from C).
* * * * ** * * * ** * * * ** * * * ** * * * C
* * * * L0
* * * * L1
* * * * L2
* * * * L3
* * * * L4
* * * * ** * * * ** * * * ** * * * *
A0 A1 A2 A3 A4A0 A1 A2 A3 A4
L0
L1
L2
L3
L4
TrueMotion Prediction
Horizontal differences between pixels in above row and vertical differences between pixels in left column are propagated (starting from C).
* * * * ** * * * ** * * * ** * * * ** * * * CC
Xij = Ai + Lj - C
* * * * L0
* * * * L1
* * * * L2
* * * * L3
* * * * L4
* * * * ** * * * ** * * * ** * * * *
A0 A1 A2 A3 A4A0 A1 A2 A3 A4
L0
L1
L2
L3
L4
TrueMotion Prediction
Horizontal differences between pixels in above row and vertical differences between pixels in left column are propagated (starting from C).
* * * * ** * * * ** * * * ** * * * ** * * * CC
Xij = Ai + Lj - C
Xij Xij Xij Xij Xij
Xij Xij Xij Xij Xij
Xij Xij Xij Xij Xij
Xij Xij Xij Xij Xij
Xij Xij Xij Xij Xij
Inter Prediction
As mentioned above...
Inter Prediction
Golden Frame Last FrameAlternate
Frame
Motion VectorReusing vectors from neighboring macroblocks.
Flexible partitioning of a macroblock into sub-blocks.
Sub-pixel Interpolation
Quarter pixel accurate motion vectors for luma pixels.
High performance six-tap interpolation filters.[3, -16, 77, 77, -16, 3]/128 for 1⁄2 pixel positions[2, -11, 108, 36, -8, 1]/128 for 1⁄4 pixel positions[1, -8, 36, 108, -11, 2]/128 for 3⁄4 pixel positions
Hybrid Transform & Quantization
Divide into Macroblocks
One 16×16 block of luma pixels (Y)Two 8×8 blocks of chroma pixels (U, V)
Typical Method
16 8 8
Divide into blocks
VP8 MethodAll blocks of luma and chroma are 4×4 blocks
4 4 4
Discrete Cosine Transform
Fast implementation
Slightly worse in energy compaction than KLT
Content-independency
Coding2-D DCT
Decoding4×4 variant of LLM implementation
Coding2-D DCT
Decoding4×4 variant of LLM implementationPractical fast 1-D DCT algorithms with 11 multiplications
I1
I2
I3
I4
O1
O2
O3
O4
Inverse DCT Graph in VP8
y0
y1
x0
x1
y0 = √2(x0×sin(π/8)-x1×cos(π/8))y1 = √2(x0×cos(π/8)+x1×sin(π/8))
H.264/AVCuse multiplication-less integer transform
slightly better thanEnergy compaction is
It is efficient in processors with
SIMD capability.
Walsh-Hadamard Transform
Y = HXHT
H = 1 1 1 1 1 1 -1 -1 1 -1 1 -1 1 -1 -1 1 [ ]
HT is the transpose of H.
Take advantage of the correlation to reduce redundancy.
Adaptive Quantization
128 quantization level.
Different quantization level in single frame.1st order luma DC1st order luma AC2st order luma DC2st order luma AC
2st order chroma DC2st order chroma AC
Entropy Coding
Supports distribution updates on a per-frame basis
Boolean arithmetic coder
Stable probability distributions within one frame
Keyframes reset the probability values to the defaults
Adaptive Loop Filter
Removing blocking artifacts introduced by quantization and transformation.
Removing blocking artifacts introduced by quantization and transformation.
Removing blocking artifacts introduced by quantization and transformation.
Slight Filtering
Removing blocking artifacts introduced by quantization and transformation.
Slight Filtering
Strong Filtering
Removing blocking artifacts introduced by quantization and transformation.
Slight Filtering
Strong Filtering
No Filtering
Parallel Processing
Data Partition
Compressed Data
Data Partition
Compressed Data
marcoblock code mode& motion vector
transform coefficients
More Transform Coefficient Partition
transform coefficients
support up to 8 token partitions
More Transform Coefficient Partition
transform coefficients
support up to 8 token partitions
Compare to H.264
100120140160180200220240260280300
Night 720p 2000kbps Sheriff 720p 2000kbps Tulip 720p 2000kbps
Deocding speed in Frame/second
VP8 H.264 High Profile
Intel Core i7 3.2GHz
20
25
30
35
40
45
Night 720p 2000kbps Sheriff 720p 2000kbps Tulip 720p 2000kbps
Deocding speed in Frame/second
VP8 H.264 High Profile
Intel Atom N270 1.66GHz
Any Questions?
Thanks for your listening :)