adaptive rate-distortion based wyner-ziv video coding
DESCRIPTION
Adaptive Rate-Distortion Based Wyner-Ziv Video Coding. Lina Karam Image, Video, and Usability (IVU) Lab Department of Electrical Engineering Arizona State University Tempe, AZ 85287 [email protected] ivulab.asu.edu. Outline. Motivation Existing DVC Approaches - PowerPoint PPT PresentationTRANSCRIPT
Adaptive Rate-Distortion Based Wyner-Ziv Video Coding
Lina KaramImage, Video, and Usability (IVU) LabDepartment of Electrical Engineering
Arizona State UniversityTempe, AZ [email protected]
1
Outline• Motivation
• Existing DVC Approaches
• BLAST-DVC: Rate-distortion based BitpLane SelecTive decoding for pixel-domain Distributed Video Coding
• AQT-DVC: Rate-distortion based Adaptive QuanTization for transform-domain Distributed Video Coding
• Enhanced AQT-DVC
• Conclusion and future directions
2
Motivation
Time
Frame 60
Frame 61
Mother and DaughterCIF – 352 x 288Spatial and Temporal Redundancy
3
Motion Estimation and Compensation
Reference Frame (Frame 197) Current Frame (Frame 198)
CIF Mother & Daughter
4
Residual Error ( No Motion Compensation)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Difference (Residual) Frame = Frame 198 – Frame 197
5
Motion Estimation and Compensation
Reference Frame (Frame 197) Current Frame (Frame 198) = Reference Frame + Error
CIF Mother & Daughter
6
Full Search Motion Estimation
[8x8] block motion vectors superimposed on Reference Frame (Frame 197)
7
Motion Compensation
8
Motion Compensated Reference (Frame 197)PSNR = 40.8 dB, MSE = 5.4
9
Residual Error ( No Motion Compensation)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
Residual Error ( 16x16 blocks, Full pixel)
PSNR = 39.4 dB, MSE = 7.5
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
10
Residual Error ( 4x4 blocks, quarter pixel)
PSNR = 45 dB, MSE = 2.1
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
11
Variable block size(16x16 – 4x4) +
quarter-pel + multi-framemotion compensation+ R-D Optimization
( H.264 2004)
85%
12
So, what is the problem?
13
Deblocking filter34%
Motion compensation
29%
Mics.18%
Intra predictor10%
Syntax parser5%
CAVLC+IQ+IZZ+IDCT4%
Power profile of H.264 with QCIF@15fps
Deblocking filter
Motion compensation
Mics.
Intra predictor
Syntax parser
CAVLC+IQ+IZZ+IDCT
From: T.-A. Liu, T.-M. Lin, S. -Z. Wang, et al. “A low-power dual-mode video decoder for mobile applications,” IEEE Communications Magazine, volume 44, issue 8, pp.119-126, Aug. 2006.
• Encoder performs both Motion Estimation and Compensation• Motion Estimation operation much more computationally complex
and consumes much more power than Motion Compensation
(H.264 Decoder)
14
Distributed Video Coding: MotivationConventional video coding• MPEGx or H.26x• High complexity video encoder due to motion estimation.
Emerging applications• Video compression with mobile devices
‒ Low complexity video encoder is preferred to reduce the hardware cost and to extend battery life.
• Video compression for sensor networks‒ Low complexity video encoder is also preferred to
reduce the hardware cost and to extend battery life. ‒ Inter–sensor communication may not be allowed or
needs to be minimized.Two main frameworks• Multi-View/Multi-Cameras• Single-View/Single Camera (Wyner-Ziv Video Coding)
15
Intraframe encoding and interframe decoding • Move complexity (motion estimation) from encoder to decoder• Achieve interframe compression rate-distortion performance
Distributed source coding• Compress consecutive frames separately• Decode the frames jointly at the decoder • Motivated by the work of Slepian-Wolf (1973) and Wyner-Ziv
(1976) ‒ Slepian-Wolf : possible to compress losslessly two statistically
dependent sources in a distributed fashion at a rate equal to their joint entropy
‒ Wyner-Ziv: possible to compress in a distributed fashion and achieve the same rate-distortion performance as when coding in a non-distributed fashion (Gaussian memoryless sources and mean-square error distortion).
Distributed Video Coding: Objectives
16
How can we do this?
17
Reference Frame (Frame 197) Current Frame (Frame 198) = Reference Frame + “Error”
Back to Mother & Daughter…
Distributed Video Coding (DVC): How?
DVC problem becomes: Correct or Reduce “Error” without using Motion Estimation at the encoder and without knowing what the “Error” is!
Similar to a channel coding problem => can make use of channel codes
18
Distributed Video Coding (DVC): ExampleQCIF (176x144) Foreman
Frame 1 Frame 2 Frame 3 Frame 4 Frame 5
Intra-coded Intra-coded Intra-coded
•Encoder:
•Decoder: - Recovers even frames from intra-coded odd-numbered frames - Odd-numbered frames are considered to be a distorted version of even-numbered frames; i.e. Frame2n=Frame2n-1+”Error”- “Error” corrected using parity bits or syndrome bits
Parity Bits orSyndrome bits
Parity Bits orSyndrome bits
19
Distributed Video Coding (DVC): Example•Issue 1: “Error” can be large => need to send a lot of parity bits => large bitrate
Frame 55 Frame 56
• Strategy: at the decoder, try to reconstruct even frames using received odd frames (e.g., bi-directional motion-compensated interpolation).
Distributed Video Coding (DVC): ExampleQCIF (176x144) Foreman
Frame 1 Frame 2 Frame 3 Frame 4 Frame 5
•Decoder: Side Information Generation
interpolate interpolate
•Issue 2: How to generate high-quality side information?•Issue 3: How do we determine the number of needed parity or syndrome bits ?
- Sending too much will waste bits- Sending too little might leave large distortions uncorrected
Interpolated frames called “side information”
PRISM (Puri et al., IEEE Trans. IP, Oct 2007)
• Syndrome-based Wyner-Ziv Coding by dividing codeword space into cosets
• After quantization, bitplane representation used
• Most significant bits can be inferred from side information
• Least significant bits (syndrome bits) need to be encoded and sent to decoder
• Issues:
- Syndrome coding rate is fixed in advance
- Coding can stop if CRC check fails => correctness not guaranteed
- Coding performance decreases significantly if unknown source statistics. Source correlation not known in advance in practice and is hard to estimate
Existing Approaches
22
1 10 1 1 0 0 1
Feedback-channel-based DVC by Aaron et al. 2004, Girod et al., 2005
• Bitplane coding
• Rate-Compatible Punctured Turbo (RCPT) codes used to generate parity bits (Slepian-Wolf coding) for each bitplane
• Feedback channel used to request parity bits based on need
• No need to determine number of parity bits to send in advance
• Hybrid FEC/ARQ–like scheme
‒ Feedback channel is to acknowledge the decoding correctness (e.g., CRC can be used to check correctness)
‒ Bitrate is determined on the fly.
‒ Decoding successes can be guaranteed.
Existing Approaches
23
levelsQuantizer
Slepian-Wolf
EncoderBuffer
Slepian-Wolf
DecoderReconstruction
Side Information Generation
Conventional Intraframe Decoder
Conventional Intraframe Encoder
S
bitplane1
S’
K’K
Intraframe Encoder Interframe Decoder
Request bits
Wyner-Ziv frames
Key frames
Side Information
S
Decoded Wyner-Ziv
frames
Decoded Key frames
Wyner-Ziv Encoder Wyner-Ziv Decoder
DCT
kX
IDCTkXkq
DCT
kX
Extract bitplanes
bitplane2
bitplane kM
kM2
Existing Approaches: Feedback-based DVC (Girod’s Group)
RCPT
For pixel-domain, no DCT, IDCT
MCTI
24
levelsQuantizer
Slepian-Wolf
EncoderBuffer
Slepian-Wolf
DecoderReconstruction
Side Information Generation
Conventional Intraframe Decoder
Conventional Intraframe Encoder
S
bitplane1
S’
K’K
Intraframe Encoder Interframe Decoder
Request bits
Wyner-Ziv frames
Key frames
Side Information
S
Decoded Wyner-Ziv
frames
Decoded Key frames
Wyner-Ziv Encoder Wyner-Ziv Decoder
DCT
kX
IDCTkXkq
DCT
kX
Extract bitplanes
bitplane2
bitplane kM
kM2
Existing Approaches: DISCOVER (Artigas et al., PCS 2007 )
Significant R-D performance improvement
LDPCA*
Hierarchicalsubpixel ME withSmoothing filter
* LDPCA provided by Girod’s Group – Varodayan et al., 2006
25
Issues with Existing Approaches
• Issue 1: Existing DVC schemes do not adapt the Slepian-Wolf decoding to the local characteristics of the video => every bitplane is Slepian-Wolf decoded based on bit budget starting from MSB to LSB. - Decoding stops when no error detected or when bit budget exhausted. Some important locations and bitplanes might not be decoded!
Question:Can we skip some less important regions and bitplanes without decoding them?How do we measure the significance of a bitplane?
Issues with Existing Approaches
•Issue 2: Existing DVC schemes do not adapt the quantization to the local characteristics of the video => During the encoding, a single quantizer matrix (one fixed quantizer for each subband) is selected for the whole video.
Question:Can we adapt the quantization matrix to the local characteristics of the video so as minimize the needed bits for LDPCA-decoding while maximizing the quality?
Proposed Strategy
• Divide each video frame into partitions in order to exploit local characteristics• Allocate bits to a partition only if they result in sufficient distortion reduction
- Determined using Distortion-Rate (D-R) ratios: D-R = D/R, where D = Distortion Reduction resulting from allocating R bits.
• Mimimum allowed distortion reduction per bit is specified in terms of a target Distortion-Rate (D-R) ratio = TD-R
-Allocate bits only if D/R of partition is > TD-R
• D/R is an indication of how much distortion reduction (quality) can a bit can buy us on average for the considered partition•Bits can be allocated to a partition via Slepian-Wolf (LDPCA-) decoding and/or by selecting quantization matrix• Target TD-R used to control bit-rate: set low for high bit-rate coding, and high for low bit-rate coding
28
Challenge: How to Measure Distortion-Rate Ratio?
• The original source information is not available at the decoder, so the distortion D cannot be exactly measured.
• The bitrate R cannot be known without decoding. • Proposed Approach: Distortion-Rate Ratio estimation
performed at the decoder using the side information frames and the source correlation model
‒ The complexity of the encoder is not increased ‒ More flexibility as the decoder can selectively decode
the bitplanes based on a target distortion-rate ratios. The target rate-distortion ratio can be changed so that different R-D operating point can be achieved.
‒ Error probability needs to be estimated at decoder
29
BLAST-DVC: Pixel-Domain BiTpLAne SelecTive Decoding
Xi
Wyner-Ziv frames
LDPCA EncoderBuffer
1ˆ
iX
Key frames
Requ
est b
its by
blo
ck
indic
es
parity
bits
+ C
RC
bits
Wyner-Ziv Frame Encoder
Wyner-Ziv Frame Decoder
Block Indices Decoding
Divide into Sub-images
…
Xi,1
Xi,2
Xi,M
Extract bitplanes
CRC Generator
…
xi,m,1
xi,m,2
xi,m,k
……
1iX1iX
Motion Compensated Interpolation
1ˆ
iX
iX
Rate-Distortion Ratio Estimation
Block Indices Encoding
Divide into Sub-images
Minimum Distance Symbol
Reconstruction
Minimum-distortion Pixel Reconstruction
LDPCA Decoder
LDPCA Decoder
LDPCA Decoder
……… …Merge Sub-imagesDecoded Wyner-Ziv frames
X’i
X’i,1
X’i,2
X’i,M
1,ˆ
iX
MiX ,ˆ
…
x’i,1,k
x'i,2,k
x'i,m,k
1,1ˆ
iX
MiX ,1ˆ
…1,1
ˆiX
MiX ,1ˆ
…
30
BLAST-DVC: Distortion-Rate Ratio EstimationSource Correlation Model• Let D be the difference of the source information X and its side
information Xside.
• D can be modeled as a random variable with a Laplacian distribution.
• α can be estimated from the co-located blocks of two motion-compensated Key frames and (Brites et al., 2006).
where m = partition index and n is the pixel location in the partition
,
255 if ,)exp(5.0
255255- if ,)exp(5.0
255 if ,)exp(5.0
)(
5.254
5.05.0
5.254
ddxx
ddxx
ddxx
dDP dd
1ˆ
iX 1ˆ
iX
,ˆˆ1ˆ2
1
2,,1,,1
22
N
n
nminmimm
XXN
31
BLAST-DVC: Rate Estimation
.1
1,
N
nknk P
NP
CRCkkkkk RNPPPPR ))1log()1(log(
:
•The needed bits for the considered kth bitplane can be computed as:
•Average of the error probabilities over subimage : knP ,
knP ,•Let be the error probability at a pixel n in bitplane k in partition
32
BLAST-DVC: Error Probability Estimation
The probability of bit error can be expressed as:
where bn,k and b’n,k denote a bit in the kth bitplane corresponding to
the nth pixels in the original subimage and in the side information (generated through motion compensated interpolation), respectively.DBP stands for Decoded Bit Planes.
)11 ,,'|1',0(
)11 ,,'|0',1(
,,,,
,,,,,
krDBPsrbbbbP
krDBPsrbbbbPP
rnrnknkn
rnrnknknkn
33
BLAST-DVC: Error Probability Estimation
)(
)(
)11 ,,'|1(
1
5.0
5.0
1
5.0
5.0
,,,1,,,
1,,,
1,,,
1,,,
S
s
XU
XLd
S
s
XU
XBd
rnrnknknskn
knskn
knskn
knskn
dDP
dDP
krDBPsrbbbP
)(
)(
)11 ,,'|0(
1
5.0
5.0
1
5.0
5.0
,,,1,,,
1,,,
1,,,
1,,,
S
s
XU
XLd
S
s
XB
XLd
rnrnknknskn
knskn
knskn
knskn
dDP
dDP
krDBPsrbbbP
and
34
• Estimate distortion reduction if the target bitplane is decoded.
• Average distortion estimation for a sub-image Xn
kkk DDD ˆΔ
Distortion reduction
Average distortion if the target bitplane is not LDPCA decoded
Average distortion if the target bitplane is LDPCA decoded
;])'[(1
21,
N
nknnk XXED
Partially reconstructed pixel value when the target bitplane is LDPCA-decoded => minimum distance symbol reconstruction is used
BLAST-DVC: Distortion Estimation
N
nknnnk XXXED
1
21, ]))',(Recon[(ˆ
Partially reconstructed pixel value based on the previously determined k-1 bitplanes and side info
35
Minimum Distortion Reconstruction
ΔΔ
if,ΔΔ
ΔΔ
if,
ΔΔ
if,1ΔΔΔ
),(Recon
1,
1,1,
1,
1,,
k
n
k
knk
k
n
k
n
k
knkn
k
n
k
knkk
k
n
knnkn
XXX
XXX
XXX
XXX
kΔ
nX
1, knX
knX ,
nX
1, knX
knX ,
nX
1, knX
knX ,
Side Info
OriginalLaplacian RV
36
Distortion Estimation – Bitplane not decoded
1U1L 1BsideX
X
)|( sideXXP
0 255
0 1
yy
y-Xside
NDBPs of no. and 2;
)(
)()(
]11 ,,|)[(
1
1
5.0
5.0
21,
1
5.0
5.0
1,,
21,
,,
,,
,,
,,
pS
yXP
XyyXP
krDBPsrbbXXED
pN
nS
s
U
Lyn
kn
S
s
U
Lyn
N
nrnrnknnk
skn
skn
skn
skn
y-Xside
Consider that the MSB is 0 and we want to determine next bit
Estimated value
=> Next bit is 1
00 01
37
1U1L 1BsideX
X
)|( sideXXP
0 255
yy
y-Recon(y,Xside)
y-Recon(y,Xside)
00 01 10 11
N
nS
s
U
Lyn
kn
S
s
U
Lyn
N
nrnrnknnnk
sknmi
sknmi
skn
skn
yXP
XyyXP
krDBPsrbbXXXED
1
1
5.0
5.0
21,
1
5.0
5.0
1,,
21,
,,,,
,,,,
,,
,,
)(
))Recon(y,()(
]11 ,,|)),Recon([(ˆ
If y in Bin 00, Recon(y,Xside)
If y in Bin 01, Xside is Recon(y,Xside)
Distortion Estimation – Bitplane LDPCA-decodedConsider that the MSB is 0 and we want to determine next bit
38
Bitplane Decoding Selection
Once the rate Rk and the distortion reduction ΔDk are obtained, a targeted distortion-rate ratio t can be chosen to determine whether bitplane decoding should be performed.
If ΔDk / Rk < t , the current bitplane is not decoded (NDBP case)
If ΔDk / Rk ≥ t , CRC bits are requested followed progressively by parity/syndrome bits, one parity/syndrome bit at a time, so that error correction can be applied to the current sub-image bitplane by means of LDPCA until no errors are detected (DBP case).
39
Proposed BLAST-DVC
Xi
Wyner-Ziv frames
LDPCA EncoderBuffer
1ˆ
iX
Key frames
Req
uest b
its by
blo
ck in
dices
parity
bits
+ C
RC
bits
Wyner-Ziv Frame Encoder
Wyner-Ziv Frame Decoder
Block Indices Decoding
Divide into Sub-images
…
Xi,1
Xi,2
Xi,M
Extract bitplanes
CRC Generator
…
xi,m,1
xi,m,2
xi,m,k
……
1iX1iX
Motion Compensated Interpolation
1ˆ
iX
iX
Rate-Distortion Ratio Estimation
Block Indices Encoding
Divide into Sub-images
Minimum Distance Symbol
Reconstruction
Minimum-distortion Pixel Reconstruction
LDPCA Decoder
LDPCA Decoder
LDPCA Decoder
……… …Merge Sub-imagesDecoded Wyner-Ziv frames
X’i
X’i,1
X’i,2
X’i,M
1,ˆ
iX
MiX ,ˆ
…
x’i,1,k
x'i,2,k
x'i,m,k
1,1ˆ
iX
MiX ,1ˆ
…
1,1ˆ
iX
MiX ,1ˆ
…
40
Proposed BLAST-DVC
Xi
Wyner-Ziv frames
LDPCA EncoderBuffer
1ˆ
iX
Key frames
Req
uest b
its by
blo
ck in
dices
parity
bits
+ C
RC
bits
Wyner-Ziv Frame Encoder
Wyner-Ziv Frame Decoder
Block Indices Decoding
Divide into Sub-images
…
Xi,1
Xi,2
Xi,M
Extract bitplanes
CRC Generator
…
xi,m,1
xi,m,2
xi,m,k
……
1iX1iX
Motion Compensated Interpolation
1ˆ
iX
iX
Rate-Distortion Ratio Estimation
Block Indices Encoding
Divide into Sub-images
Minimum Distance Symbol
Reconstruction
Minimum-distortion Pixel Reconstruction
LDPCA Decoder
LDPCA Decoder
LDPCA Decoder
……… …Merge Sub-imagesDecoded Wyner-Ziv frames
X’i
X’i,1
X’i,2
X’i,M
1,ˆ
iX
MiX ,ˆ
…
x’i,1,k
x'i,2,k
x'i,m,k
1,1ˆ
iX
MiX ,1ˆ
…
1,1ˆ
iX
MiX ,1ˆ
…
41
Simulation SetupQCIF Video Sequences (176x144)Frame rate: 15 frame per second.Number of partitions per frame = 64 (22x18 each)Comparison with following systems:• H.264 Inter : I-B-I-B• H.264 Intra only• DISCOVER by X. Artigas et al.
‒ Transform domain DVC, GOP = 2.• PDDVC (non-adaptive best pixel-domain system)
‒ Pixel domain DVC, GOP =2.‒ Special case of the proposed system but no partitions (1
partition per frame)
42
Simulation Results
2.0 dB
22% reduction 18% reduction
1.6 dB
43
Simulation Results
1.4 dB
18% reduction
18% reduction
0.8 dB
44
Visual Testing Setup
9 subjects took the test.Two video sequences are randomly placed side by
side on a 19” Dell Ultrasharp screen.Score• 1: DISCOVER is much better than BLAST DVC• 2: DISCOVER is better than BLAST DVC• 3: same quality• 4: DISCOVER is worse than BLAST DVC• 5: DISCOVER is much worse than BLAST DVC
45
Visual testingHall Monitor Foreman
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 73.60 97.64 167.01 293.73
BLAST 71.43 97.62 166.48 291.63
Average PSNR
(dB)
DISCOVER 28.71 29.93 32.38 35.51
BLAST 28.19 29.34 31.68 34.59
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 87.62 100.28 140.38 208.25
BLAST 83.53 89.69 121.57 185.45
Average PSNR
(dB)
DISCOVER 31.48 32.07 34.31 37.27
BLAST 31.49 32.02 34.29 37.25
46
Proposed SystemFrame bits: 3.36 kbits.
Frame PSNR: 32.89 dB.
DISCOVERFrame bits: 5.34 kbits.
Frame PSNR : 33.21 dB
Sequence average bitrate is 140.38 kbps and average PSNR is 34.31 dB for DISCOVER. Sequence average bitrate is 121.57 kbps and average PSNR is 34.29 dB for the proposed system.
47
Sequence average bitrate is 167.01 kbps, and average PSNR is 32.38 dB for DISCOVER.Sequence average bitrate is 166.48 kbps, and average PSNR is 31.68 dB for the proposed system.
DISCOVERFrame bits: 8.61 kbits Frame PSNR: 33.16 dB
Proposed SystemFrame bits: 5.83 kbitsFrame PSNR: 31.84 dB
48
DISCOVER BLAST-DVC
Compressed at 15fps, 167.01 kbps Compressed at 15fps, 166.48 kbps
49
AQT-DVC: Transform-Domain Distributed Video Coding with Rate-Distortion Based Adaptive Quantization
Motivation• Transform domain DVC performance is better than pixel domain DVC
performance, especially for high motion sequences. • Rate-distortion based adaptive quantization provides a better quantization
scheme in terms of rate-distortion performance.Considerations:
• Feedback channel Minimize the traffic on the feedback channel. Bitplane selective scheme is
not applicable because the number of bitplanes might be too large. -> One quantization matrix for each partition (M 4x4 DCT blocks)• Partition size versus LDPCA block size Smaller partition size keeps the flexibility of the quantization scheme. Larger LDPCA block size provides a better error correction ability and
reduce the feedback channel traffic. -> One LDPCA code for a bitplane of a subband. -> Due to different adative quantizers, resulting bitplanes are not
rectangular (irregular shape) and have undefined values => need to modify LDPCA
50
Sample Quantizer Matrices
Each matrix describes the number of quantization levels used for each of the 16 DCT subbands
51
Adaptive Quantization
Q 4x4 DCT block
3
4
1 1 1
3 3
4
4 4
4 4
1 1
1 1
1 1
5 55 5
5 5 5 5
1 1
1 1
3 3 3 3 3 3
Q: Quantizer matrix index
52
LDPCA Adaptationx1
x8
s1
s8 a8
a1 x1
x8
s1+s2
a8
a2
a4s3+s4
s5+s6
s7+s8
a6
LDPCA encoder Tanner graph corresponding to the transmission of only the even-indexed subset of the accumulated syndrome
x1
x3
s1+s2 a2
a4s3+s4
s5+s6 a6
x2
Tanner graph after eliminating redundant nodes
53
AQT-DVCWyner-Ziv Frame Encoder
Xi
Wyner-Ziv frames
DCTAdaptive
QuantizationExtract
Bitplanes
LDPCA Encoder
CRC Generator
Buffer
parity bits +
CR
C bits
request
LDPCA DecoderReconstructionInverse DCT
Xi-1
Xi+1
Side Information Generation
Distortion-rate Estimation
Quantizer S
et Index
X’i
Decoded Wyner-Ziv frames
Wyner-Ziv Frame Decoder
Quantizater Set Selection
Quantizer Set Index
DCT
Same D-R concept butdifferent equations for Dand R
54
Quantizer Matrix Selection
• Each RD point corresponds to a quantizer matrix
• Two criterions for quantizer selection
- D/R Is larger than threshold target D-R TD-R=t
- The quantizer matrix results in the largest
distortion reduction
Ave
rage
dis
tort
ion
D
Average bitrate R0
Slope t
Selected quantizer set
7M
6M
5M4M
3M 2M
1M
0M
Side information
55
Simulation Setup
QCIF Video Sequences (176x144)
Frame rate: 15 frame per second.
Partition size = 16x16 pixels (four 4x4 DCT blocks)
Four LDPCA code to accommodate variable-size bitplanes: 396, 792, 1188, and 1984
Comparison with following systems:• GOP = 2
‒ H.264 Inter : I-B-I-B
‒ H.264 Intra only
‒ DISCOVER by X. Artigas et al. (LDPCA length: 1584)
56
Simulation Results
Up to 1.4 dB compared to DISCOVER
57
Visual testingHall Monitor Foreman
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 73.60 97.64 167.01 293.73
AQT-DVC 71.99 103.02 168.31 290.52
Average PSNR
(dB)
DISCOVER 28.71 29.93 32.38 35.51
AQT-DVC 28.70 29.93 32.37 35.49
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 87.62 100.28 140.38 208.25
AQT-DVC 85.60 98.51 141.56 207.78
Average PSNR
(dB)
DISCOVER 31.48 32.07 34.31 37.27
AQT-DVC 32.02 32.78 35.17 38.60
58
DISCOVER AQT-DVC
Compressed at 15fps,167.01 kbps Compressed at 15fps, 168.31 kbps
59
AQT-DVC
Inaccurate estimate of source correlation model might result in inappropriate quantization matrix selection and might cause significant RD performance loss in AQT-DVC.• The model estimation solely depends on two neighboring motion-compensated Key
frames.
Previous Key frame Next Key frameOriginal WZ frame
Motion-compensated previous Key frame
Motion-compensated next Key frameSide information 60
eAQT-DVC Procedure
Coarsely quantize and encode all DC coefficients
Decode and reconstruct all DC coefficients
Receive quantization matrix index
Use the obtained Laplacian models to estimate rate-distortion ratios for all coefficients with respect to all available quantization matrices
Estimate the Laplacian model paramters of all DCT coefficients by using the motion-
compensated Key frames and the reconstructed coarsely quantized DC coefficients
Quantize and encode all DCT coefficients
Decode and reconstruct all DCT coefficients
syndromes
syndromes
Matrix index
Encoder Decoder
Select the best quantization matrices (in the R-D sense), one for each partition.
61
Simulation Results (High-motion Sequences)
62
Visual testingHall Monitor Foreman
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 73.60 97.64 167.01 293.73
eAQT 71.43 97.62 166.48 291.63
Average PSNR
(dB)
DISCOVER 28.71 29.93 32.38 35.51
eAQT 28.19 29.34 31.68 34.59
Operating Point A B C D
Average Bitrate
(kbps)
DISCOVER 87.62 100.28 140.38 208.25
eAQT 89.58 100.84 142.51 209.17
Average PSNR
(dB)
DISCOVER 31.48 32.07 34.31 37.27
eAQT 32.02 32.74 35.65 38.61
63
DISCOVER eAQT-DVC
Compressed at 15fps,167.01 kbps Compressed at 15fps,166.48 kbps
64
Conclusion
Adaptive distributed video coding
Distortion-Rate estimation for distributed video coding
• Allows allocation of more bits to significant regions
• A bitplane selective decoding scheme for pixel-domain DVC
• An adaptive quantization for transform-domain DVC
• PSNR improvement as much as 2.0 dB on the decoded video.
• Superior visual quality on the decoded video.
65
Future Research Directions• Explore more accurate source probability model.
• Variable block size locally-adaptive DVC scheme
• Improved DVC without feedback channel
• Real-time decoding
• Multi-View compression/3D TV
• Perceptual-based DVC
66
Related PublicationsWei-Jung Chien and Lina J. Karam, “Transform-Domain Distributed
Video Coding with Rate-Distortion Based Adaptive Quantization,” to appear in the IET Journal of Image Processing, Special Issue on Distributed Video Coding.
Wei-Jung Chien and Lina J. Karam, “BLAST-DVC: BitpLAne SelecTive Distributed Video Coding,” Springer Journal of Multimedia Tools and Applications, Special Issue on Distributed Video Coding, July 2009.
Wei-Jung Chien and Lina J. Karam, “AQT-DVC: Transform-Domain Distributed Video Coding with Rate-Distortion Based Adaptive Quantization,” accepted to IEEE International Conference on Image Processing, 2009.
Wei-Jung Chien and Lina J. Karam “Bitplane Selective Distributed Video Coding,” Asilomar Conference on Signals, Systems and Computers, 2008.
6767
67
Related PublicationsWei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “ Rate-
Distortion Based Selective Decoding for Pixel-Domain Distributed Video Coding ,” IEEE International Conference on Image Processing, p .1132 - 1135 , 2008
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Block Adaptive Wyner-Ziv Coding for Transform-Domain Distributed Video Coding,” IEEE International Conference on Acoustics, Speech, and Signal Processing, p I-525-8, 2007.
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Distributed Video Coding with lossy side information,” IEEE International Conference on Acoustics, Speech, and Signal Processing, p II-69-72, 2006.
Wei-Jung Chien, Lina J. Karam, and Glen P. Abousleman, “Distributed Video Coding with 3-D Recursive Search Block Matching,” IEEE International Symposium on Circuits and Systems, p 5415-5418, 2006.
6868
68
Wei-Jung Chien and Lina Karam
Wei-Jung Chien and President Obama
Thank you