acm multimedia october 20, 2009
DESCRIPTION
Manipulating Lossless Video in the Compressed Domain William Thies 1 , Steven Hall 2 , Saman Amarasinghe 2 1 Microsoft Research India 2 Massachusetts Institute of Technology. ACM Multimedia October 20, 2009. Processing in the Compressed Domain. Multimedia archives are growing rapidly - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/1.jpg)
Manipulating Lossless Videoin the Compressed Domain
William Thies1, Steven Hall2, Saman Amarasinghe2
1 Microsoft Research India2 Massachusetts Institute of Technology
ACM Multimedia
October 20, 2009
![Page 2: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/2.jpg)
Processing in the Compressed Domain• Multimedia archives are growing rapidly
– Monsters vs. Aliens production 100 TB– Facebook photos 400 TB– YouTube 600 TB
• How to analyze or modify the data?
Uncompress Process RecompressCompressed
InputCompressed
Output
ProcessCompressed
InputCompressed
Output
Compressed-domain transformation
Typical practice
lossless priorto distribution
![Page 3: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/3.jpg)
Prior Work: Focus on Lossy Formats• DCT-based spatial compression (JPEG, MPEG stills)
– Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]– Edge detection [Shen & Sethi 1996]– Image segmentation [Feng & Jiang 2003]– Shearing and rotating inner blocks [Shen & Sethi 1998]– Linear combinations of pixels [Smith & Rowe 1996]
• DCT-based temporal compression (MPEG video)– Captioning [Nang, Kwon, & Hong 2000]– Reversal [Vasudev 1998]– Distortion detection [Dorai, Ratha, & Bolle 2000]– Transcoding [Acharya & Smith 1998]
• Almost no work on lossless formats– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]– Pattern matching in compressed text [Farach & Thorup 1998; Navarro
2003]– Modifying pitch and playback of audio [Levine 1998]
![Page 4: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/4.jpg)
Prior Work: Focus on Lossy Formats• DCT-based spatial compression (JPEG, MPEG stills)
– Resizing [Dugad & Ahuja 2001] [Mukherjee & Mitra 2002]– Edge detection [Shen & Sethi 1996]– Image segmentation [Feng & Jiang 2003]– Shearing and rotating inner blocks [Shen & Sethi 1998]– Linear combinations of pixels [Smith & Rowe 1996]
• DCT-based temporal compression (MPEG video)– Captioning [Nang, Kwon, & Hong 2000]– Reversal [Vasudev 1998]– Distortion detection [Dorai, Ratha, & Bolle 2000]– Transcoding [Acharya & Smith 1998]
• Almost no work on lossless formats– Transpose and rotation of black/white images [Shoji 1995; Misra et al. 1999]– Pattern matching in compressed text [Farach & Thorup 1998; Navarro
2003]– Modifying pitch and playback of audio [Levine 1998]
Our Focus:
Regular Processing ofLZ77-Compressed Data Streams
![Page 5: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/5.jpg)
Example
o o o o l a l a l a
O O O O L A L A L A
Output:
Input:
to lowercase
![Page 6: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/6.jpg)
AAL A L L
Example
O O O O AL A L L A
O O O O L A L A L AInput:
CompressedInput:
o o o o l a l a l aOutput:
![Page 7: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/7.jpg)
A
Example
O O O O L A L L A
O O O O L A L A L A
4 2
Input:
o o o o l a l a l aOutput:
CompressedInput:
![Page 8: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/8.jpg)
Example
O O O O L A
O O O O L A L A L A
4 2
o o o o l a l a l aOutput:
Input:
CompressedInput:
“Repeat Token”
Count Distance
![Page 9: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/9.jpg)
Example
O O O O L A
O O O O L A L A L A
4 213
o o o o l a l a l aOutput:
Input:
CompressedInput:
Count Distance
“Repeat Token”
![Page 10: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/10.jpg)
Example
O L A
O O O O L A L A L A
4 213
o o o o l a l a l aOutput:
Input:
CompressedInput:
Count Distance
“Repeat Token”
![Page 11: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/11.jpg)
Example
O L A
o l a
O O O O L A L A L A
4 213
4 213CompressedOutput:
CompressedInput:
Input:
o o o o l a l a l aOutput:
![Page 12: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/12.jpg)
Example
O L A
o l a
4 213
4 213CompressedOutput:
CompressedInput:
Compressed Domain TransformationO O O O L A L A L A
o o o o l a l a l aOutput:
Input:
![Page 13: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/13.jpg)
Example
O L A
o l a
4 213
4 213CompressedOutput:
CompressedInput:
Compressed Domain Transformation
![Page 14: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/14.jpg)
Our Contributions• Handle the general case
– Produce and consumemore than one data item
– Split and join data streams
• Implement in a compiler– Programmer thinks in terms of uncompressed data– Compiler translates to work on compressed data– Relies on StreamIt programming language
• Evaluate on video processing tasks– 12 videos in Apple Animation format– Adjust colors or overlay two videos– Speedups proportional to compression ratio (median 15x)
O L A
o l a
4 213
4 213CompressedOutput:
CompressedInput:
Compressed Domain Transformation
![Page 15: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/15.jpg)
In This Talk• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
![Page 16: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/16.jpg)
void->void pipeline FMRadio(freq1 low, float freq2, int N) {
add AtoD();
add FMDemod();
add splitjoin {
split duplicate;
for (int i=0; i<N; i++) {
add pipeline {
add LowPassFilter(freq1 + i*(freq2-
freq1)/N);
add HighPassFilter(freq2 + i*(freq2-freq1)/N);
}}join roundrobin();
}
add Adder();
add Speaker();
}
Adder
Speaker
AtoD
FMDemod
LPF1
Duplicate
RoundRobin
LPF2 LPF3
HPF1 HPF2 HPF3
The StreamIt Language
![Page 17: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/17.jpg)
Adder
Speaker
AtoD
FMDemod
LPF1
Duplicate
RoundRobin
LPF2 LPF3
HPF1 HPF2 HPF3
• Applications– DES and Serpent [PLDI 05]– MPEG-2 [IPDPS 06]– SAR, DSP benchmarks, JPEG, …
• Programmability– StreamIt Language (CC 02) – Teleport Messaging (PPOPP 05)– Programming Environment in Eclipse (P-PHEC 05)
• Domain Specific Optimizations– Linear Analysis and Optimization (PLDI 03)– Optimizations for bit streaming (PLDI 05)– Linear State Space Analysis (CASES 05)
• Architecture Specific Optimizations– Compiling for Communication-Exposed
Architectures (ASPLOS 02 & 06, dasCMP 07)– Phased Scheduling (LCTES 03)– Cache Aware Optimization (LCTES 05)– Load-Balanced Rendering (Graphics Hardware 05)
• Migrating Legacy Code to a Stream Representation– Using a Dynamic Analysis (MICRO 07)
The StreamIt Language
![Page 18: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/18.jpg)
Language Primitives
Filter Splitter Joiner
Filter
pop 2 push 1 roundrobin(1,1) roundrobin(2,2)pop N push M roundrobin(N,M)
Model of computation also known as cyclo-static dataflow
![Page 19: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/19.jpg)
Example: Video Compositing
roundrobin(1,1)
Source 1 Source 2
Output
MultiplyPixels
2
1
![Page 20: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/20.jpg)
In This Talk• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
![Page 21: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/21.jpg)
Transforming Windows of Data
O O – O O L A – L –– L A – A
L A L A AO O O O L
O O – O O L A – L –– L A – A
L A L A AO O O O L
HyphenatePairs
Input:
Output:
![Page 22: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/22.jpg)
Transforming Windows of Data
O O – O O L A – L –– L A – A
L A L A AO O O O L
O O – O O L A – L –– L A – A
L A L A AO O O O L
HyphenatePairs
Input:
Output:
![Page 23: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/23.jpg)
Transforming Windows of Data
O O – O O L A – L –– L A – A
L A L A AO O O O L
AO L243 1
L –A36
Output:
CompressedInput:
Input:
CompressedOutput:
![Page 24: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/24.jpg)
Transforming Windows of Data
O O – O O L A – L –– L A – A
L A L A AO O O O L
AO L243 1
L –A36
Output:
CompressedInput:
Input:
CompressedOutput:
![Page 25: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/25.jpg)
Transforming Windows of Data
O O – O O L A – L –– L A – A
L A L A AO O O O L
AO L243 1
O O L –– A36
AO O L242 2
33
Output:
Coarsened,Expanded
CompressedInput:
Input:
CompressedOutput:
![Page 26: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/26.jpg)
General Case: Filters
DN… … Filter
I O
D’O/IN’’O/I… …
D’N’… ..… Filter
I O
FilterI O
…N’ % I
items
Coarsen
Translate
D’ = LCM (D, I)N’ = N – (D’ – D)
N’’ = N’ – N % I
![Page 27: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/27.jpg)
CompressedInput:CompressedOutput:
Splitting Streams
L A L A AL A L A L1
1
AL A L AL A L A L1
1
2814
14
Input:Output:
![Page 28: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/28.jpg)
CompressedInput:
Splitting Streams
L A L A AL A L A L2
2
AL2
2
Input:Output:
![Page 29: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/29.jpg)
Coarsened,ExpandedInput:
CompressedOutput:
Splitting Streams
L A AL AL A L A L2
2
4624
22
![Page 30: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/30.jpg)
1
1
O
XX
O
O O O
Splitting and Joining: Transpose
O O O
O O O
4
4
O O O
![Page 31: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/31.jpg)
O
X
1
1O O O
Splitting and Joining: Transpose
O O O 4
4
O
X O O O
O O O
![Page 32: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/32.jpg)
O
XX
O
O O O
Splitting and Joining: Transpose
O O O
O O O
O O O 1
1
4
4
![Page 33: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/33.jpg)
1
1
4
4X O
Splitting and Joining: Transpose
O
X O
3 1O
O O
O O O
12 12
3 1
![Page 34: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/34.jpg)
1
1
4
4
Splitting and Joining: Transpose
O3 13 1
12O
O
2
4
X
O
X O12
O
X O
3 1
12
O3 1
![Page 35: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/35.jpg)
General Case: Joiners
D1N1… …
W1W2D2N2
… …
If D1 % W1 = 0 and D2 % W2 = 0 and D1/W1 = D2/W2
D1(W1+W2)N’… …
W1
![Page 36: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/36.jpg)
In This Talk• StreamIt Language
• Compressed Domain Transformation
• Experimental Evaluation
![Page 37: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/37.jpg)
Implementation• Implemented subset of transformations in StreamIt
– User can change graph connectivity + filter functions
• Supported file format: Apple Animation (part of .MOV)– Standard format for interchange of lossless video– Compression: Run-length encoding within a line +
difference encoding between frames
• Emit executable plugins for MEncoder and Blender– Allows integration with standard video editing workflow
1 1 2 11-to-1 filter
1-to-1 joinerwith 2-to-1 filter
1
1
![Page 38: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/38.jpg)
Experimental Methodology• Evaluated on 12 videos drawn from Internet video,
computer animation, and stock digital television content
• Two classes of transformations:1. Color adjustment: inverse, brightness, contrast
2. Composite transformations: alpha-under, multiply
+ =
x =
alphaunder
1 1
2 11
1
![Page 39: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/39.jpg)
1x 10x 100x 1000x1x
10x
100x
1000x
Brightness
Contrast
Inverse
Compositing
Compression Factor
Sp
eed
up
Results: Execution Time
Color Adjustment:- 2.5x to 471x (median 17x)
Compositing:- 1.1x to 32x (median 6.6x)
Compression FactorFollowing Re-compression
Compression factor was low (≤1.1x) for one of source videos
![Page 40: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/40.jpg)
1x 10x 100x 1000x0x
1x
2x
3x
4x
5x
6xBrightness
Contrast
Inverse
Compositing
Compression Factor
Fil
e B
loat
Rela
tive t
o R
eco
mp
ressio
n
Masked out areasnot re-compressed
Saturated colorsnot re-compressed
Compression FactorFollowing Re-compression
Results: File Bloat
![Page 41: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/41.jpg)
Opportunity: Ignoring “Dead” Data• Some pixels in composite frames do not depend on both
input frames– Example: digital television mask (a low-performance case)
• If two data streams are multiplied, and one of them is repeatedly zero, then the repeat can be copied to the output (regardless of the values in the other stream)– We expect this would fix performance of our outlier cases– Requires pattern matching on stream graph
x =
2 11
1
![Page 42: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/42.jpg)
Extension to Other File Formats• High-efficiency mappings
– Flic Video– Microsoft RLE– Targa (with run-length encoding)
• Medium-efficiency mappings– Open EXR– Planar RGB
Re-arranges data by color or by byte
• Low-efficiency mappings– ZIP– GZIP– PNG
Performs Huffman coding prior to LZ77
![Page 43: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/43.jpg)
Conclusions• New method for direct processing of lossless-encoded
data streams– Relies on LZ77 compression and stream programming model– Supports operations on windows of data– Supports splitting, joining, and reordering data
• Preliminary implementation in an automatic compiler– Write program on uncompressed data, run on compressed data
• Good speedups in the context of video processing– 15x speedup (median) on color adjustment and compositing– Across 12 videos in Apple Animation format– May prove useful as more content authored in lossless formats
• Scope for extending technique, finding new applications
![Page 44: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/44.jpg)
Extra Slides
![Page 45: ACM Multimedia October 20, 2009](https://reader035.vdocuments.mx/reader035/viewer/2022062309/568134b8550346895d9bd747/html5/thumbnails/45.jpg)
General Case: Splitters
DN… … Split
U
D’VU+V
N’’VU+V… …
D’N’… ..… Split
Split…N’ % (U+V)
items
Coarsen
Translate
D’ = LCM (D, U+V)N’ = N – (D’ – D)
N’’ = N’ – N % (U+V)
V
U
V
U
V