video communication 2009

8/22/2019 Video Communication 2009

http://slidepdf.com/reader/full/video-communication-2009 1/420

Enrico Magli

Dept. of Electronics - Politecnico di Torino (Italy)

[email protected]

http://www1.tlc.polito.it/sas-ipl

Video communication



2

Credits

Parts of the material used in this course havebeen inspired or taken from work of the

following people:ÎYao Wang (Polytechnic University, New York,

USA)

ÎBernd Girod (Stanford University, USA)ÎDapeng Wu (Univ. of Florida, USA)

ÎY. Guo, C. Neumann (Thomson Inc.)

ÎD. Purandare (Univ. of Central Florida, USA)



Information theory and compression



4

General transmission scheme

Encoder/

Modulator

Encoder/

Modulator ChannelChannel Decoder/

Demodulator

Decoder/

Demodulator

Source coding+

Channel coding

Channel decoding+

Source decoding



5

Source coding

Typical signals contain redundancy

ÎCoding (representation) redundancy

ÎCorrelation between adjacent samples

Î

Psychovisual redundancyGoal: to reduce the intrinsic redundancy

of the source signalÆ to find a more

compact signal representation



6

Notation

Let n1 and n2 be the number of information

carrying units in two data sets that representsthe same information

Compression ratio:

A compression algorithm searches arepresentation with

2

1

n

nCR =

1>CR



7

Coding redundancy

Data is not equal to information

Data is the means by which information isconveyed

The same story can be told with a different

number of words if the teller is long-winded orshort and to the point!

This is the coding redundancy



8

Coding redundancy

White is the most likely

value in this picture Encoding each pixel

with the same no. ofbits leads to coding

redundancy

Variable length codingis the solution to coding

redundancy



9

Coding redundancy

Let p(r k ) = nk / n the probability of occurrence

of gray level r k , k = 0, 1, 2, … , L-1 Let r k be represented by l(r k ) bits; the average

number of bits to represent each pixel is

If l(r k ) = m, then L

ave= m

)()(1

0

k

L

k

k ave r pr l L ∑−

=

=



10

Coding redundancy

It makes sense that fewer bits are assigned to

those r k for which p(r k ) is larger This achieves data compression as Lave is

lower

Therefore, data compression is achieved

Variable length codes are used



11

Inter-pixel redundancy

Large areas of the

image are uniform This means correlation

among pixels (adjacentpixels are almost the

same)

This is not solved usingvariable length coding

(which works on eachsingle pixel)



12

Psychovisual redundancy

Human perception of the information of an

image does not involve quantitative analysisof every pixel

Pixel values can be modified up to a given

extent without significant subjectivedegradation

Modifications must involve psychovisually

redundant information (not easy to define) The image is irreversibly altered



13

Psychovisual redundancy

The brain searches fordistinguishing featuresand mentally combinesthem into objects(recognizable groups ofpixels)

Use of prior knowledgefor interpretation (face ,wall, poster )

If the wall were slightlydifferent this could notbe perceived



14

Data compression

Data compression algorithms can be divided

into two categories:Î Lossless coding : compressed signal is equal to

the original (no coding errors)

Î Lossy coding : a controlled amount of errors aretolerated, according to human subjective sensingcapability



15

Lossless source coding

Mainly for data (e.g. PC files)

Sensitive applications (biomedical images,remote sensing data)

Output=input (no losses)

represents in a more efficient way the signalsamples (or data codewords)

Achievable compression ratios in the case ofnatural images: 2.5-3



16

Lossy source coding

A certain degree of distortion is accepted

between output and inputDistortion should not be apparent

Lost data cannot be recovered

Much larger achievable compression ratios:50 and more



17

Introduction

Invented by Claude Shannon in the 40’s

Has set the mathematical framework fordigital communications

Information Theory teaches us

Î how to “measure” information

Î how to represent information “efficiently”

Î how to reliably transmit information across

communication channels



18

Measuring information

A discrete memoryless source generates

symbols from a set X of M elements (alphabet);each symbol is characterized by its probabilityof occurrence pi

How do we measure the amount of informationcarried by message xi ?

{ } M

ii p1={ } M

ii x X 1=

=



19

Properties of information

The amount of information carried by a

message is inversely proportional to itsprobability

Statistically independent messages:

i ji j p p x I x I <> if )()(

)()(),( )()(),( ji ji ji ji x I x I x x I xP xP x xP +=⇒=



20

Definition of “information”

and is measured in bits

Average amount of information carried by thememoryless source: entropy (bits/symbol)

ii p

x I 1

log)(2

=

i

M

i

ii

M

i

i p

p x I p X H 1log)()( 2

11∑∑

==

==



21

Examples

H(X) = 1.75 bits/symbol H(X) = 2 bits/symbol

81

81

41

21

4

3

2

1

4

3

2

1

=

=

=

=

=

p

p

p

p

x

x

x

x

X

41

41

41

41

4

3

2

1

4

3

2

1

=

=

=

=

=

p

p

p

p

x

x

x

x

X

Equiprobable sources carry more information, andare more difficult to compress



22

Bounds on Entropy

Theorem: the first order entropy of a

memoryless M-symbol alphabet is limited by

Example: 8 bit quantizer (M=28)

with the equality if the symbols are equiprobable

M X H 2log)( ≤

bit/symbol8)( ≤ X H



23

Noiseless coding theorem

(Shannon)

It is possible to code without any loss ofinformation a memoryless source alphabetwith entropy H(X) bits/symbol, using H(X) + ε

bits/symbol ε is a quantity that can be made arbitrarily

small considering increasingly larger blocks

of symbols to be coded



24

Bounds on lossless coding

The average codeword length of a lossless

coder cannot be less than entropyEntropy represents the target average

number of bits/symbol of a lossless encoder

Coding Efficiency:

η= H(X) / n

where n is the average codeword length



25

…if not memoryless…

If the source is correlated, the first order

entropy does not represent a bound to theaverage codeword length

All the previous results still hold, replacing

first order entropy with entropy rate H ∞ whichtakes into account correlation amongstsymbols

)(1

lim)( X H N

X H N N ∞→∞

=

∑ ∑∑= ==

======⋅⋅⋅−=m

i

m

i

N N N N

m

i

N

N

i xi xi xPi xi xi xP X H 1 1

22112211

11 2

),...,,(log),...,,()(



26

Lossless - Introduction

The goal of lossless compression is

to minimize the average length

of the compressed symbols

exploiting statistical properties of the data

Î probability distribution

Î

correlation (redundancy) of the data No distortion is acceptedÆ minimize the rate

for zero distortion



27

Introduction to lossy compression

Advantages of lossy compression:

ÎHigher compression ratiosÎ Low distortion of the decoded data

ÎPossibility to “shape” the error between original

and decoded dataDisadvantages:

ÎThe decoded signal is not exactly conform to the

original



28

Examples of lossy compression

Speech/Audio:

ÎGSM speech compressionÎMP3 audio

Image compression:

Î JPEGÎ JPEG2000

Video compression

ÎMPEG-2

ÎH.263



29

Lossy compression

Consider an i.i.d. discrete-time random process X

Main difference with respect to lossless compression:we accept some distortionÆ we reconstruct X*≠X

A single letter distortion measure for a length-m datavector is defined as

with ρ(.) a nonnegative “component by component”distortion measure

∑−

=

=1

0

** ),(1

),(m

j

j jm x xm

x x ρ ρ



30

Examples of distortion measures

Hamming metric

Euclidean metric

⎪⎩

⎪⎨⎧

≠

==

*

**

if 1

if 0),(

ji

ji

jim x x

x x x x ρ

2**)(),(

ji jim

x x x x −= ρ



31

Source code

A source code Q describes a source X by an

approximation X*, such that:Î the distortion between X and X* is equal to D

Î the rate necessary to transmit X* losslessly is R

Quantizer Entropycoder

Dequantizer Entropydecoder

X I* S

SX* I*

Distortion Drate R bit/sample



32

Rate-distortion function

Given a desired expected distortion

. the rate-distortionfunction R(D) is the minimum rate atwhich we can guarantee the existence

of a source code that represents X withX*, so that X* is encoded with a rate ofR bit/sample, and

D X X E m ≤− )(*

ρ

D X X E m

≤− )( * ρ



33

Example of R-D function

D

R lossless coding

maximum distortion



34

Operational R-D function

In practice the R-D function is very difficult to

compute for realistic sources.Usually one employs the operational R-D

function, which is the set of practically

achievable R-D points for a given samplerealization of the source and a specific code



35

Operational R-D function

Example: consider compressing an image

with JPEG at different quality factors

D

R

operational R-D curve



36

Huffman Coding

Problem statement

ÎGiven a source X emitting symbols a i withprobability P(a i )

ÎFind a compact representation c(ai)Æ ci

ObjectiveÎ If li represents the length of codeword ci

Îwe want to minimize the average length

∑= iii aPll )(



37

Huffman Coding

The average length will be

minimized if

Î

Î Variable length coding (VLC): the shortest code-words are allocated to the most probablesymbols

ji ji ji llaPaPaa ≤⇒≥∀ )()(|,

∑=i

ii aPll )(



38

Huffman Coding

We must guarantee that the codewords

c(ai)Æ

ci are unequivocally decodable Huffman coding is based on the idea of

prefix free coding

Î Any codewords cannot be a prefix for anothercodeword

il

ji ji ji ccllcc 1|, ≠⇒

≤∀

ic

jc



39

Huffman Coding

¾ Example: cod. 1 is not prefix free, cod.2 is prefix

free

cod. 1 cod. 2a 0 0b 01 10

c 11 11

0 1

1 1

cod. 1

0 1

0 1

cod. 2

¾ Prefix free codes are represented with a binary

tree where internal nodes do not representcodewords (the codewords are only the leaves ofthe tree)



40

Huffman Coding

Huffman codes construction was prposed in

D.A.. Huffman, “A method for the constructionof minimum redundancy codes”, Proc. Of the IRE , 1951



41

Code construction example

Memoryless source X={a,b,c,d,e}

P(ai)={0.4 0.2 0.2 0.1 0.1}

The two least probable symbols are groupedto form a tree node

The sum of the probabilities of the twosymbols is attributed to the tree node

d e

a b c

0.4 0.2 0.2 P({d,e})=P(d)+P(e)=0.2



42

Code construction example

The procedure is iterated considering both

the remaining symbols and the created trees

d ea b

0.4 0.2 0.20.2

c

10

0

0

0

1

1

1

ca b

0.4 0.2

d e

0.4

a

b

0.4

d e

0.6

c

c

a

b

1.0

d e

a (0.4) 1b (0.2) 0 1c (0.2) 0 0 0d (0.1) 0 0 1 0e (0.1) 0 0 1 1

∑

∑

=−=

==

i

ii

i

ii

bpsaPaP X H

bpsaPll

122.2)(log)()(

2.2)(



43

Huffman Coding

Coding efficiency

Stronger higher bounds are

ÎGiven the maximum probability value p M

1)()( +<≤ X H l X H

5.0if ,)( ≥+< M M p p X H l

5.0if ,086.0)( <++< M M p p X H l



44

Huffman Coding

Coding efficiency can be poor with small

alphabet with unbalanced probabilities (P M >0.5)

a(0.8) 0

b(0.18) 11c(0.02) 10 bps2.1

bps816.0)(

=

=

l

X H



45

Extended Huffman Coding

Extended Huffman Coding is obtained coding

n-tuples of symbolsÎExample

a(0.8) 0

b(0.18) 11c(0.02) 10

aa(0.64) 0ab(0.144) 11

ac(0.016) 10101ba(0.144) 100bb(0.0324) 1011bc(0.0036) 10100100ca(0.016) 101000

cb(0.0036) 1010011cc(0.0004) 10100101

bps816.0)( = X H

bps2.1=l

bps8614.0=l



46

Huffman Coding

Extended Huffman coding efficiency is

Î

(where n is the number of grouped symbols)

n X H l X H

1)()( +<≤

Si lifi d VLC



47

Simplified VLC

An easy and sub-optimal VLC coding

technique is known as Run-Length coding It is based on the assumption that a given

symbol is repeated for long

ÎFax, B/W images The symbol and length of its run is coded

Example

ÎX=000000100000000010000001ÎCode: 6,9,6

B i f i



48

Basics of images

Li ht i t f th EM



49

Light is part of the EM wave

Ill i ti d fl ti li ht



50

Illuminating and reflecting light

H E



51

Human Eye

cones

rods

Human Eye



52

Human Eye

Human eye: some features

Î The range of intensity that we can

perceive is impressive (on the orderof 1010)

Î HVS cannot operate over such arange simultaneously

Î Brightness adaptation is used

Î Brightness discrimination is poor atlow level of illumination (Weber law)

Î Sensitive to hedges (high contrast

zones)

Colors



53

Colors

Sensing colors

Î 7 millions cones in human

eye can be divided into 3categories, able to sensered (R), green (G), blue (B)

Î RGB color model

Trichromatic color mixing



54

Trichromatic color mixing

RGB vs CMY



55

RGB vs. CMY

Color representation models



56

Color representation models

YCbCr color space



57

YCbCr color space

An important color space for video application is theso called YCbCr

Î Luminance

Y= 0.299 R + 0.578 G + 0.114 B

Î Chrominance

Cb = B - YCr = R - Y

Î Y corresponds to the black and white TV signal

Î Cb/Cr can be used by color TV to generate R,G,B

Î HVS is much less sensitive to Cb,Cr (can be compressed toa large extent without impairing the perceived quality)



Image Transforms – part I

Outline



59

Outline

Introduction

Fourier TransformDFT

Introduction



60

Introduction

An image can be described in space orfrequency

Spatial frequency: the rate of change of an image

Representation in space domain: picture =collection of brightness levels

Representation in frequency domain: picture

= collection of spatial frequency components

Space vs frequency



61

Space vs. frequency

Dark Low frequency

Dark

High frequency

Bright

Low frequency Bright

High frequency



62

Fourier Transform

Fourier Transform



63

Fourier Transform

The Fourier Transform is used to decomposean image into sine and cosine components

Used in a wide range of applications: imageanalysis, filtering, reconstruction and

compressionAs we are only concerned with digital images,

we will only consider (2D) Discrete Fourier

Transform (DFT)

xamp e o mage requencyrepresentation



64

representation

Images that are pure cosineshave particularly simple FT

Pure horizontal cosine of 8cycles and pure verticalcosine of 32 cycles.

The FT just has a single

component, represented by2 bright spots symmetricallyplaced about the center ofthe FT image

The center of the image isthe origin of the frequencycoordinate system.

Example of image frequencyrepresentation



65

representation

Images of 2D cosines withboth horizontal and verticalcomponents.

(left) 4 cycles horizontal and16 cycles vertically.

(right ) 32 cycles horizontallyand 2 cycles vertically

For real images, the FT issymmetrical about the originso the 1st and 3rd (2nd and4th) quadrants are the same

If the image is symmetricalabout the x-axis 4-foldsymmetry results.

Discrete Fourier Transform



66

The DFT is the sampled Fourier Transformand therefore does not contain all frequenciesforming an image, but only a set of sampleswhich is large enough to fully describe thespatial domain image.

The number of frequencies corresponds tothe number of pixels in the spatial domainimage, i.e. the image in the spatial andFourier domain are of the same size

Two-dimensional DFT



67

A square image x(n,m) of size N×N has thetwo-dimensional DFT (2-D DFT):

F(k,l) is obtained by multiplying the imagewith the corresponding base function and

summing the result.

∑∑−

=

−

= ⎭⎬⎫

⎩⎨⎧

⎟ ⎠

⎞⎜⎝

⎛ +−=

1

0

1

02

2exp),(1

),( N

n

N

m N

lm

N

kn jmn x

N lk F π

Two-dimensional DFT



68

The base functions are sine and cosinewaves with increasing frequencies

F(0,0) represents the DC-component whichcorresponds to the average brightness and

F(N-1,N-1) represents the highest frequency.

Separability of 2-D DFT



69

p y

A double sum has to be calculated for eachimage point. However, because the DFT isseparable , it can be written as

∑∑

−

=

−

=

⎭

⎬⎫

⎩

⎨⎧

−=

⎭⎬

⎫

⎩⎨

⎧

−=1

0

1

0

2exp),(1

),(

2exp),(

1

),(

N

n

N

m

N

kn jmn x

N

mk P

N

lm

jmk P N lk F

π

π

Separability of 2-D DFT



70

The spatial domain image is first transformedinto an intermediate image using 1-D DFTapplied to the rows

This intermediate image is then transformed

into the final image, again using 1-D DFTapplied to columns

This procedure decreases the number of

required computationsComplexity of 2-D DFT: ( ) N N O 2

2 log

Properties of 2-D DFT



71

The DFT produces a complex valued image

It is displayed with two images, typicallymagnitude and phase .

Only the magnitude is usually displayed

The Fourier domain image has a muchgreater range than the image in the spatialdomain. Hence, its values are usually

calculated and stored in float values andrepresented in log- scale

Magnitude and phase spectra



72

The images arehorizontal cosines of 8

cycles, differing only bya 1/2 cycle lateral shift

Both have the same

magnitude spectrum. The phase spectrum

would be different, ofcourse.

Inversion of 2-D DFT



73

The Fourier image can be re-transformed tothe spatial domain:

Both amplitude and phase information arerelevant for the reconstruction of the image

∑∑−

=

−

= ⎭⎬⎫

⎩⎨⎧

⎟ ⎠

⎞⎜⎝

⎛ +=

1

0

1

02

2exp),(1

),( N

k

N

l N

lm

N

kn jlk F

N mn x π

Effect of phase on reconstruction



74

This image is reconstructed from the frequency domain

using amplitude information from (b) and phaseinformation from (a)

(a)

(b)

2-D DFT: example 1



75

(a) image

(b) section A-B

(c) 1-d FFT of section A-B

(d) 2-D FFT of image

2-D DFT : example 2



76

(a) Chest radiograph (b) 2-D Fourier spectrum of (a)

broad range of spatial frequenciessignificant vertical and horizontal features, due to ribs and vertebral column

2-D DFT: example 3



77

The DFTs tend to have bright lines

perpendicular to lines in the original letter.

If the letter has circular segments, then sodoes the FT.

2-D DFT: example 4



78

The concentric ring structure in the

DFT of the white pellets image is due to

each individual pellet. If we took the

DFT of just one pellet, we would stillget this pattern. The fact that there are

many pellets and information about

exactly where each one is is contained

mostly in the phase

The coffee beans have less symmetry

and are more variably colored so they

do not show the same ring structure.

You may be able to detect a faint "halo"

in the coffee DFT. What do you think this is from?

2-D DFT: example 5



79

The girl looks very similar to the

ape… except for the hat…

Effect of edge between hat and hair

2-D DFT: example 6



80

The first image is allblack except for a single

pixel wide stripe fromthe top left to thebottom right

The second image istotally random

General transform coding scheme



81

“Reversible”transform

Entropy

coding

Bit allocation

pixels valuesQuantization

Why do we need to introduce a transform domain?

The objective is to represent the original data X into anew domain Y, more suitable for quantization and

coding

X Y

General transform coding scheme



82

Quantization (lossy coding only) depends on

Î desired bit rateÎ statistics of the various transformed coefficients

Î distortion of the reconstructed signal

Entropy coding

Î Any binary encoding technique (Run length, Huffman,Arithmetic …)

“Reversible”

transform

“Reversible”

transform

Entropycoding

Bit allocation

Entropycoding

Bit allocation

X

QuantizationQuantization

Y

Transform Coding



83

Transforms are able to decorrelate data

The coefficients in the transformed domainare more suitable for the subsequent

quantization operationÎ In the transformed domain few coefficients

concentrate most of the signal energy

ÎCoefficients are decorrelated, therefore scalarquantization is nearly optimum

The Karhunen-Loeve Transform



84

Also called the Hotelling Transform

The KLT is a data dependent transform

Let X denote a random data vector of length N , m be its (vector) mean value and C be its N x N covariance matrix:

T m X m X E C ))(( −−=




85

The matrix C is real and symmetric, andhence can be diagonalized using itseigenvectors

The eigenvectors ei of C are given by

where λ i are the corresponding eigenvalues

iii eCe λ =




86

Let us consider a matrix A whose columnscorrespond to the eigenvectors of C ,

arranged in increasing eigenvalue order Let us consider the transformation

Y is zero mean and has covariance matrix:

where Λ is the diagonal eigenvalue matrix

)( m X AY T

−=

Λ==−−== CA A Am X m X A E YY E C T T T T

y ]))(([)(




87

The elements in the transformed domain areuncorrelated

If only the top K coefficients are kept,corresponding to the K largest eigenvectors,the mean square error between the originalvector X and its reconstruction from truncatedY is theoretically minimum

KLT is a bound as for compression efficiency

but is computationally intractable

Discrete Cosine Transform



88

The 1-D discrete cosine transform (DCT) isdefined as

1,,1,0

2

)12(cos)()()(

1

0

−=

⎥⎦

⎤⎢⎣

⎡ += ∑

−

=

N u

N

u x x f uuC

N

x

L

π α

1,,1

2)(

1)0(

−=

=

=

N u

N u

N

L

α

α

Inverse DCT



89

Similarly, the Inverse DCT (IDCT) is definedas

with α (u) defined as before

1,,1,0

2

)12(cos)()()(

1

0

−=

⎥⎦

⎤⎢⎣

⎡ += ∑

−

=

N x

N

u xuC u x f

N

u

L

π α

2-D DCT



90

The two-dimensional DCT is obtainedapplying the 1-D transform to the rows and

columns independently

The corresponding transform is

1,,1,0,

2

)12(cos

2

)12(cos),()()(),(

1

0

1

0

−=

⎥⎦

⎤⎢⎣

⎡ +⎥⎦

⎤⎢⎣

⎡ += ∑∑

−

=

−

=

N vu

N

v y

N

u x y x f vuvuC

N

x

N

y

L

π π α α

Inverse 2-D DCT



91

Analogously, the inverse 2-D transform is

1,,1,0,

2

)12(cos

2

)12(cos),()()(),(

1

0

1

0

−=

⎥⎦

⎤⎢⎣

⎡ +⎥⎦

⎤⎢⎣

⎡ += ∑∑

−

=

−

=

N y x

N

v y

N

u xvuC vu y x f

N

u

N

v

L

π π α α

DCT basis functions



92

Basis functions of 8X8 DCT

When it is applied to an 8x8image, it yields an 8x8 matrix

of weighted valuescorresponding to how muchof each basis function ispresent in the image

An 8x8 image that justcontains one shade of graywill yield only a weightedvalue for the upper left handDCT basis function (which

has no frequencies in the xor y direction).

2-D DCT



93

Transform Coding: DCT



94

For N → ∞, DCT tends to a diagonal matrix(KLT)

The input data stream must be divided into

blocks before applying the transform

The correlation across the block boundaries

is not removed

Example of 2-D DCT



95

DCTImage

Test image: Lenna



96

Test image: Lenna



97

Interpretation of DCT basis functions



98

The top-left basis function represents zerospatial frequency (DC coefficient )

Along the top row the basis functions haveincreasing horizontal spatial frequencycontent.

Down the left column the functions haveincreasing vertical spatial frequency content.

DFT vs. DCT periodicity

d



99

DFT periodicityDFT periodicity

DCT periodicityDCT periodicity

discontinuitydiscontinuity

n

2n

Why DCT not FFT?



100

DCT can approximate

lines well with fewercoefficients

Blocking artifacts less

pronounced Better approximation to

the KLT

Used in the JPEGstandard

DFT (example)



101

DFT (25% samples retained) Absolute Error (MSE= 5.1345)

Quantization



102

Goal of quantization: to represent a real numberin (-∞,+ ∞) as an integer number, i.e. an elementof a discrete and finite set of 2N possible values(N bit quantizer).

Bit rate: B=N f s

Uniform Quantization



103

Errors: granularity and overload

Original signal

Quantized

signal

xi

yi Δ

xi+1

yi+1

Uniform Quantization



104

t

Quantization techniques



105

Uniform quantization is (almost) optimal whenthe input signal is memoryless

Quantization techniques:

ÎScalar quantization

ÎNon-uniform quantization

ÎRobust-quantization

ÎPdf-optimized quantization (Lloyd-Max)

ÎEntropy-constrained quantization

ÎVector quantization

Alternative to transforms: linear prediction



106

Linear prediction:

Î estimate the value of the current pixel x[n] as the

linear combination of past pixels: x*[n] = a1 x[n-1]+ a2 x[n-2] + …

Î instead of x[n], encode the prediction error

e[n]=x[n]-x*[n]Î the decoder recovers x[n]=e[n] + x*[n]

Linear prediction (DPCM)



107

e[n] can be quantized more efficiently than x[n]

P(x[n])

x[n] +

-

e[n]

e[n] x[n]

P(x[n])

+

+

Example of third order LP



108

A B C X

P(x)=a× A + b×B + c×C

E=X-P(x)

Practical DPCM scheme



109

Q

H(z)

x[n] e[n]e[n]+q[n]

-+

+

+

xs[n]Pxs

H(z)

xs[n]

++

Q-1

Q-1



The JPEG coding standard

International standards

O i i h d fi d d



111

Organizations that define standards:

Î ISO (JTC 1 SC 29 WG 01/11)

9 JPEG, MPEG, JPEG 2000

Î ITU

9 H.261, H.263, H.264

Why standards?Î Interoperability

International standards (cont’d)

Wh d fi t d d



112

Who defines standards:

ÎCompanies

ÎAcademia

Advantages of using a standard

Î provides interoperability

Disadvantages:

Î technology in the standards is some years old

Carrying out the technical work

A f kl ti



113

A few weekly meetings per year

A few intermediate meetings

ÎCall for proposals

ÎWorking draft

ÎFinal committee draft

ÎFinal Draft International Standard

ÎFinal Publication Draft

The copyright issue

Some technologies used in JPEG are covered by



114

Some technologies used in JPEG are covered bypatents:

Î IBM, AT&T, and Mitsubishi for arithmetic coderÎ Forgent for Huffman tables ?

Goal:

Î baseline algorithm, royalty-free

Î advanced algorithm, with license fees

Participants in JPEG are required to accept toprovide royalty-free licenses for technology that they

bring into the standard, for the baseline version ofthe algorithm.

International standards (cont’d)

What is standardized ?



115

What is standardized ?

Multimediaencoder

Source data

Multimediadecoder

Syntax

Defined by standard

Roadmap to international image coding standards

JPEG



116

JPEGÎ baseline lossy compression

Î extension (hierarchical, progressive)

Î lossless compression

JPEG-LSÎ lossless compression

Î

near-lossless compression JPEG 2000

Î lossy compression

Î lossless compression

Î extensions

JPEG

This standardized image compression



117

This standardized image compressionscheme is designed to work on full-color or

gray-scale digital images JPEG defines a baseline algorithm, plus

extensions for Progressive and Hierarchical

Coding It foresees a separate lossless mode

(Huffman or Arithmetic coding)

JPEG block scheme

Color space decomposition



118

Color space decomposition

Î RGB

Î YUV (subsampled)

Application of the algorithm to each component

JPEG

The coding steps:



119

The coding steps:

Î transformation of the image into a suitable colorspace

Î application of a 8x8 blocks DCT

Î quantizationÎ zig-zag reading

Î entropy (lossless) coding

JPEG compression



120

•A weighted scalar quantization

is applied to each transformedcoefficient in every block

•Quantized DC values are coded

by DPCM from macroblock to

macroblock •Zig-zag reordering

•Encoding of zero-runs

•Entropy coding

JPEG Quantization Matrices

Divide each entry of the



121

Divide each entry of theimage matrix by the

corresponding entry inthe quantization matrix

Quality factor to controlquality

Contained in the JPEGfile, with imageinformation

• Flexibility withquantization tables (?)Fq(u,v)= round[F(u,v)/Q(u,v)]



122

Original Block DCT (rows)



123

DCT (columns) Quantized DCT



124

Reconstructed block Abs error vs. original



125



126



127

JPEG entropy coding

The zig-zag scanned coefficients are



128

g gencoded as sequence of “couples” of

symbols:

ÎRunlength: nr. of zero samples preceding thecurrent sample (0-15 or EOB)

ÎSize: nr. of quantization bits for the current sample

ÎAmplitude: quantized sample value

Symbol 1 Symbol 2

(RUNLENGTH, SIZE) (AMPLITUDE)

Codestream syntax

The codestream consists of



129

ÎMarkers and marker segments (to carry auxiliary

information)ÎData

Marker structure:

ÎCodeÎ Length

ÎMarker data

JPEG syntax

FFD8 (Start Of Image)



130

FFE0 (FIF marker)

FFDB (Define Quantization Table)FFC4 (Define Huffman Table)

FFC0 (Start Of Frame)

FFDA (Start of Scan)FFD0-FFD7 (Restart Markers)

FFFE (Comment)

FFD9 (End Of Image)

JPEG – lossless mode



131

JPEG performance



132

Quality max - Size: 61k Quality med - Size: 14k Quality low - Size: 4k

JPEG performance



133

Original image

Encoded @ 24 bits per pixel

JPEG performance



134

Quality 95/100

3.926 bits per pixel (bpp)

CR = 24/3.926 = 6.1

JPEG performance



135

Quality 50/100


CR = 22.5

JPEG performance



136

Quality 25/100


CR = 34.0

JPEG performance



137

Quality 5/100 (min.useful)0.291 bits per pixel (bpp)

CR = 82.5

JPEG

Disadvantages:bl ki ff t f th i



138

Î blocking effect for non smooth images

Î image correlation is not removed across blockboundaries

Î only possible dynamic range is 8 or 12 bpp

Î non unified version for lossless and lossycompression

ÎFourier-like basis functions

ÎPoor performance at low bit rate

The use of low bit rate coding algorithmsbecomes necessary (JPEG 2000)

Video coding



Video coding

Analog video



140

Progressive and interlaced scans



141

Color TV broadcasting and receiving



142

Why not using RGB directly?



143



144

Digitizing a raster video



145

RGBÅÆ YCbCr



146

Chrominance subsampling formats



147

Digital video formats



148

2D motion estimation



Notation



150

Motion representation



151

Block based motion estimation



152

Block matching algorithm



153

Exhaustive block matching algorithm



154

Complexity of integer-pel EBMA



155

Sample Matlab script for integer-pel EBMA



156

Fractional accuracy EBMA



157

Half-pel accuracy EBMA



158

Bilinear interpolation



159



160

Pros and cons with EBMA



161

Fast algorithms for BMA



162

Video codingusing motion compensation



Characteristics of typical videos



164

Key ideas in video compression:hybrid video coding



165

Different coding modes



166

Temporal prediction



167

Block matching algorithmfor motion estimation



168

Multiple reference frame temporalprediction



169

Spatial prediction



170

Motion compensated video



171

Macroblocks in 4:2:0 color format



172

MB coding in I-mode(assuming no intra prediction)



173

MB coding in P-mode



174

MB coding in B-mode



175

Coding mode selection



176

Rate control



177

Loop filtering



178

Video coding standards



Scalable coding



180

Bitstream scalability



181

Illustration of scalable coding



182

Quality (SNR) scalabilityby multistage quantization



183

Spatial/temporal scalabilitythrough down/upsampling



184

Scalability in MPEG-2



185

Fine granularity scalability (FGS)in MPEG-4



186

Drift problem in scalable codecs



187

How to solve the drift problem?



188

Trade-off between coding efficiency anddrift



189

Video coding standards and applications



190

H.261 video coding standard



191

DCT coefficient quantization



192

Motion estimation/compensation



193

Variable length coding



194

Parameter selection and rate control



195

H.263 video coding standard



196

Improvements over H.261



197

PB-picture mode



198

Performance of H.261 and H.263



199

MPEG-1 overview



200

MPEG-1 vs. H.261



201

Group of pictures in MPEG



202

MPEG-2 overview



203

MPEG-2 vs. MPEG-1



204

DCT modes



205

MPEG-2 scalability



206

SNR-scalable encoder



207

Spatially-scalable encoder



208

Temporally scalable encoder



209

Profiles and levels in MPEG-2



210

MPEG-4 overview



211

Object-based coding



212

Object description hierarchy in MPEG-4



213

Example of scene composition



214

Coding of texture with arbitrary shape



215

Shape-adaptive DCT



216

MPEG-4 shape coding



217

Mesh animation



218

Body and face animation



219

MPEG-4 video coding efficiency tools



220

H.264/AVC



Introduction

Started as ITU recommendation

Now joint ISO and ITU effort (JVT)

ITU H.264/AVC, MPEG-4 Part 10

Targets bit rate reduction by a factor 2 at the



222

Targets bit rate reduction by a factor 2, at the

same quality, with respect to other standardsÎ at the expenses of much higher complexity

Comparison of video coders (QCIF, 30 fps, 100 kbit/s)Original H.263 baseline (33 dB) H.263+ (33.5 dB)

MPEG-4 core (33.5 dB) H.264 (42 dB)



223

( ) ( )

H.264/AVC applications



224

Relationship to other standards



225

H.264/AVC structure



226

H.264/AVC profiles

Baseline: core compression capabilities, pluserror resilience. Suitable for videoconference,mobile video, …

Main: high compression and quality (e.g.,broadcasting)



227

Extended: added features for efficientstreaming

H.264 video coding layer



228

Partitioning of a frame



229

Flexible Macroblock Ordering (FMO)



230

Common elements with other standards



231

H.264 motion compensation accuracy



232

Macroblock partitioning



233

Multiple reference frames



234

Macroblock type

Each MB can be encoded in one of thefollowing modes:

Î INTRA

9 Intra 4x4 9 prediction modes for Y

9 Intra 16x16 4 prediction modes for Y

Î INTER



235

9 prediction with square blocks (16x16, 8x8, 4x4)9 prediction with rectangula blocks (8x16, 16x8, 4x8, 8x4)

RATE DISTORTION OPTIMIZATION

Intra prediction

MBs to be coded in Intra mode can be predicted fromthe already coded MBs in the same slice

(Intra 16x16)



236

DCT and inverse transform



237

H.264 4x4 transform



238

4x4 DCT

4x4 DCT



239

4x4 DCT

Deblocking filter

In-loop filter improves visual quality andPSNR. The filter in H.264/AVC is veryarticulate

Î slice level

Î edge level (filtering strength is dependent oncoding residuals)



240

Î sample level (thresholds allow to turn off the filterfor given pixels)

Î strong filter for very flat MBs

Deblocking filter



241



Deblocking filter: subjective results(Inter)



243

Entropy Coding



244

Entropy Coding

CAVLC (Context-adaptive Variable Length Coding)

Î uses exp-Golomb codes for all symbols except transform

coefficientsÎ uses Huffman-like tables for transform coefficients

CABAC (Context-based Adaptive Binary Arithmetic



245

CABAC (Context based Adaptive Binary ArithmeticCoding)

CABAC



246

S-pictures



247

Comparison of H.264 to MPEG-4



248

Rate Allocation

How does one select the optimal coding modefor each MB? Lagrangian optimization.

For each MB and for each coding mode a costfunction is computed. The mode minimizingthe cost function is used for that MB.



249

This guarantees to obtain maximum PSNR, atthe expenses of a very high complexity

Lagrangian R-D optimization

Cost function:

where:

D = distortion using the current options (using SAD)

R D J +=



250

R = Bit-rate using the current optionsλ = Lagrange parameter (used to set the bit-rate)

Lagrangian R-D optimization

Given QP (i.e., the bit-rate), for every possibleset of coding parameters (coded blockpattern, intra and inter coding modes,

reference frame, motion vectors), computeÎ the distortion D associated to that set of

parameters

Î the rate R associated to that set of parameters



251

Î the cost J=D+ λR associated to that set ofparameters

Select the set of parameters that minimizes J

Performance: H.264 vs. MPEG-4



252

Network Adaptation Layer



253

Data partitioning

The symbols contained in a slice are partitioned indifferent types:

0 TYPE_HEADER Picture or Slice Headers

1 TYPE_MBHEADER Macroblock header information2 TYPE_MVD Motion Vector Data

3 TYPE_CBP Coded Block Pattern



254

4 TYPE_2x2DC 2x2 DC Coefficients

5 TYPE_COEFF_Y Luma AC Coefficients

6 TYPE_COEFF_C Chroma AC Coefficients

7 TYPE_EOS End-of-Stream Symbol

NAL for IP networks

1 sliceÆ 2 (or 3) packetsTYPE_HEADER

TYPE_MBHEADER

TYPE_MVD

TYPE_EOS

First packet

(high priority)

TYPE CBP



255

TYPE_CBPTYPE_2x2DC

TYPE_COEFF_Y

TYPE_COEFF_C

Second packet

(low priority)

Error concealment

It is not normative



256

Works on single MBs

INTRA Concealment



257

Pixel value = (15x(16-3) + 21x(16-12) + 32x(16-7) + 7x(16-8)) /

(13+4+9+8) =18

INTER Concealment



258

Error control



Steps involved in a communication session



260

End-to-end delay



261

Challenges for video communications



262

Conventional source coding is not goodenough



263

Spatial/temporal error propagation



264

Drift



265

Effect of transmission errors



266

QoS requirements of typical videoapplications



267

Interactive two-way visual communications



268

One-way video streaming



269

Major types of communication networks



270

Characteristics of major videocommunications applications



271

Error control techniques for video



272

Transport level error control



273

Channel coding basics



274

FEC for video transmission



275

Delay-constrained ARQ



276

Error resilient encoding



277

Reversible variable length coding



278

Coding mode selection

based on network conditions



279

Layered coding with unequal error protection



280

Multiple description coding



281

Generic two description coder



282

Challenges for multiple description

video coding



283

Video redundancy coding in H.263+



284

Decoder error concealment



285

Error concealment techniques



286

Sample error concealment results



287

Encoder-decoder interactive error control



288

Video transport using path diversity



289

Why using multiple paths



290

Video streaming



A brief history of streaming media



292

Internet media streaming



293

What is streaming video?



294

Outline



295

Time-varying available bandwidth



296

Time-varying delay



297

Effect of packet loss



298

Unicast vs. multicast



299

Heterogeneity for multicast



300

Architecture for video streaming



301

Video compression



302

Application of layered video



303

Application-layer QoS control



304

Source-based rate control



305

Receiver-based rate control



306

Continuous Media Distribution Services



307

Continuous media distribution services

The aim is to provide QoS and achievingefficiency for streaming video/audio overthe best-effort Internet.

Continuous Media Distribution Servicesinclude:

1. network filtering

2. application-level multicast3. content replication



308

1) Network Filtering

Network filtering aims to maximize videoquality during network congestion.

The filter receive the client’s requests and

adapt the stream sent by the serveraccordingly.

on thedata planeon thecontrol plane



309

1) Network Filtering (cont’d)

Typically, frame-dropping filters are used asnetwork filters.

The receiver can change the bandwidth of the

media stream.ÎBy sending requests to the filter to increase or

decrease the frame dropping rate.

Î

The receiver continuously measures the packet loss ratio .



310

2) Application-Level Multicast

The application-level multicast is aimed atbuilding a multicast service on top of theInternet.

The media multicast networks can be built froman interconnection of content-distributionnetworks.

The media multicast networks could support“peering relationships” at the application level orh d l



311

the streaming-media/content layer .

3) Content Replication

1) Mirror{ Mirroring is to place copies of the original multimedia

files on other machines scattered around the Internet.

{ In this way, clients can retrieve multimedia data from

the nearest duplicate server.

{ Disadvantages: expensive, ad hoc, and slow.

2) Cache

{ Caching makes local copies of contents that the clientsretrieve.

{ Based on the belief that different clients will load manyof the same contents



312

of the same contents .

Receiver-driven layered multicast



313

Streaming Servers



314

Streaming server



315

Streaming Servers

Streaming servers are required to processmultimedia data under timing constraints.

A streaming server typically consists of the

following three subsystems:ÎCommunicator

ÎOperating system

ÎStorage system



316

Real-Time Operating System

1) Process Management{ The operating system must use real-time

scheduling techniques.

{ There are two basic algorithms:

9 Earliest deadline first (EDF){ each task is assigned a deadline, and

{ the tasks are processed in the order ofincreasing deadlines .

9 Rate-monotonic scheduling{ each task is assigned a static priority according

to its request rate.

{ rate ➡ , priority ➡



317

{ rate , priority

{ the tasks are processed in the order of priorities .

Real-Time Operating System (cont’d)

2) Resource Management{ Resources in a multimedia server include CPUs,

memories, and storage devices.

{ Resource management involves admission control and resource allocation .

9 deterministic & statistical



318

Real-Time Operating System (cont’d)

3) File Management{ The file system provides access and control

functions for file storage and retrieval.

{ There are two basic approaches:

9 A files is not scattered across several disks

9 To organize files on distributed storage like disk arrays.



319

Storage System

1) Increase throughput with data striping{ Under data striping schemes, a multimedia file is

scattered across multiple disks and the diskarray can be accessed in parallel .

{ An important issue is to balance the load of mostheavily loaded disks to avoid overload situationswhile keeping latency small.



320

Storage System (cont’d)

2) Increase capacity with tertiary andhierarchical storage

{ To keep the storage cost down, tertiary storagemust be added.

9 tape, CD-ROM

{ Under the hierarchical storage architecture , onlya fraction of the total storage is kept on disks

while the major remaining portion is kept on atertiary tape system.



321

Hierarchical Storage



322

Storage System (cont’d)

3)Fault tolerance{ In order to ensure uninterrupted service even in

the presence of disk failures .

{ There are two techniques:

9 Error-correcting (parity-encoding)

{ Adding a small storage overhead

9 mirroring

{ Incurring at least twice as much storage volume{ Tradeoff between reliability and complexity .



323

Dynamic stream switching: SureStreams



324

Dynamic stream switching: SP-frames



325

SP-frames (cont’d)



326

SP-frames: performance gain



327

Media Synchronization



328

Media Synchronization

Media synchronization refers to maintainingthe temporal relationships within one data

stream and between various media streams.

Each component on the transport path affectsthe data in a different way.

ÎThey all inevitably introduce delays and delay variations .



329

Media Synchronization (cont’d)

There are three levels of synchronization:

Î Intra-stream synchronization

9 the media layer

Î Inter-stream synchronization

9 the stream layer

Î Inter-object synchronization

9 the object layer



330


The method that are used widely to specify the

temporal relations is time-stamping:

ÎAt the source, a stream is time-stamped to keeptemporal information

ÎAt the destination, the application presents thestreams according to their temporal relation.



331


Preventive

ÎDesigned to minimize synchronization errors asdata is transported from the server to the user.

ÎTo minimize latencies and jitters

Corrective

ÎCompensations when synchronization errorsoccur.

ÎStream Synchronization protocol (SSP)



332

Protocols for Streaming Video



333

Protocol stack for Internet streaming media



334

Protocols for Streaming Video

Network-layer protocolÎ network addressing

Î IP

Transport protocolÎ end-to-end network transport functionsÎ UDP, TCP, real-time transport protocol (RTP), and real-

time control protocol (RTCP)

Session control protocolÎ defines the messages and procedures to control the

delivery of the multimedia data during an establishedsession.

Î RTSP, and the session initiation protocol (SIP)



335

Protocol Stacks for Media Streaming



336

Transport Protocols

UDP and TCP protocols support suchfunctions as multiplexing, error control,congestion control, or flow control.

Since TCP retransmission introducesunacceptable delays, UDP is typically employedfor streaming applications.



337

Transport Protocols (cont’d)

RTP is a data transfer protocol while RTCP isa control protocol.

In an RTP session, participants periodically

send RTCP packets to convey feedback onquality of data delivery and information ofmembership.



338


RTP provides the following functions:

ÎTime-stamping

ÎSequence numbering

ÎPayload type identification

ÎSource identification

9 SSRC (Synchronization SouRCe identifier)



339


Basically, RTCP provides the followingservices:

ÎQoS feedback

ÎParticipant identification

9 RTCP SDES

ÎControl packets scaling

Î Inter-media synchronization

ÎMinimal session control information



340

Session Control Protocols

Main functions of RTSP are:ÎTo support VCR-like control operations.

ÎProviding means for choosing delivery

channels and delivery mechanisms.ÎAlso establishing and controlling streams of

continuous audio and video media.

9 Media retrieval

9 Adding media to an existing session



341

Session Control Protocols (cont’d)

Session Initiation Protocol

ÎSIP can also create and terminate sessions withone or more participants.

ÎSIP supports user mobility by proxying and

redirecting requests to the user’s current location.



342

Peer-to-peer networking



Outline

Introduction and Overview

Popular P2P Applications

P2P Video-on-Demand

Conclusions and Future of P2P



344

P2P Introduction and Overview



P2P Introduction and Overview - Outline

Part I:

History, motivation and evolution

ÎHistory: Napster and beyond

ÎWhat is Peer-to-peer?

ÎWhy Peer-to-peer?

Brief P2P technologies overview

ÎUnstructured p2p-overlaysÎStructured p2p-overlays



346

History, motivation and evolution

P2P represented~65% of InternetTraffic at end 2006

1999: Napster first widely used p2p application



347

1999: Napster , first widely used p2p-application

Napster , first widely used p2p-application

The application:

A p2p application for the distribution of mp3filesÎEach user can contribute its own content

How it works:

Central index server

ÎMaintains list of all active peers and their availablecontent

Distributed storage and downloadÎ

Client nodes also act as file serversAll d l d d i h d



348

Client nodes also act as file serversÎAll downloaded content is shared

History, motivation and evolution - Napster

(cont’d)

Central index server…

Initial join

ÎPeers connect to Napster serverÎTransmit current listing of shared

files to server

join



349

peers


(cont’d)

1) query

2) answer

Content search

ÎPeers sends song request toNapster server

ÎNapster server checks song

database and returns list ofmatched peers

…Central index server



350

peers


(cont’d)

1) request2) download…

File retrieval

ÎThe requesting peer contacts thepeer having the file directly anddownloads it

Central index server

1) 2)



351

peers

Napster was the first simple but successfulP2P-application. Many others followed…

P2P File Download Protocols: 1999: Napster

2000: Gnutella, eDonkey

2001: Kazaa 2002: eMule, BitTorrent

History, motivation and evolution - File Download



352

Definition of Peer-to-peer (or P2P)

A peer-to-peer (or P2P) computer network is a

network that relies primarily on the computing powerand bandwidth of the participants in the networkrather than concentrating it in a relatively smallnumber of servers.

A pure peer-to-peer network does not have the notionof clients or servers, but only equal peer nodes thatsimultaneously function as both "clients" and

"servers" to the other nodes on the network. This model of network arrangement differs from the

client-server model where communication is usually

to and from a central server.Taken from the wikipedia free encyclopedia www wikipedia org



353

to and from a central server.Taken from the wikipedia free encyclopedia - www.wikipedia.org

It is a broad definition with lots of applications

P2P-File download

ÎNapster, Gnutella,KaZaa, eDonkey,…

P2P-Communication

ÎVoIP, Skype,Messaging, …

P2P-Video-on-

Demand

P2P-Computation

Î seti@home

P2P-Streaming

ÎPPLive, ESM, …

P2P-Gaming

…



354

P2P is not restricted to file download!

P2P Protocols: 1999: Napster, End System Multicast

(ESM)

2000: Gnutella, eDonkey 2001: Kazaa

2002: eMule, BitTorrent

2003: Skype 2004: PPLive

Today: TVKoo, TVAnts, PPStream,

SopCast… Next: Video on Demand Gaming

File DownloadStreaming

Telephony

Video-on-Demand

Gaming

Application type:

History, motivation and evolution -

Applications



355

p Next: Video-on-Demand, Gaming

Why is P2P so successful?

Scalable – It’s all about sharing resources

ÎNo need to provision servers or bandwidth

ÎEach user brings its own resource

ÎE.g. resistant to flash crowds

9 flash crowd = a crowd of users all arriving at the sametime

capacity

Resources couldbe:

•Files to share;•Upload

bandwidth;

•Disk storage;…



356

Why is P2P so successful? (cont’d)

Cheap - No infrastructure needed

Everybody can bring its own content (at no

cost)ÎHomemade content

ÎEthnic content

Î

Illegal contentÎBut also legal content

Î…

High availability – Content accessible most oftime



357

g ytime

P2P-Overlay

Build graph at application layer, and forward

packet at the application layer

It is a virtual graph

ÎUnderlying physical graph is transparent to theuser

ÎEdges are TCP connection or simply a entry of anneighboring node’s IP address

The graph has to be continuously maintained(e.g. check if nodes are still alive)



358

P2P-Overlay (cont’d)

Underlay

Overlay

Source

Source



359

The P2P enabling technologies

Unstructured p2p-overlays

ÎGenerally random overlay

ÎUsed for content download, telephony, streaming

Structured p2p-overlays

ÎDistributed Hash Tables (DHTs)

ÎUsed for node localization, content download,streaming



360

Unstructured p2p-overlays

Unstructured p2p-overlays do not really care

how the overlay is constructedÎPeers are organized in a random graph topology

9 E.g., new node randomly chooses three existing nodesas neighbors

9 Flat or hierarchical

ÎBuild your p2p-service based on this graph

Several proposals

ÎGnutellaÎKaZaA/FastTrack

ÎBitTorrent



361

Unstructured p2p-overlays (cont’d)

Unstructured p2p-overlays are just a framework, you

can build many applications on top of it

Unstructured p2p-overlays pros & consÎ Pros

9

Very flexible: copes with node churn9 Supports complex queries (conversely to structured overlays)

Î Cons9 Content search is difficult: There is a tradeoff between

generated traffic (overhead) and the horizon of the partial view

In the following we detail the following applicationsÎ SkypeÎ BitTorrent



362

One Example of usage of unstructured

overlays

Typical problem in unstructured overlays:

How to do content search and query?ÎFlooding

Î Limited Scope, send only to a subset of yourneighbors

ÎTime-To-Live, limit the number of hops permessages

Search “Britney Spears”Example of flooding:(similar to Gnutella)

Found entry!

NotifyUpload



363

messages

Survey of popular P2P

applications



BitTorrent - Components

In the initial version of BitTorrent, a torrent is composed of: A single content

Î The content is cut down into piecesÎ Pieces are cut down into blocks, which are the transmission units between

peersÎ The protocol only accounts for transferred pieces: partially received pieces

cannot be served by a peer

A single Central Tracker Î The central tracker has

9 the list of all peers participating accessing or serving the file9 the list of all pieces of the file, and their respective hash values

One or more Seeds Î Seeds have the entire file

Many Leechers Î Leechers download the file



365

BitTorrent – Peer-set

Peer-set

Î The list of neighbors a peer is allowed to communicate with

Peer-set construction

Î Each peer (seed or leecher) contacts the tracker and gets alist of peers participating in the same session

Î Typically 50 peers are chosen at random by the tracker foreach peer

Î The peer-set is augmented by peers connecting directly toyou

Î The peer-set size is limited to 80 peers



366

BitTorrent - Algorithms

Two components in BitTorrent downloading

algorithm:

Peer Selection – determines from whom todownload the piece?

Piece Selection – determines which piece todownload?



367

Tit for Tat

Based on the English saying meaning "equivalent

retaliation" ("tip for tap"), an agent using this strategy willrespond in kind to a previous opponent's action.

If the opponent previously was cooperative, the agent iscooperative. If not, the agent is not.

This strategy is dependent on the following conditions thathas allowed it to become the most prevalent strategy forthe Prisoner's Dilemma:Î 1. Unless provoked, the agent will always cooperate

Î 2. If provoked, the agent will retaliate

Î 3. The agent is quick to forgive

Taken from the wikipedia free encyclopedia - www.wikipedia.org



368

BitTorrent - Peer selection

Choke AlgorithmÎChoking is a temporary refusal to upload

ÎEach peer unchokes a fixed number of peers(default = 4)

9 3 peers on tit-for-tat basis9 1 peer on optimistic unchoke basis



369

3 0

BitTorrent - Peer selection (cont’d)

Tit-for-tat peer selectionÎSelect the 3 peers from which you downloaded

most and that are interested in your chunks

ÎPeer selection is done every 10 seconds, based

on the download rates of the last 30 seconds.



370

371


Optimistic unchoke peer selectionÎSelect one peer at random that is interested in

your chunks, regardless of the current downloadrate from it

ÎRotates every 30 seconds.

Reason:

ÎTo discover currently unused connections that arebetter than the ones being used

ÎCorresponds to always cooperating on the first

move in prisoner's dilemma



371

372


Anti-SnubbingÎWhen a remote peer uploaded no data in 60 s, the

local peer assumes that he has been snubbed

Î In that case the local peer refuses to upload to it

except for the optimistic unchoking



372

373

BitTorrent - Piece selection

Random first piece

ÎOnly applies if leecher has downloaded less than4 pieces (chunks)

ÎChoose randomly the next piece to download

ÎAllows to download quickly your first pieces tohave pieces to reciprocate for the choke algorithm



373

374

BitTorrent - Piece selection (cont’d)

Local rarest first policy

ÎDetermine the pieces that are most rare amongyour peers and download those first

ÎEnsures that the most common pieces are left till

the end to downloadÎRarest first also ensures that a large variety of

pieces are downloaded from the seed



374

375

BitTorrent - Summary

Efficient file download thanks to simple incentive

mechanismsÎ Local rarest first

9 High piece entropy

Î Tit-for-tat9

Avoids free-riding9 Optimizes resource utilization

Space for improvement?

Î Steady state very stable and efficientÎ Startup-phase still unstable with some inefficiencies

Î Is there an advantage of deploying BitTorrent on Set-Top-Boxes?

Î

Is BitTorrent adapted to mobile terminals/DTN networks?Possible usage of network coding?



3 5

376

Skype Overlay

Protocol not fully understood todayÎ

Proprietary protocolÎ Content and control messages are encrypted

Protocol reuses concepts of the FastTrack overlay

used by KaZaA

Builds upon an unstructured overlayÎ Combines

9 distributed index servers9 a flat unstructured network among index servers

Î Two tier hierarchy9 Super Nodes (SN)9 Ordinary Nodes (ON)



377

Skype Overlay (cont’d)

Super Nodes (SN)

Î Connect to each other, building a flat unstructured overlay(similar to the Gnutella overlay)

Ordinary Nodes (ON)

Î Connect to Super Nodes that act as a directory server(similar to the index server in Napster)

Skype login serverÎ Only central component

Î Stores and verifies usernames and passwords

Î Stores the buddy list



378

Skype Overlay (cont’d)

Skypelogin

server

Neighbor relationshipSN ON

Messageexchange during

login forauthentication



379

How is the overlay constructed? - Super

Node Lists

Each node keeps a host cache with a list of

Super Nodes IP-addressesÎUp to 200 entries

Some Super Nodes IP-addresses are hard-

codedÎSuper Nodes provided by Skype

These lists are used to locate a nodes SuperNode at login



380

How is the overlay constructed? -

Login

Contact login server and authenticate

Advertise your presence to other peersÎContact a Super Node

ÎContact your buddies (through Super Node), and

notify your presence



381

Super Nodes – Index servers

Super Nodes are index servers

Î I.e. index of locally connected Skype users (andtheir IP addresses)

If buddy is not found in local index of a Super

NodeÎSpread node search to neighboring Super Nodes

ÎNot clear how this is implemented

9 Possibly flood the request similar to Gnutella



382

Super Nodes – Relay nodes

Super nodes also act as relay nodes

ÎEnables NAT traversalsÎAvoid congested or faulty paths



383


Bob

Alice

Alice would like to call Bob (or inversely)



384


Skyperelay node

Alice

Bob

Alice would like to call Bob (or inversely)

Contact Relay NodeCall



385

Super Node election When does an ordinary node become a super node?

Î High bandwidth, Public IP address, but details not clear

Î Highly dynamic9 Super Node Churn, Short Super Node session time

Churn Session time



386

Super Node election

A world map of Skype Super Nodes



387

Skype - Summary

VoIP has other requirements than file

downloadÎDelay

Î Jitter

Skype network seems to handle theseconstraints in spite of

ÎHigh node churn

Protocol not fully understood



Conclusion and future of p2p



389

P2P Attracting Attentions from CommercialWorld

NBC Universal goes peer-to-peer – worldmedia.com

BitTorrent raised $8.75 million venture capitalsÎ Teamed with CacheLogic to work for BT

Startups providing P2P live program: pplive,

coolstreaming BBC Legal Download Platforms: iMP / Kontiki

Î Allow users in UK to download BBC TV and radioprograms via a program guide for up to 7 days after

broadcast



390

P2P Attracting Attentions from CommercialWorld

Microsoft is activeÎ Peer-to-Peer library

Î Acquisition of Groove

Î Avalanche

Î RedCarpet

Î P2P Windows update

Google and Apple are not using P2P... Yet (?)Î they face mounting costs with video

GoogleÎ Google video is online

Î Bought YouTube

Î Bought chinese p2p-company Xunlei Network Technology Apple

Î iTunes changed the world of music

Î Will it change the world of video?9 iTV will be a digital media adapter with HDD



391

Will P2P Go Beyond Desktop?

Current device requirement

ÎCPU, memory, and disk space requirementÎPlatforms supported

Î Internet connection requirement

Three categories of p2p applicationÎ file downloading

9 BitTorrent already on some SetTop-Boxes and DSL-routers

Î Voice9 Skype mobile phones

Î Video9 Not yet



392

o eyon es op(Discussion)

Mobile P2P?

ÎWhat benefits does p2p offer over mobile device?9 ???

ÎWhat are potential issues?9 Power

9 Connection speed9 ???

P2P on set-top box?

Î ???Other consumer electronic devices?

Î ???



393

Future of P2P - Ad-hoc P2P

Opportunistically use all available technologies!

Access knowledge and resource of devices you crossin the street

Local P2P content search

Î What is currently the best place to find a cab ?

Î What are the results of yesterday’s soccer match ?

GSMGSM



394

Future of P2P - Ad-hoc P2P (cont’d)

Your request or messages are stored and

forwardedÎEnable p2p communication even if there is no

direct path between two peers at a given moment

in time



395

Conclusions and Future of P2P

More commercial P2P applicationsÎ Combats between legal and illegal content sharing will continueÎ More p2p used in commercial environment

9 Reduce distribution cost and compete with illegal content

Secure P2P Better performance

Î

More intelligent sharingÎ More scalableÎ Handle churn betterÎ Competing with other technology

Supporting diversity – long tail contentÎ YouTube

Supporting community Relationship with ISPs Become ubiquitous application ??



Peer-to-peer media streaming



Growth of Internet traffic

397

Cisco's global consumer Internet traffic forecast (2007)

0

2000

4000

6000

8000

2007 2008 2009 2010 2011

P B / m o n

t h

IPTV

Video streaming

P2P

VoIP

Web/Data

Gaming

Other video



398

What is IPTV?

IPTV (Internet Protocol Television) is a system

where a digital television service is delivered by usingInternet Protocol over a network infrastructure.

IPTV is typically supplied by a service provider usinga closed network infrastructure. This closed network

approach is in competition with the delivery of TVcontent over the public Internet, called InternetTelevision. In businesses, IPTV may be used to

deliver television content over corporate LANs.




399

What is peer-to-peer TV?

The term P2PTV refers to peer-to-peer (P2P)

software applications designed to redistributevideo streams in real time on a P2P network;

The distributed video streams are typically TV

channels from all over the world but may alsocome from other sources.




400

Joost

Joost is a system for distributing TV shows

and other forms of video using P2PTVtechnology

Created by the founders of Skype and Kazaa.

Has signed up more than a million betatesters and is on track for an end-of-yearlaunch.

Uses H.264 video coding



401

Introduction

Advent of multimedia technology and broadband surge lead to:

Î Excessive usage of P2P application that includes:

9

Sharing of Large Videos over the internetÎ Video-on-Demand (VoD) applications

Î P2P media streaming applications

BitTorrent like P2P models suitable for bulk file transfer

P2P file sharing has no issues like QoS:

Î No need to playback the media in real time

Î Downloading takes long time, many users do it overnight



402

Introduction Contd.

P2P media streaming is non trivial:

Î

Need to playback the media in real time9 Quality of Service

Î Procure future media stream packets

9 Needs reliable neighbors and effective management

Î High “churn” rate – Users join and leave in between9 Needs robust network topology to overcome churn

Î Internet dynamics and congestion in the interior of the network

9 Degrades QoS

Î Fairness policies extremely difficult to apply like tit-for-tat 9 High bandwidth users have no incentive to contribute



403

P2P Media Streaming

Media streaming extremely expensive

Î 1 hour of video encoded at 300Kbps = 128.7 MB

Î Serving 1000 users would require 125.68 GB

Media Server cannot serve everybody in swarm

In P2P Streaming:

Î Peers form an overlay of nodes on top of www internet

Î Nodes in the overlay connected by direct paths (virtual or logical links), in

reality, connected by many physical links in the underlying network

Î Nodes offer their uplink bandwidth while downloading and viewing the media

content

Î Takes load off the server

Î Scalable



404

P2P Sharing

Server

1

2

5

3

4

…

…

…

…

…

…

…

…

1

3

Content Distribution ToolContent Distribution Tool

File is chopped intoFile is chopped into piecespieces



405

Major Approaches

Major approachesÎ Content Distribution Networks like Akamai

9 ExpensiveÆ Only large infrastructure can afford

Î Client Server Model

9 Not scalable

Î Application Layer Multicast

9 Alternate to IP Multicast

Î Peer-to-Peer Based

9 Most viable and simple to use and deploy

9 No setup cost

9 Scalable



406

Content Distribution Networks (CDNs)

CDN nodes deployed in multiple locations, often over multiple

backbones

These nodes cooperate with each other to satisfy an end

user’s request

User request is sent to nearest CDN node, which has a

cached copy

QoS improves as end user receives best possible connection

Yahoo mail uses Akamai



407

Media Streaming

Tree Based

Application LayerMulticast

Peer-to-Peer

Mesh Based

[CoolStreaming, PPLive,SOPCast,TV Ants, Feidian]

[NICE, ZigZag, SpreadIT] [ESM, Narada]

Roadmap to Internet media streaming



408

Application Layer Multicast (ALM)

Very sparse deployment of IP Multicast due to technicaland administrative reasons

In ALM:Î Multicasting implemented at end hosts instead of network routers

Î Nodes form unicast channels or tunnels between them

Î

Overlay Construction algorithms at end hosts can be more easilyapplied

Î End hosts needs lot of bandwidth

Most ALM approaches form Tree based topology:

Î Simple to useÎ Ineffective in case of churn and node failures as incurs high

recovery time



409

ALM Methodologies

Tree Based

Î Content flows from server to nodes in a tree like fashion, every node forwards

the content to its children, which in turn forward to their childrenÎ One point of failure for a complete subtree

Î High recovery time

Î Notes Tree Base Approaches: NICE, SpreadIT, Zigzag

Mesh Based

Î Overcomes tree based flaws

Î Nodes maintain state information of many nodes

Î

High control overheadÎ Notes Mesh Based approaches include Narada and ESM from CMU.



410

Tree Based ALM



411

Mesh Based ALM



412

Peer-to-Peer Streaming Models

Design flaws in ALM lead to current day P2P Streaming models based on

chunk driven technology

Media content is broken down in small pieces and disseminated in theswarm

Neighboring nodes use Gossip protocol to exchange buffer information

Nodes trade unavailable pieces

Robust and Scalable

Most noted approach in recent years: CoolStreaming

Î PPLive, SOPCast, Fiedian, TV Ants are derivates of CoolStreaming

Î Proprietary and working philosophy not published

Î Reverse Engineered and measurement studies released



413

CoolStreaming

Files is chopped by server and disseminated in the swarm

Node upon arrival obtain a peerlist of 40 nodes from theserver

Nodes contact these nodes for media content

In steady state, every node has typically 4-8 neighbors, itperiodically shares it buffer content map with neighbors

Nodes exchange the unavailable content

Real world deployed and highly successful system



414

Metrics

Quality of ServiceÎ Jitter less transmission

Î Low end to end latency

Uplink utilizationÎ High uplink throughput leads to scalable P2P systems

Robustness and ReliabilityÎ Churn, Node failure or departure should not affect QoS

Scalability Fairness

Î Determined in terms of content served (Share Ratio)

Î No user should be forced to upload much more than what it hasdownloaded

SecurityÎ Implicitly affects above metrics



415

Quality of Service

Most important metric

Jitter: Unavailability of stream content at play time causes jitter

Jitter less transmission ensures good media playback Continuous supply of stream content ensures no jitters

Latency: Difference in time between playback at server anduser

Lower latency keeps users interestedÎ A live event viz. Soccer match would lose importance in crucial moments

if the transmission is delayed

Reducing hop count reduces latency



416

Uplink Utilization

Uplink is the most sparse and important resource in

swarm Summation of uplinks of all nodes is the load taken off the

server

Utilization = Uplink used / Uplink Available Needs effective node organization and topology to

maximize uplink utilization

High uplink throughput means more bandwidth in the swarmand hence it leads to scalable P2P systems



417

Robustness and Reliability

A Robust and Reliable P2P system should be able to

support with an acceptable levels of QoS under followingconditions:

Î High churn

Î Node failure

Î Congestion in the interior of the network

Affects QoS

Efficient peering techniques and node topology ensuresrobust and reliable P2P networks



418

Scalability

Serve as many users as possible with an acceptable level

of QoS Increasing number of nodes should not degrade QoS

An effective overlay node topology and high uplink

throughput ensures scalable systems



419

Fairness

Measured in terms of content served to the swarm

Î Share Ratio = Uploaded Volume / Downloaded Volume

Randomness in swarm causes severe disparity

Î Many nodes upload huge volume of content

Î Many nodes get a free ride with no or very less contribution

Must have an incentive for an end user to contribute

P2P file sharing system like BitTorrent use tit-for-tat policy to stop

free riding

Not easy to use it in Streaming as nodes procure pieces in realtime and applying tit-for-tat can cause delays



420

Security

Implicitly affects other P2P Streaming metrics

Mainly 4 types of attacks:Î Malicious garbled Payload insertion

Î Free rider – Selfish used only downloads with no uploads

Î Whitewasher – After being kicked out, comes again with newidentity. Such nodes use IP spoofing

Î DDoS attack – One or more nodes collectively launch a DoS attackon media server to crack the system down

Lot of attack on P2P file sharing system but very few onStreamingÎ Possibility cannot be denied



421

Current Issues

High buffering time

Î Half a minute for popular streaming channels and around 2 minutes for less

popular

Some nodes lag with their peers by more than 2 minutes in playback

time.

Î Better Peering Strategy needed

Uneven distribution of uplink bandwidths (Unfairness) Huge volumes of cross ISP traffic

Î ISPs use bandwidth throttling to limit bandwidth usage

Î Degrade QoS perceived at used end

Sub Optimal uplink utilization

video communication 2009

Documents