ee465: introduction to digital image processing 1 one-minute survey result thank you for your...

41
EE465: Introduction to Digital Image Proc essing 1 One-Minute Survey Result Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg, Michael, Shalini, Brian and Justin Valentine’s challenge Min: 30-45 minutes, Max: 5 hours, Ave: 2-3 hours Muddiest points Regular tree grammar (CS410 compiler or CS422: Automata) Fractal geometry (“The fractal geometry of nature” by Mandelbrot) Seeing the Connection Remember the first story in Steve Jobs’ speech “Staying Hungry, Staying Foolish”? In addition to Jobs and Shannon, I have two more examples: Charles Darwin and Bruce Lee

Upload: aiden-lounsberry

Post on 14-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 1

One-Minute Survey Result Thank you for your responses

Kristen, Anusha, Ian, Christofer, Bernard, Greg, Michael, Shalini, Brian and Justin

Valentine’s challenge Min: 30-45 minutes, Max: 5 hours, Ave: 2-3 hours

Muddiest points Regular tree grammar (CS410 compiler or CS422: Automata) Fractal geometry (“The fractal geometry of nature” by Mandelbrot)

Seeing the Connection Remember the first story in Steve Jobs’ speech “Staying Hungry,

Staying Foolish”? In addition to Jobs and Shannon, I have two more examples: Charles

Darwin and Bruce Lee

Page 2: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 2

Data Compression Basics

Discrete source Information=uncertainty Quantification of uncertainty Source entropy

Variable length codes Motivation Prefix condition Huffman coding algorithm

Page 3: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 3

Information

What do we mean by information? “A numerical measure of the uncertainty of an

experimental outcome” – Webster Dictionary How to quantitatively measure and represent

information? Shannon proposes a statistical-mechanics inspired

approach Let us first look at how we assess the amount of

information in our daily lives using common sense

Page 4: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 4

Information = Uncertainty Zero information

Pittsburgh Steelers won the Superbowl XL (past news, no uncertainty)

Yao Ming plays for Houston Rocket (celebrity fact, no uncertainty) Little information

It will be very cold in Chicago tomorrow (not much uncertainty since this is winter time)

It is going to rain in Seattle next week (not much uncertainty since it rains nine months a year in NW)

Large information An earthquake is going to hit CA in July 2006 (are you sure? an

unlikely event) Someone has shown P=NP (Wow! Really? Who did it?)

Page 5: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 5

Shannon’s Picture on Communication (1948)

sourceencoder

channel

sourcedecoder

source destination

Examples of source: Human speeches, photos, text messages, computer programs …

Examples of channel: storage media, telephone lines, wireless transmission …

super-channel

channelencoder

channeldecoder

The goal of communication is to move informationfrom here to there and from now to then

Page 6: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 6

The role of source coding (data compression):

Facilitate storage and transmission by eliminating source redundancy

Our goal is to maximally remove the source redundancy by intelligent designing source encoder/decoder

Source-Channel Separation Principle*

The role of channel coding:

Fight against channel errors for reliable transmission of information

(design of channel encoder/decoder is considered in EE461)

We simply assume the super-channel achieves error-free transmission

Page 7: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 7

Discrete Source

A discrete source is characterized by a discrete random variable X

Examples Coin flipping: P(X=H)=P(X=T)=1/2 Dice tossing: P(X=k)=1/6, k=1-6 Playing-card drawing:

P(X=S)=P(X=H)=P(X=D)=P(X=C)=1/4

What is the redundancy with a discrete source?

Page 8: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 8

Two Extreme Cases

sourceencoder channel

sourcedecoder

tossinga fair coin

Head or

Tail?

channel duplicationtossing a coin withtwo identical sides

P(X=H)=P(X=T)=1/2: (maximum uncertainty) Minimum (zero) redundancy, compression impossible

P(X=H)=1,P(X=T)=0: (minimum redundancy) Maximum redundancy, compression trivial (1bit is enough)

HHHH…

TTTT…

Redundancy is the opposite of uncertainty

Page 9: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 9

Quantifying Uncertainty of an Event

ppI 2log)( p - probability of the event x (e.g., x can be X=H or X=T)

p

1

0

)( pI

0

notes

must happen (no uncertainty)

unlikely to happen (infinite amount of uncertainty)

Self-information

Intuitively, I(p) measures the amount of uncertainty with event x

Page 10: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 10

Weighted Self-information

p

0

1

)( pI

0

1/2 1

0

0

1/2

pppIw 2log)(

Question: Which value of p maximizes Iw(p)?

)()( pIppIw

As p evolves from 0 to 1, weighted self-information

first increases and then decreases

Page 11: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 11

p=1/e

2ln

1)(

epIw

Maximum of Weighted Self-information*

Page 12: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 12

},...,2,1{ Nx

Niixprobpi ,...,2,1),(

N

iip

1

1

To quantify the uncertainty of a discrete source, we simply take the summation of weighted self-information over the whole set

X is a discrete random variable

Quantification of Uncertainty of a Discrete Source

A discrete source (random variable) is a collection (set) of individual events whose probabilities sum to 1

Page 13: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 13

Shannon’s Source Entropy Formula

N

iiw pIXH

1

)()(

N

iii ppXH

12log)( (bits/sample)

or bps

Weightingcoefficients

Page 14: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 14

Source Entropy Examples

Example 1: (binary Bernoulli source)

)1(1),0( xprobpqxprobp

)loglog()( 22 qqppXH

Flipping a coin with probability of head being p (0<p<1)

Check the two extreme cases:

As p goes to zero, H(X) goes to 0 bps compression gains the most

As p goes to a half, H(X) goes to 1 bps no compression can help

Page 15: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 15

Entropy of Binary Bernoulli Source

Page 16: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 16

Source Entropy Examples

Example 2: (4-way random walk)

4

1)(,

2

1)( NxprobSxprob

bpsXH 75.1)8

1log

8

1

8

1log

8

1

4

1log

4

1

2

1log

2

1()( 2222

N

E

S

W

8

1)()( WxprobExprob

Page 17: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 17

Source Entropy Examples (Con’t)

Example 3:

2

1)(1,

2

1)( bluexprobpredxprobp

A jar contains the same number of balls with two different colors: blue and red.Each time a ball is randomly picked out from the jar and then put back. Considerthe event that at the k-th picking, it is the first time to see a red ball – what is the probability of such event?

Prob(event)=Prob(blue in the first k-1 picks)Prob(red in the k-th pick )=(1/2)k-1(1/2)=(1/2)k

(source with geometric distribution)

Page 18: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 18

Source Entropy Calculation

If we consider all possible events, the sum of their probabilities will be one.

Then we can define a discrete random variable X with

12

1

1

k

kCheck:

k

kxP

2

1)(

Entropy:

bpskppXHk

k

kkk 2

2

1log)(

112

Problem 1 in HW3 is slightly more complex than this example

Page 19: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 19

Properties of Source Entropy

Nonnegative and concave Achieves the maximum when the source

observes uniform distribution (i.e., P(x=k)=1/N, k=1-N)

Goes to zero (minimum) as the source becomes more and more skewed (i.e., P(x=k)1, P(xk) 0)

Page 20: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

History of Entropy

Origin: Greek root for “transformation content” First created by Rudolf Clausius to study

thermodynamical systems in 1862 Developed by Ludwig Eduard Boltzmann in

1870s-1880s (the first serious attempt to understand nature in a statistical language)

Borrowed by Shannon in his landmark work “A Mathematical Theory of Communication” in 1948

EE465: Introduction to Digital Image Processing 20

Page 21: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

A Little Bit of Mathematics*

Entropy S is proportional to log P (P is the relative probability of a state)

Consider an ideal gas of N identical particles, of which Ni are in the i-th microscopic condition (range) of position and momentum.

Use Stirling’s formula: log N! ~ NlogN-N and note that pi = Ni /N, you will get S ~ ∑ pi log pi

EE465: Introduction to Digital Image Processing 21

Page 22: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

Entropy-related Quotes

“My greatest concern was what to call it. I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage. ”

--Conversation between Claude Shannon and John von Neumann regarding what name to give to the “measure of uncertainty” or attenuation in

phone-line signals (1949)

EE465: Introduction to Digital Image Processing 22

Page 23: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

Other Use of Entropy In biology

“the order produced within cells as they grow and divide is more than compensated for by the disorder they create in their surroundings in the course of growth and division.” – A. Lehninger

Ecological entropy is a measure of biodiversity in the study of biological ecology.

In cosmology “black holes have the maximum possible entropy of

any object of equal size” – Stephen Hawking

EE465: Introduction to Digital Image Processing 23

Page 24: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 24

What is the use of H(X)?

Shannon’s first theorem (noiseless coding theorem)For a memoryless discrete source X, its entropy H(X)defines the minimum average code length required tonoiselessly code the source.

Notes: 1. Memoryless means that the events are independentlygenerated (e.g., the outcomes of flipping a coin N timesare independent events)2. Source redundancy can be then understood as thedifference between raw data rate and source entropy

Page 25: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 25

Code Redundancy*

0)( XHlr

Average code length:

Theoretical boundPractical performance

N

i ii p

pXH1

2

1log)(

N

iiilpl

1

li: the length ofcodeword assignedto the i-th symbol

Note: if we represent each symbol by q bits (fixed length codes),Then redundancy is simply q-H(X) bps

Page 26: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 26

How to achieve source entropy?

Note: The above entropy coding problem is based on simplified assumptions are that discrete source X is memoryless and P(X) is completely known. Those assumptions often do not hold forreal-world data such as images and we will recheck them later.

entropycoding

discretesource X

P(X)

binary bit stream

Page 27: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 27

Data Compression Basics

Discrete source Information=uncertainty Quantification of uncertainty Source entropy

Variable length codes Motivation Prefix condition Huffman coding algorithm

Page 28: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 28

Recall:

Variable Length Codes (VLC)

Assign a long codeword to an event with small probabilityAssign a short codeword to an event with large probability

ppI 2log)( Self-information

It follows from the above formula that a small-probability event containsmuch information and therefore worth many bits to represent it. Conversely, if some event frequently occurs, it is probably a good idea to use as few bits as possible to represent it. Such observation leads to the idea of varying thecode lengths based on the events’ probabilities.

)(log)( 2 xpxl

Page 29: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 29

symbol k pk

S

W

N

E

0.5

0.25

0.125

fixed-lengthcodeword

0.125

000110

11

variable-lengthcodeword

010110

111

4-way Random Walk Example

symbol stream : S S N W S E N N N W S S S N E S Sfixed length: variable length:

00 00 01 11 00 10 01 01 11 00 00 00 01 10 00 00

0 0 10 111 0 110 10 10 111 0 0 0 10 110 0 0

32bits

28bits

4 bits savings achieved by VLC (redundancy eliminated)

Page 30: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 30

=0.5×1+0.25×2+0.125×3+0.125×3=1.75 bits/symbol

• average code length:

Toy Example (Con’t)

• source entropy:

4

12log)(

kkk ppXH

s

b

N

Nl

Total number of bits

Total number of symbols

(bps)

)(2 XHbpsl fixed-length variable-length

)(75.1 XHbpsl

Page 31: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 31

Problems with VLC

When codewords have fixed lengths, the boundary of codewords is always identifiable.

For codewords with variable lengths, their boundary could become ambiguous

symbolS

W

N

E

VLC

0110

11

S S N W S E …

0 0 1 11 0 10…

0 0 11 1 0 10… 0 0 1 11 0 1 0…

S S W N S E … S S N W S E …

e

d d

Page 32: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 32

Uniquely Decodable Codes

To avoid the ambiguity in decoding, we need to enforce certain conditions with VLC to make them uniquely decodable

Since ambiguity arises when some codeword becomes the prefix of the other, it is natural to consider prefix condition

Example: p pr pre pref prefi prefix

ab: a is the prefix of b

Page 33: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 33

Prefix condition

No codeword is allowed to be the prefix of any other codeword.

We will graphically illustrate this condition with the aid of binary codeword tree

Page 34: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 34

Binary Codeword Tree

1 0

… …

1011 01 00

root

Level 1

Level 2

# of codewords

2

22

2kLevel k

Page 35: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 35

Prefix Condition Examplessymbol x

WE

S

N0110

11

codeword 1 codeword 2010110

111

1 0

… …

1011 01 00

1 0

… …

1011

codeword 1 codeword 2

111 110

Page 36: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 36

How to satisfy prefix condition?

Basic rule: If a node is used as a codeword, then all its descendants cannot be used as codeword.

1 0

1011

111 110

Example

Page 37: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 37

Kraft’s inequality 121

N

i

li

li: length of the i-th codeword

Property of Prefix Codes

WE

S

N0110

11

010110

111

symbol x VLC- 1 VLC-2Example

124

1

i

li 124

1

i

li

(proof skipped)

Page 38: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 38

Two Goals of VLC design

–log2p(x) For an event x with probability of p(x), the optimalcode-length is , where x denotes the smallest integer larger than x (e.g., 3.4=4 )

• achieve optimal code length (i.e., minimal redundancy)

• satisfy prefix condition

code redundancy: 0)( XHlr

Unless probabilities of events are all power of 2, we often have r>0

Page 39: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 39

Solution: Huffman Coding (Huffman’1952) – we will cover it later while studying JPEG

Arithmetic Coding (1980s) – not covered by EE465 but EE565 (F2008)

Page 40: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 40

Golomb Codes for Geometric Distribution

k12345678…

codeword010110111011110111110111111011111110… …

Optimal VLC for geometric source: P(X=k)=(1/2)k, k=1,2,…

01

1 0

1 0

1 0

Page 41: EE465: Introduction to Digital Image Processing 1 One-Minute Survey Result  Thank you for your responses Kristen, Anusha, Ian, Christofer, Bernard, Greg,

EE465: Introduction to Digital Image Processing 41

Summary of Data Compression Basics

Shannon’s Source entropy formula (theory) Entropy (uncertainty) is quantified by weighted

self-information

VLC thumb rule (practice) Long codeword small-probability event Short codeword large-probability event

N

iii ppXH

12log)( bps

)(log)( 2 xpxl