welcome to comp 120! - university of north carolina at ... · welcome to comp 120! handouts: ......

Comp 120- Spring ‘05 /13/2004 Lecture

Instructor: Marc Pollefeys

L01 - Basics of Information 1Comp120 – Spring 2005 1/13/04

Welcome to Comp 120!

Handouts: Lecture Slides, Syllabus

I thought this course was called“Computer Organization”

David Macaulay

1) Course Mechanics2) Course Objectives3) Information


Meet the Crew…

Lectures: Marc Pollefeys (SN 205)Office Hours Wed. 14:00-16:00

TA: Bud Vasile (SN 034)Office Hours Mon. 14:00-15:30

Book: Patterson & HennessyComputer Organization & Design3nd Edition, ISBN: 1-55860-604-1(However, you won’t need it for the next couple of weeks)

Comp 120- Spring ‘05 Page 1/13/2004 Lecture



Problem sets will be distributed on Thursdays and are due back on the next Thursday class meeting (before the beginning of lecture). Sometimes you will have one week to complete each set, at other times two. Late problem sets will not be accepted, but the lowest two problem-set scores will be dropped.

I will attempt to make Lecture Notes, Problem Sets, and other course materials available on the web by 10am on the day they are given. TODAY IS THE LAST DAY I WILL BRING HANDOUTS.

Course Mechanics

Grading:Best 8 of 10 problem sets 40%2 Quizzes 30%Final Exam 30%


Comp 120: Course Clickables

Announcements, corrections, etc.On-line copies of all handouts

http://www.unc.edu/courses/2005spring/comp/120/001




What is this Course About?

The THREE main goals of Comp 120

2) Power of Abstraction

1) Demystify Computers

3) Learn the Structures of Computation


Goal 1: Demystify ComputersStrangely, most people (even some computer scientists) are afraid of

computers.

We are only afraid of things we do not understand!

I do not fear computers. I fear the lack of them. - Isaac Asimov (1920 - 1992)

Fear is the main source of superstition, and one of the main sources of cruelty. To conquer fear is the beginning of wisdom.

- Bertrand Russell (1872 – 1970)




Analogy: The Electric Motor

Faraday’s and Henry’sinvention, the electric motor, seems to magically convertelectrical energy tomechanical motion.

From the outside itlooks mysterious.

Opening it up might notshed much light either!


Goal 2: Power of Abstraction

Define a function, develop a robust implementation, and then put a box around it.

Abstraction enables us to create unfathomable machines called computers.

Why do we need ABSTRACTION…

Imagine a billion --- 1,000,000,000




The key to building systems with >1G components

Personal Computer:Hardware & Software

Circuit Board:≈8 / system1-2G devices

Integrated Circuit:≈8-16 / PCB

.25M-16M devicesModule:≈8-16 / IC

100K devices

Cell:≈1K-10K / Module

16-64 devicesGate:≈2-16 / Cell8 devices

Scheme for representinginformation

MOSFET


What do we See in a Computer?• Structure

– hierarchical design: – limited complexity at each level– reusable building blocks

• What makes a good system design?– “Bang for the buck”:

minimal mechanism, maximal function– reliable in a wide range of environments– accommodates future technical improvements

• Interfaces– Key elements of system engineering;

typically outlive the technologies they interface

– Isolate technologies, allow evolution– Major abstraction mechanism

Wait! I thinkI see a bug inthe DIV logic.

Got thatone off

the web. Sure hopeit works.




Goal 3: Computational Structures

What are the fundamental elements of computation?Can we define computation independent of the substrate

on which it is implemented?

Babbage’s difference engine


Our Plan of Attack…

Understand how things work, bottom-upEncapsulate our understanding

using appropriate abstractionsStudy organizational principles: abstractions, interfaces, APIs.

Roll up our sleeves and design ateach level of hierarchy

Learn big picture and understand trends




What is “Computation”?

Computation is about “processing information”- Transforming information from one form

to another- Deriving new information from old- Finding information associated with a

given input- Computation describes the motion of

information through time- Communication describes the motion of

information through space


What is “Information”?

information, n. Knowledge communicated or received concerning a particular fact or circumstance.

Information resolves uncertainty.Information is simply that which cannot be predicted.

The less predictable a message is, the more information it conveys!

UNC won again.

Tell me something new…

“ 10 Problem sets, 2 quizzes, and a final!”




Real-World InformationWhy do unexpected messages

get allocated the biggest headlines?

… because they carry the most information.


Quantifying Information(Claude Shannon, 1948)

Suppose you’re faced with N equally probable choices, and I give you a fact that narrows it down to M choices. ThenI’ve given you

log2(N/M) bits of information

Examples:information in one coin flip: log2(2/1) = 1 bitroll of a dice: log2(6/1) = ~2.6 bitsoutcome of a Football game: 1 bit

(well, actually, “they won” may convey more information than “they lost”…)

Information is measured in bits (binary digits) =

number of 0/1’s required to encode choice(s)




Example: Sum of 2 dice23456789101112

i2 = log2(36/1) = 5.170 bitsi3 = log2(36/2) = 4.170 bitsi4 = log2(36/3) = 3.585 bitsi5 = log2(36/4) = 3.170 bitsi6 = log2(36/5) = 2.848 bitsi7 = log2(36/6) = 2.585 bitsi8 = log2(36/5) = 2.848 bitsi9 = log2(36/4) = 3.170 bitsi10 = log2(36/3) = 3.585 bitsi11 = log2(36/2) = 4.170 bitsi12 = log2(36/1) = 5.170 bits

The average information provided by the sum of 2 dice:

bits2743ppii

i2i12

2iMN

2NM

i

i .)(log)(logave =−== ∑∑=

Entropy


Show Me the Bits!Can the sum of two dice REALLY be representedusing 3.274 bits? If so, how?

The fact is, the averageinformation content is astrict lower-bound on howsmall of a representationthat we can achieve.

In practice, it is difficultto reach this bound. But,we can come very close.




2 5 3 6 5 8 3

Variable-Length Encoding2 - 10011 3 - 0101 4 - 011 5 - 0016 - 111 7 - 101 8 - 110 9 - 00010 - 1000 11 - 0100 12 - 10010

36/36

43/36

112/36

32/36

4/36

7/36

94/36

54/36

8/36

15/36

76/36

103/36

121/36

21/36

2/36

5/36

11/36

85/36

65/36

10/36

21/36

0

0

0

0

00

0

0

0

0

1

1

1

1

1

1

1

1

1

1

Huffman decodingtree

Example Stream: 1001100101011110011100101


Encoding EfficiencyHow does our encoding strategy compare to the information content of the roll?

Pretty close. However, an efficient encoding (as defined by having an average code size close to the information content) is not always what we want!

3063b

54433

333345b

ave

361

362

363

364

365

366

365

364

363

362

361

ave

.=

+++++

+++++=




Encoding Considerations

- Encoding schemes that attempt to match the information content of a data stream are essentially removing redundancy. They are data compressiontechniques.

- However, sometime our goal in encoding information is increase redundancy, rather than remove it. Why?- Make the information easy to manipulate

(fixed-sized encodings)- Make the data stream resilient to noise

(error detecting and correcting codes)


Error detection using paritySometimes we add extra redundancy so that we can detect errors. For instance, this encoding detects any single-bit error:

2-11110003-11111014-00115-01016-01107-00008-10019-101010-110011-111111012-1111011

4 2* 5ex 4: 001111100000101

4 10* 7ex 3: 001111100000101

4 9* 7ex 2: 001111100000101

4 6* 7ex 1: 001111100000101

There’s somethingpeculiar aboutthose examples




Property 1: ParityThe sum of the bits in each symbol is even.

(this is how errors are detected)

2-1111000 = 1 + 1 + 1+ 1 + 0 + 0 + 0 = 43-1111101 = 1 + 1 + 1 + 1 + 1 + 0 + 1 = 64-0011 = 0 + 0 + 1 + 1 = 25-0101 = 0 + 1 + 0 + 1 = 26-0110 = 0 + 1 + 1 + 0 = 27-0000 = 0 + 0 + 0 + 0 = 08-1001 = 1 + 0 + 0 + 1 = 29-1010 = 1 + 0 + 1 + 0 = 210-1100 = 1 + 1 + 0 + 0 = 211-1111110 = 1 + 1 + 1 + 1 + 1 + 1 + 0 = 612-1111011 = 1 + 1 + 1 + 1 + 0 + 1 + 1 = 6

How muchinformationis in thelast bit?


Property 2: SeparationEach encoding differs from all others by at least two bits in their overlapping parts

3 4 5 6 7 8 9 10 11 121111101 0011 0101 0110 0000 1001 1010 1100 1111110 1111011

2 1111000 1111x0x xx11 x1x1 x11x xxxx 1xx1 1x1x 11xx 1111xx0 11110xx3 1111101 xx11 x1x1 x11x xxxx 1xx1 1x1x 11xx 11111xx 1111xx14 0011 0xx1 0x1x 00xx x0x1 x01x xxxx xx11 xx115 0101 01xx 0x0x xx01 xxxx x10x x1x1 x1x16 0110 0xx0 xxxx xx10 x1x0 x11x x11x7 0000 x00x x0x0 xx00 xxxx xxxx8 1001 10xx 1x0x 1xx1 1xx19 1010 1xx0 1x1x 1x1x10 1100 11xx 11xx11 1111110 1111x1xThis difference is called

the “Hamming distance”

“A Hamming distance of 1is needed to uniquelyidentify an encoding”




A Short DetourIt is illuminating to consider Hamming distances geometrically.

Given 2 bits the largest Hamming distancethat we can achieve between 2 encodings is 2.

This allows us to detect 1-bit errors if weencode 1 bit of information using 2 bits.

With 3 bits we can find 4 encodings with a Hamming distance of 2, allowing the detection of 1-bit errors when3 bits are used to encode 2 bits of information.

We can also identify 2 encodingswith Hamming distance 3. Thisextra distance allows us to detect2-bit errors. However, we coulduse this extra separationdifferently.

00

10

01

11

000 001

010

100

011

111

101

110

Encodings separated bya Hamming distance of 2.

000 001

010

100

011

111

101

110

Encodings separated bya Hamming distance of 3.


Error correcting codesWe can actually correct 1-bit errors in encodings separated by a Hamming distance of 2. This is possible because the sets of bit patterns located at a Hamming distance of 1 from our encodings are distinct.

However, attempting error correctionwith such a small separation is dangerous.Suppose, we have a 2-bit error. Our errorcorrection scheme will then misinterpretthe encoding. Misinterpretations alsooccurred when we had 2-bit errors in our1-bit-error-detection (parity) schemes.

A safe 1-bit error correction schemewould correct all 1-bit errors and detectall 2-bit errors. What Hamming distanceis needed between encodings toaccomplish this?

000 001

010

100

011

111

101

110

This is justa votingscheme




An alternate error correcting codeWe can generalize the notion of parity in order to construct error correcting codes. Instead of computing a single parity bit for an entire encoding we can allow multiple parity bits over different subsets of bits. Consider the following technique for encoding 25 bits.

This approach is easy toimplement, but it is notoptimal in terms of thenumber of bits used.

0 1 1 1 11 1 1 0 00 1 0 1 01 0 1 0 00 1 0 0 00 0 1 0 1

010010

0 1 1 1 11 1 1 0 00 1 0 1 01 0 1 0 00 1 0 0 00 0 1 0 1

010010

1-bit errors will cause both a row and columnparity error, uniquely identifying the errant bitfor correction. An extra parity bit allows usto detect 1-bit errors in the parity bits.

Many 2-bit errors can also be corrected.However, 2-bit errors involving the same rowor column are undetectable.

10

1 10 1

0 1

1 1


SummaryInformation resolves uncertainty• Choices equally probable:

• N choices down to M →log2(N/M) bits of information

•Choices not equally probable:• choicei with probability pi →

log2(1/pi) bits of information• average number of bits = Σpilog2(1/pi) • use variable-length encodings

Next time:• encoding information with bits• Representing letters & numbers

welcome to comp 120! - university of north carolina at ... · welcome to comp 120! handouts: ......

Documents