info rm atics and computing information and uncertainty

23
Informati cs and computing Information and uncertainty

Upload: clement-pierce

Post on 02-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Informatics and computing

Information and uncertainty

Informatics and computing

Manipulating symbols Last class

Typology of signs Sign systems Symbols

Tremendously important distinctions for informatics and computational sciences Computation = symbol manipulation Symbols can be manipulated without reference to content (syntactically), due to

the arbitrary nature of convention Allows computers to operate! All signs rely on a certain amount of convention, as all signs have a pragmatic

(social) dimension, but symbols are the only signs which require exclusively a social convention, or code, to be understood.

Informatics and computing

Symbol manipulation

Some have meaning (in some language) The relation between symbols and meaning is arbitrary Example: cut-up method for generating poetry pioneered by Brion Gysin and William Burroughs and often used by artists such as David Bowie, or use of samples in electronic music

aedl:

adel adle aedl aeld alde aled dael dale deal dela dlae dlea eadl eald edal edla elad elda lade laed ldae ldea lead leda

4! Permutations:4 x 3 x 2 x 1 = 24

Informatics and computing

Information theory“The mathematical theory of communication”, Claude Shannon (1948)

Efficiency of information transmission in electronic channels

Key concept: information quantity that can be measured unequivocally (objectively)

Does not deal at all with the subjective aspects of information semantics and pragmatics. Information is defined as a quantity that depends on symbol manipulation alone

Informatics and computing

What’s an information quantity?How to quantify a relation?

Information is a relation between an agent, a sign and a thing, rather than simply a thing.

The most palpable element in the information relation is the sign, symbols

But which symbols do we use to quantify the information contained in messages?• Several symbol systems can be used to convey the same message• We must agree on the same symbol system for all messages!

Informatics and computing

What’s an information quantity?Both sender and receiver must use the same code, or convention, to encode and decode symbols from and to messages. • We need to fix the language used for communication• Set of symbols allowed (an alphabet)• The rules to manipulate symbols (syntax)• The meaning of the symbols (semantics).

A language specifies the universe of all possible messages = Set of all possible symbol strings of a given size.

Shannon Information is thus defined as “a measure of the freedom from choice with which a message is selected from the set of all possible messages”

DEAL

DELA

DLAE

DLEA

DAEL

DALE

EALD

EADL

ELAD

ELDA

EDLA

EDAL

ALDE

ALED

ADLE

ADEL

AELD

AEDL

LDEA

LDAE

LEAD

LEDA

LADE

LAED

DEAL is 1 out of 4! = 4×3×2×1 = 24 choices.

Informatics and computing

What’s an information quantity?Information is defined as “a measure of the freedom from choice with which a message is selected from the set of all possible messages”

Bit (short for binary digit) is the most elementary choice one can make between two items: “0’ and “1”, “heads” or “tails”, “true” or “false”, etc.

Bit is equivalent to the choice between two equally likely choices.

Example, if we know that a coin is to be tossed, but are unable to see it as it falls, a message telling whether the coin came up heads or tails gives us one bit of information

Informatics and computing

Decision-makingDecision-making:• Perhaps the most fundamental capability of human beings• Decision always implies uncertainty• Choice• Lack of information, randomness, noise, Error

“The highest manifestation of life consists in this: that a being governs its own actions. A thing which is always subject to the direction of another is somewhat of a dead thing. ”“A man has free choice to the extent that he is rational.” (St. Thomas Aquinas)

“In a predestinate world, decision would be illusory; in a world of perfect foreknowledge, empty; in a world without natural order, powerless. Our intuitive attitude to life implies non-illusory, non-empty, non-powerless decision… Since decision in this sense excludes both perfect foresight and anarchy in nature, it must be defined as choice in face of bounded uncertainty” (George Shackle)

Informatics and computing

Uncertainty-based information: original contributions

Information is transmitted through noisy communication channels: Ralph Hartley and Claude Shannon (at Bell Labs), the fathers of Information Theory, worked on the problem of efficiently transmitting information; i.e. decreasing the uncertainty in the transmission of information.

Hartley, R.V.L., "Transmission of Information", Bell System Technical Journal, July 1928, p.535.

C. E. Shannon, ``A mathematical theory of communication,'' Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656, July and October, 1948.

Informatics and computing

Choices: multiplication principle• “If some choice can be made in M different ways, and some subsequent choice can be made

in N different ways, then there are M x N different ways these choices can be made in succession” [Paulos]

• 3 shirts and 4 pants = 3 x 4 = 12 outfit choices

Informatics and computing

Hartley uncertainty• Nonspecificity: Hartley measure• The amount of uncertainty associated with a set of alternatives (e.g. messages) is measured

by the amount of information needed to remove the uncertainty• A type of ambiguity

Quantifies how many yes-no questions need to be asked to establish what the correct alternative is

AAH 2log)(

Number of Choices

Measured in bits

A = Set of Alternatives

xn

x3

x2

x1

B

Informatics and computing

Hartley uncertainty

AAH 2log)(

Number of Choices

Measured in bits

Quantifies how many yes-no questions need to be asked to establish what the correct alternative is

A

Menu Choices A = 16 Entrees B = 4 Desserts

How many dinner combinations? 16 x 4 = 64

H(AxB) = log2(16x4) = log2(16)+log2(4) = 4+2 = 6

AxB

Informatics and computing

Hartley uncertainty: decision trees

AAH 2log)(

Number of ChoicesMeasured in bits

Informatics and computing

What about probability?Some alternatives may be more probable than others!

A different type of ambiguityHigher frequency alternatives: less information required

Measured by Shannon’s entropy measureThe amount of uncertainty associated with a set of alternatives (e.g. messages) is measured by the average amount of information needed to remove the uncertainty

Probability distribution of letters in English text (Orwell’s 1984 in fact):

Informatics and computing

Shannon’s entropy

Probability of alternativeMeasured in bits

A = Set of weighted Alternatives

xn

x3

x2

x1

Shannon’s measureThe average amount of uncertainty associated with a set of weighted alternatives (e.g. messages) is measured by the average amount of information needed to remove the uncertainty

Informatics and computing

Entropy of a messageMessage encoded in an alphabet of n symbols, for example:English = 26 characters + spaceMore code = dots, dashes and spacesDNA: A, T, G, C

Informatics and computing

What it measuresmissing information, how much information is needed to establish what the symbol is, or• uncertainty about what the symbol is, or• on average, how many yes-no questions need to be asked to

establish what the symbol is.

One alternative

Uniform distribution

Informatics and computing

Example: Morse code

1) All dots: p1 = 1, p2 = p3 = 0.Take any symbol – it’s a dot; no uncertainty, no question needed, no missing information, HS = -1.log2(1) = 0.

2) 50-50 dots and dashes: p1 = p2 = 1/2, p3 = 0.Given the probabilities, need to ask one questionone piece of missing informationHS = -(1/2.log2(1/2) + 1/2.log2(1/2) ) = -1.log2(1/2) = - (log2(1) - log2(2)) = log2(2) = 1 bit

3) Uniform: all symbols equally likely, p1 = p2 = p3 = 1/3.Given the probabilities, need to ask as many as 2 questions - 2 pieces of missing information, HS = - log2(1/3) = - (log2(1) - log2(3)) = log2(3) = 1.59 bits

Informatics and computing

Bits, entropy and Huffman codesGiven a symbol set {A,B,C,D,E}And occurrence probabilities PA, PB, PC, PD, PE, The Shannon entropy then corresponds to:The average minimum number of bits needed to represent a symbol

Huffman coding: variable length coding for messages whose symbols have variable frequencies that minimizes number of bits per symbol?

Coding:H = -(0.250*log2(0.250)+

0.375*log2(0.375)+

0.167*log2(0.167)+ 0.125*log2(0.125)+ 0.083*log2(0.083)) =

2.135

Huffman code: #bits per symbol=0.375 * 1+0.250 * 2+0.167 * 3+0.125 * 4+0.083 * 4= 2.208

Informatics and computing

Critique of Shannon’s communication theory

•The entropy formula as a measure of information is arbitrary• Shannon’s theory measures quantities of information, but it does not consider information content• In Shannon’s theory, the semantic aspects of information are irrelevant to the engineering problem

Informatics and computing

Other forms of uncertainty•Vagueness or fuzziness•Simultaneously being “True” and “False”•Fuzzy Logic and Fuzzy Set Theory

Informatics and computing

From crisp to fuzzy sets•Fuzziness: Being and Not Being•Laws of Contradiction and Excluded Middle are Broken

Set of all People

Tall People

1

Set of all People

Tall People

1XAA

AA

0

Informatics and computing

Papers:

1) boyd, danah and Crawford, Kate, Six Provocations for Big Data (September 21, 2011). A Decade in Internet Time: Symposium on the Dynamics of the Internet and Society, September 2011R.

2)Ryuji Suzuki, John R. Buck and Peter L. Tyack (2006) Information entropy of humpback whale songs, J. Acoust. Soc. Am, 199(3), March

3)David A. Huffman (1952). A method for the construction of Minimum-Redundancy Codes, in Proceedings of the I.R.E, September.

This week’s discussion