floating point. agenda history basic terms general representation of floating point constructing...

21
Floating Point

Upload: jordan-elliott

Post on 12-Jan-2016

247 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Floating Point

Page 2: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Agenda

History Basic Terms General representation of floating point Constructing a simple floating point

representation Floating Point Arithmetic The IEEE-754 Floating-Point Standard Range, Precision, and Accuracy

Page 3: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

History The first floating point representation

was firstly used in “V1” machine (1945). It had 7-bit exponent, 16-bit mantissa, and a sign bit.

In 1954, floating point representation was used by IBM for the modern computing system.

In 1962, the UNIVAC 1100/2200 series was introduced. It contains single precision and double precision.

Page 4: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Basic Terms Scientific notation: A notation that renders numbers

with a single digit to the left of the decimal point. Normalized: A number in floating-point notation that

has no leading 0s. Floating point: Computer arithmetic that represents

numbers in which the binary point is not fixed. Fraction: The value, between 0 and 1, placed in the

fraction field of the floating point. Exponent: In the numerical representation system of

floating-point arithmetic, the value that is placed in the exponent field.

Page 5: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

General representation of floating point

Page 6: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Constructing a simple floating point

representation We will use 14-bit model: 1 sign bit, 5-bit

exponent, and 8-bit significand. For example, storing a decimal number

17 into this model. In decimal we can say, 17 = 0.17 x 10^2 But, in order to construct a floating point

representation we have to convert it into binary.

Page 7: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

17 (decimal) = 10001 ( binary) 10001 = 0.10001 x 2^5 Then, we can now construct its

representation

0 00101 1000100

1bit 5 bits 8 bits

sign field:

0 : positive value

1 : negative value

Page 8: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

What if we want to store a negative exponent value?

The previous example can’t handle this problem, thus we could fix that by using biased exponent.

For example, if we want to store 0.25, we will have 0.1 x 2^-1

We can fix this by using excess-16 representation. So that we add 16 to the negative exponent (-1 + 16 = 15).

0 01111 1000000

Page 9: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

We don’t have a unique representation for each number.

0 11000 00010001

0 10111 00100010

0 10110 01000100

0 10101 10001000

= 17

Another problem using this method

Page 10: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Remedy

This problem can be fixed by normalization.

Normalization is a convention that the leftmost bit of the significand must always be 1. So that we only have

for decimal value 17.

0 01111 1000000

Page 11: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Floating Point Arithmetic

Addition

11.001000

0.10011010

11.10111010

0 10010 11001000

0 10000 10011010

0 10010 11101110

Page 12: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

0 10010 11001000

0 10000 10011010

0 10001 11110000

Page 13: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Some other problems in floating point arithmetic

Division by zero. Overflow, if the result is greater in

magnitude than the given storage. Underflow, if the result is smaller in

magnitude than the given storage.

Page 14: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

The IEEE-754 Floating-Point Standard

This was first introduced in 1985. This type of floating point includes two

formats: single precision and double precision.

Page 15: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

The standard defines: arithmetic formats: sets of binary and decimal

floating-point data, which consist of finite numbers, (including negative zero and subnormal numbers), infinities, and special 'not a number' values.

interchange formats: encodings (bit strings) that may be used to exchange floating-point data in an efficient and compact form

rounding algorithms: methods to be used for rounding numbers during arithmetic and conversions

operations: arithmetic and other operations on arithmetic formats

exception handling: indications of exceptional conditions (such as division by zero, overflow, underflow, etc.)

Page 16: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Single Precision IEEE-754

This representation uses an excess-127 This representation assumes an implied

1 to the left of the radix point, for example we put 1 = 1.0 x 2^(0+127)

1bit 8 bits 23bits

Floating Point Number Single Precision Representation

1.0 0 01111111 00000000000000000000000

0.5 0 10000000 00000000000000000000000

19.5 0 10000011 00111000000000000000000

-3.75 1 10000000 11100000000000000000000

Page 17: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Double Precision IEEE-754

This representation uses an excess-1023

This representation assumes an implied 1 to the left of the radix point, for example we put 1 = 1.0 x 2^(0+127). (same as the single precision)

1bit 11 bits 52 bits

Page 18: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Range, Precision, and Accuracy

Range

In double precision, for example, we have

Negative Expressible Negative Negative Positive Expressible Positive Positive

Overflow Number Underflow Underflow Numbers Overflow

-1.0 x 10^308 -1.0 x 10^-308 0 1.0 x 10^-308 1.0 x 10^308

Page 19: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Accuracyhow close a number is to its true valuefor example, we can’t represent 0.1 in floating point, but we can still find a number in the range that relatively close to 0.1

Precisionhow much information we have about a value and the amount of information used to represent the valuefor example, 1.666 (4 decimal digits of precision) and 1.6660 (5 decimal digits of precision). Thus, the first number is more accurate than the second one.

Page 20: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

Thank You

Page 21: Floating Point. Agenda  History  Basic Terms  General representation of floating point  Constructing a simple floating point representation  Floating

References

Wikipedia:

http://en.wikipedia.org/wiki/IEEE_754 Books:

Computer Organization and Design

by Patterson, D

Computer Organization and Architecture

by Null, Linda