3
Why Don’t Computers Use Base 10? Base 10 Number Representation
– That’s why fingers are known as “digits”– Natural representation for financial transactions
Floating point number cannot exactly represent $1.20
– Even carries through in scientific notation 1.5213 X 104
Implementing Electronically– Hard to store
ENIAC (First electronic computer) used 10 vacuum tubes / digit
– Hard to transmit Need high precision to encode 10 signal levels on single wire
– Messy to implement digital logic functions Addition, multiplication, etc.
4
How do we represent data in a computer?
At the lowest level, a computer is an electronic machine.– works by controlling the flow of electrons
Easy to recognize two conditions:1. presence of a voltage – we’ll call this state “1”
2. absence of a voltage – we’ll call this state “0”
Could base state on value of voltage, but control and detection circuits more complex.
– compare turning on a light switch tomeasuring or regulating voltage
5
Computer is a binary digital system.
Basic unit of information is the binary digit, or bit. Values with more than two states require multiple bits.
– A collection of two bits has four possible states:00, 01, 10, 11
– A collection of three bits has eight possible states:
000, 001, 010, 011, 100, 101, 110, 111
– A collection of n bits has 2n possible states.
Binary (base two) system:• has two states: 0 and 1
Digital system:• finite number of symbols
6
What kinds of data do we need to represent?
– Numbers – signed, unsigned, integers, floating point,complex, rational, irrational, …
– Text – characters, strings, …– Images – pixels, colors, shapes, …– Sound– Logical – true, false– Instructions– …
Data type: – representation and operations within the computer
7
Machine Words Machine Has “Word Size”
– Nominal size of integer-valued data Including addresses
– Most current machines are 32 bits (4 bytes) Limits addresses to 4GB Becoming too small for memory-intensive applications
– High-end systems are 64 bits (8 bytes) Potentially address 1.8 X 1019 bytes
– Machines support multiple data formats Fractions or multiples of word size Always integral number of bytes
8
Word-Oriented Memory Organization
Addresses Specify Byte Locations
– Address of first byte in word– Addresses of successive words
differ by 4 (32-bit) or 8 (64-bit)
000000010002000300040005000600070008000900100011
32-bitWords
Bytes Addr.
0012001300140015
64-bitWords
Addr =??
Addr =??
Addr =??
Addr =??
Addr =??
Addr =??
0000
0004
0008
0012
0000
0008
9
Data Representations Sizes of C Objects (in Bytes)
– C Data Type Sparc/Unix Typical 32-bit Intel IA32 int 4 4
4 long int 8 4
4 char 1 1
1 short 2 2
2 float 4 4
4 double 8 8
8 long double 8 8
10/12 char * 8 4
4– Or any other “pointer”
10
Pointers and Arrays Pointer
– Address of a variable in memory– Allows us to indirectly access variables
in other words, we can talk about its addressrather than its value
Array– A list of values arranged sequentially in memory– Example: a list of telephone numbers
– Expression a[4] refers to the 5th element of the array a
11
Address vs. ValueSometimes we want to deal with the address
of a memory location,rather than the value it contains.
Adding a column of numbers.– R2 contains address of first location.– Read value, add to sum, and
increment R2 until all numbershave been processed.
R2 is a pointer -- it contains theaddress of data we’re interested in.
x3107x2819x0110x0310x0100x1110x11B1x0019
x3100
x3101
x3102
x3103
x3104
x3105
x3106
x3107
x3100R2
address
value
12
Byte Ordering How should bytes within multi-byte word be ordered in
memory? Conventions
– Sun’s, Mac’s are “Big Endian” machines Least significant byte has highest address
– Alphas, PC’s are “Little Endian” machines Least significant byte has lowest address
13
Byte Ordering Example Big Endian
– Least significant byte has highest address
Little Endian– Least significant byte has lowest address
Example– Variable x has 4-byte representation 0x01234567– Address given by &x is 0x100
0x100 0x101 0x102 0x103
01 23 45 67
0x100 0x101 0x102 0x103
67 45 23 01
Big Endian
Little Endian
01 23 45 67
67 45 23 01
14
Machine-Level Code Representation Encode Program as Sequence of Instructions
– Each simple operation Arithmetic operation Read or write memory Conditional branch
– Instructions encoded as bytes Alpha’s, Sun’s, Mac’s use 4 byte instructions
– Reduced Instruction Set Computer (RISC) PC’s use variable length instructions
– Complex Instruction Set Computer (CISC)
– Different instruction types and encodings for different machines Most code not binary compatible
Programs are Byte Sequences Too!
15Chapter 1
Big Idea: Information is Bits + Context Computer stores the bits You decide how to interpret them
Example (binary)
00000000 00000000 00000000 00001101
Decimal = 13 Float = 1.82169E-44
17
Unsigned Integers Non-positional notation
– could represent a number (“5”) with a string of ones (“11111”)– problems?
Weighted positional notation– like decimal numbers: “329”– “3” is worth 300, because of its position, while “9” is only worth 9
329102 101 100
10122 21 20
3x100 + 2x10 + 9x1 = 329 1x4 + 0x2 + 1x1 = 5
mostsignificant
leastsignificant
18
Unsigned Integers (cont.) An n-bit unsigned integer represents 2n values:
from 0 to 2n-1.22 21 20
0 0 0 0
0 0 1 1
0 1 0 2
0 1 1 3
1 0 0 4
1 0 1 5
1 1 0 6
1 1 1 7
19
Unsigned Binary Arithmetic Base-2 addition – just like base-10!
– add from right to left, propagating carry
10010 10010 1111+ 1001 + 1011 + 111011 11101 10000
10111+ 111
carry
20
Signed Integers With n bits, we have 2n distinct values.
– assign about half to positive integers (1 through 2n-1)and about half to negative (- 2n-1 through -1)
– that leaves two values: one for 0, and one extra
Positive integers– just like unsigned – zero in most significant bit
00101 = 5
Negative integers– sign-magnitude – set top bit to show negative,
other bits are the same as unsigned10101 = -5
– one’s complement – flip every bit to represent negative11010 = -5
– in either case, MS bit indicates sign: 0=positive, 1=negative
21
Two’s Complement Problems with sign-magnitude and 1’s complement
– two representations of zero (+0 and –0)– arithmetic circuits are complex
How to add two sign-magnitude numbers?– e.g., try 2 + (-3)
How to add to one’s complement numbers? – e.g., try 4 + (-3)
Two’s complement representation developed to makecircuits easy for arithmetic.
– for each positive number (X), assign value to its negative (-X),such that X + (-X) = 0 with “normal” addition, ignoring carry out
00101 (5) 01001 (9)
+ 11011 (-5) + (-9)
00000 (0) 00000 (0)
22
Two’s Complement Representation If number is positive or zero,
– normal binary representation, zeroes in upper bit(s)
If number is negative,– start with positive number– flip every bit (i.e., take the one’s complement)– then add one
00101 (5) 01001 (9)
11010 (1’s comp) (1’s comp)
+ 1 + 111011 (-5) (-9)
23
Two’s Complement Signed Integers MS bit is sign bit – it has weight –2n-1. Range of an n-bit number: -2n-1 through 2n-1 – 1.
– The most negative number (-2n-1) has no positive counterpart.
-23 22 21 20
0 0 0 0 0
0 0 0 1 1
0 0 1 0 2
0 0 1 1 3
0 1 0 0 4
0 1 0 1 5
0 1 1 0 6
0 1 1 1 7
-23 22 21 20
1 0 0 0 -8
1 0 0 1 -7
1 0 1 0 -6
1 0 1 1 -5
1 1 0 0 -4
1 1 0 1 -3
1 1 1 0 -2
1 1 1 1 -1
25
Text: ASCII Characters ASCII: Maps 128 characters to 7-bit code. (type man ascii)
– both printable and non-printable (ESC, DEL, …) characters
00 nul10 dle20 sp 30 0 40 @ 50 P 60 ` 70 p01 soh11 dc121 ! 31 1 41 A 51 Q 61 a 71 q02 stx12 dc222 " 32 2 42 B 52 R 62 b 72 r03 etx13 dc323 # 33 3 43 C 53 S 63 c 73 s04 eot14 dc424 $ 34 4 44 D 54 T 64 d 74 t05 enq15 nak25 % 35 5 45 E 55 U 65 e 75 u06 ack16 syn26 & 36 6 46 F 56 V 66 f 76 v07 bel17 etb27 ' 37 7 47 G 57 W 67 g 77 w08 bs 18 can28 ( 38 8 48 H 58 X 68 h 78 x09 ht 19 em 29 ) 39 9 49 I 59 Y 69 i 79 y0a nl 1a sub2a * 3a : 4a J 5a Z 6a j 7a z0b vt 1b esc2b + 3b ; 4b K 5b [ 6b k 7b {0c np 1c fs 2c , 3c < 4c L 5c \ 6c l 7c |0d cr 1d gs 2d - 3d = 4d M 5d ] 6d m 7d }0e so 1e rs 2e . 3e > 4e N 5e ^ 6e n 7e ~0f si 1f us 2f / 3f ? 4f O 5f _ 6f o 7f del
26
Interesting Properties of ASCII Code What is relationship between a decimal digit ('0', '1', …)
and its ASCII code?
What is the difference between an upper-case letter ('A', 'B', …) and its lower-case equivalent ('a', 'b', …)?
Given two ASCII characters, how do we tell which comes first in alphabetical order?
Are 128 characters enough?(http://www.unicode.org/)
27
Other Data Types Text strings
– sequence of characters, terminated with NULL (0)– typically, no special hardware support
Image– array of pixels
monochrome: one bit (1/0 = black/white) color: red, green, blue (RGB) components (e.g., 8 bits each) other properties: transparency
– hardware support: typically none, in general-purpose processors MMX -- multiple 8-bit operations on 32-bit word
Sound– sequence of fixed-point numbers
28
Pointers in C C lets us talk about and manipulate “pointers”
(addresses) as variables and in expressions.
Declaration int *p; /* p is a pointer to an int */
A pointer in C is always a pointer to a particular data type:int*, double*, char*, etc.
Operators *p -- returns the value pointed to by p (“dereference”) &z -- returns the address of variable z (“address of”)
29
Example int i; int *ptr;
i = 4; ptr = &i; *ptr = *ptr + 1;
store the value 4 into the memory locationassociated with i
store the address of i into the memory location associated with ptr
read the contents of memoryat the address stored in ptr
store the result into memoryat the address stored in ptr
31
Converting Binary (2’s C) to Decimal
1. If leading bit is one, take two’s complement to get a positive number.
2. Add powers of 2 that have “1” in thecorresponding bit positions.
3. If original number was negative,add a minus sign.
n 2n
0 1
1 2
2 4
3 8
4 16
5 32
6 64
7 128
8 256
9 512
10 1024
X = 01101000two
= 26+25+23 = 64+32+8= 104ten
Assuming 8-bit 2’s complement numbers.
32
More Examples
n 2n
0 1
1 2
2 4
3 8
4 16
5 32
6 64
7 128
8 256
9 512
10 1024
Assuming 8-bit 2’s complement numbers.
X = 00100111two
= 25+22+21+20 = 32+4+2+1= 39ten
X = 11100110two
-X = 00011010= 24+23+21 = 16+8+2= 26ten
X = -26ten
33
Converting Decimal to Binary (2’s C) First Method: Division
1. Divide by two – remainder is least significant bit.
2. Keep dividing by two until answer is zero,writing remainders from right to left.
3. Append a zero as the MS bit;if original number negative, take two’s complement.
X = 104ten 104/2 = 52 r0 bit 0
52/2 = 26 r0 bit 126/2 = 13 r0 bit 213/2 = 6 r1 bit 3
6/2 = 3 r0 bit 43/2 = 1 r1 bit 5
X = 01101000two 1/2 = 0 r1 bit 6
34
Converting Decimal to Binary (2’s C) Second Method: Subtract Powers of Two
1. Change to positive decimal number.
2. Subtract largest power of two less than or equal to number.
3. Put a one in the corresponding bit position.
4. Keep subtracting until result is zero.
5. Append a zero as MS bit;if original was negative, take two’s complement.
X = 104ten 104 - 64 = 40 bit 6
40 - 32 = 8 bit 58 - 8 = 0 bit 3
X = 01101000two
n 2n
0 1
1 2
2 4
3 8
4 16
5 32
6 64
7 128
8 256
9 512
10 1024
35
Hexadecimal Notation It is often convenient to write binary (base-2) numbers
as hexadecimal (base-16) numbers instead.– fewer digits -- four bits per hex digit– less error prone -- easy to corrupt long string of 1’s and 0’s
Binary Hex Decimal0000 0 0
0001 1 1
0010 2 2
0011 3 3
0100 4 4
0101 5 5
0110 6 6
0111 7 7
Binary Hex Decimal1000 8 8
1001 9 9
1010 A 10
1011 B 11
1100 C 12
1101 D 13
1110 E 14
1111 F 15
36
Converting from Binary to Hexadecimal Every four bits is a hex digit.
– start grouping from right-hand side
011101010001111010011010111
7D4F8A3
This is not a new machine representation,just a convenient way to write the number.
38
Overview of some C operators
! Logical NOT or “bang”
~ Bitwise NOT (“flips” bits)
& Bitwise AND
^ Bitwise XOR
| Bitwise OR
+ Addition
<< Bitwise left shift (shifts bits to left)
>> Bitwise right shift (shifts bits to right)
39
Addition 2’s comp. addition is just binary addition.
– assume all integers have the same number of bits– ignore carry out– for now, assume that sum fits in n-bit 2’s comp. representation
01101000 (104) 11110110 (-10)
+ 11110000 (-16) + (-9)
01011000 (88) (-19)
Assuming 8-bit 2’s complement numbers.
40
Subtraction Negate subtrahend (2nd no.) and add.
– assume all integers have the same number of bits– ignore carry out– for now, assume that difference fits in n-bit 2’s comp. representation
01101000 (104) 11110110 (-10)
- 00010000 (16) + (-9)
01011000 (88) (-19)
01101000 (104) 11110110 (-10)
+ 11110000 (-16) + (9)
01011000 (88) (-1)
Assuming 8-bit 2’s complement numbers.
41
Sign Extension To add two numbers, we must represent them
with the same number of bits. If we just pad with zeroes on the left:
Instead, replicate the MS bit -- the sign bit:
4-bit 8-bit0100 (4) 00000100 (still 4)
1100 (-4) 00001100 (12, not -4)
4-bit 8-bit0100 (4) 00000100 (still 4)
1100 (-4) 11111100 (still -4)
42
Overflow If operands are too big, then sum cannot be represented
as an n-bit 2’s comp number.
We have overflow if:– signs of both operands are the same, and– sign of sum is different.
Another test -- easy for hardware:– carry into MS bit does not equal carry out
01000 (8) 11000 (-8)
+ 01001 (9) +10111 (-9)
10001 (-15) 01111 (+15)
43
Logical Operations Operations on logical TRUE or FALSE
– two states -- takes one bit to represent: TRUE=1, FALSE=0
View n-bit number as a collection of n logical values– operation applied to each bit independently
A B A AND B0 0 00 1 01 0 01 1 1
A B A OR B0 0 00 1 11 0 11 1 1
A NOT A0 11 0
44
Examples of Logical Operations AND
– useful for clearing bits AND with zero = 0 AND with one = no change
OR– useful for setting bits
OR with zero = no change OR with one = 1
NOT– unary operation -- one argument– flips every bit
11000101AND 00001111
00000101
11000101OR 00001111
11001111
NOT 1100010100111010
45
Bit-Level Operations in C
Operations &, |, ~, ^ Available in C– Apply to any “integral” data type
long, int, short, char
– View arguments as bit vectors– Arguments applied bit-wise
Examples (Char data type)– ~0x41 --> 0xBE
~010000012 --> 101111102
– ~0x00 --> 0xFF~000000002 --> 111111112
– 0x69 & 0x55 --> 0x41011010012 & 010101012 --> 010000012
– 0x69 | 0x55 --> 0x7D011010012 | 010101012 --> 011111012
46
Relations Between Operations DeMorgan’s Laws
– Express & in terms of |, and vice-versa A & B = ~(~A | ~B)
– A and B are true if and only if neither A nor B is false A | B = ~(~A & ~B)
– A or B are true if and only if A and B are not both false
Exclusive-Or using Inclusive Or A ^ B = (~A & B) | (A & ~B)
– Exactly one of A and B is true A ^ B = (A | B) & ~(A & B)
– Either A is true, or B is true, but not both
47
General Boolean Algebras Operate on Bit Vectors
– Operations applied bitwise
All of the Properties of Boolean Algebra Apply
01101001& 01010101 01000001
01101001| 01010101 01111101
01101001^ 01010101 00111100
~ 01010101 10101010 01000001 01111101 00111100 10101010
48
Contrast: Logic Operations in C
Contrast to Logical Operators– &&, ||, !
View 0 as “False” Anything nonzero as “True” Always return 0 or 1 Early termination
Examples (char data type)– !0x41 --> 0x00– !0x00 --> 0x01– !!0x41 --> 0x01
– 0x69 && 0x55 --> 0x01– 0x69 || 0x55 --> 0x01– p && *p (avoids null pointer access)
49
Shift Operations
Left Shift: x << y– Shift bit-vector x left y positions
Throw away extra bits on left Fill with 0’s on right
Right Shift: x >> y– Shift bit-vector x right y positions
Throw away extra bits on right
– Logical shift Fill with 0’s on left
– Arithmetic shift Replicate most significant bit on right Useful with two’s complement integer
representation
01100010Argument x
00010000<< 3
00011000Log. >> 2
00011000Arith. >> 2
10100010Argument x
00010000<< 3
00101000Log. >> 2
11101000Arith. >> 2
0001000000010000
0001100000011000
0001100000011000
00010000
00101000
11101000
00010000
00101000
11101000
51
Numeric RangesNumeric RangesUnsigned Values
– UMin = 0000…0
– UMax = 2w – 1111…1
Two’s Complement Values– TMin = –2w–1
100…0
– TMax = 2w–1 – 1
011…1
Other Values– Minus 1
111…1
Decimal Hex BinaryUMax 65535 FF FF 11111111 11111111TMax 32767 7F FF 01111111 11111111TMin -32768 80 00 10000000 00000000-1 -1 FF FF 11111111 111111110 0 00 00 00000000 00000000
Values for W = 16
52
Values for Different Word Sizes
Observations– |TMin | = TMax + 1
Asymmetric range
– UMax = 2 * TMax + 1
C Programming– #include <limits.h>
K&R App. B11
– Declares constants, e.g., ULONG_MAX LONG_MAX LONG_MIN
– Values platform-specific
W8 16 32 64
UMax 255 65,535 4,294,967,295 18,446,744,073,709,551,615TMax 127 32,767 2,147,483,647 9,223,372,036,854,775,807TMin -128 -32,768 -2,147,483,648 -9,223,372,036,854,775,808
53
short int x = 15213; unsigned short int ux = (unsigned short) x; short int y = -15213; unsigned short int uy = (unsigned short) y;
Casting Signed to Unsigned
C Allows Conversions from Signed to Unsigned
Resulting Value– No change in bit representation– Nonnegative values unchanged
ux = 15213
– Negative values change into (large) positive values uy = 50323
54
Signed vs. Unsigned in CSigned vs. Unsigned in C
Constants– By default are considered to be signed integers– Unsigned if have “U” as suffix
0U, 4294967259U
Casting– Explicit casting between signed & unsigned same as U2T and T2U
int tx, ty;
unsigned ux, uy;
tx = (int) ux;
uy = (unsigned) ty;
– Implicit casting also occurs via assignments and procedure callstx = ux;
uy = ty;
55
Why Should I Use Unsigned?
Don’t Use Just Because Number Nonzero– C compilers on some machines generate less efficient code
unsigned i;
for (i = 1; i < cnt; i++)
a[i] += a[i-1];
– Easy to make mistakesfor (i = cnt-2; i >= 0; i--)
a[i] += a[i+1];
Do Use When Performing Modular Arithmetic– Multiprecision arithmetic– Other esoteric stuff
Do Use When Need Extra Bit’s Worth of Range– Working right up to limit of word size
56
MultiplicationMultiplication
Computing Exact Product of w-bit numbers x, y– Either signed or unsigned
Ranges– Unsigned:
Result requires up to 2w bits
– Two’s complement: Result requires to 2w–1 bits
Maintaining Exact Results– Would need to keep expanding word size with each product computed– Done in software by “arbitrary precision” arithmetic packages
57
Unsigned Multiplication in CUnsigned Multiplication in C
Standard Multiplication Function– Ignores high order w bits
Implements Modular ArithmeticUMultw(u , v) = u · v mod 2w
• • •
• • •
u
v*
• • •u · v
• • •
True Product: 2*w bits
Operands: w bits
Discard w bits: w bits UMultw(u , v)
• • •
58
Unsigned vs. Signed MultiplicationUnsigned vs. Signed Multiplication
Unsigned Multiplicationunsigned ux = (unsigned) x;
unsigned uy = (unsigned) y;
unsigned up = ux * uy– Truncates product to w-bit number up = UMultw(ux, uy)
– Modular arithmetic: up = ux uy mod 2w
Two’s Complement Multiplicationint x, y;
int p = x * y;– Compute exact product of two w-bit numbers x, y– Truncate result to w-bit number p = TMultw(x, y)
59
Power-of-2 Multiply with Shift
Operation– u << k gives u * 2k
– Both signed and unsigned
Examples– u << 3 == u * 8– u << 5 - u << 3 == u * 24– Most machines shift and add much faster than multiply
Compiler generates this code automatically
• • •
0 0 1 0 0 0•••
u
2k*
u · 2kTrue Product: w+k bits
Operands: w bits
Discard k bits: w bits UMultw(u , 2k)
•••
k
• • • 0 0 0•••
TMultw(u , 2k)
0 0 0••••••
60
Unsigned Power-of-2 Divide with Shift
Quotient of Unsigned by Power of 2– u >> k gives u / 2k – Uses logical shift
Division Computed Hex Binaryx 15213 15213 3B 6D 00111011 01101101x >> 1 7606.5 7606 1D B6 00011101 10110110x >> 4 950.8125 950 03 B6 00000011 10110110x >> 8 59.4257813 59 00 3B 00000000 00111011
0 0 1 0 0 0•••
u
2k/
u / 2kDivision:
Operands:•••
k••• •••
•••0 ••• •••
u / 2k •••Result:
.
Binary Point
0 •••
61
Signed Power-of-2 Divide with Shift
Quotient of Signed by Power of 2– x >> k gives x / 2k – Uses arithmetic shift– Rounds wrong direction when u < 0
0 0 1 0 0 0•••
x
2k/
x / 2kDivision:
Operands:•••
k••• •••
•••0 ••• •••
RoundDown(x / 2k) •••Result:
.
Binary Point
0 •••
Division Computed Hex Binaryy -15213 -15213 C4 93 11000100 10010011y >> 1 -7606.5 -7607 E2 49 11100010 01001001y >> 4 -950.8125 -951 FC 49 11111100 01001001y >> 8 -59.4257813 -60 FF C4 11111111 11000100
62
Correct Power-of-2 Divide Quotient of Negative Number by Power of 2
– Want x / 2k (Round Toward 0)– Compute as (x+2k-1)/ 2k
In C: (x + (1<<k)-1) >> k Biases dividend toward 0
• Case 1: No rounding
Divisor:
Dividend:
0 0 1 0 0 0•••
u
2k/
u / 2k
•••
k
1 ••• 0 0 0•••
1 •••0 1 1••• .
Binary Point
1
0 0 0 1 1 1•••+2k +–1 •••
1 1 1•••
1 ••• 1 1 1•••
Biasing has no effect
63
Correct Power-of-2 Divide (Cont.)
Divisor:
Dividend:
Case 2: Rounding
0 0 1 0 0 0•••
x
2k/
x / 2k
•••
k1 ••• •••
1 •••0 1 1••• .
Binary Point
1
0 0 0 1 1 1•••+2k +–1 •••
1 ••• •••
Biasing adds 1 to final result
•••
Incremented by 1
Incremented by 1
65
Fractions: Fixed-Point How can we represent fractions?
– Use a “binary point” to separate positivefrom negative powers of two -- just like “decimal point.”
– 2’s comp addition and subtraction still work. if binary points are aligned
00101000.101 (40.625)
+ 11111110.110 ( -1.25) [note, it is in 2s compl.]
00100111.011 (39.375)
2-1 = 0.5
2-2 = 0.25
2-3 = 0.125
No new operations -- same as integer arithmetic.
66
Very Large and Very Small: Floating-Point
Large values: 6.023 x 1023 -- requires 79 bits Small values: 6.626 x 10-34 -- requires >110 bits
Use equivalent of “scientific notation”: F x 2E
Need to represent F (fraction), E (exponent), and sign. IEEE 754 Floating-Point Standard (32-bits):
S Exponent Fraction
1b 8b 23b
0exponent,2fraction.01
254exponent1,2fraction.11126
127exponent
S
S
N
N
67
Floating Point Example Single-precision IEEE floating point number: 10111111010000000000000000000000
– Sign is 1 – number is negative.– Exponent field is 01111110 = 126 (decimal).– Fraction is 0.100000000000… = 0.5 (decimal).
Value = -1.5 x 2(126-127) = -1.5 x 2-1 = -0.75.
sign exponent fraction
68
Floating Point in CFloating Point in C
C Guarantees Two Levelsfloat single precision
double double precision
Conversions– Casting between int, float, and double changes numeric values– Double or float to int
Truncates fractional part Like rounding toward zero Not defined when out of range
– Generally saturates to TMin or TMax
– int to double Exact conversion, as long as int has ≤ 53 bit word size
– int to float Will round according to rounding mode