introduction to computer and program designocw.nctu.edu.tw/course/icp011/ch04.pdfthe basic data...

Lesson 4Data Types

Introduction to Computer and Program Design

James C.C. Cheng

Department of Computer Science

National Chiao Tung University

The basic data types There are 13 basic data types in C

In C++, bool represents boolean value, true or false.

The size of bool depends on compiler, 1 byte in most case. 2

Name Size (byte) Rangechar 1 -128 to 127

unsigned char 1 0 to 255

short 2 -32768 to 32767

unsigned short 2 0 to 65535

int 4 -231 to 231 - 1

unsigned int 4 0 to 232 - 1

long 4 -231 to 231 - 1

unsigned long 4 0 to 232 - 1

__int64, long long 8 -263 to 263 – 1

unsigned __int64 8 0 to 264 - 1

float 4 ±(1.175494351e-38 to 3.402823466e38 )

double 8 ±(2.2250738585072014e-308 to 1.7976931348623158e308 )

long double 12 in DevC++, 8 in MSC x86 80-bit extended precision format

The basic data types sizeof operator

return the number of byte for a variable or a data type

size_t sizeof( name ); size_t

In 32-bit environment: unsigned long

In 64-bit environment: unsigned long long

sizeof(type name)

ex: sizeof(int); sizeof(double);

sizeof(variable name)

ex: int x; sizeof(x); double d; sizeof(d);

Integers The integer data types are:

bool, char, short, int, long, __int64 bool in C99: #include <stdbool.h>

unsigned char, unsigned short, unsigned int, unsigned long, unsigned __int64

4

5

Integers Decimal system

Each digit = {0, 1, …, 9 } 347210 = ( 3 x 103 ) + ( 4 x 102 ) + ( 7 x 101 ) + ( 2 x 100 )

Binary system Each digit = {0, 1} Binary Decimal

10112 = (1 x 23)10 + (0 x 22)10 + (1 x 21)10 + (1 x 20)10 = 1110

Decimal Binary 13 mod 2 = 1 13/2 = 6 mod 2 = 0 6/2 = 3 mod 2 = 1 3/2 = 1 mod 2 = 1 1/2 =0

Addition: 1+0 = 12 ; 1+1 = 102

Subtraction: 1-0 = 12 ; 10-1 = 12

1310 = 11012

Integers Hexadecimal system

Each digit = {0-9, A, B, C, D, E, F} Hex Decimal

3A7C = (3 x 163)10 + (10 x 162)10 + (7 x 161)10 + (12 x 160)10= 1497210

Decimal Hex Method 1: the same as Dec binary Method 2: decbinhex

Addition (2+7)16 = 916

(1+9)16 = A16

(2+B)16 = D16

(9+9)16 = 1216

Subtraction (5-2)16 = 316

(F-2)16 = D16

(10-1)16 = F16 6

7

Integers Binary Hex

Grouping each 4 digits from the smallest digit 11001110101101 = 0011,0011,1010,1101

Check the hex bin table 0011,0011,1010,1101 = 33AD

Hex Binary

0 0000

1 0001

2 0010

3 0011

4 0100

5 0101

6 0110

7 0111

8 1000

9 1001

A 1010

B 1011

C 1100

D 1101

E 1110

F 1111

8

Integers The limits of unsigned integers

1 bit: 0, 1

4 bit: 0 ~ (24 – 1)10 = 0 ~ 1510

1 byte: 0 ~ (28 – 1)10 = 0 ~ 25510

2 byte: 0 ~ (216 – 1)10 = 0 ~ 65,53510

4 byte: 0 ~ (232 – 1)10 = 0 ~ 4,294,967,29510

8 byte: 0 ~ (264 – 1)10 = 0 ~ 18,446,744,073,709,551,61510

#include <limits.h>…unsigned char uc = UCHAR_MAX;unsigned short us = USHRT_MAX;unsigned int u = UINT_MAX;unsigned long ul = ULONG_MAX;unsigned long long ull = ULLONG_MAX;

Integers The limits of signed integers

9

#include <limits.h>…char c = CHAR_MIN;

c = CHAR_MAX;short s = SHRT_MIN;

s = SHRT_MAX;int i = INT_MIN;

i = INT_MAX;long l = LONG_MIN;

l = LONG_MAX;long long ll = LLONG_MIN;

ll = LLONG_MAX;

10

Integers Signed integer

How to store a signed integer?

Method 1: Sign bit

The range is ( 0 ~ 27 – 1) = (0 ~ 127)

pros: simple

cons:

negative zero

Addition and subtraction problems1001,10112 + 0001,10112 = 1011,01102 -27+27 = -54 …!?

1 0 0 1 1 0 1 1

0 0 0 1 1 0 1 1Sign bit

= -27

= 27

11


Method 2: One’s complement

27 = 0001,10112

-27 = -(0001,10112) 0/1 inversion 1110,01002

The range is ( 0 ~ 27 – 1) = (0 ~ 127)

Pros:

Simple

No addition and subtraction problems

1110,01002 + 0001,10112 = 1111,11112 -27 + 27 = -0

1110,01002 + 0000,00012 = 1110,01012 -27 + 1 = -26

Cons:

Negative zero

12


Method 3: Two’s complement

27 = 0001,10112

-27 = 1,0000,00002 - 0001,10112 = 1110,01012

The range is - 27 ~ -1, 0, 1 ~ (27 – 1) = -128 ~ 127

Pros:

No negative zero

No addition and subtraction problems

1110,01012 + 0001,10112 =

1,0000,00002 (max 8 bit) = 0000,00002 -27 + 27 = 0

1110,01012 + 0000,00012 = 1110,01102

= -(1,0000,00002 - 1110,01102 ) = -(0001,1010)2 -27 + 1 = -26

Cons:

It needs more conversion cost

Integers Constants:

bool bT = true, bF = false, bTrue = 5438, bFalse = 0;

The zero is false; Non-zero is true.

char cA = ‘A’, cB = 66, cC= 65603, cD = 0x43, cE = 0105, cEnter = ‘\n’; Hex constant: Using the prefix "0x" Octal constant: Using the prefix "0"

short i = 10, j = 32767, k = -32768, s = 32768, t = -32769;

int x = 10, y = -100, z = 0x02AB2F;

int nMax = 2147483647;

int nMinA = -2147483648; // It causes a warning message

int nMinB = (-2147483647 – 1); // OK!

13

- 2147483648 - ( 2147483648 )

Integers Constants:

long n = 120, m = 12345L, a = 0xFFFFL;

long integer constant : Using the suffix “L”

__int64 nn = 10, mm = 9876543210LL;

long long constant : Using the suffix “LL”

64-bit integer in scanf and printf: ll or I64

14

__int64 nn = 10;long long mm = 9876543210LL ;scanf("%I64d%lld", &nn, &mm);printf("%lld, %I64d \n", nn, mm); // OK, using “ll” or “I64” // Notice that "l" is lowercase. "L" means "long double", %Ld ==> %d

nn = 7; mm = 1;printf("%d, %I64d \n", nn, mm); // In x86, the second output is wrong!?

07, 00, 00, 00 00, 00, 00, 00 01, 00, 00, 00 00, 00, 00, 00

nn mm

Integers Constants:

unsigned char, unsigned short and unsigned int

unsigned long : The suffix is “UL”

unsigned __int64 :

In Dev C++, the suffix is “LLU”

In VC++, the suffix is “ULL”

15

unsigned char uc = -1;unsigned short us = -1;unsigned int ui = -1;printf("%u, %u, %u\n", uc, us, ui);

unsigned long un0 = 123UL, un1 = -1UL;printf("%u, %u\n", un0, un1);

unsigned long long unn0 = 9876543210000001234LLUunsigned long long unn1 = -1LLU;printf("%llu, %I64u\n", unn0 , unn1 );

Integers integers in printf and scanf

In printf <= 4byte argument 4byte

long long 8 byte

%d: singed decimal

%i: signed decimal integer (In scanf, including hex and octal)

%u: unsigned integer

%o: unsigned octal integer

%x: unsigned lowercase hex integer (in scanf, including lowercase and uppercase)

%X: uppercase unsigned hex integer (Only in printf)

%c: character

%p: Address in hex digits. 32-bit: 8-digit, 64-bit: 16-digit (only in printf)

%n: output the number of characters written/read so far, the argument shall be an integer point. In VC++, this function is disabled. 16

Integers integers in printf and scanf

data length: hh: chat

h: short

l: long

ll long long

17

char c = -1; printf("%02hhX\n", c); // Failed in DevC++short s = -1; printf("%04hX\n", s);long l = -1L; printf("%08lX\n", l);long long ll = -1LL; printf("%016llX\n", ll);

Integers Implicit typecast

small large

signed unsigned

18

short s1 = 10; int n1 = 0xFF0000;if(n1 > s1) // short int

printf("Hello\n");

long long nn = 0x7FFFFFFF00000000ULL; if(nn > n1) // int long long

printf("World!\n");

unsigned int u= 0;int i = -1;if(i>u) // int unsigned int

printf("-1 > 0\n");

Float point numbers float , 32-bit IEEE-754 float point number

double , 64-bit IEEE-754 float point number

long double In VC++, long double = double

In GCC 4.3 or above, sizeof( long double) is 12 byte,

In x86 environment, it only uses 10 byte

Why it need 12 byte? In 32-bit environment, the data access unit is 4 byte.

x86 80-bit extended precision format

19

20

Float point numbers Decimal fraction Binary?

0.5 10 = 2-110 = 0.1 2

0.25 10 = 2-210 = 0.01 2

0.125 10 = 2-310 = 0.001 2

….

0.10112 = (1 x 2-1) 10 + (0 x 2-2) 10 + (1 x 2-3) 10 + (1 x 2-4) 10

= 0.510 + 0.12510 + 0.062510

= 0.687510

0.37510 = 0.0110.375 * 2 = 0.750.75 * 2 = 1.50.5 * 2 = 1.0

Some decimal fraction cannot be converted to binary system 0.410 = 0.001100110011……2

= 0.00112

21

Float point numbers Decimal fraction Binary?

integer part binary

fraction part binary

13.562510 = 1101.10012

1310 = 1101

0.562510 = 0.10012

22

Float point numbers IEEE 754 (The IEEE Standard for Floating-Point Arithmetic)

IEEE, Institute of Electrical and Electronics Engineers 國際電子電機學會 32-bit IEEE-754

Sign: 1bit Exponent: 8 bit (127 offset) Fraction: 23 bit

64-bit IEEE-754 Sign: 1bit Exponent: 11 bit (1023 offset) Fraction: 52 bit

mantissa

Float point numbers IEEE 754 (The IEEE Standard for Floating-Point Arithmetic)

Example:

23

13.562510= 1101.10012= 1.10110012 * 23

For 32 bit float:= (-1)0 * 1.10110012 * 2130-127

Sign = 0Exponent = 130 = 100000102Fraction = 10110010…02

For 64 bit float:= (-1)0 * 1.10110012 * 21026-1023

Sign = 0Exponent = 1026= 100,0000,00102Fraction = 10110010…02

0 1 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0x41590000

, , , , , ,0 1 0 0 0 0 0 0 0 0 1 0 1 0 1 1 0 0 1 0 0 0 0 0 … 0 0x402B20...0

, , , , , , ,

Float point numbers x86 0-bit extended precision format ( for gcc's long double)

Sign: 1bit Exponent: 15 bit (16383 offset) Integer part: 1bit

In 80387 or above, this bit always be 1 Fraction: 63 bit

Value = (-1)sign 2exponent – 16383 (1.fraction)2

Reference: http://en.wikipedia.org/wiki/Extended_precision

24

… 1 …Sign Exponent Integer Fraction

Float point numbers Addition and subtraction

13.562510 + 0.1562510 = 13.7187510

= (-1)0 * 23 * 1.10110012 + (-1)0 * 2-3 * 1.012

= 23 * 1.10110012 + 23 * 2-6 * 1.012

= 23 * 1.10110012 + 23 * 0.000001012

= 23 * 1.101101112

= 13.7187510

13.562510 - 0.1562510 = 13.4062510

= (-1)0 * 23 * 1.10110012 + (-1)1 * 2-3 * 1.012

= 23 * 1.10110012 + (-1) * 23 * 2-6 * 1.012

= 23 * 1.10110012 + (-1) * 23 * 0.000001012

= 23 * 1.10110012 + 23 * 1.111110112

= 23 * 11.101011012 (Overflow) 23 * 1.101011012

= 13. 406251025

} 2's complement

Float point numbers Truncation and Rounding

Example:

26

1234567.210= 1,0010,1101,0110,1000,0111. 0011,0011,0011... 2= 1. 0010,1101,0110,1000,0111 0011,0011,0011... 2 * 220

For 32 bit float:= (-1)0 * 1. 0010,1101,0110,1000,0111 0011,0011,0011... 2 * 2147-127

Sign = 0Exponent = 147 = 100100112Fraction (23 bit) = 0010,1101,0110,1000,0111 0011,0011,0011 2 Truncation

= 0010,1101,0110,1000,0111 010 2 Rounding (1234567.25)

1234567.1510= 1,0010,1101,0110,1000,0111. 0010,0110,0110,... 2= 1. 0010,1101,0110,1000,0111 0010,0110,0110... 2 * 220

For 32 bit float:= (-1)0 * 1. 0010,1101,0110,1000,0111 0010,0110,0110... 2 * 2147-127

Sign = 0Exponent = 147 = 100100112Fraction (23 bit) = 0010,1101,0110,1000,0111 0010,0110,0110 2 Truncation

= 0010,1101,0110,1000,0111 001 2 (1234567.125)

27

Float point numbers Limits

Float Values (b = bias)

Sign Exponent (e ) Fraction (f ) Value

0 00..00 00..00 0

0 00..0000..01~ Positive Denormalized Real

11..11 0.f × 2(-b+1)

000..01~

XX..XXPositive Normalized Real

11..10 1.f × 2(e-b)

0 11..11 00..00 +Infinity

0 11..1100..01~

NaN11..11

28


Sign Exponent (e) Fraction (f) Value

1 00..00 00..00 -0

1 00..0000..01~ Negative Denormalized Real

11..11 -0.f × 2(-b+1)

100..01~

XX..XXNegative Normalized Real

11..10 -1.f × 2(e-b)

1 11..11 00..00 -Infinity

1 11..1100..01~

NaN11.11


29

#include <float.h>…float f = FLT_MIN;

f = FLT_MAX;double d = DBL_MIN;

d = DBL_MAX;long double ld = LDBL_MIN;

ld = LDBL_MAX;printf("%f\n", f);printf("%f\n", d);printf("%Lf\n", ld); // only in GCC 4.3 or above

30

Float point numbers printf()

float will be converted to double

%f, %e, %E, %g, and %G need 8 byte argument

%+width+.precision + {f/e/E/g/G},

width: the minimum number of characters printed, if width is not given, all characters of the value are printed.

precision:

for f/e/E : the number of digits after the decimal point,

for g/G: the maximum number of significant digits printed.

default is sixfloat r = 0.00000123;printf("%10.8f, %4.2e, %4.2g\n", r, r, r);

0.00000123, 1.23e-006, 1.2e-006Output:

Float point numbers scanf()

%f: float, 4 byte data

%lf: double 8 byte data

%Lf: long double Only in GCC 4.3 or above *.c file

G++ does not accept %Lf

31

Float point numbers Constants:

float f = 1.1234f; Using the suffix "f"

double r0 = 1.12345, r1 = 123, r2 = 0x64, r3 = 123LLU;

Without any prefix and suffix

Typecast integer float

float double

32

float fx = 33.3f;printf("%f\n", fx * 2 / 3 );

#include <float.h>…float fx = 0.0f;double dx = fx + DBL_MIN;printf("%f\n", dx);

Float point numbers Never check a floating number to EQUAL a value

33

float A = 1.2f;float B = 12.0f;scanf("%f%f", &A, &B); // type 1.2 and 12float D = A - B / 10.0f; // D = 1.2 – 12/10.0 should be zeroif(D == 0.0) // Oops!

printf("!\n");else

printf("%f\n", D); // This line will be show up

Float point numbers Be careful with using a floating number to be an iterator

34

float i;for(i=0.0f; i<=1.0f; i+=0.1f)

printf("%f\n", i);

// How many lines will be displayed?

35

Union One memory space to be accessed as different data types

The size of the union is at least the size of the largest member.

union U{unsigned long i;float f;

};

int main(){U u;u.f = 13.5625f;

printf("%08X\n", u.i); // 41590000printf("%08X\n", u.f); // In x86: 00000000printf("%08X%08X\n", u.f); // In x86: 00000000402B2000

// Three printing method for float. Which one is the best?};

Characters ASCII (American Standard Code for Information Interchange)

'0'~'9': 48~57

'A'~'Z': 65~90

'a'~'z': 97~122

http://en.wikipedia.org/wiki/ASCII

Constants:

Using the single quotation marks ' '

36

char a = 'A';printf("%c: %d\n", a, a);a = '9';printf("%c: %d\n", a, a);a = 'xyz';printf("%c: %d\n", a, a);

a = "uvw"; // Compiling error!

char a = 50, b = 70, c= 100;printf("%c%c%c\n", a, b, c);

Characters Escape Sequences

\n: (10) newline

\t: (9) tab

\b: (8) backspace

\r: (13) return key In MS Windows, the end of line consists of two characters: \r\n (13, 10)

\': single quotation

\": double quotation

\\: backslash

\0: null

\OOO: OOO is 3-digit octal ASCII char e = '\050',

\xOO: OO is 2-digit hex ASCII char i = '\x41',

37

char a = '\n', b = '\t', c= '\b';char d = '\r', i = '\'', j = '\"';char f = '\\', g = '\0';

Strings A character array

The last character of a string must be \0

Constant string: Using the double quotation marks " "

String declaration and initialization:

38

char s1[5] = "abcd"; // a writable char arraychar s2[5] = {'A', 'B', 'C', 'D', 0}; // a writable char arraychar s3[] = "xyz"; // a writable char arraychar s4[] = { 'X', 'Y', 'Z', 0}; // a writable char arraychar* s5 = "uvw"; // a char pointer that points a read-only data

printf("%s\n%s\n%s\n%s\n%s\n", s1, s2, s3, s4, s5);

s1[0] = 'T'; // OK!

s5[0] = 'T'; // Runtime errorchar* s6 = { 'U', 'V', 'W', 0}; // Compiler error

Strings Misuse

39

char *s1 = "abcd";char s2[] = "ABCD";printf("%s\n%s\n", s1, s2);

s1 = s2; // The pointer can be an l-valueprintf("%s\n%s\n", s1, s2); // ABCD, ABCDs2[0] = 'T'; // Watch the side effect!printf("%s\n%s\n", s1, s2); // TBCD, TBCD

char s1[] = "abcd";char s2[] = "ABCD";s1 = s2;// Compiling error! The array name cannot be an l-value

side effectA expression returns one or more additional values.That means it modifies some observable state.

Strings scanf()

%[ ]: Read the specified characters Input ends when a non-matching character is reached or the field width is reached.

40

char A[20] = {0}; // Declare a string of 20 zero charactersscanf("%s", A); // type "ABC EFG XYZ" printf("%s\n", A);

char A[20] = {0}, B[20] = {0}, C[20] = {0};scanf("%s%s%s", A , B, C); // type "ABC EFG XYZ" printf("%s %s %s\n", A, B, C);

char A[20] = {0}, B[20] = {0}, C[20] = {0};scanf("%[0-9]%[A-Z]%[a-z]", A , B, C);printf("%s%s%s\n", A, B, C);

scanf("%[ -~, '\t']", A); // Read ASCII 9 and 32 to 126

printf("%s\n", A);scanf("%[ -~, '\t', '\n' ]", A); // Oops….

Strings scanf()

%[^ ]: Read the characters except the specified characters

41

char A[20] = {0}, B[20] = {0}, C[20] = {0};scanf("%[^0-9]%[^A-Z]%[^a-z]", A , B, C);printf("%s %s %s\n", A, B, C);

scanf("%[^'\n']", A); // Read all characters except '\n'printf("%s\n", A);

Strings gets()

char* gets ( char *s);

It reads characters from the stdin and stores them into s until a newline character or the end-of-file is reached.

On success, it returns s. Otherwise, it returns NULL

42

#include <stdio.h>…char A[100] = {0};while( gets(A) != NULL ){

printf("%s\n", A);}

Typecast Change the data type of variable

x = (x's type name) y;

Ex:

43

int n = 1234;float f = (float) n; char c = (char) f;double d = (double) c;

#define The text replacement

Usage:

#define replacement target_text

Ex:

#define PI 3.14159

#define EXP 2.71828

#define NULL_STR

double p = PI;

printf("%f\n", p); // 3.14159

printf("%f\n", EXP); // 2.71828

printf("%f\n", NULL_STR); // Compile Error

printf("%f\n", NULL_STREXP); // Compile Error

printf("%f\n", NULL_STR EXP); // 2.71828 44

#define Macro, 巨集

Ex:#define MIN_INT(x) (-2147483647 – 1)

#define INC(x) (++x)

#define ADD(x, y) (x+y)

#define MIN(x, y) (x<y?x:y)

#define _MIN(x, y) x<y?x:ytice

float x = PI * MIN(2.0f, 3.0f); // x = 6.28298

float y = PI * _MIN(2.0f, 3.0f); // y = 3.0 !?

45

請注意括號!

#define Do not use #define to define a data type

Ex:

46

#define uint unsigned intuint a, b ;// OK! a and b are unsigned int

char *s1 = " Hello ", *s2 = " World ";#define CSTR char *CSTR s3 = " OK ", s4 = " oops ";

// s4 is not a char *

typedef To define a datatype

The usage:typedef original_typename new_typename;

EX:

47

注意要加分號

typedef unsigned int uint;uint a, b ; // OK! a and b are unsigned int

typedef char * cstr;cstr sA = "Hello", sB = "world!";printf("%s, %s\n", sA, sB ); // OK!

練習題設計一個函式，名為inverseN，用來反轉一個unsigned long

例如：12345則回傳54321；12321則回傳12321；0則回傳0須注意overflow，若數字2223334445則回傳0並顯示overflow。請用最節省空間方式完成，不可用64bit整數、字串及陣列

給一個double陣列A，用來儲存某數學函數的運算結果，即A[i] = f(ix)。

請寫一個函式來計算f 的微分： ′ ≅ ′ ≅

void dev(const double* A, double* B, int n);其中A為輸入陣列，B為微分結果，n為陣列元素個數。A[i] = 0 if i < 0 or i >= n

48

練習題請設計generic min & max functions，可處理所有C語言的基本資料型

態，並回傳最大值及最小值。void GMin(const void *pa, const void *pb, void *pOut, size_t n); void GMax(const void *pa, const void *pb, void *pOut, size_t n); pa及pb兩者為指向同一資料型態的輸入，pOut為輸出，n為資料大小(byte)

請以上題來實作generic 的bubble sortvoid bbsort(void *p, int n, size_t m, char dir);p為指向欲排序的資料陣列，n為陣列元素個數，m為每個元素的資料大小(byte)若dir為0，則由小排到大，若不為0 ，則由大排到小

49

introduction to computer and program designocw.nctu.edu.tw/course/icp011/ch04.pdfthe basic data...

Documents