1 chapter 6 data types what is a data type? a set of values versus a set of values + set of...

47
1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those value

Upload: cooper-havey

Post on 15-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

1

Chapter 6Data Types

What is a data type?

A set of values

versus

A set of values + set of operations on those values

Page 2: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

2

Why data types?• Data abstraction

– Programming style – incorrect design decisions show up at translation time

– Modifiability – enhance readability

• Type checking (semantic analysis) can be done at compile time - the process a translator goes through to determine whether the type information is consistent

• Compiler uses type information to allocate space for variables

• Translation efficiency: Type conversion (coercion) can be done at compile time

Page 3: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

3

Overview cont…

• Declarations– Explicit type information

• Type declarations– Give new names to types in type declararion

• Type checking– Type Inference Rules for determining the types of

constructs from available type information– Type equivalence determines if two types are the

same• Type system

– Type construction methods + type inference rules + type equivalence algorithm

Page 4: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

4

Simple Types• Predefined types• Enumerated types

Haskell

data Color  = Red | Green | Blue | Indigo | Violet

deriving (Show,Eq,Ord)Pascal type fruit = (apple, orange, banana);C/C++ enum fruit { apple, orange, banana };

Page 5: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

5

Simple Types• Subrange typesHaskell [2..5]Pascal type byte = 0..255; minors = 0..19; teens = 13..19;ADA subtype teens is INTEGER range

13..19;

Page 6: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

6

Data Aggregates and Type Constructors

• Aggregate (compound) objects and types are constructed from simple types

• Recursive – can also construct aggregate objects and types from aggregate types

• Predefined – records, arrays, strings……

Page 7: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

7

Constructors - Cartesian Product

• U X V = { (u, v) | u is in U and v is in V}

• Projection functions– p1: U X V -> U and P2: U X V -> V

• Pascal family of languages - recordtype polygon = record edgeCt: integer; edgeLen: real end;

var a : polygon;

Cartesian product typeINTEGER X REAL

Projectionsa.edgeCt

a.edgeLen

Page 8: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

8

Constructors – Mapping (arrays)

• The array constructor defines mappings as data aggregates

• Mapping from an array index to the value stored in that position in the array

• The domain is the values of the index• The range is the values stored in the

array

Page 9: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

9

Constructors Mapping (arrays)

• C/C++

typedef int little_people_num[3];

little_people_num gollum_city = {0, 0, 0}

gollum_city[3] = 5

typedef int matrix[10][20];

Page 10: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

10

Arrays

• Design Issues:1. What types are legal for subscripts?2. Are subscripting expressions in element references range checked?3. When are subscript ranges bound?4. When does allocation take place?5. What is the maximum number of

subscripts?6. Can array objects be initialized?7. Are any kind of slices allowed?

Page 11: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

11

Arrays

• Indexing is a mapping from indices to elements

map(array_name, index_value_list) an element

Page 12: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

12

Arrays

• Subscript Types:– FORTRAN, C - integer only– Pascal - any ordinal type (integer, boolean,

char, enum)– Ada - integer or enum (includes boolean and

char)– Java - integer types only

Page 13: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

13

Arrays

• Categories of arrays (based on subscript binding and binding to storage)1. Static - range of subscripts and

storage bindings are static e.g. FORTRAN 77, some arrays in Ada– Advantage: execution efficiency (no

allocation or deallocation)

Page 14: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

14

Arrays

2. Fixed stack dynamic - range of subscripts is statically bound, but storage is bound at elaboration time (at function call time)– e.g. Most Java locals, and C locals that are

not static– Advantage: space efficiency

Page 15: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

15

Arrays

3. Stack-dynamic - range and storage are dynamic (decided at run time), but fixed after initial creation on for the variable’s lifetime– e.g. Ada declare blocks

declare STUFF : array (1..N) of FLOAT; begin ... end;

– Advantage: flexibility - size need not be known until the array is about to be used

Page 16: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

16

Arrays

4. Heap-dynamic – stored on heap and sizes decided at run time.

e.g. (FORTRAN 90) INTEGER, ALLOCATABLE, ARRAY (:,:) :: MAT (Declares MAT to be a dynamic 2-dim array) ALLOCATE (MAT (10,NUMBER_OF_COLS)) (Allocates MAT to have 10 rows and NUMBER_OF_COLS columns) DEALLOCATE MAT (Deallocates MAT’s storage)

Page 17: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

17

Arrays

4. Heap-dynamic (continued)– Truly dynamic: In APL, Perl, and JavaScript,

arrays grow and shrink as needed– Fixed dynamic: Java, once you declare the

size, it doesn’t change

Page 18: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

18

Arrays

• Number of subscripts– FORTRAN I allowed up to three– FORTRAN 77 allows up to seven– Others - no limit

• Array Initialization – Usually just a list of values that are put in

the array in the order in which the array elements are stored in memory

Page 19: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

19

Arrays

• Array Operations1. APL - many, see book (p. 240-241)2. Ada– Assignment; RHS can be an aggregate

constant or an array name– Catenation; for all single-dimensioned

arrays– Relational operators (= and /= only)3. FORTRAN 90– Intrinsics (subprograms) for a wide variety

of array operations (e.g., matrix multiplication, vector dot product)

Page 20: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

20

Arrays

• Slices– A slice is some substructure of an array;

nothing more than a referencing mechanism– Slices are only useful in languages that

have array operations

Page 21: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

21

Arrays

• Slice Examples:1. FORTRAN 90

INTEGER MAT (1:4, 1:4) MAT(1:4, 1) - the first column MAT(2, 1:4) - the second row

Page 22: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

22

Example Slices in FORTRAN 90

Page 23: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

23

Arrays

• Implementation of Arrays– Access function maps subscript expressions

to an address in the array – Row major (by rows) or column major order

(by columns)

Page 24: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

24

Locating an Element

Page 25: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

25

Compile-Time Descriptors

Single-dimensioned array Multi-dimensional array

Page 26: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

26

Accessing Formulas – 1D

• Address(A[i]) = StartAddress + (i-lb)*size

= VirtualOrigin +i*sizelb: lower bound size: number of bytes of one elementVirtual origin allows us to do math once,

so don’t have to repeat each time.You must check for valid subscript before

you use this formula, as obviously, it doesn’t care what subscript you use.

Page 27: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

27

Accessing FormulasMultiple Dimensions In row-major order

• ubi: upper bound in ith dimension

• lbi: lower bound in ith dimension

• lengthi = ubi –lbi +1• In row-major orderAddress(A[i,j]) = StartAddress + size((i-lbi)*lengthj + j-lbj)

=Our goal is to perform as many computations as possible

before run time:=StartAddress + size*i*lengthj –size(lbi * lengthj) + size*j -

size*lbj

= VO + i*multi + j*multj

Page 28: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

28

Address(A[i,j]) = StartAddress + size((i-lbi)*lengthj + j-lbj)

= VO + i*multi + j*multj = 40 +28i+20j

For Example: array of floats A[0..6, 3..7] beginning at location 100

StartAddress = 100size = 4 (if floats take 4 bytes)lbi = 0 ubi = 6 lengthi = 7

lbj = 3 ubj = 7 lengthj = 5

VO = 100 + 4*(-3)*5 = 40multi = 28

multj = 20

VO 40

lbi 0

ubi 6

multi 28

lbj 3

ubj 7

multj 20

repeated for each dimension

Page 29: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

29

Accessing FormulasMultiple Dimensions

• In column-major orderAddress(A[i,j]) = StartAddress + size((i-lbi) + (j-lbj)*lengthi)• In 3D in row major:Addr(A[I,j,k]) = StartAddress + size*((i-lbi)*lengthj*lengthk) + (j-

lbj)lengthk + k-lbk)

Page 30: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

30

Accessing Formulas Slices

• Suppose we want only the second row of our previous example. We would need array descriptor to look like a normal 1D array (as when you pass the slice, the receiving function can’t be expected to treat it any differently)

Page 31: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

31

Address(A[i,j]) = StartAddress + size((i-lbi)*lengthj + j-lbj)

= VO + i*multi + j*multj = 40 +28i+20j

For Example: array of floats A[0..6, 3..7] beginning at location 100

If we want only the second row, it is like we have hardcoded the j=2 in the accessing formula,

A[0..6,2]The accessing formula is simpleJust replace j with 2, adjust the VO, and removethe ub,lb, and length associated with j

so 40 +28i+20j = 80+28iand the table is changedaccordingly (and looks justlike the descriptor fora regular 1D array)

VO 80

lbi 0

ubi 6

multi 28

Page 32: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

32

Constructors Union

• Cartesian products – conjunction of fields

• Union – disjunction of fields• Discriminated or undiscriminated

Page 33: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

33

• Pascal Variant Record – discriminated union

type address_range = 0..maxint; address_type = (absolute, offset); safe_address = record case kind: address_type of absolute: (abs_addr:address_range); offset: (off_addr: integer); end

Page 34: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

34

References

• A reference is the address of an object under the control of the system – which cannot be used as a value or operated on

• C++ is perhaps the only language where pointers and references exist together.

• References in C++ are constant pointers that are dereferenced everytime they are used.

Page 35: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

35

Constructors Pointer and Recursive Types

• Some languages (Pascal, Ada) require pointers to be typed

• PL/1 treated pointers as untyped data objectsWhat is the significance of this for a type checker?

• C pointers are typed but C allows arithmetic operations on them unlike Pascal and Ada

Page 36: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

36

Type Equivalence

• When are two types the same• Structural equivalence• Declaration equivalence• Name equivalence

Page 37: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

37

Structural Equivalence

• Two types are the same if they have the same structure

• i.e. they are constructed in exactly the same way using the same type constructors from the same simple types

• May look alike even when we wanted them to be treated as different.

Page 38: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

38

Structural Type Equivalence

typedef int anarray[10];typedef struct { anarray x; int y;}struct1;

typedef struct { int x[10]; int y;}struct2;

typedef int anarray[10];typedef struct { anarray a; int b;}struct3;

typedef int anarray[10];typedef struct { int b;

anarray a;}struct4;

(Note we are just using the syntax of C as an example. C does NOT use structural equivalence for structs

Page 39: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

39

Structural Equivalence• Consider…

• Dynamic arrays

Type array1 = array[-1..9] of integer; array2 = array[0..10] of integer;

Array (INTEGER range <>) of INTEGER

Page 40: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

40

Name Equivalence

• Two name types are equivalent only if they have the exact same type name

• Name equivalence in Ada and C• ar1 and ar2 are not considered name equivalent

typedef int ar1[10];typedef ar1 ar2;typedef int age;

type ar1 is array (INTEGER range1..10) of INTEGER;type ar2 is new ar1;type age is new INTEGER;

Page 41: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

41

Name equivalence…

v1: ar1;v2: ar1;v3: ar2;

v4: array (INTEGER range 1..100) of INTEGER;v5: array (INTEGER range 1..100) of INTEGER;v4 and v4 cannot be name equivalent, as there is no name.

v6,v7: array (INTEGER range 1..100) of INTEGER;

Page 42: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

42

Declaration Equivalent

• Lead back to the same original structure declaration via a series of redeclarations

type t1 = array [1..10] of integer; t2 = t1; t3 = t2;These are the same typetype t4 = array [1..10] of integer; t5 = array [1..10] of integer;These are different types.

Page 43: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

43

Type Checking• Involves the application of a type equivalence

algorithm to expressions and statements to determine if they make sense

• Any attempt to manipulate a data object with an illegal operation is a type error

• Program is said to be type safe (or type secure) if guaranteed to have no type errors

• Static versus dynamic type checking• Run time errors

Page 44: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

44

Type Checking…• Strong typing and type checking

– Strong guarantees type safety• A language is strongly typed if its type system

guarantees statically (as far as possible) that no data-corrupting errors can occur during execution.

• Statically typed versus dynamically typed– Static (type of every program expression be known

at compile time)• All variables are declared with an associated type• All operations are specified by stating the types of

the required operands and the type of the result• Java is strongly typed• Python is dynamic typed and strong typed

Page 45: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

45

Strongly typed• /* Ruby code */

a="this"• b = a+1• p b• : can't convert Fixnum into String (TypeError)• a is of str type. • In the second line, we're attempting to add 2 to a

variable of str type. • A TypeError is returned, indicating that a str object

cannot be concatenated with an int object. This is what characterizes strong typed languages: variables are bound to a particular data type.

Page 46: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

46

Weakly Typed

• /* PHP code */ <?php $foo = "x"; $foo = $foo + 2; // not an error echo $foo; ?>

• In this example, foo is initially a string type. In the second line, we add this string variable to 2, an integer. This is permitted in PHP, and is characteristic of all weak typed languages.

Page 47: 1 Chapter 6 Data Types What is a data type? A set of values versus A set of values + set of operations on those values

47

Type Conversions

• Def: A mixed-mode expression is one that has operands of different types

• Def: A coercion is an implicit type conversion• The disadvantage of coercions:

– They decrease in the type error detection ability of the compiler

• In most languages, all numeric types are coerced in expressions, using widening conversions

• In Ada, there are virtually no coercions in expressions