structure of programming language data types. 2 a data type defines a collection of data objects and...

45
Structure of Programming Language Data Types

Upload: janis-roberts

Post on 17-Jan-2016

263 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

Structure of Programming Language

Data Types

Page 2: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

2

Data Types

• A data type defines a collection of data objects

and a set of predefined operations on those

objects

• An object represents an instance of a user-

defined (abstract data) type

Page 3: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

3

Why have Types?

• Types provide context for operations

– a+b what kind of addition?

– pointer p = new object how much space?

• Limit valid set of operations

– int a = “string” know error before run time

Page 4: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

4

Defining/Declaring a Type

• Fortran: variable’s type known from name

– Letters I-N = integer type

– Otherwise = real (floating-point) type

• Scheme, Smalltalk: run-time typing

• Most languages (such as C++): user explicitly

declares type

– int a, char b

Page 5: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

5

Primitive Data Types

• Almost all programming languages provide a set

of primitive data types

• Primitive data types: Those not defined in terms

of other data types

• Some primitive data types are merely reflections

of the hardware

• Others require little non-hardware support

Page 6: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

Pre-defined types

• Built-in types: – E.g. int, boolean, float, double in C/Java

• Types of literals– 0,1,… (integer)– true:boolean, false:boolean– 0.0:float, 1.745: float (or double??)– 0.0d : double, 1.745d: double– “Hello Mars!” : string

Page 7: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

Pre-defined types• Types of built-in operators

– +, -, *, /, % : integer,integer integer– +, - : integer integer– <, <=, ==, !=, >, >= : integer,integer bool– &, |, ^ : bool,bool bool– ! : bool bool– [ ] : array(int), int int – [ ] : array(float), int float– …– [ ]= : array(float), int, float array(float)

Page 8: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

8

Primitive Data Types: Integer

• Almost always an exact reflection of the hardware so the mapping is trivial

• There may be as many as eight different integer types in a language

• Java’s signed integer sizes: byte, short, int, long

Page 9: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

9

Primitive Data Types: Floating Point

• Model real numbers, but only as approximations• Languages for scientific use support at least two

floating-point types (e.g., float and double; sometimes more

• Usually exactly like the hardware, but not always• IEEE Floating-Point

Standard 754

Page 10: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

Range of numbers• Normalized (positive range; negative is symmetric)

• Unnormalized

00000000100000000000000000000000 +2-126× (1+0) = 2-126

01111111011111111111111111111111 +2127× (2-2-23)

smallest

largest

smallest

largest

00000000000000000000000000000001 +2-126× (2-23) = 2-149

00000000011111111111111111111111 +2-126× (1-2-23)

0 2-149 2-126(1-2-23)

2-126 2127(2-2-23)

Positive underflow Positive overflow

10

Page 11: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

11

Primitive Data Types: Decimal

• For business applications (money)

– Essential to COBOL

– C# offers a decimal data type

• Store a fixed number of decimal digits

• Advantage: accuracy

Page 12: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

12

Primitive Data Types: Boolean

• Simplest of all

• Range of values: two elements, one for “true”

and one for “false”

• Could be implemented as bits, but often as bytes

Page 13: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

13

Primitive Data Types: Character

• Stored as numeric codings

• Most commonly used coding: ASCII

• An alternative, 16-bit coding: Unicode

– Includes characters from most natural languages

– Originally used in Java

– C# and JavaScript also support Unicode

Page 14: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

14

Character String Types

• Values are sequences of characters

• Design issues:

– Is it a primitive type or just a special kind of array?

– Should the length of strings be static or dynamic?

Page 15: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

15

Character String Types Operations

• Typical operations:– Assignment and copying– Comparison (=, >, etc.) – Catenation– Substring reference– Pattern matching

Page 16: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

16

Character String Type in Certain Languages

• C and C++– Not primitive– Use char arrays and a library of functions that

provide operations

• SNOBOL4 (a string manipulation language)– Primitive– Many operations, including elaborate pattern

matching

• Java– Primitive via the String class

Page 17: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

17

Character String Length Options

• Static: COBOL, Java’s String class• Limited Dynamic Length: C and C++

– In C-based language, a special character is used to indicate the end of a string’s characters, rather than maintaining the length

• Dynamic (no maximum): SNOBOL4, Perl, JavaScript

• Ada supports all three string length options

Page 18: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

18

Records

• Also known as ‘structs’ and ‘types’.– C

struct resident {

char initials[2];

int ss_number;

bool married;

};

• fields – the components of a record, usually referred to using dot notation.

Page 19: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

19

Nesting Records

• Most languages allow records to be nested within each other.– Pascal

type two_chars = array [1..2] of char;

type married_resident = record

initials: two_chars;

ss_number: integer;

incomes: record

husband_income: integer;

wife_income: integer;

end;

end;

Page 20: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

20

Definition of Records

• COBOL uses level numbers to show nested records; others use recursive definition

• Record Field References1. COBOL

field_name OF record_name_1 OF ... OF record_name_n

2. Others (dot notation)

record_name_1.record_name_2. ... record_name_n.field_name

Page 21: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

21

Definition of Records in COBOL

• COBOL uses level numbers to show nested records; others use recursive definition01 EMP-REC.

02 EMP-NAME.

05 FIRST PIC X(20).

05 MID PIC X(10).

05 LAST PIC X(20).

02 HOURLY-RATE PIC 99V99.

Page 22: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

22

Definition of Records in Ada• Record structures are indicated in an orthogonal

way

type Emp_Rec_Type is record

First: String (1..20);

Mid: String (1..10);

Last: String (1..20);

Hourly_Rate: Float;

end record;

Emp_Rec: Emp_Rec_Type;

Page 23: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

23

Operations on Records

• Assignment is very common if the types are identical

• Ada allows record comparison• Ada records can be initialized with aggregate

literals• COBOL provides MOVE CORRESPONDING

– Copies a field of the source record to the corresponding field in the target record

Page 24: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

24

Sets

• Introduced by Pascal, found in most recent languages as well.

• Common implementation uses a bit vector to denote “is a member of”.– Example:

U = {‘a’, ‘b’, …, ‘g’}A = {‘a’, ‘c’, ‘e’, ‘g’} = 1010101

• Hash tables needed for larger implementations. – Set of integers = (232 values) / 8 = 536,870,912 bytes

Page 25: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

25

Enumerations

• enumeration – set of named elements

– Values are usually ordered, can compare

enum weekday {sun,mon,tue,wed,thu,fri,sat}

if (myVarToday > mon) { . . . }

• Can choose ordering in C

enum weekday {mon=0,tue=1,wed=2…}

Page 26: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

26

Enumeration Types

• All possible values, which are named constants, are provided in the definition

• C# exampleenum days {mon, tue, wed, thu, fri, sat, sun};

• Design issues– Is an enumeration constant allowed to appear in more

than one type definition, and if so, how is the type of an occurrence of that constant checked?

– Are enumeration values coerced to integer?– Any other type coerced to an enumeration type?

Page 27: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

27

Array Types

• An array is an aggregate of homogeneous data

elements in which an individual element is

identified by its position in the aggregate,

relative to the first element.

Page 28: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

28

Array Indexing• Indexing (or subscripting) is a mapping from

indices to elements

array_name (index_value_list) an element

• Index Syntax

– FORTRAN, PL/I, Ada use parentheses

• Ada explicitly uses parentheses to show uniformity between

array references and function calls because both are mappings

– Most other languages use brackets

Page 29: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

29

Arrays Index (Subscript) Types

• FORTRAN, C: integer only• Pascal: any ordinal type (integer, Boolean, char,

enumeration)• Ada: integer or enumeration (includes Boolean

and char)• Java: integer types only• C, C++, Perl, and Fortran do not specify range

checking• Java, ML, C# specify range checking

Page 30: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

30

Array Dimensions

• C uses 0 -> (n-1) as the array bounds.– float values[10]; // ‘values’ goes from 0 -> 9

• Fortran uses 1 -> n as the array bounds.– real(10) values ! ‘values’ goes from 1 -> 10

• Some languages let the programmer define the array bounds.– var values: array [3..12] of real;

(* ‘values’ goes from 3 -> 12 *)

Page 31: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

31

Array Initialization

• Some language allow initialization at the time of storage allocation– C, C++, Java, C# example

int list [] = {4, 5, 7, 83} – Character strings in C and C++

char name [] = “freddie”;– Arrays of strings in C and C++

char *names [] = {“Bob”, “Jake”, “Joe”];– Java initialization of String objects

String[] names = {“Bob”, “Jake”, “Joe”};

Page 32: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

32

Arrays Operations• APL provides the most powerful array

processing operations for vectors and matrixes as well as unary operators (for example, to reverse column elements)

• Ada allows array assignment but also catenation

• Fortran provides elemental operations because they are between pairs of array elements– For example, + operator between two arrays results

in an array of the sums of the element pairs of the two arrays

Page 33: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

33

Rectangular and Jagged Arrays

• A rectangular array is a multi-dimensioned array in which all of the rows have the same number of elements and all columns have the same number of elements

• A jagged matrix has rows with varying number of elements– Possible when multi-dimensioned arrays actually

appear as arrays of arrays

Page 34: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

34

Accessing Multi-dimensioned Arrays

• Two common ways:– Row major order (by rows) – used in most languages– column major order (by columns) – used in Fortran

Page 35: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

35

Memory Layout Options

• Ordering of array elements can be accomplished in two ways:– row-major order – Elements travel across rows, then

across columns.– column-major order – Elements travel across

columns, then across rows.

Row-major Column-major

Page 36: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

36

Pointer and Reference Types

• A pointer type variable has a range of values that consists of memory addresses and a special value, nil

• Provide the power of indirect addressing• Provide a way to manage dynamic memory• A pointer can be used to access a location in the

area where storage is dynamically created (usually called a heap)

Page 37: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

37

Pointer Operations

• Two fundamental operations: assignment and dereferencing

• Assignment is used to set a pointer variable’s value to some useful address

• Dereferencing yields the value stored at the location represented by the pointer’s value– Dereferencing can be explicit or implicit– C++ uses an explicit operation via *j = *ptrsets j to the value located at ptr

Page 38: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

38

Pointer Assignment Illustrated

The assignment operation j = *ptr

Page 39: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

39

Pointer Arithmetic in C and C++

float stuff[100];float *p;p = stuff;

*(p+5) is equivalent to stuff[5] and p[5]*(p+i) is equivalent to stuff[i] and p[i]

Page 40: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

40

Pointers in Fortran 95

• Pointers can only point to variables that have the TARGET attribute

• The TARGET attribute is assigned in the declaration:

INTEGER, TARGET :: NODE

Page 41: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

41

Reference Types

• C++ includes a special kind of pointer type called a reference type that is used primarily for formal parameters– Advantages of both pass-by-reference and pass-by-

value

• Java extends C++’s reference variables and allows them to replace pointers entirely– References refer to call instances

• C# includes both the references of Java and the pointers of C++

Page 42: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

42

Garbage Collection

• Language implementation notices when objects are no longer useful and reclaims them automatically– essential for functional languages– trend for imperative languages

• When is object no longer useful?– Reference counts– Mark and sweep– “Conservative” collection

Page 43: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

43

Mark-and-Sweep

• Idea… when space low1. Mark every block “useless”2. Beginning with pointers outside the heap,

recursively explore all linked data structures and mark each traversed as useful

3. Return still marked blocks to freelist

• Must identify pointers– in every block– use type descriptors

Page 44: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

• Consider the grammar:• <nested> ”(” <nested> ”)” • | ”[” <nested> ”]”• | " "• (a) Is the sentence ([]) syntactically correct?

Using shift reduce method • (b) Is the sentence ([)] syntactically correct?

Using LL method • (c) Can this grammar is used for top down

parser? Why? How it can be modified to use if not?

44

Page 45: Structure of Programming Language Data Types. 2 A data type defines a collection of data objects and a set of predefined operations on those objects An

Given the following grammar:

 

<assign> ::= id = <expression>

<expression> ::= <term> | <expression>+<term> | <expression>-<term>

<term> ::= <factor> | <term> * <factor> | <term> / <factor>

<factor> ::= id | integer | (<expression>)

 

Using LL(1) draw the parse tree for:

B = A / (B+G) – D

45