c prog linux.pdf

135
Systems Programming 00. Introduction Alexander Holupirek Database and Information Systems Group Department of Computer & Information Science University of Konstanz Summer Term 2008 1 Welcome All! © 1995 United Feature Syndicate, Inc. (NYC), [email protected] 2 Visiting Card Alexander Holupirek [email protected] http://www.inf.uni-konstanz.de/~holupire 88 4440 E 217 I E-mail is the best way to reach me. I You are welcome in my office whenever you have a question (no need to make an appointment first). 3 Your Tutors Jochen Oekonomopulos [email protected] Enrolled in master studies Information Engineering V 504 Tuesday, 18:00-19:30, Room C 252/Computer Pool Thomas Zink [email protected] Enrolled in master studies Information Engineering V 504 Friday, 12:00-13:30, Room D 406/Computer Pool 4

Upload: yugandhara-rao

Post on 05-Nov-2015

278 views

Category:

Documents


13 download

TRANSCRIPT

  • Systems Programming00. Introduction

    Alexander Holupirek

    Database and Information Systems GroupDepartment of Computer & Information Science

    University of Konstanz

    Summer Term 2008

    1

    Welcome All!

    1995 United Feature Syndicate, Inc. (NYC), [email protected]

    2

    Visiting Card

    Alexander Holupirek

    [email protected]

    http://www.inf.uni-konstanz.de/~holupire

    88 4440 E 217

    I E-mail is the best way to reach me.I You are welcome in my office whenever you have a question

    (no need to make an appointment first).

    3

    Your Tutors

    Jochen Oekonomopulos

    [email protected]

    Enrolled in master studies Information Engineering

    V 504

    Tuesday, 18:00-19:30, Room C 252/Computer Pool

    Thomas Zink

    [email protected]

    Enrolled in master studies Information Engineering

    V 504

    Friday, 12:00-13:30, Room D 406/Computer Pool

    4

  • Tutorial Groups

    Subversion Repository for the Tutorial

    I We have set up a version control system for the tutorials.I Please use it to commit your solutions to the assignments.I Source code from the lecture is available in the /pub directory.I Once registered to the tutorial you will receive your

    credentials.

    Command line approach to check out the repository

    $ svn --username holu --password XXXX co \svn :// phobos29.inf.uni -konstanz.de/sys_S08

    5

    How You Will Benefit

    Assignments & Tutorials

    I Work on the weekly assignments.I Hand them in on time.I Jochen and Thomas will revise them.I Attend the tutorials and discussion of solutions.

    6

    How You Will Benefit (cont.)

    Lecture Material

    I Use the material provided on the course website to prepare forthe lectures.

    I Dont hesitate to ask questions.I Let me know if I can improve the lecture material and/or its

    presentation.

    7

    How You Will Benefit (cont.)

    Account & Mailinglist

    I Use the Account Tool to register to the course.I You will automagically become a member on the mailinglist

    sys [email protected].

    I Feel free to post and discuss problems, questions, commentson that list.

    I Make sure to receive the e-mails.1

    I Any information about changes etc. will be posted there.

    1These are sent to @inf.uni-konstanz.de8

  • How You Will Benefit (cont.)

    Examination and Credits

    I Register to the course (via StudIS) within 4 weeks.I Pass the examination at the end of the semester.I Examination dates:

    I July, 18th, 12:00 - 13:30, D 406I October, 17th, 12:00 - 13:30, D 406

    I 6 ECTS, Informatik der Systeme

    Have fun!

    9

    Organizational Matters

    Website for this course

    I Please check this site regulary for latest information.

    http://www.inf.uni-konstanz.de/dbis/teaching/ss08/sys/

    Schedule (OK for everybody?)

    I Monday, 18:00-19:30, Room C 252I Tuesday, 18:00-19:30, Room D 247/Computer Pool

    10

    Literature

    11

    Literature

    The IEEE and The Open Group.Single UNIX Specification, Version 3, 2004 Edition.http://www.unix.org/single unix specification/

    Brian W. Kernighan, Dennis M. Ritchie.The C Programming Language.ISBN 0-13-110370-9, 1988, 41th Printing.Prentice Hall Software Series

    W. Richard Stevens, Stephen A. Rago.Advanced Programming in the UNIX Environment.ISBN 978-0201433074Addison-Wesley Professional; 2nd edition (June 27, 2005)

    12

  • What Is This Course About?

    Systems Programming

    I With systems we mean operating systems.I With programming we mean using the interface an operating

    system (OS) provides.

    I With OS we mean UNIX-like OSs.

    Operating System

    I Layer of software on top of bare hardware.I Shields programmers from the complexity of the hardware.I Presents an interface (of a virtual machine) that is easier to

    understand and program.

    13

    The UNIX System Interface

    The UNIX operating system provides its services through a set ofsystem calls, which are in effect functions within the operatingsystem that may be called by user programs.

    I Syscalls determine a direct interface to the kernel.I Employed for maximum efficiency.I Access some facility that is not the libraries.

    I The service calls available in the interface vary from OS toOS, however the underlying concepts tend to be similar.

    I ISO C library is (in many cases) modeled on UNIX facilities.

    14

    Standardization Of The UNIX System Interface

    During the 1980s the proliferation of UNIX versions and differencesbetween them led many large users (such as the U.S. government)to call for standardization.

    I Among others ANSI2 C and the IEEE3 POSIX emergedI POSIX stands for Portable Operating System InterfaceI POSIX refers to a family of related standards4

    I POSIX originally used as synonym for IEEE Std 1003.1-1988I POSIX.1 emerged as a preferred termI The latest version of POSIX.1 was published on April, 30th 04.I It is called IEEE Std 1003.1, 2004 Edition (POSIX.1)

    2American National Standards Institute3Institute of Electrical and Electronics Engineers4IEEE Std 1003.n (where n is a number) and the parts of ISO/IEC 9945

    15

    Systems Programming With POSIX.1

    application using the API

    POSIX.1 system call interface

    OS as Black Box

    Figure: POSIX.1 as interface to UNIX OSs

    16

  • Systems vs. Kernel Programming.

    I Black Box Modell is suitable for systems programming.I Knowledge about the systems internals, however, is beneficial

    to use the system properly and to not work against it.

    I Providing the system services is (mostly) kernel programming.

    application using the API

    POSIX.1 system call interface

    OS as Black Box

    application using the API

    POSIX.1 system call interface

    OS kernel

    Figure: Black vs. White Box View of a UNIX System

    17

    The Joint Standard

    The latest version POSIX.1 has been jointly developed by the IEEEand The Open Group5. As such it is both an IEEE and an OpenGroup Technical Standard:

    I IEEE Std 1003.1, 2004 EditionI The Open Group Technical Standard Base Specifications, Issue 6I It is also an international standard ISO/IEC 9945:2003

    5http://www.opengroup.org/overview/members/membership list.htm18

    The Single UNIX Specification, Version 3

    The standard is published free of charge on the web6 as

    The Single UNIX Specification, Version 3, 2004 Edition

    Conceptually, this standard describes a set offundamental services needed for the efficient constructionof application programs. Access to these services hasbeen provided by defining an interface, using the Cprogramming language, a command interpreter, andcommon utility programs that establish standardsemantics and syntax.

    [IEEE/The Open Group, 2004, Preface]

    6http://www.unix.org/single unix specification/19

    The Single UNIX Specification (SUSv3)

    The document is broken into four parts:

    I Part 1: Base Definitions (XBD)I Part 2: System Interfaces (XSH)I Part 3: Shell and Utilities (XCU)I Part 4: Rationale

    The System Interfaces volume (XSH)7 describes a set of systeminterfaces offered to application programs by systems conformantto this part of the Single UNIX Specification. Readers are expectedto be experienced C language programmers.http://www.opengroup.org/onlinepubs/009695399/functions/contents.html

    7http://www.unix.org/version3/xsh contents.html20

  • Part 2: System Interfaces Volume (XSH)

    Because POSIX.1 specifies an interface and not an implementation,no distinction is made between system calls and library functions.

    Example

    System Interface Table. Lists 1123 interfaces.http://www.opengroup.org/onlinepubs/009695399/functions/atoi.html

    http://www.opengroup.org/onlinepubs/009695399/functions/read.html

    21

    UNIX Architecture

    kernel

    shellsystem calls

    applications

    library routines

    22

    System Calls - Section 2

    The system call interface has traditionally been documented inSection 2 of the UNIX Programmers Manual.

    1 General commands (tools and utilities).

    2 System calls and error numbers.

    3 Libraries.

    3p perl(1) programmers reference guide.

    4 Device drivers.

    5 File formats.

    6 Games.

    7 Miscellaneous.

    8 System maintenance and operation commands.

    9 Kernel internals.

    X11 An alias for X11R6.

    X11R6 X Window System.

    local Pages located in /usr/local.

    man(1) on OpenBSD

    23

    System Calls - Section 2

    The system call interface has traditionally been documented inSection 2 of the UNIX Programmers Manual.

    0 Header files (usually found in /usr/include)

    1 Executable programs or shell commands

    2 System calls (functions provided by the kernel)

    3 Library calls (functions within program libraries)

    4 Special files (usually found in /dev)

    5 File formats and conventions eg /etc/passwd

    6 Games

    7 Miscellaneous (including macro packages and

    conventions), e.g. man(7), groff(7)

    8 System administration commands (usually only for root)

    9 Kernel routines [Non standard]

    man(1) on Linux

    24

  • System Call Definition & C Library Functions

    I Definition of the system call interface is in the C language8.I A standard technique on UNIX systems is for each system call

    to have a function of the same name in the Standard CLibrary.

    I Those functions invoke the apt kernel service, using whatevertechnique is required on the system.I The function may put one or more of the C arguments into

    general registers and then execute some machine instructionthat generates a software interrupt in the kernel.

    I We can consider the system calls as being C functions.

    8Regardless of the actual implementation technique used to invoke a call25

    Library Calls - Section 3

    I Section 3 of the UNIX Programmers Manual defines thegeneral purpose functions available to the programmers.

    I These functions are not entry points into the kernel.I May use kernels system calls, however.I printf(3): May invoke write(2) to perform output.I atoi(3) (convert ASCII string to integer): no OS at all.

    I Implementors view (kernel programming): Distinctionbetween system call vs. library function is fundamental.

    I Users perspective (systems programming): Not as critical,both exist to provide services for application programs, but . . .

    26

    System Calls vs. Library Calls

    Example to illustrate the difference: current time and date

    I Some OS have syscalls to return the time and another toreturn the date. Special handling (switch to or from daylightsaving) is handled by the kernel or requires humanintervention.

    I UNIX provides one syscall (gettimeofday(2)) that returnsthe number of seconds since the Epoch.9

    I Any interpretation (local time zone, converting tohuman-readable time) is left to the user process.

    I Syscalls usually provide a minimal interface while libraryfunctions often provide more elaborate functionality.

    9midnight, January 1, 1970, Coordinated Universal Time27

    Essentials

    I Good knowledge of C.I Knowledge about the services an OS provides:

    I system calls.I C libraries.

    I Some knowledge about kernels internas.I Some knowledge about operating system concepts.I Some knowledge about the underlying hardware.

    28

  • Systems Programming01. The C Programming Language

    Alexander Holupirek

    Database and Information Systems GroupDepartment of Computer & Information Science

    University of Konstanz

    Summer Term 2008

    29

    A Tutorial Introduction

    Variables and Arithmetic Expressions

    Character Input and Output

    Arrays

    Functions

    Call by Value, Call by Reference

    Character Arrays

    Variables, Declarations and Scope

    30

    Schedule For Today: A First Glance At C

    I Quick introductionI Show essential elements of the languageI No details, rules, and exceptionsI Provide examplesI Show the basics, such as

    I variables and constantsI arithmeticI control flowI functionsI rudiments of input and output

    I Leave out anything else, such asI pointersI structuresI standard library

    31

    The First Program Is Always The Same

    Print the words: Hello, worldNot that easy, because you have to:

    I Create the program textI Compile it successfullyI Run itI Get the output

    1 #include

    2

    3 int

    4 main(void)

    5 {

    6 printf("Hello , world\n");

    7 return (0);

    8 }

    32

  • Compilation On A UNIX-like OS

    $ cc -Wall hello.c$ lshello.c a.out

    $ ./a.outHello , world

    $

    engine filename description

    hello.c source codepreprocessor hello.i source w/ preproc. directives expandedcompiler hello.s assembler codeassembler hello.o object code ready to be linkedlinker a.out executable

    33

    C Programs

    Basic building blocks

    I functionsI statementsI variablesI arguments

    I functions contain statementsI statements specify computing operations to be doneI variables store values used during computationI arguments (one way to) communicate data between functions

    34

    Building Blocks Of Our Example

    I A function called mainI Liberty to name functions whatever you like, but . . .I main is special, a program begins execution at the beginning

    of main

    I Every program must have a main somewhereI main will usually call other functions to help perform its job

    I Functions that you wroteI Functions that are provided for you, e.g. printf

    35

    Some Explanations About The Program Itself

    1 #include

    2

    3 int

    4 main(void)

    5 {

    6 printf("Hello , world\n");

    7 return (0);

    8 }

    I line 1: tell compiler to include information about the standardinput/output library

    I line 3/4: define a function named main, which receives noargt values. Parentheses after the function name surround theargument list (emlist). Returns an int.

    I line 5/8: statements of main are enclosed in bracesI line 6: main calls library function printf, which prints this

    sequences of characters; \n represents the newline character.

    36

  • Line 6: Print A String

    I A function is called by naming it, followed by a parenthesizedlist of arguments:

    printf("Hello world\n");

    calls the function printf with the argument

    "Hello world\n"

    I printf is a library function that prints output

    (in this case the string of characters between the quotes)

    37

    Character String/String Constant

    I A sequence of characters in double quotes is called a characterstring or string constant

    I Sequence \n stands for the newline character, which whenprinted advances the output to the left margin of the next line

    I We have to use \n to include a newline character with printf

    printf("Hello , world

    ");

    $ cc hello.chello.c:6:16: missing terminating " character

    hello.c:7:9: missing terminating " character

    hello.c: In function main:

    hello.c:8: error: syntax error before "return"

    38

    Printing Hello, world

    I printf never supplies a newline automaticallyI so several calls can build up an output line in stagesI our first program could just as well have been written like

    below to produce identical output

    #include

    int

    main(void)

    {

    printf("Hello , ");

    printf("world");

    printf("\n");

    return (0);

    }

    39

    Escape Sequences

    I Notice that \n represents only a single characterI An escape sequence like \n provides a general and extensible

    mechanism for hard-to-type or invisible characters.

    \a alert (bell) character \\ backslash\b backspace \? question mark\f formfeed \ single quote\n newline \" double quote\r carriage return \ooo octal number\t horizontal tab \xhh hexadecimal number\v vertical tab

    Table: The complete set of escape sequences

    40

  • A Tutorial Introduction

    Variables and Arithmetic Expressions

    Character Input and Output

    Arrays

    Functions

    Call by Value, Call by Reference

    Character Arrays

    Variables, Declarations and Scope

    41

    Fahrenheit-Celsius: C = (5/9)(F 32)

    1 #include

    2 /* print fahrenheit -celsius table

    3 for fahrenheit = 0, 20, ..., 300 */

    4 int

    5 main(void)

    6 {

    7 int fahr , celsius;

    8 int lower , upper , step;

    9

    10 lower = 0; /* lower limit */

    11 upper = 300; /* upper limit */

    12 step = 20; /* step size */

    13

    14 fahr = lower;

    15 while (fahr

  • Data Types And Sizes

    Sizes are machine-dependent

    I Each compiler is free to choose appropriate sizes for its ownhardware. ISO C defines compile-time limits.

    I short and int are at least 16 bitI long is at least 32 bitI short is no longer than int, int is no longer than longI Numerical limits10 are documented in and

    . Additional limits are specified in 11

    Assignment

    10ISO C99 : 7.10/5.2.4.2 : Numerical limits11ISO C99 : 7.18 : Integer Types

    45

    The while Loop

    Each line in the result table is computed the same way:

    15 while (fahr

  • Fahrenheit-Celsius Converter Bug List

    Fixing problems

    I Pretty printing: Right-justified outputI Switch from integer to floating-point arithmetic

    Construct a patch for the changes using diff(1)

    NAME

    diff - compare files line by line

    SYNOPSIS

    diff [OPTION ]... FILES

    DESCRIPTION

    Compare files line by line.

    -u -U NUM --unified [=NUM]

    Output NUM (default 3) lines of unified context.

    -p --show -c-function

    Show which C function each change is in.

    49

    1 $ diff -up fahrenheit_v1.c fahrenheit_v2.c2 --- fahrenheit_v1.c Sat Apr 19 08:58:48 2008

    3 +++ fahrenheit_v2.c Sat Apr 19 08:58:05 2008

    4 @@ -4,7 +4,7 @@

    5 int

    6 main(void)

    7 {

    8 - int fahr , celsius;

    9 + float fahr , celsius;

    10 int lower , upper , step;

    11

    12 lower = 0; /* lower limit */

    13 @@ -13,8 +13,8 @@ main(void)

    14

    15 fahr = lower;

    16 while (fahr

  • Printing With printf(3)

    specifier print as . . .

    %d decimal integer%6d decimal, at least 6 characters wide%f floating point%6f floating point, at least 6 characters wide%.2f floating point, 2 characters after decimal point%6.2f floating point, at least 6 wide and 2 after decimal point

    I Further printf(3) recognizes %o for octal, %x forhexadecimal, %c for character, %s for string, %p for address(pointer)

    I ISO C : 7.19.6 : Formatted input/output functions

    53

    The for Loop, Fahrenheit-Celsius v3

    1 #include

    2 /* print fahrenheit -celsius table

    3 for fahrenheit = 0, 20, ..., 300 */

    4 int

    5 main(void)

    6 {

    7 int fahr;

    8

    9 for (fahr = 0; fahr

  • Character Input And Output

    Processing character data

    I Text I/O is dealt with as streams of charactersI A text stream is a sequence of characters divided into linesI Each line consists of zero or more characters followed by a

    newline character (regardless of where the stream originates orwhere it goes to). The library makes each input or outputstream conform to this model

    I Standard library provides several functions for reading andwriting one character at a time, of which getchar(3) andputchar(3) are the simplest.

    57

    getchar(3) and putchar(3)

    #include

    int int

    getchar(void); putchar(int c);

    I getchar(3) reads the next input character from a text streamI Why does getchar(3) return an int?

    I getchar(3) returns a distinctive value when there is no moreinput. A value, called EOF (end of file), that cannot beconfused with any real data. EOF is defined in

    I The return type must be big enough to hold EOF in addition toany possible char.

    I putchar(3) prints a character each time it is called

    58

    File Copying

    Given getchar(3) and putchar(3) . . .

    . . . we can write a surprising amount of useful code withoutknowing anything more about input and output

    Copying input to output one character at a time

    read a characterwhile (character is not end-of-file indicator)

    output the character just readread a character

    59

    File Copying, v1

    read a characterwhile (character is not end-of-file indicator)

    output the character just readread a character

    1 #include

    2

    3 /* copy input to output , v1 */

    4 int

    5 main(void)

    6 {

    7 int c;

    8

    9 c = getchar ();

    10 while (c != EOF) {

    11 putchar(c);

    12 c = getchar ();

    13 }

    14

    15 return (0);

    16 }

    60

  • File Copying, v2

    I An assignment, such as c = getchar() is an expression andhas a value (value of the left hand side after the assignment)

    I An assignment can appear as part of a larger expression

    1 #include

    2

    3 /* copy input to output , v2 */

    4 int

    5 main(void)

    6 {

    7 int c;

    8

    9 while ((c = getchar ()) != EOF)

    10 putchar(c);

    11

    12 return (0);

    13 }

    61

    Character Counting, v1

    1 #include

    2

    3 /* count characters in input , v1 */

    4 int

    5 main(void)

    6 {

    7 long nc;

    8

    9 nc = 0;

    10 while (getchar () != EOF)

    11 ++nc;

    12 printf("%ld\n", nc);

    13

    14 return (0);

    15 }

    62

    Character Counting, v2

    1 #include

    2

    3 /* count characters in input , v2 */

    4 int

    5 main(void)

    6 {

    7 double nc;

    8

    9 for (nc = 0; getchar () != EOF; ++nc)

    10 ; /* nothing */

    11 printf("%.0f\n", nc);

    12

    13 return (0);

    14 }

    63

    Line Counting

    I Standard library ensures that an input text stream appears asa sequence of lines, each terminated by a newline

    1 #include

    2

    3 /* count lines in input */

    4 int

    5 main(void)

    6 {

    7 int c, nl;

    8

    9 nl = 0;

    10 while ((c = getchar ()) != EOF)

    11 if (c == \n)

    12 ++nl;

    13 printf("%d\n", nl);

    14

    15 return (0);

    16 }

    64

  • Word Counting

    NAME

    wc - word , line , and byte or character count

    SYNOPSIS

    wc [-c | -m] [-hlw] [file ...]

    DESCRIPTION

    The wc utility reads one or more input text files , and ,

    by default , writes the number of lines , words , and bytes

    contained in each input file to the standard output

    $ wc /etc/services285 1398 9732 /etc/services

    $ cc count_words.c$ cat /etc/services | ./a.out285 1398 9732

    65

    1 #include

    2

    3 #define IN 1 /* inside a word */

    4 #define OUT 0 /* outside a word */

    5

    6 /* count lines , words and , characters in input */

    7 int

    8 main(void)

    9 {

    10 int c, nl, nw, nc, state;

    11

    12 state = OUT;

    13 nl = nw = nc = 0;

    14 while ((c = getchar ()) != EOF) {

    15 ++nc;

    16 if (c == \n)

    17 ++nl;

    18 if (c == || c == \n || c == \t)

    19 state = OUT;

    20 else if (state == OUT) {

    21 state = IN;

    22 ++nw;

    23 }

    24 }

    25 printf("%d %d %d\n", nl, nw, nc);

    26 return (0);

    27 }

    66

    A Tutorial Introduction

    Variables and Arithmetic Expressions

    Character Input and Output

    Arrays

    Functions

    Call by Value, Call by Reference

    Character Arrays

    Variables, Declarations and Scope

    67

    Counting Digits, White Spaces, And The Rest

    Next is an artificial program, which counts the number ofoccurrences of each digit, of white space characters (blank, tab,newline), and all other characters.

    It will help us to . . .

    I introduce arraysI talk about initializationI see that chars are, by definition, just small integersI speak about coding conventions

    The output of the program on itself is:

    $ cat count_digits.c | ./a.outdigits = 10 3 0 0 0 0 0 0 0 1, white space =122, other =361

    $ wc -m count_digits.c497 count_digits.c

    68

  • 1 #include

    2

    3 /* count digits , white space , others */

    4 int

    5 main(void)

    6 {

    7 int c, i, nwhite , nother;

    8 int ndigit [10];

    9

    10 nwhite = nother = 0;

    11 for (i = 0; i < 10; ++i)

    12 ndigit[i] = 0;

    13

    14 while ((c = getchar ()) != EOF)

    15 if (c >= 0 && c = 0 */

    17 int

    18 power(int base , int n)

    19 {

    20 int i, p;

    21

    22 p = 1;

    23 for (i = 1; i

  • Function Terminology

    line 3: function declaration (function prototype), says that power is afunction that expects two int arguments and returns an int

    line 17: function definition starts with the declaration of the parametertypes and names, and the type of the result that the functionreturns (has to match with the prototype)

    I parameter, a variable named in the parenthesized list in afunction definition

    I argument, a value used in a call of the functionI parameter and argument are sometimes referred to as formal

    and actual argument

    73

    A Tutorial Introduction

    Variables and Arithmetic Expressions

    Character Input and Output

    Arrays

    Functions

    Call by Value, Call by Reference

    Character Arrays

    Variables, Declarations and Scope

    74

    ArgumentsCall by Value/Reference

    In C, all function arguments are passed by value

    I The called function is given the values of its arguments intemporary variables rather than the originals

    I The callee cant directly alter a variable in the calling function

    Call by reference is possible

    I The caller must provide the address of the variable to be set(technically a pointer to the variable), and the called functionmust declare the parameter to be a pointer and access thevariable indirectly through it

    I We will discuss pointers in more detail at a later point

    75

    Passing An Array As Argument

    When the name of an array is used as an argument,

    I the value passed to the function is the location or address ofthe beginning of the array

    I there is no copying of array elementsI the function can access and alter any element of the array

    76

  • A Tutorial Introduction

    Variables and Arithmetic Expressions

    Character Input and Output

    Arrays

    Functions

    Call by Value, Call by Reference

    Character Arrays

    Variables, Declarations and Scope

    77

    Character Arrays

    The most common type of array in C is the array of characters

    longline.c reads a set of text lines and prints the longest

    Program outline:

    while (there is another line)if (its longer than the previous longest)

    save itsave its length

    print longest line

    78

    Splitting The Program

    The program divides naturally into pieces

    I Function getline fetches the next line of inputI It has to return a signal about end-of-fileI We let it return the length of the line, or zero on EOFI Zero is appropriate because it is never a valid line length

    I Function copy copies a line to a safe placeI Function main to control getline and copy

    1 #include

    2 #define MAXLINE 1000 /* maximum input line size */

    3

    4 int getline(char line[], int maxline );

    5 void copy(char to[], char from []);

    79

    The Controlling Function

    7 /* print longest input line */

    8 int

    9 main(void)

    10 {

    11 int len; /* current line length */

    12 int max; /* maximum length seen so far */

    13 char line[MAXLINE ]; /* current input line */

    14 char longest[MAXLINE ]; /* longest line saved here */

    15

    16 max = 0;

    17 while ((len = getline(line , MAXLINE )) > 0)

    18 if (len > max) {

    19 max = len;

    20 copy(longest , line);

    21 }

    22 if (max > 0) /* there was a line */

    23 printf("%s", longest );

    24 return (0);

    25 }

    80

  • Getting A Line

    27 /* getline: read a line into s, return length */

    28 int

    29 getline(char s[], int lim)

    30 {

    31 int c, i;

    32

    33 for (i=0; i

  • External Variables

    Global, external variables of a program

    I As an alternative to automatic variables, it is possible todefine variables that are external to all functions.

    I Those can be accessed by name by any functionI Because external variables are globally accessible, they can be

    used instead of argument lists to communicate data betweenfunctions (but, beware!)

    I External variables remain into existence permanentlyI They retain their values even after the functions that set them

    have returned

    85

    External Variables (cont.)

    Definition and declaration of external variables

    I An external variable must be defined, exactly once, outside afunction; this sets aside storage for it.

    I The variable must also be declared in each function thatwants to access it; this states the type of the variable.

    I In general: All variables (automatic or extern) must bedeclared, either explicit or implicit from context

    I Definition of a variable, refers to the place where the variableis created and assigned storage

    I Declaration of a variable, refers to places where the nature ofthe variable is stated but no storage is allocated

    86

    1 #include

    2 #define MAXLINE 1000 /* maximum input line size */

    3

    4 int max; /* maximum length seen so far */

    5 char line[MAXLINE ]; /* current input line */

    6 char longest[MAXLINE ]; /* longest line saved here */

    7

    8 int getline(void);

    9 void copy(void);

    10

    11 /* print longest line; external objects , weak solution */

    12 int

    13 main(void)

    14 {

    15 int len; /* current line length */

    16 extern int max;

    17 extern char longest [];

    18

    19 max = 0;

    20 while ((len = getline ()) > 0)

    21 if (len > max) {

    22 max = len;

    23 copy ();

    24 }

    25 if (max > 0) /* there was a line */

    26 printf("%s", longest );

    27 return (0);

    28 }

    29

    30 int

    31 getline(void)

    32 {

    33 int c, i;

    34 extern char line [];

    35

    36 for (i=0; i < MAXLINE -1

    37 && (c=getchar ()) != EOF && c != \n; ++i)

    38 line[i] = c;

    39 if (c == \n) {

    40 line[i] = c;

    41 ++i;

    42 }

    43 line[i] = \0;

    44 return i;

    45 }

    46

    47 void

    48 copy(void)

    49 {

    50 int i;

    51 extern char line[], longest [];

    52

    53 i = 0;

    54 while (( longest[i] = line[i]) != \0)

    55 ++i;

    56 }

    87

    30 int

    31 getline(void)

    32 {

    33 int c, i;

    34 extern char line [];

    35

    36 for (i=0; i < MAXLINE -1

    37 && (c=getchar ()) != EOF && c != \n; ++i)

    38 line[i] = c;

    39 if (c == \n) {

    40 line[i] = c;

    41 ++i;

    42 }

    43 line[i] = \0;

    44 return i;

    45 }

    46

    47 void

    48 copy(void)

    49 {

    50 int i;

    51 extern char line[], longest [];

    52

    53 i = 0;

    54 while (( longest[i] = line[i]) != \0)

    55 ++i;

    56 }

    88

  • Terminology: External vs. Internal

    I A C program consists of a set of external objects, which areeither variables or functions

    I Function are always external, because C does not allowfunctions to be defined inside other functions

    I External is used in contrast to internal, which describes thearguments and variables used inside functions

    I By default, external variables and functions have the propertythat all references to them by the same name, even fromfunctions compiled separately, are references to the samething (this is called external linkage in the standard)15

    15We will see later how to define external variables and functions that arevisible only within a single source file, once again, the keyword is static

    89

    Static Internal Variables

    The static declaration can be applied to internal variables

    I Internal static variables are local to a particular function (justas automatic variables), but unlike automatics, they remain inexistence over different invokations of the function

    I This means that internal static variables provide private,permanent storage within a single function

    void

    f(unsigned int m, long n)

    {

    static int i;

    ...

    }

    90

    Static External Variables And Functions

    The static declaration can be applied to external objects

    I Applied to an external variable or function, static limits thescope of that object to the rest of the source file

    I It provides a way to hide names otherwise globally visible

    static char buf[BUFSIZE ];

    static int bufp = 0;

    static void

    f(register unsigned int m, register long n)

    {

    ...

    }

    91

    Register Variables

    The register declaration

    I advises the compiler that the variable in question will beheavily used. The idea is to place it in a machine register

    I Compiler are free to ignore the adviceI Can only be used with automatics and formal argumentsI Not possible to take the address of a register variable

    void

    f(register unsigned int m, register long n)

    {

    register int i;

    ...

    }

    92

  • Initialization

    I In the absence of explicit initialization, external and staticvariables are guaranteed to be initialized to zero.

    I Scalar variables may be initialized when they are defined, byfollowing the name with an equals sign and an expression:

    int x = 1;

    char squote = \;

    long day = 1000L * 60L * 60L * 24L; /* milliseconds/day */

    I For external and static variables, the initializer must be aconstant expression; the initialization is done once,conceptually before the program begins execution.

    I For automatic and register variables, it is done each time thefunction or block is entered (not restricted being a constant)

    93

    Block Structure And Scope

    I Declarations of variables (including initializations) may followthe left brace that introduces any compound statement, notjust the one that begins a function

    I They hide any identically named variables in outer blocksI They remain into existence until the matching right braceI What is the scope of i?

    if (n > 0) {

    int i; /* declare a new int */

    for (i = 0; i < n; i++)

    ...

    }

    94

    Block Structure And Scope

    I An automatic variable declared and initialized in a block isinitialized each time the block is entered

    I A static variable is initialized only the first time the block isentered

    I Automatic variables, including formal parameters, also hideexternal variables and functions of the same name

    int x;

    int y;

    void

    f(double x)

    {

    double y;

    ...

    }

    95

    Systems Programming02. C Programs in Space and Time

    Alexander Holupirek

    Database and Information Systems GroupDepartment of Computer & Information Science

    University of Konstanz

    Summer Term 2008

    96

  • C Programs In (Address) Space And (Run-)time

    Where is my data and why do I have to know?

    I C is closely related to the machine. Before talking aboutpointers, storage allocation etc. some background knowledgeabout address space, (virtual) memory and its allocationduring program execution comes in handy

    I Knowledge about the memory layout of a program is quitehelpful when debugging

    I Knowledge about what is happening inside the machine onprogram execution is fundamental, to both, debuggingprograms and, in first place, writing clean code

    97

    Repetition Computer Architecture

    Storage Classes

    From Source Code To Executable Code

    Construction of an Executable

    Relocation Process

    98

    C, Assembler, And Machine Code

    int a, b;

    a = b * b;

    mov 0x403030,%eaximul 0x403030,%eaxmov %eax,0x403020

    4012ee a14012ef 304012f0 304012f1 404012f2 004012f3 0f4012f4 af4012f5 054012f6 304012f7 304012f8 404012f9 004012fa a34012fb 204012fc 304012fd 404012fe 00

    ausfhrbarerBinrcode (hexa-dezimal dargestellt)

    Intel iA32-Assembler-Quellcode

    Maschinenbefehle bzw.Prozessorinstruktionen

    Adresse Inhalt (je 1 Byte)

    C-Quellcode

    99

    C, Assembler, And Machine Code

    int a=4, b;

    int main(void) {

    if (a>5)

    b=1;

    else

    b=0;

    }

    8048344: 83 3d 94 94 04 08 05 cmpl $0x5,0x8049494804834b: 7e 0c jle 8048359

    804834d: c7 05 8c 95 04 08 01 movl $0x1,0x804958c8048354: 00 00 00

    8048357: eb 0a jmp 8048363

    8048359: c7 05 8c 95 04 08 00 movl $0x0,0x804958c8048360: 00 00 00

    8048363: c9 ...

    Speicher-adresse

    Speicherinhalt(=Maschinenbefehl)

    C-Quellcode Ausfhrbarer Binrcode Assembler-Quellcode

    a liegt auf Adresse 0x8049494b liegt auf Adresse 0x804958c

    Zahlenwerte in Binr- und Assemblercodesind alle hexadezimal zu verstehen

    100

  • Address Space

    0

    max.

    0x10000000

    0x1000000f0x10000010

    Datenblock

    0x500000000x50000001

    16 Byte

    Gre desDatenblocks

    Startadresse desDatenblocks

    Letzte Byteadressedes Datenblocks

    Adresse des erstenByte nach demDatenblock

    Tiefstmgliche Adresse(Speicherbeginn)

    Hchstmgliche Adresse(Speicherende)

    Speicheradressen Speicherinhalte

    Adressen einzelnerByte

    0x56

    0xfc

    101

    Byte Ordering

    Adr.

    AdressraumDaten (4 Byte):

    MSB LSBd3 d2 d1 d0

    0

    n

    max.

    Big-Endian-System Little-Endian-System

    Adr. InhaltMSB

    LSB

    Mit der Adresse n wird auf die 4 Byte groen Daten im Programm zugegriffen

    nn+1n+2n+3

    d3d2d1d0

    Adr. Inhaltd0d1d2d3

    nn+1n+2n+3

    LSB

    MSB

    MSB = Most Significant Byte (hchstwertiges Byte)LSB = Least Significant Byte (niedrigstwertiges Byte)

    102

    Alignment Rules

    Goal: Optimal Performance

    I Determine the address locations for variables and instructionsI Great impact on compiler, assembler, linker tools

    Adressraum

    Adressen(hexadezimal)

    0x350x360x370x38

    Daten-Langwort(misaligned)

    Datenbus

    Adressoffsets (Byteadressen)

    1. Zugriff

    2. Zugriff

    Langwortgrenzen auf dem Bus

    Langwortgrenzen (ohne Rest durch 4 teilbar) im Adressraum

    +00x34

    +10x35

    +20x36

    +30x37

    0x38 0x39 0x3a 0x3b

    103

    Alignment Rules (cont.)

    For derived types16 (constructed from the basic types) alignmentrules apply to each single component:

    struct artikel {char name[5];int anzahl;double preis;};

    alignment(1) alignment(4)

    Alignment rules may be influenced through compiler directives(-malign-int aligns variables on 32-bit boundaries producing code that runs

    somewhat faster on processors with 32-bit busses at the expense of memory)

    16arrays, functions, pointers, structures, unions (we will discuss them later)104

  • Repetition Computer Architecture

    Storage Classes

    From Source Code To Executable Code

    Construction of an Executable

    Relocation Process

    105

    Storage Classes

    Placement of data in memory depends on storage class

    I An object, such as a variable, is a location in storage, and itsinterpretation depends on two main attributes: its storageclass and its type

    I The storage class determines the lifetime of the storageassociated with the identified object

    I The types determines the meaning of the values found in theidentified object.

    I In C we have two storage classes: automatic and staticI Storage class specifiers (auto, extern, register, static)

    together with the context of an objects declaration, specifyits storage class

    106

    Automatic Storage Class

    Automatic Objects

    I auto and register give the declared objects automaticstorage class, and may be used only within functions

    I They are local to a block17, discarded on exit from the blockI Declarations within a block create automatic objects if no

    storage class specification is mentioned or auto is used

    I Initialization of automatic objects is performed each time theblock is entered at the top (if a jump into the block isexecuted the initializations are not performed)

    I Objects declared register are automatic, and are (ifpossible) stored in fast registers of the machine

    I For register the address operator & is not allowed

    17aka compound statement, such as the body of a function107

    Static Storage Class

    Static Objects

    I May be local to a block or external to all blocksI In both cases, they retain their values across exit from and

    reentry to functions and blocks

    I Within a block, static objects are declared with staticI Objects declared outside of all blocks (at the same level as

    function definitions) are always static

    I On the outer level, the keyword static makes them local toa particular translation unit (internal linkage)

    I They are global to an entire program by omitting an explicitstorage class, or by using extern (external linkage)

    108

  • Storage Class And Sections

    Intermediate Summary

    I A program executed does not only use storage for itsinstructions, but additionally needs space for, e.g., variables

    I Variables may be temporary, dynamically allocated, or static(i.e., permanent in terms of storage allocation), initialized oruninitialized, declared as constant (const) and thus read-only

    I Placement of data in memory depends on its storage classI During the translation process the compiler uses sections to

    divide the address space into logical units

    I Details vary with operating systems and compiler used

    109

    Typical Program Organisation

    A typical program divides naturally in sections

    Code machine instructions, should be unmodifiable, size is knownafter compilation, does not change (.text)

    Data I static dataI initialized (.data) /uninitialized (.bbs)I constant address in memoryI permanent life time

    I dynamic dataI stack or heapI storage space not knownI volatile life time

    110

    Program Sections

    .text

    .data

    .bss

    PROM oder RAM

    RAM

    RAM

    Adressraum

    schreibgeschtzt

    PROM:Programmable Read Only Memory(im Betrieb nicht beschreibbarerSpeicherbaustein)

    RAM:Random Access Memory(Speicher mit wahlfreiem Zugriff)

    111

    Virtual Memory And Segments

    Virtual Memory

    I Whenever a process is created, the kernel provides a chunk ofphysical memory which can be located anywhere

    I Through the magic of virtual memory (VM), the processbelieves it has all the memory on the computer

    Typically the VM space is laid out in a similar manner:

    I Text Segment (.text)I Initialized Data Segment (.data)I Uninitialized Data Segment (.bss)I The StackI The Heap

    112

  • A Program In Memory

    Code, Konstanten

    initialisierte Datennicht initialisierte Daten

    Heap

    0Ad

    ress

    en

    aus ausfhrbarer Datei geladen

    bei Prozessstart bereitgestelltund mit 0 initialisiert (gelscht)bei Prozessstart bereitgestellt,fr dynamische Speicherallozierung,

    bei Prozessstart bereitgestellt,wchst zu tieferen Adressen(bzw. zu hheren Adr.;

    wchst dem Stapel entgegen

    prozessorabhngig)Stack

    staticdatadynamicdata

    113

    Different Memory Layouts

    Code, Konstanten

    initialisierte Datennicht initialisierte Daten

    Heap

    0

    Adre

    ssen

    Stack

    Code, Konstanten

    initialisierte Datennicht initialisierte Daten

    Heap

    Stack

    0

    Adre

    sse

    n

    (A) Lsung auf PC (iA32) (B) Stack umgekehrt wachsend

    Programm-startadresse

    114

    Memory Segments

    Text Segment. The text segment contains the actual code(including constants) to be executed. Its usually sharable, somultiple instances of a program can share the text segment tolower memory requirements. This segment is usually markedread-only so a program cant modify its own instructions.

    Initialized Data Segment. This segment contains global variableswhich are initialized by the programmer.

    Uninitialized Data Segment. Also named .bss (block started bysymbol) which was an operator used by an old assembler.This segment contains uninitialized global variables. Allvariables in this segment are initialized to 0 or NULL pointersbefore the program begins to execute.

    115

    Memory Segments (cont.)

    The Stack The stack is a collection of stack frames which we willdiscuss later. When a new frame needs to be added (as aresult of a newly called function), the stack grows downward.

    The Heap Dynamic memory, where storage can be (de-)allocatedvia Cs free(3)/malloc(3). The C library also getsdynamic memory for its own personal workspace from theheap as well. As more memory is requested on the fly, theheap grows upward.

    116

  • Variable Placement And Life Time (Code)

    int a;

    static int b;

    void

    func(void)

    {

    char c;

    static int d;

    }

    int

    main(void)

    {

    int e;

    int *pi = (int*) malloc(sizeof(int));

    func ();

    func ();

    free(pi);

    return (0);

    }

    117

    Variable Placement And Life Time (Code)

    int a; /* Permanent life time */

    static int b; /* dito , but reduced scope */

    void

    func(void)

    {

    char c; /* only for the life time of func() */

    /* but 2x; visible only in func() */

    static int d; /* im unique , exist once at a stable */

    /* address , visible only in func() */

    }

    int

    main(void)

    {

    int e; /* life time of main() */

    int *pi = (int*) malloc(sizeof(int)); /* newborn */

    func ();

    func ();

    free(pi); /* RIP , pi points to an invalid address */

    return (0);

    }

    118

    Variable Placement And Life Time (Diagram)

    t=0: Programmausfhrung wirdgestartet, d.h., Ausfhrungsum-gebung ist bereits initialisiert

    t=x: beliebiger Zeitpunkt whrendder Programmausfhrung

    Code

    Daten

    Halde (Heap)

    Stapel (Stack)

    Adresse0

    max.

    PC(t=0)PC(t=x)

    pi

    SP(t=0)

    SP(t=x)

    1. Instruktion2. Instruktion3. Instruktion4. Instruktion...

    ab

    cpie

    intd

    119

    Variable Placement

    Variables (outside a function) Globally declared variables go to theUninitialized Data Segment if they are not initialized, toInitialized Data Segment otherwise. Necessary for the OS todecide if storage has to be loaded with initialization datafrom the executable binary.

    Variables (inside a function) Implicit assumption of auto, go toThe Stack. Declared as static, see above.

    Constants (const) Text Segment

    Function Parameters Are pushed on The Stack or stored inregisters. If pointers are passed, data is elsewhere.

    120

  • Repetition Computer Architecture

    Storage Classes

    From Source Code To Executable Code

    Construction of an Executable

    Relocation Process

    121

    From Source Code To Executable Code

    Translation Steps (multi-phase compilation)

    Compilation HLL source code to assembler source code

    Assembly Assembler source code to object code

    Linking Object code to executable code

    Compilers and assemblers create object files containing thegenerated binary code and data for a source file. Linkers combinemultiple object files into one, loaders take object files and loadthem into memory.

    Goal: An executable binary file (a.out)

    From high-level language (HLL) source code to executable code,i.e., concrete processor instructions in combination with data.

    122

    Translation Steps Using gcc(1)

    Prprozessor Compiler Assembler Binder

    *.c/*.cc/*.cpp

    *.s

    *.s

    *.o

    *.o/*.a

    a.out

    Eingabe-

    Ausgabe-

    Quellcode C/C++ Assembler-Quellcode

    Assembler-Quellcode Objektdatei Ausfhrbare Datei(= Objektdatei, ladbar)

    Objektdatei,

    *.i/*.ii

    Vorverarbeiteter

    Bibliotheksdatei

    dateien

    dateien

    C/C++-Quellcode (ungebunden)Objektdatei(ungebunden)

    123

    File Suffixes And Their Meaning

    For any given input file, the file name suffix determines what kindof compilation is done (see gcc(1)) for more details and suffixes:

    suffix compilation step

    .c C source code which must be preprocessed

    .i C source code which should not be preprocessed

    .h Header file to be turned into a precompiled header

    .s Assembler code

    .o An object file to be fed straight into linking

    124

  • Creation Of An Executable File

    = Operation

    = Eingang oder= Kommando

    (Filename).o

    a.out

    ld

    gas

    Assemblieren

    (Filename).sgcc

    Kompilieren

    (Filename).c

    Object/Library Files

    Binden

    Ausgang

    125

    The C Preprocessor

    The C preprocessor performs . . .

    I Inclusion of named filesI Macro SubstitutionI Conditional Compilation

    126

    File Inclusion

    A control line of the form

    #include filename

    causes the replacement of that line by the entire contents of thefile filename.

    NoteThe characters in the name filename must not include > or \n, andthe effect is undefined if it contains any of ", , \ , or /*.

    LocationThe named file is searched for in a sequence of implementation-dependent places (often starting in /usr/include).

    127

    Macro Substitution

    A control line of the form

    #define identifier token -sequence

    causes the preprocessor to replace subsequent instances of theidentifier with the given sequence of tokens.

    Example

    #define EXIT_FAILURE 1

    #define EXIT_SUCCESS 0

    #define S_IRWXU 0000700 /* RWX mask for owner */

    #define S_IRUSR 0000400 /* R for owner */

    #define S_IWUSR 0000200 /* W for owner */

    #define S_IXUSR 0000100 /* X for owner */

    128

  • Macro Substitution (cont.)

    A control line of the form

    #define identifier( identifier -list ) token -sequence

    where there is no space between the first identifier and the (, is amacro definition with parameters given by the identifier list.

    Example

    #define S_ISDIR(m) ((m & 0170000) == 0040000) /* directory */

    #define S_ISCHR(m) ((m & 0170000) == 0020000) /* char sp. */

    #define S_ISBLK(m) ((m & 0170000) == 0060000) /* block sp.*/

    #define S_ISREG(m) ((m & 0170000) == 0100000) /* regular */

    #define S_ISFIFO(m) ((m & 0170000) == 0010000) /* fifo */

    129

    Macro Substitution (cont.)

    A control line of the form

    #undef identifier

    causes the identifiers preprocessor definition to be forgotten. It isnot erroneous to apply #undef to an unknown identifier.

    Example

    /*

    * Some header files may define an abs macro.

    * If defined , undef it to prevent a syntax error

    * and issue a warning.

    * #warning is a pragma (implementation -dependent action)

    */

    #ifdef abs

    #undef abs

    #warning abs macro collides with abs() prototype , undefining

    #endif

    130

    Conditional Inclusion

    Parts of a program may be compiled conditionally

    Example

    #ifndef NULL

    #ifdef __GNUG__

    #define NULL __null

    #else

    #define NULL 0L

    #endif

    #endif

    131

    Predefined Names

    Several identifiers are predefined, and expand to produce specialinformation. They, and also the preprocessor expression operatordefined, may not be undefined or redefined.

    LINE A decimal constant containing the current source line numberFILE A string literal containing the name of the file being compiledDATE A string literal containing the data of compilation Mmm dd yyyyTIME A string literal containing the data of compilation hh:mm:ss

    STDCThe constant 1. It is intended that this identifier be defined tobe 1 only in standard-conforming implementations

    132

  • Compilation

    HLL-Quellcode Compiler

    Assembler-Quellcode

    bersetzungsliste mit

    Text

    Text

    Text

    evtl. temporre Dateien

    Kompilation

    Fehlermeldungen

    133

    Assembly

    Assembler-

    Assemblierung

    Assembler

    Maschinencode und

    bersetzungsliste mit Fehler-

    Text

    Objektformat

    Text

    evtl. temporre Dateien

    Quellcode

    Zusatzinformationen

    meldungen und Symboltabelle

    134

    Linking

    Binden

    Binder (Linker)Absoluter Code oder relozier-

    Link Map (Adressraum-benutzung), Symbolliste

    Binrcode od.

    Text

    evtl. temporre DateienObjektformat

    Objektformat

    Bibliotheksobjektformat

    Maschinencode und Zusatzinfo.

    Maschinencode und Zusatzinfo.

    Maschinencode und Zusatzinfo. librarysearch

    Objektformat

    barer Code mit Zusatzinfo.

    135

    Repetition Computer Architecture

    Storage Classes

    From Source Code To Executable Code

    Construction of an Executable

    Relocation Process

    136

  • Program Section In Virtual Memory

    Sektion .text (Code):

    Sektion .data (init. Daten)

    0

    xx

    0

    yy

    Adressraum0

    0x08048244

    0x08049370

    0xffffffff

    Nach Kompilation Nach Bindung

    Jede Sektion beginnt bei Adr. 0, Sektionen Alle Sektionen sind im Adress-sind logische. Adressrume des Compilers raum absolut platziert

    137

    Linking An Executable Binary

    OBJ1

    OBJ2

    OBJ3

    .data1

    .text2 .bss2

    .text3 .data3 .bss3

    .text1 .bss1

    .text1 .text2 .text3 .data1 .data3 .bss1 .bss2 .bss3

    Eingabedaten: ungebundene Objektdateien

    Verarbeitungsresultat: ausfhrbare Datei (gebunden, reloziert)

    Bindung (linking)

    OBJtotal

    .text: Code

    .data: initialisierte Variablen

    .bss: nicht initialisierte Variablen

    I Each object code (compiled seperately) starts at address 0I Linking them together involves

    I centralization of sectionsI relocation of adresses

    138

    Relocation Records

    I Once sections are placed subsequently, relocation can startI Executable code contains embedded addressesI Static data, function calls, jump targetsI On relocation those have to be changed inside the codeI Without a relocation table this is not possibleI A relocation record holds the relative address of a symbol

    (name of a variable, a function etc.)

    RELOCATION RECORDS FOR [.text]:

    OFFSET TYPE VALUE

    0000001a R_386_32 b

    00000023 R_386_32 a

    00000029 R_386_32 b

    139

    Source File: compile.c

    int a = 1; /* Global variable , initialized -> .data */

    int b; /* Global variable , uninitialized -> .bss */

    int

    main(void)

    {

    static int c; /* Local , static variable -> .bss */

    b = 5;

    c = b + a + 16;

    return c;

    }

    I Compile a relocatable object file

    cc -c compile.c (creates compile.o)

    I Linking an executable binary (one-step compilation)

    cc compile.c -o compile

    140

  • Analysis of Object Files: compile.o

    $ file compile.oELF 32-bit LSB relocatable , Intel 80386 , version 1, not stripped

    $ objdump -x compile.ocompile.o: file format elf32 -i386

    compile.o

    architecture: i386 , flags 0x00000011:

    HAS_RELOC , HAS_SYMS

    start address 0x00000000

    Sections:

    Idx Name Size VMA LMA File off Algn

    0 .text 0000005a 00000000 00000000 00000034 2**2

    CONTENTS , ALLOC , LOAD , RELOC , READONLY , CODE

    1 .data 00000004 00000000 00000000 00000090 2**2

    CONTENTS , ALLOC , LOAD , DATA

    2 .bss 00000004 00000000 00000000 00000094 2**2

    ALLOC

    3 .rodata 00000005 00000000 00000000 00000094 2**0

    CONTENTS , ALLOC , LOAD , READONLY , DATA

    141

    Object File: compile.o (cont.)

    SYMBOL TABLE:

    00000000 l df *ABS* 00000000 compile.c

    00000000 l d .text 00000000

    00000000 l d .data 00000000

    00000000 l d .bss 00000000

    00000000 l O .bss 00000004 c.0

    00000000 l d .rodata 00000000

    00000000 g O .data 00000004 a

    00000000 g F .text 0000005a main

    00000004 O *COM* 00000004 b

    RELOCATION RECORDS FOR [.text]:

    OFFSET TYPE VALUE

    0000001a R_386_32 b

    00000023 R_386_32 a

    00000029 R_386_32 b

    00000031 R_386_32 .bss

    00000036 R_386_32 .bss

    0000004c R_386_32 .rodata

    142

    compile.o: file format elf32 -i386

    Disassembly of section .text:

    00000000 :

    0: 55 push %ebp

    1: 89 e5 mov %esp ,%ebp

    3: 83 ec 18 sub $0x18 ,%esp6: 83 e4 f0 and $0xfffffff0 ,%esp9: b8 00 00 00 00 mov $0x0 ,%eaxe: 29 c4 sub %eax ,%esp

    10: a1 00 00 00 00 mov 0x0 ,%eax

    15: 89 45 e8 mov %eax ,0 xffffffe8 (%ebp)

    18: c7 05 00 00 00 00 05 movl $0x5 ,0x01f: 00 00 00

    22: a1 00 00 00 00 mov 0x0 ,%eax

    27: 03 05 00 00 00 00 add 0x0 ,%eax

    2d: 83 c0 10 add $0x10 ,%eax30: a3 00 00 00 00 mov %eax ,0x0

    35: a1 00 00 00 00 mov 0x0 ,%eax

    3a: 8b 55 e8 mov 0xffffffe8 (%ebp),%edx

    3d: 3b 15 00 00 00 00 cmp 0x0 ,%edx

    43: 74 13 je 58

    45: 83 ec 08 sub $0x8 ,%esp48: ff 75 e8 pushl 0xffffffe8 (%ebp)

    4b: 68 00 00 00 00 push $0x050: e8 fc ff ff ff call 51

    55: 83 c4 10 add $0x10 ,%esp58: c9 leave

    59: c3 ret 143

    compile.o: file format elf32 -i386

    Disassembly of section .text:

    00000000 :

    int b; /* Global variable , uninitialized -> .bss */

    int

    main(void)

    {

    0: 55 push %ebp

    ... 6 more lines ...

    15: 89 45 e8 mov %eax ,0 xffffffe8 (%ebp)

    static int c; /* Local , static variable -> .bss */

    b = 5;

    18: c7 05 00 00 00 00 05 movl $0x5 ,0x01f: 00 00 00

    c = b + a + 16;

    22: a1 00 00 00 00 mov 0x0 ,%eax

    27: 03 05 00 00 00 00 add 0x0 ,%eax

    2d: 83 c0 10 add $0x10 ,%eax30: a3 00 00 00 00 mov %eax ,0x0

    return c;

    35: a1 00 00 00 00 mov 0x0 ,%eax

    }

    ... 10 more lines ...

    144

  • Executable Binary File: compile

    compile: file format elf32 -i386

    compile

    architecture: i386 , flags 0x00000112:

    EXEC_P , HAS_SYMS , D_PAGED

    start address 0x1c000408

    Sections:

    Idx Name Size VMA LMA File off Algn

    ...

    9 .text 00000214 1c000408 1c000408 00000408 2**2

    CONTENTS , ALLOC , LOAD , READONLY , CODE

    ...

    12 .data 00000014 3c001008 3c001008 00001008 2**2

    CONTENTS , ALLOC , LOAD , DATA

    ...

    20 .bss 00000184 3c003100 3c003100 00001100 2**5

    ALLOC

    SYMBOL TABLE:

    3c003140 l O .bss 00000004 c.0

    3c003280 g O .bss 00000004 b

    1c0005c0 g F .text 0000005a main

    3c001018 g O .data 00000004 a

    145

    1c0005c0 :

    int b; /* Global variable , uninitialized -> .bss */

    int

    main(void)

    {

    1c0005c0: 55 push %ebp

    1c0005c1: 89 e5 mov %esp ,%ebp

    1c0005c3: 83 ec 18 sub $0x18 ,%esp1c0005c6: 83 e4 f0 and $0xfffffff0 ,%esp1c0005c9: b8 00 00 00 00 mov $0x0 ,%eax1c0005ce: 29 c4 sub %eax ,%esp

    1c0005d0: a1 00 31 00 3c mov 0x3c003100 ,%eax

    1c0005d5: 89 45 e8 mov %eax ,0 xffffffe8 (%ebp)

    static int c; /* Local , static variable -> .bss */

    b = 5;

    1c0005d8: c7 05 80 32 00 3c 05 movl $0x5 ,0 x3c0032801c0005df: 00 00 00

    c = b + a + 16;

    1c0005e2: a1 18 10 00 3c mov 0x3c001018 ,%eax

    1c0005e7: 03 05 80 32 00 3c add 0x3c003280 ,%eax

    1c0005ed: 83 c0 10 add $0x10 ,%eax1c0005f0: a3 40 31 00 3c mov %eax ,0 x3c003140

    return c;

    1c0005f5: a1 40 31 00 3c mov 0x3c003140 ,%eax

    }

    146

    Repetition Computer Architecture

    Storage Classes

    From Source Code To Executable Code

    Construction of an Executable

    Relocation Process

    147

    Relocation Of An Assembler Instruction

    During the linking process relocated addresses are injected in thecode, for example the assignment b = 5;

    Before relocation (relocatable compile.o):

    18: c7 05 00 00 00 00 05 movl $0x5 ,0x01c0005d8: c7 05 80 32 00 3c 05 movl $0x5 ,0 x3c003280After relocation (executable compile ):

    The proper address for b can be found in the symbol table.

    SYMBOL TABLE: (compile)

    3c003280 g O .bss 00000004 b

    I The symbol table for compile yields 3c003280 for variable b

    148

  • Relocation Of An Assembler Instruction (cont.)

    ? How to find the right places in the machine code to performthe substitutions?

    I Linker has relocation record (relative address) of b

    RELOCATION RECORDS FOR [.text]: (compile.o)

    0000001a R_386_32 b

    I Linker has absolute address of main from symbol table

    SYMBOL TABLE: (compile)

    3c003280 g O .bss 00000004 b

    1c0005c0 g F .text 0000005a main

    149

    Relocation Of An Assembler Instruction (cont.)

    Putting it all together:

    RELOCATION RECORDS FOR [.text]: (compile.o)

    0000001a R_386_32 b (relative offset)

    SYMBOL TABLE: (compile)

    3c003280 g O .bss 00000004 b (abs. address of b)

    1c0005c0 g F .text 0000005a main (abs. address of main)

    Computing the address where substitution must be performed:

    1c0005c0 + 0000001a = 1c0005da

    18: c7 05 00 00 00 00 05 movl $0x5 ,0x01c0005d8: c7 05 80 32 00 3c 05 movl $0x5 ,0 x3c003280

    150

    Systems Programming03. Functions and Program Structure

    Alexander Holupirek

    Database and Information Systems GroupDepartment of Computer & Information Science

    University of Konstanz

    Summer Term 2008

    151

    Schedule For Today

    Please make sure to register to the course via StudIS. You can notattend the examination, otherwise.

    So far: Static view of the program (before run-time)

    I Compilation in different stepsI Program files (e.g., in the ELF) contain sectionsI Sections are mapped to VM segmentsI Observed correlation between static storage class specifier,

    sections in ELF file and location in virtual memory

    Today: Dynamic view on the program (during run-time)

    I A closer look at functionsI Automatic allocation of memory on the stack/the heap

    152

  • Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    153

    Basics Of Functions

    Basics of Functions

    I Break large computer tasks into smaller onesI Enable people to build on what other have doneI No starting over from scratchI Hide details of operation from parts of the program that dont

    need to know about them

    I Structure the programI Easing pain of making changes

    154

    A Simple Version Of The Unix Tool grep(1)

    Basic task for simple grep:

    Print each line of input that contains a particular pattern

    Example:

    Input: Text in /etc/servicesPattern: http

    $ ./a.out < /etc/services# See also http ://www.iana.org/assignments/port -numbers

    www 80/ tcp http # WorldWideWeb HTTP

    https 443/ tcp # secure http (SSL)

    155

    Program Layout Of Simple grep

    Simple grep falls neatly into three pieces:

    while (there is another line)if (the line contains the pattern)

    print it

    I As said, small pieces are easier to deal with than one big oneI Irrelevant details can be buried in the functionsI Chance of unwanted interactions is minimizedI Pieces may even be useful in other programs

    156

  • A Function For Each Problem

    Simple grep falls neatly into three pieces:

    while (there is another line) getline()if (the line contains the pattern)

    print it printf(3)

    Decide whether the line contains an occurence of the pattern

    We write strindex(s, t) that returns the position or index inthe string s where the string t begins, or -1 if s does not contain t

    If we later want to switch to more sophisticated patternmatching, we only have to replace strindex; the rest of thecode remains the same.18

    18The standard library provides strstr(3) that is similar to strindex, exceptthat it returns a pointer instead of an index

    157

    Source Code grep:main

    #include

    #define MAXLINE 1000 /* maximum input line length */

    int getline(char line[], int max);

    int strindex(char source[], char searchfor []);

    char pattern [] = "http"; /* pattern to search for */

    /* find all lines matching pattern */

    int

    main(void)

    {

    char line[MAXLINE ];

    int found = 0;

    while (getline(line , MAXLINE) > 0)

    if (strindex(line , pattern) >= 0) {

    printf("%s", line);

    found ++;

    }

    return found;

    }

    158

    Source Code grep:strindex

    /* strindex: return index of t in s, -1 if none */

    int

    strindex(char *s, char *t)

    {

    int i, j, k;

    for (i = 0; s[i] != \0; i++) {

    for (j=0, k=i; t[j] == s[k] && t[j] != \0; j++, k++)

    ;

    if (j > 0 && t[j] == \0)

    return i;

    }

    return (-1);

    }

    159

    Function Definition

    A function definition has the form:return-typefunction-name(parameter declarations, if any){

    declarationsstatements

    }

    I Various parts may be absent; a minimal function is

    void dummy(void) { }which does nothing, accepts nothing, and returns nothing19

    I If the return-type is omitted, int is assumed

    19. . . but may be used as a place holder during program development160

  • A C Program Seen As Set Of External Objects

    C program is just a set of definitions of variables and functions

    I Communication between the functions isI by argumentI values returned by the functionsI through external variables

    I The functions can occur in any order in the source fileI Source program can be split into multiple files, so long as no

    function is split.

    161

    Returning From Functions

    The return Statement . . .. . . is the mechanism for returning a value from the called functionto its caller.

    I Any expression can follow returnI expression will be converted to return-type of functionI The calling function is free to ignore the returned valueI There need be no expression after return

    I in that case, no value is returned to the caller (garbage)

    I Control also returns with no value when execution falls offthe end of the function by reaching the closing right brace

    I It is not illegal, but probably a sign of trouble, if a functionreturns a value from one place and no value from another

    I If a function fails to return a value, its value is likely garbage

    162

    Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    163

    Functions Returning Non-integers

    Functions returning non-integer values

    I So far we have only returned either no value (void) or an intI What if function must return some other type?I To illustrate how to deal with this, we write and use atof(s).

    The function atof(s)Converts the string s to its double-precision floating-pointequivalent. It handles an optional sign and decimal point, and thepresence or absence of either integer part or fractional part20.

    20Use atof(3) declared by in real life164

  • Source Code: atof

    #include /* isspace , isdigit ... */

    double /* atof: convert string s to double */

    atof(char s[])

    {

    double val , power;

    int i, sign;

    for (i = 0; isspace(s[i]); i++)

    ; /* skip white space */

    sign = (s[i] == -) ? -1 : 1;

    if (s[i] == + || s[i] == -)

    i++;

    for (val = 0.0; isdigit(s[i]); i++)

    val = 10.0 * val + (s[i] - 0);

    if (s[i] == .)

    i++;

    for (power = 1.0; isdigit(s[i]); i++) {

    val = 10.0 * val + (s[i] - 0);

    power *= 10.0;

    }

    return sign * val / power;

    }

    165

    Declare To Use A Function

    I Calling function must know atof(s) returns a non-int valueI One way to ensure this:

    I Declare atof() explicitly in the calling function

    I This kind of declaration is shown in a primitive calculator:

    #include

    #define MAXLINE 100

    int /* rudimentary calculator */ +123.2

    main(void) 123.2

    { -0.2

    double sum , atof(char []); 123

    char line[MAXLINE ]; +0.7

    int getline(char line[], int max); 123.7

    -123.1

    sum = 0; 0.6

    while (getline(line , MAXLINE) > 0)

    printf("\t%g\n", sum += atof(line ));

    return (0);

    }

    166

    Inconsistent Return Types

    The declaration

    double sum, atof(char []);

    says that sum is a double variable, and that atof is a function thattakes one char[] argument and returns double.

    I The function atof must be declared and defined consistentlyI If atof itself and the call to it have inconsistent types in the

    same source file, the error will be detected by the compiler

    I But if (as is more likely) atof were compiled separately, themismatch would not be detected, atof would return adouble that main would treat as an int, and meaninglessanswers would result

    167

    Function Declaration By Context

    A mismatch can happen,

    I if there is no function prototype,I and a function is implicitly declared by its first appearance in

    an expression, just like in our calculator expressionsum += atof(line)

    Function Declaration By Context

    I If a name that has not been previously declared occurs in anexpression and is followed by a left parenthesis, it is declaredby context to be a function name

    I The function is assumed to return an intI Nothing is assumed about its arguments

    168

  • Missing Function Arguments

    If a function declaration does not include arguments, as in

    double atof();

    this is taken to mean that nothing is to be assumed about thearguments of atof; all parameter checking is turned off.

    I This special meaning of the empty argument list is intended topermit older C programs to compile with ANSI/ISO compilers

    I If the function takes arguments, declare themI If the function takes no arguments, use void

    169

    Explicit Cast Of The Return Type

    Given atof, properly declared, we could write atoi in terms of it:

    /* atoi: convert string s to integer using atof */

    int

    atoi(char s[])

    {

    double atof(char s[]);

    return (int) atof(s);

    }

    I The value of the return expression is converted to the type ofthe function before the return is taken.

    I Therefore, the value of atof, a double, is convertedautomatically to int when it appears in this return

    I This operation does potentially discard information warningI The cast states explicitly that the operation is intended

    170

    Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    171

    External Objects

    As mentioned, a program is just a set of definitions of variablesand functions. These can be considered as external objects.

    I Functions are always external21

    I External is used in contrast to internal, which describes thearguments and variables used inside functions

    I By default, external variables and functions have theproperty that all references to them by the same name, evenfrom functions compiled separately, are references to the samething (this is called external linkage in the standard)

    21C does not allow functions to be defined inside other functions172

  • A Reverse Polish Notation Calculator

    We will build a reverse polish notation calculator to discuss

    I Function evaluationI Splitting up a program in several source filesI Scope Rules

    Infix Notation vs. Reverse Polish Notation

    ( 1 - 2 ) * ( 4 + 5 )

    1 2 - 4 5 + *

    Parentheses are not needed; the notation is unambigous as long aswe know how many operands each operator expects.

    173

    Calculator Design Using A Stack

    stack: 1 1 -1 -1 -1 -1 -9

    2 4 4 9

    5

    input: 1 2 - 4 5 + *

    Program description

    I Each operand arriving is pushed on the stackI Once an operator arrives

    I Pop apt number of operands (e.g., two for binary operators)I Apply operator to themI Push the result back onto the stack

    I The value on the top of the stack is popped and printed whenthe end of the input line is encountered.

    174

    Calculator Program Layout

    Basic structure of our calculator (controlling main function):

    while (next operator or operand is not EOF)if (number)

    push itelse if (operator)

    pop operandsdo operationpush result

    else if (newline)pop and print top of stack

    elseerror

    175

    Program Design Considerations

    I Pushing and popping a stack are trivial, but with errorhandling long enough to be put each in a separate function

    I A function for fetching the next input operator or operand

    Where to put the stack? Who should access it directly?

    I Keep it in main. Pass the stack to the routines that push and pop itI But main doesnt need to know about the stackI main only does push and pop operations

    I Store the stack and its pointer in external variablesI Accessible to the push and pop functions but not main

    176

  • Program Layout In One Source File

    Lets think of the program as existing in one source file:

    #includes

    #defines

    function declarations for main

    int main(void) { }

    external variables for push and pop

    void push(double f) { }

    double pop(void) { }

    int getop(char s[]) { }

    routines called by getop

    177

    1 #include

    2 #include

    3

    4 #define MAXOP 100

    5 /* signal number found */

    6 #define NUMBER 0

    7

    8 int getop(char []);

    9 void push(double );

    10 double pop(void);

    11

    12 /* reverse polish calc */

    13 int

    14 main(void)

    15 {

    16 int type;

    17 char s[MAXOP ];

    Main loop switches on thetype of operator or operand

    18

    19 while ((type=getop(s))!= EOF) {

    20 switch (type) {

    21 case NUMBER:

    22 break;

    23 case +:

    24 break;

    25 case *:

    26 break;

    27 case -:

    28 break;

    29 case /:

    30 break;

    31 case \n:

    32 break;

    33 default:

    34 printf("error: unknown\

    35 command %s\n", s);

    36 }

    37 }

    38 return (0);

    39 }

    178

    Order Of Function Evaluation

    switch (type) {

    case NUMBER:

    push(atof(s));

    break;

    case +:

    push(pop() + pop ());

    break;

    case *:

    push(pop() * pop ());

    break;

    case -:

    push(pop() - pop ());

    break;

    case /:

    push(pop() / pop ());

    break;

    case \n:

    printf("\t%.8g\n", pop ());

    break;

    default:

    printf("error: unknown command %s\n", s);

    }

    I What about followingimplementation of switch?

    I + and * are commutative, theorder in which the poppedoperands are combined isirrelevant

    I - and / left and right operandsmust be distinguished

    I / error: zero-divisorI The order in which function

    calls are evaluated is not defined

    Implementation is erroneous

    179

    Steering The Order Of Function Evaluation

    20 switch (type) {

    21 case NUMBER:

    22 push(atof(s));

    23 break;

    24 case +:

    25 push(pop() + pop ());

    26 break;

    27 case *:

    28 push(pop() * pop ());

    29 break;

    30 case -:

    31 op2 = pop ();

    32 push(pop() - op2);

    33 break;

    34 case /:

    35 op2 = pop ();

    36 if (op2 != 0.0)

    37 push(pop() / op2);

    38 else

    39 printf("error: zero divisor\n");

    40 break;

    41 case \n:

    42 printf("\t%.8g\n", pop ());

    43 break;

    44 default:

    45 printf("error: unknown command %s\n", s);

    46 }

    To guarantee the right order, itis necessary to pop the first valueinto a temporary variable.

    180

  • Source Code Stack

    I The stack itself and its fill factor (the stack pointer) areshared by push and pop

    I Since they are defined outside any function, they are external

    51 #define MAXVAL 100 /* maximum depth of val stack */

    52

    53 int sp = 0; /* next free stack position */

    54 double val[MAXVAL ]; /* value stack */

    55

    56 /* push: push f onto value stack */

    57 void

    58 push(double f)

    59 {

    60 if (sp < MAXVAL)

    61 val[sp++] = f;

    62 else

    63 printf("error: stack full , cant push %g\n", f);

    64 }

    181

    Source Code Stack

    66 /* pop: pop and return top value from stack */

    67 double

    68 pop(void)

    69 {

    70 if (sp > 0)

    71 return val[--sp];

    72 else {

    73 printf("error: stack empty\n");

    74 return (0.0);

    75 }

    76 }

    182

    Source Code To Get Operands And Operators

    83 /* getop: get next operator or numeric operand */

    84 int

    85 getop(char s[])

    86 {

    87 int i, c;

    88

    89 while ((s[0] = c = getch ()) == || c == \t)

    90 ;

    91 s[1] = \0;

    92 if (! isdigit(c) && c != .)

    93 return c; /* not a number */

    94 i = 0;

    95 if (isdigit(c)) /* collect integer part */

    96 while (isdigit(s[++i] = c = getch ()))

    97 ;

    98 if (c == .) /* collect fraction part */

    99 while (isdigit(s[++i] = c = getch ()))

    100 ;

    101 s[i] = \0;

    102 if (c != EOF)

    103 ungetch(c);

    104 return NUMBER;

    105 }

    183

    What Are getch And ungetch?

    What are getch and ungetch?

    It is often the case that a program cannot determine that it hasenough input until is has read too much.

    Example: Collecting the characters that make up a number

    Problem: Until the first non-digit is seen, the number is notcomplete. But then the program has read one character too far.Solution: It would be nice if it were possible to un-read theunwanted character.

    184

  • The Functions getch And ungetch

    getch delivers the next input character to be considered

    ungetch remembers the characters put back on the input.Subsequent calls to getch will return them beforereading new input22.

    I Work together via a shared buffer and an index in the buffer.I Because of that and because they must retain their values

    between calls they must be external to both functions.

    22ungetc(3) declared in un-gets a character from input stream185

    Source Code: (un-)getch

    107 #define BUFSIZE 100

    108

    109 char buf[BUFSIZE ]; /* buffer for ungetch */

    110 int bufp = 0; /* next free position in buf */

    111

    112 /* getch: get a (possibly pushed back) character */

    113 int

    114 getch(void)

    115 {

    116 return (bufp > 0) ? buf[--bufp] : getchar ();

    117 }

    118

    119 /* ungetch: push character back on input */

    120 void

    121 ungetch(int c)

    122 {

    123 if (bufp >= BUFSIZE)

    124 printf("ungetch: too many characters\n");

    125 else

    126 buf[bufp ++] = c;

    127 }

    186

    Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    187

    A Program In Several Files

    I As seen in assignments, the functions and variables that makeup a C program need not all be compiled at the same time.

    I The source text may be kept in several files, and previouslycompiled routines may be loaded from libraries.

    There may arise some questions with this:

    I How are declarations written so that variables are properlydeclared during compilation?

    I How are declarations arranged so that all the pieces will beproperly connected when the program is loaded?

    I How are declarations organized so there is only one copy?I How are external variables initialized?

    188

  • Visibility Scope

    Visibility Scope

    The scope of a name is the part of the program within which thename can be used.

    I For an automatic variable declared at the beginning of afunction, the scope is the function in which the name isdeclared

    I Local variables of the same name in different functions areunrelated

    I The same is true of the parameters of the function, whichare in effect local variables

    I The scope of an external variable or a function lasts fromthe point at which it is declared to the end of the file beingcompiled

    189

    Scope In Natural Order Of Appearance

    main, sp, val, push, & pop defined in one file, in the order shown:

    int

    main(void)

    { ... }

    int sp = 0;

    double val[MAXVAL ];

    void

    push(double f)

    { ... }

    double

    pop(void)

    { ... }

    Variables sp and val may be used in push and pop simply bynaming them; no further declarations are needed.But these names are not visible in main, nor are push and pop

    190

    Definition & Declaration Of External Variables

    Definition and Declaration of External Variables

    I If an external variable is to be referred to before it is definedI Or if it is defined in a different source file from the one where

    it is being used

    I Then an extern declaration is mandatory.

    It is important to distinguish between the declaration of anexternal variable and its definition.

    definition causes storage to be set aside (sets storage class)

    declaration announces the properties of a variable (its type)

    191

    Definition And Declaration Of External Variables

    Consider the lines to appear outside of any function:

    int sp;

    double val[MAXVAL ];

    I They define the external variable sp and valI and cause storage to be set asideI and serve as the declaration for the rest of that source file.

    On the other hand, consider the lines:

    extern int sp;

    extern double val[];

    I They declare for the rest of the source file that sp is an intand val is a double[] (whose size is determined elsewhere)

    I They do not create the variables or reserve storage for them.

    192

  • Wrap Up Definition And Declaration

    Wrap Up Definition and Declaration

    I There must be only one definition of an external variableamong all files that make up the program.

    I Initialization of an external variable is possible only withinthe definition

    I Other files may contain extern declarations to access it23

    I Array sizes must be specified with the definition, but areoptional with an extern declaration.

    23There may also be extern declarations in the file containing the definition193

    Definition/Declaration Of Externals

    Although it is not a likely organization for this program

    I functions push and pop could be defined in one fileI variables val and sp defined and initialized in another.

    These definitions and declarations tie them together:

    extern int sp; #define MAXVALUE 100

    extern double val[];

    int sp = 0;

    void push(double f) { ...} double val[MAXVALUE ];

    double pop(void) { ... }

    I Because the extern declarations lie ahead of and outside thefunction definitions, they apply to all functions

    I One set of declarations suffices for all of the left fileI The same organization would also be needed if the definitions

    of sp and val followed their use in one file

    194

    Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    195

    Program Organisation In Different Files

    Let us now divide the calculator program into several source files(as a simulation for substantially bigger programs)

    I main main.cI push and pop, and their variables stack.cI getop getop.cI getch and ungetch getch.c24

    24We seperate them from the others because they would come from aseperately-compiled library in a realistic program

    196

  • Header File

    What about the definitions & declarations shared among files?

    I As much as possible, we want to centralize thisI As a consequence, there would be only one copy to get right

    and keep right as the program evolves

    I We will place this common material in a header file calc.hI It will be included by the others as necessaryI There is a tradeoff between the desire that each file have

    access only to the information it needs for its job and thepractical reality that it is harder to maintain more header files

    I Up to some moderate program size, it is probably best to haveone header file that contains everything that is to be sharedbetween any two parts of the program

    197

    Program Structure

    calc.h main.c

    #define NUMBER 0 #include

    void push(double ); #include

    double pop(void); #include "calc.h"

    int getop(char []); #define MAXOP 100

    int getch(void);

    void ungetch(int); int main(void) {}

    getch.c stack.c getop.c

    #include #include #include

    #define BUFSIZE 100 #include "calc.h" #include

    #define MAXVAL 100 #include "calc.h"

    char buf[BUFSIZE ]; int sp = 0;

    int bufp = 0; double val[MAXVAL ];

    int getch(void) {} void push(double) {} int getop(char []) {}

    void ungetch(int) {} double pop(void) {}

    198

    Basics of Functions

    Functions Returning Non-integers

    External Variables

    Scope Rules

    Header Files

    Static Variables

    A Program in Execution - Unix Run-time

    199

    Static Variables

    The variables

    I sp and val in stack.c and buf and bufp in getch.cI are for private use of functions in their source filesI are not meant to be accessed by anything else

    The static declaration

    I applied to an external variable or functionI limits the scope of that object to the rest of the source file

    External static thus provides a way to hide names like buf andbufp in the getch-ungetch combination, which must be external sothey can be shared, yet which should not be visible to users ofgetch and ungetch.

    200

  • External Static Example

    If the two function and the two variables are compiled in one file:

    static char buf[BUFSIZE ]; /* buffer for ungetch */

    static int bufp = 0; /* next free position in buf */

    int getch(void) { ... }

    void ungetch(int c) { ... }

    I No other function will be able to access buf and bufpI The names will not conflict with the names in other files of

    the same program

    I The same goes for sp and val in stack.c

    201

    Specifier static For Functions

    I The external static is most often use for variablesI But can be applied to functions as wellI Normally, function names are global, however, declared static

    its name is invisible outside of the file in which it is compiled.

    $ readelf -s global.o

    Symbol table .symtab contains 7 entries:

    Num: Value Size Type Bind Vis Ndx Name

    0: 00000000 0 NOTYPE LOCAL DEFAULT UND

    1: 00000000 0 FILE LOCAL DEFAULT ABS global.c

    2: 00000000 0 SECTION LOCAL DEFAULT 1

    3: