hkust summer programming course 2008

47
1 Department of Computer Science and Engineering, HKUST HKUST Summer Programming Course 2008 C API ~ Interfacing C programming

Upload: merry

Post on 12-Jan-2016

24 views

Category:

Documents


1 download

DESCRIPTION

HKUST Summer Programming Course 2008. C API ~ Interfacing C programming. Overview. Introduction Streams Output Functions Input Functions File-Related Functions Memory Management Functions Other aspects of C Programming. C API. Introduction. Motivation. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HKUST Summer Programming Course 2008

1Department of Computer Science and Engineering, HKUST

HKUST SummerProgramming Course 2008

C API

~ Interfacing C programming

Page 2: HKUST Summer Programming Course 2008

2

Overview

Introduction Streams Output Functions Input Functions File-Related Functions Memory Management Functions Other aspects of C Programming

Page 3: HKUST Summer Programming Course 2008

3Department of Computer Science and Engineering, HKUST

C API

Introduction

Page 4: HKUST Summer Programming Course 2008

4

Motivation

There is quite a lot of functionalities that the system offers to the programmers, as examples: Creating and destroying processes. Reading/Writing files. Opening network connection. ….

Historically, these functionalities were implemented as C functions, some library authors make C++ wrappers, but many programmer still like to use C versions.

C functions maybe faster then C++ functions. There are some overhead in the wrappers.

Page 5: HKUST Summer Programming Course 2008

5

C functions to cover in this lecture

Formatted I/O Output: printf, fprintf, sprintf, snprintf Input: scanf, fscanf, sscanf

File I/O fopen, feof, fgetc, fputc, fread, fwrite,

fclose, fseek, rewind, fflush

Memory Allocation malloc, realloc, calloc, free, memcpy, memset

Page 6: HKUST Summer Programming Course 2008

6

References

ALWAYS look up manual pages before using the function for the first time. Important: Learn how to read manual page

Most material in this website is extracted from: http://www.cplusplus.com/ref/

If there is anything unclear, you can always reference the aforementioned website.

Page 7: HKUST Summer Programming Course 2008

7Department of Computer Science and Engineering, HKUST

C API

Streams

Page 8: HKUST Summer Programming Course 2008

8

Streams in C

file1

file2

fileN

……

……

OS

Stream1

Stream2

StreamM

Our C Program

File I/O Code

Other Code

Screen and keyboard are emulated as files in OS

attach

attach

attach

Page 9: HKUST Summer Programming Course 2008

9

Streams in C

There are three standard I/O streams. stdin – standard input (ie. keyboard) (C++: cin) stdout – standard output (ie. screen) (C++: cout) stderr – standard error (ie. screen) (C++: cerr)

You have been using stdout whenever you use cout! You can use cerr << "ERROR" << endl; to print to standard

error stream. I/O operations are performed in C using streams.

Files are also accessed through streams, these streams can be created or destroyed whenever necessary.

In C, streams are identified by a number, called the File Descriptor. Eg. printf output to stdout, while fprintf can be used to

output to any streams. (We will cover fprintf function soon).

Page 10: HKUST Summer Programming Course 2008

10

Output Streams in C

Instead of standard output (stdout), there is another way for us to output data to the screen, it is called the standard error stream (stderr).

With two different output streams (both are directed to the screen), we can have a better management of display Normal output (input prompts, messages, information)

stdout Error output (error message) stderr We can even separate two types of output by redirection

(next slide)

Page 11: HKUST Summer Programming Course 2008

11

Capture Output in OS

It is possible to capture ALL the output generated by a program using redirection operator in the shell (e.g. C-shell, or Bash Shell in your Linux, or cmd.exe in Windows). Log files are very useful in debugging!!!

Example (Linux, C-shell): ./a.out > output.txt

In the above, instead of printing output to the screen, the normal output will be saved to a file named output.txt.

Data printed to error stream will not be redirected (still displayed on the screen).

(./a.out > output.txt) >& error.txt This can redirect the normal output to output.txt, while

the error output to error.txt.

Page 12: HKUST Summer Programming Course 2008

12

Redirect Input Stream

Similar to output streams, we can redirect a file content to the standard input stream.

Example:int main( ) { // program

int x, y; double z;cin >> x >> y >> z;return 0;

}

100 90 // input.txt80.2

./a.out < input.txt // command line

Page 13: HKUST Summer Programming Course 2008

13Department of Computer Science and Engineering, HKUST

C API

Output Functions

Page 14: HKUST Summer Programming Course 2008

14

Formatted I/O

The printf function provides a convenient way to output formatted data to stdout.int printf(const char * format [ , argument , ...]);

Print formatted dataPrints the arguments formatted as the format argument specifies.

The format is a C-string. The C-string specifies what the output looks like.

The arguments are data to substitute into the format template. Recall that this accepts any number of parameters, which is

implemented by ellipse list in C. It returns the number of characters written, or a negative

number when error occurs.

Page 15: HKUST Summer Programming Course 2008

15

Formatted I/O

Example printf(“Hello\n”); printf(“Hello %d\n”, 100); printf(“Hello %f %s\n”, 2.5, “abc”); ….

The % means that it is a space to substitute arguments into the place to be outputted. %d -> integer %f -> floating point number %% -> the % sign …. and many others of them (check the documents).

Page 16: HKUST Summer Programming Course 2008

16

Formatted I/O

Recall that the ellipse list cannot check the type of parameters.

A common pitfall in using printf is substituting a mismatched type into the arguments. For example,

printf(“Hello %f\n”, 100); This will print some garbage value, since it trys to interpret a

binary number representing an integer as a floating point number.

Page 17: HKUST Summer Programming Course 2008

17

Formatting strings

The same formatted result can be used to format a string using sprintf.int  sprintf ( char * buffer, const char * format [ , argument , ...] );

Print formatted data to a string.Writes a sequence of arguments to the given buffer formatted as the format argument specifies, instead of stdout.

The first argument is a character array “buffer” that the function will write the formatted string into it.

fprintf is a similar function, which can print to any stream, including file streams.

Page 18: HKUST Summer Programming Course 2008

18

sprintf Example

int main() {

buffer = new char[30];

char buffer2[40];

sprintf(buffer, “Hello %s”, “World”);

sprintf(buffer2, “Mario %s”, “World”);

delete[] buffer;

return 0;

}

Page 19: HKUST Summer Programming Course 2008

19

The buffer – and the buffer overflow problem The sprintf() function’s first argument requires a buffer to store

the formatted output of the string. The buffer should be allocated, either on the stack (static array) or

on the heap (dynamic array). What if it points to somewhere that cannot be written?

Runtime error “may” occurs. Because it corrupts the memory location, which may be used for another variable.

It just like throwing rubbish on the street. If there is a policeman, then you are caught, otherwise you are safe. DON’T take the chance.

What if it points to somewhere allocated but not sufficient space to store the formatted output.

char buffer[5]; sprintf( buffer, "%s", "Hello World" );

Heap -> may corrupt other objects. Stack -> may corrupt the stack -> serious problem.

Page 20: HKUST Summer Programming Course 2008

20

Buffer Overflow Attack

Buffer overflow attack is a typical method for crackers to break a program. Usually, they input a string into the program so that it is too

long and will eventually corrupt the stack to do something “strange”. Don’t do this if you are not expert.

To an extreme, crackers can execute arbitrary code on another machine.

Dangerous? Yes! One might be careful about buffer and if it will overflow. The function snprintf helps to avoid buffer overflow attack.

Page 21: HKUST Summer Programming Course 2008

21

snprintf

int snprintf(char *s, size_t size, const char *template, ...)

The snprintf function is similar to sprintf, except that the size argument specifies the maximum number of characters to produce. The trailing null character is counted towards this limit, so you should allocate at least size characters for the string s.

As a kind reminder, size_t is just an unsigned integer.

Page 22: HKUST Summer Programming Course 2008

22Department of Computer Science and Engineering, HKUST

C API

Input Functions

Page 23: HKUST Summer Programming Course 2008

23

scanf

Now we turned our head from formatted output to formatted input.int scanf(const char * format [ , argument , ...]);

Read formatted data from stdin.Reads data from the standard input (stdin) and stores it into the locations given by argument(s). Locations pointed by each argument are filled with their corresponding type of value requested in the format string. There is NO pass-by-reference in C, passing a non-const

pointer is the only way to allow modify a parameter in C. There must be the same number of type specifiers in

format string as that of arguments passed.

Page 24: HKUST Summer Programming Course 2008

24

scanf - pointers

Notice, to use scanf, you must specify the address of the variable storing the input.

Examples: int x; scanf(“%d”, &x);

input: 22 x = 22

float f, int y; scanf(“%f=%d”, &f, &y); Input: 0.2=3 f = 0.2, y = 3

int x; char remain[1024]; scanf( "%d", &x ); scanf( "%s", remain ); Input: 2.2 x = 2, remain = .2

Page 25: HKUST Summer Programming Course 2008

25

fscanf, sscanf

fscanf – accept extra parameter to specify which stream to read from.

sscanf – accept input from a C-string. You can use this to “parse” a string. Separating a string into separate components according to a

format.

Page 26: HKUST Summer Programming Course 2008

26

scanf returns … Return Value.

The number of items succesfully read. This count doesn't include any ignored fields with asterisks (*). Technically, but seldom used, you can add a format argument to specify

a type but don’t want its value. int a, b; scanf(“%d %*d %d”, &a, &b); Input: 1 2 3 a = 1, b = 3 returns 2

If EOF is returned an error has occurred before the first assignment could be done.

Tips: Always check return value. Output understandable error message and ask for re-entering input

or quitting program. For more robust parsing, using a more sophisticated parsing method

instead of using scanf.

Page 27: HKUST Summer Programming Course 2008

27

Summary

Input Output

Standard I/O scanf printf

File fscanf fprintf

String (character buffer)

sscanf sprintf, snprintf

Page 28: HKUST Summer Programming Course 2008

28Department of Computer Science and Engineering, HKUST

C API

File-Related Functions

Page 29: HKUST Summer Programming Course 2008

29

File I/O: Overview

File I/O is a service provided by the operating system through a set of functions.

These functions must be able to know which file the user is reading/writing.

That’s why all file related function (except fopen), requires a parameter of type FILE*. The argument is called the file descriptor, which is a unique identifier for each opened file in the operating system. There are other properties stored in the structure FILE (such

as access mode).

Page 30: HKUST Summer Programming Course 2008

30

Various functions for File I/O

fopen – open a file, with various access modes. feof – check whether the file ended. fgetc – read a character from the file fputc – write a character to the file fread – read a block of characters from the file fwrite – write a block of characters to the file fclose – close the file fseek – move the currently reading position rewind – move the currently reading position back to the

beginning of the file. fflush – flush the buffer and write to the file immediately.

Page 31: HKUST Summer Programming Course 2008

31

Binary File / Text File A text file is simply a human-readable file. A binary file is not human-readable. Comparison between text file and binary file:

Advantages of text files: Human readable. Compressible (since it wastes lots of space inherently).

Advantages of binary files: No precision problem for printing floating point numbers. Usually, smaller in size. Hidden some information

(you can’t understand the file without the file format). Easier to read an arbitrary element in an array (eg. each integer

is stored as 4 bytes, the 101th element is located at the 401th byte)

Page 32: HKUST Summer Programming Course 2008

32

Typical File I/O Scenarios (1)

The first thing you need to consider is that the file is a text file or is a binary file.

The next thing you need to know is that you want to read, write or append the file.

Now you can open the file, using the fopen function. Remember to check if you can successfully open the file or not (return NULL when fails).

Then you need to know whether you want to process the file byte, block, or token. For byte-level access, you may want to use fgetc or fputc (text

file). For block, you may want to use fread/fwrite (binary file). For token, you can use fscanf (text file).

Page 33: HKUST Summer Programming Course 2008

33

Typical File I/O Scenarios (2)

Whenever you read/write a file, the file pointer, associated with the file descriptor, moves to the end of you last accessed location and your next operation starts there. Sometimes you want to move around the file pointer in the file, then you use fseek/rewind.

You can also check whether you have reached the end of the file, using the function feof.

Last, but still very important, is to fclose the file. Otherwise, the content might not be saved into the file (or

sometimes other cannot delete/open the file)

Page 34: HKUST Summer Programming Course 2008

34

More about fgetc()

Recall that the default behavior of cin is to skip the whitespace (space, newline).

fgetc (or getc, which read from standard input) won’t skip white space. You can hence use fgetc to count how many spaces in an

input.

Page 35: HKUST Summer Programming Course 2008

35

More about feof() The concept of “end-of-file” is a little bit strange, it can be

interpreted in this way. The file is appended with a character, called the “eof” character. Whenever this character is read, after reading, the feof() return

true. Before reading, however feof() return false even if the last character of the original file is already read.

For example while (!feof(file)) {cout << fgetc(file) << endl;} This is NOT correct! Since it will actually output the end-of-file

character once. Even the last character is read, the feof() still return true.

The corrected version:while (true) {int v = fgetc(file); if (feof(file))break; cout << v<< endl;}

Page 36: HKUST Summer Programming Course 2008

36

More about fflush() By default, C I/O are buffered (that’s true for C++ I/O as

well). Buffered I/O means that the data to be outputted are stored

in a piece of memory before it is truly written out. This is to take advantage of block I/O performance. Writing to memory (RAM) is much more efficient than writing

to disk. Eg. DMA technique can be efficiently used to write a block of

data to disk. It is up to the library when to truly written out the data or

wait. To force the library to write out the data, you should use fflush().

When the file is closed using fclose(), the data is flushed.

Page 37: HKUST Summer Programming Course 2008

37Department of Computer Science and Engineering, HKUST

C API

Memory Management Functions

Page 38: HKUST Summer Programming Course 2008

38

Memory Management

Memory is NOT an infinite asset, and is shared across all programs. Recall that deallocate memory once it is not needed.

The operating system has a module called the memory manager to ensure each process access it’s own memory, and will not corrupt others.

Memory management is a highly computational activity. If memory allocation/deallocation happens too often, it will leads to: Slowing down of system performance Memory fragmentation – where big piece of continuous is

hard to find.

Page 39: HKUST Summer Programming Course 2008

39

Memory Fragmentation Illustrated

Suppose there are 100 bytes on memory. A cell means 10 bytes Marked with “A” means it is allocated.

However, when another 20 bytes of memory is requested, it cannot be fulfilled. However 30 bytes of memory is actually not allocated.

A A A A A A A

Page 40: HKUST Summer Programming Course 2008

40

Memory Related Functions

malloc – allocate continuous piece of memory of specified size realloc – resize the allocated piece of memory, the contents

are retained. calloc – same as malloc, except initializing the memory to

zero free – release the memory back to the memory manager

Same function is used both for dynamic variables and dynamic arrays (this is different from delete and delete[] in C++)

memcpy – bitwise copy from a memory location to another memory location

memset – quickly set a value the each byte of a piece of memory

Page 41: HKUST Summer Programming Course 2008

41

Operator sizeof

Many memory management functions require the size of a type. It is a bad practice to hard-code the size (say, hard-code 4

for integer). sizeof operator is used to return the size of a variable

int x; sizeof(x) 4 bytes (in Win32 machine) double x; sizeof(x) 8 bytes (in Win32 machine)

You can also apply sizeof to a type sizeof(int)

Always use sizeof, even if you know the size of the datatype in your machine. This produces a more portable code.

Page 42: HKUST Summer Programming Course 2008

42

More about malloc / free To allocate a dynamic object, you can use

int* pInt = (int*) malloc( sizeof(int) ); Obj* pObj = (Obj*) malloc( sizeof(Obj) );

To allocate a dynamic array of length = ELEMENT, you can use int* pIntArray = (int*) malloc( sizeof(int)*

ELEMENT ); Obj* pObjArray = (Obj*) malloc( sizeof(Obj)*

ELEMENT ); If there is not enough space to allocate, it return NULL.

Always check this in programming. To deallocate memory, you can use

free(pInt); free(pObj); free(pIntArray); free(pObjArray);

Page 43: HKUST Summer Programming Course 2008

43

More about realloc

Why realloc is faster than allocate then copy? Because it tries to expand the original allocated piece by the

specified size, if there are sufficient free space around the original allocation.

This may improve the efficiency of the solution for the last question in Midterm.

What will happen if I use realloc but the current address do NOT allow expansion? It falls back do allocate then copy.

Page 44: HKUST Summer Programming Course 2008

44

More about memset / memcpy

You can initialize an array to 0 by:memset( pIntArray, 0, sizeof(int)*ELEMENT );

You can copy an array to another by:int* pIntArray2 = (int*) malloc( sizeof(int) * ELEMENT);

memcpy( pIntArray2, pIntArray, sizeof(int)*ELEMENT ); Again, this is much more efficient than the solution for the last

question in Midterm, which uses a for-loop to copy the elements.

Page 45: HKUST Summer Programming Course 2008

45Department of Computer Science and Engineering, HKUST

C API

Other aspects of C Programming

Page 46: HKUST Summer Programming Course 2008

46

Other aspects of C Programming

There are some more strange convention when writing a C program. All variables must be declared at the beginning of a scope. Use a C compiler – gcc, instead of g++.

Use extern "C" in function prototype to link functions compiled with a C compiler (more about this in next slide).

You can also use C++ compiler to compile a C program (C++ is a superset).

Page 47: HKUST Summer Programming Course 2008

47

Other aspects of C Programming Assume func_c.obj, which defined a function add, is compiled

with C compiler. And assume you cannot access the source code of that object file.

Now, if we want to use the function add in a C++ program:// A C++ program, using C function#include <iostream>using namespace std;extern "C" { int add( int, int ); }int main( ) {

int x = 10, y = 12;cout << add(x,y) << endl;return 0;

}