c course - strings

Upload: karmamen

Post on 01-Jun-2018

232 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/9/2019 C Course - Strings

    1/17

    Strings

  • 8/9/2019 C Course - Strings

    2/17

    C Style Strings

    There is no such thing as a string type in C (there are strings in C++, Delphi,C#, Java …).

    In C, string is an bounded array of char with a terminating zero, i.e. ’\0’.

    There is no automatic bounds checking or automatic overwrite detection ofthe string array (and any other array) by the compiler and run-time environment.

    Improper string handling is the source of frequent programming errors

    Example:

    char buffer[21];

    This definition allocates buffer for string long 20 characters.

    Why it is then 21 bytes long ?Because 21st element of the array is for the terminating zero.

  • 8/9/2019 C Course - Strings

    3/17

    A Simple String

    char buf[10] = "Hello!\n";

    The variable buf is actually a”pointer” to where, in memory,the string is located.

    String ends with a NULL (’\0’)

    character. Some string operations, like

    the string declaration above,will automatically append theNULL terminator but

    sometimes string should beexplicitly NULL-terminated

  • 8/9/2019 C Course - Strings

    4/17

    String declarations

    1. char *no_space_buffer;

    2. char *class_name = "C course";3. char just_fit_buffer[] = "enough space for str"

    4. char buffer_with_space[100];

    Often you will see strings declared in one of these ways:1. storage must come from somewhere else.

    2. gets pointer to string in the string constant table created in “.text”section during compile time

    3. allocates buffer which size (in bytes) is exact as initialization stringsize, plus NULL character

    4. explicitly tells the compiler to allocate 100 bytes of storage.

  • 8/9/2019 C Course - Strings

    5/17

    Assigning a string

    BAD 1. char buffer[20];

     buffer = "test string 1";

    BAD 2. char buffer[];

     buffer = "test string 2";

    GOOD 3. char buffer[20] = "test string 3";

    GOOD 4. char buffer[] = "test string 4";

    GOOD 5. char * buffer;

     buffer = "test string 4“;

    GOOD 6. char * buffer = "test string 5";

  • 8/9/2019 C Course - Strings

    6/17

    Multiple strings in singlememory chunk

    Consider the following code:

    char buf[20] = "Hello,World!\n";

    char *buf2 = buf + 7;

     printf("buf: %s\n", buf);

     printf("buf2: %s\n", buf2);

     buf2[0] = ’M’;

     printf("buf: %s\n", buf);

    What is the output?

  • 8/9/2019 C Course - Strings

    7/17

    Character Literals

    Character literal are enclosed in single

    quotes (i.e., ’, not ”).

    char buf[10]; buf[0] = ’A’; /* correct */

     buf[0] = "A"; /* incorrect */ buf[1] = 0; /* NULL terminator */

  • 8/9/2019 C Course - Strings

    8/17

    Buffer Overflow

    Strings in C don’t ”grow” automatically. Instead,the programmer must always be aware of how

    much space is available.Consider this example:

    char s1[10];

    char s2[10];

    strcpy( s1,"This string is to long!\n" );

    We are copying 25 bytes into a 10 byte buff ffff ff er!

    We have overwritten string s2!

    Since the string is 25 characters long and s1 and s2are 20 characters total, we have written off  the endof our own memory and have potentially corruptedthe program’s call stack!

    The compiler and run-time environment will notdetect this!

  • 8/9/2019 C Course - Strings

    9/17

    String constants

    Example 1char *str;

    str="hello";

     printf("%s\n",str);

    Suppose we have following codesExample 2char str[100];

    strcpy(str,"hello"); printf("%s\n",str);

    These two fragments produce the same output, buttheir internal behavior is quite different.

  • 8/9/2019 C Course - Strings

    10/17

    String constants

    Example 1

    The statement

    str = "hello";

    causes str to point to the address of the string "hello" in the string constanttable which is technically a part of the executable code so it can be used only inread-only manner.

    Example 2

    The string "hello" is a part of the string constant table, so you can copy it into thearray of characters named str which can be used in read-write manner

    Since str is not a pointer, the statement

    str = "hello";

    will not work in Example 2. It will not even compile.

  • 8/9/2019 C Course - Strings

    11/17

  • 8/9/2019 C Course - Strings

    12/17

    Strings and mallocINCORRECT

    int main()

    {

    char *str;

    str = (char *) malloc (100);

    str = "hello";

    free(str);

    return 0;}

    It compiles properly, but gives a segmentation fault at the free() linewhen you run it. Why ?

    The malloc() line allocates a block 100 bytes long and points str at it.

    Line str = "hello" is syntactically correct because str is a pointer butwhen it is executed, str points to the string in the string constant table and theallocated block is orphaned. Since str is pointing into the string constant table,the string cannot be changed.

    free() fails because it cannot deallocate a block in an executable region.

    CORRECT

    int main()

    {

    char *str;

    str = (char *) malloc (100);

    strcpy(str, "hello");

    free(str);

    return 0;

    }

  • 8/9/2019 C Course - Strings

    13/17

    Passing string as anargument to a function

    Arrays are always automatically passed by reference

    So, Print() can be written in two ways:

    void Print(char the_string[])

    {

     printf("String: %s\n", the_string);

    }

    or

    void Print(char *the_string)

    {

     printf("String: %s\n", the_string);

    }

  • 8/9/2019 C Course - Strings

    14/17

  • 8/9/2019 C Course - Strings

    15/17

    strcat, strncat

    char * strcat(char *s1, const char *s2);

    char * strncat(char *s1, const char *s2, size _ t n);

    The strcat() and strncat() functions append a copy of the null-terminated string s2 to the end of the null-terminated string s1, then add aterminating '\0'. The string s1 must have sufficient space to hold theresult.

    The strncat() function appends no more than n  characters from s2, andthen adds a terminating '\0' .

    The strcat() function is easily misused and can easily cause bufferoverflow of the destination buffer and it is advised to use strncat() andensure that no more characters are copied to the destination buffer than itcan hold.

  • 8/9/2019 C Course - Strings

    16/17

    strcmp, strncmp

    int strcmp(const char *s1, const char *s2);

    int strncmp(const char *s1, const char *s2, size_t n);

    The strcmp() and strncmp() functions compare the null-terminated strings s1 and s2 .

    The strncmp() function compares not more than n  characters.

    Characters that appear after a `\0' character are not compared

    Return value is less than, equal to, or greater than zero if s1 (or thefirst n bytes thereof) is found, respectively, to be less than, to match,or be greater than s2.

  • 8/9/2019 C Course - Strings

    17/17

    strtok

    char *strtok(char *str, const char *delim)

    The strtok() function parses a string into a sequence of tokens.

    On the first call to strtok() the string to be parsed should be specified instr. In each subsequent call that should parse the same string, str shouldbe NULL.

    The delim argument specifies a set of characters that delimit the tokens inthe parsed string. The caller may specify different strings in delim insuccessive calls that parse the same string.

    Each call to strtok() returns a pointer to a null-terminated stringcontaining the next token. This string does not include the delimitingcharacter. If no more tokens are found, strtok() returns NULL.

    WARNINGWARNING

    This function modifies its firstThis function modifies its first argument so itargument so it cannot be used oncannot be used onconstant strings.constant strings.