introduction to regex(2)

Upload: tota-binothman

Post on 07-Apr-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Introduction to Regex(2)

    1/15

    REGULAREXPRESSIONS & LEX

    Arwa Basabrain

  • 8/4/2019 Introduction to Regex(2)

    2/15

    What Are Regular Expressions?

    Regular expressions are patterns ofcharacters that match, or fail to match,sequences of characters in text. To allow

    developers to create regular expressionpatterns, certain characters andcombinations of characters have special

    meanings and uses

  • 8/4/2019 Introduction to Regex(2)

    3/15

    What Can Regular Expressions BeUsed For?

    Finding Doubled Words

    Checking Input from Web Forms

    Changing Date Formats Finding Incorrect Case

    Search and Replace in Word Processors

    Directory Listings

    Online Searching

  • 8/4/2019 Introduction to Regex(2)

    4/15

    Regular Expression Basics

    . : matches any single character except \n

    * : matches 0 or more instances of the preceding regularexpression

    + : matches 1 or more instances of the preceding regular expression

    ? : matches 0 or 1 of the preceding regular expression

    | : matches the preceding or following regular expression

    [ ] : defines a character class

    () : groups enclosed regular expression into a new regular expression: matches everything within the literally

  • 8/4/2019 Introduction to Regex(2)

    5/15

    Regular Expression Basics

    . Any character (may or may not match line terminators)\d A digit: [0-9]\D A non-digit: [^0-9]\s A whitespace character: [ \t\n\x0B\f\r]\S A non-whitespace character: [^\s]\w A word character: [a-zA-Z_0-9]\W A non-word character: [^\w]

  • 8/4/2019 Introduction to Regex(2)

    6/15

    Meta-characters

    meta-characters (do not match themselves, because they areused in the preceding reg exps):

    ( ) [ ] { } < > + / , ^ * | . \ " $ ? - %

    to match a meta-character, prefix with "\"

    to match a backslash, tab or newline, use \\, \t, or \n

  • 8/4/2019 Introduction to Regex(2)

    7/15

    Lex Regular Expressions

    Lex uses an extended form of regular expression:

    (c: character, x,y: regular expressions, s: string, m,nintegers and i: identifier).

    c any character except meta-characters (see below)[...] the list of enclosed chars (may be a range)

    [...] the list of chars not enclosed

    . any ASCII char except newline

    xy concatenation of x and yx* same as x*

    x+ same as x+ (i.e. x* but not )

    x? an optional x (same as x+ )

  • 8/4/2019 Introduction to Regex(2)

    8/15

    Lex Reg Exp (cont)

    x|y x or y

    {i} definition of i

    x/y x, only if followed by y (y not removed from input)

    x{m,n} m to n occurrences of x

    x x, but only at beginning of line

    x$ x, but only at end of line

    "s" exactly what is in the quotes (except for "\" and

    following character)

    A regular expression finishes with a space, tab or newline

  • 8/4/2019 Introduction to Regex(2)

    9/15

    Regular Expression Examples

    Matching Floating Point Numbers

    [-+]?[0-9]*\.?[0-9]+

    Match numbers with exponents

    [-+]?[0-9]*\.?[0-9]+([eE][-+]?[0-9]+)?if you want to validate if a particular string holds a floating point number,rather than finding a floating point number within longer text

    ^[-+]?[0-9]*\.?[0-9]+$

  • 8/4/2019 Introduction to Regex(2)

    10/15

    Regular Expression ExamplesCon

    Matching a Valid Date

    (19|20)\d\d[- /.](0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])

    Match Email Address

    ^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,4}$Matches a complete line of text that contains any of the words "one","two" or "three".

    ^.*\b(one|two|three)\b.*$

  • 8/4/2019 Introduction to Regex(2)

    11/15

    Regular Expressions DesignerProgram

  • 8/4/2019 Introduction to Regex(2)

    12/15

    LanguageElementsection

  • 8/4/2019 Introduction to Regex(2)

    13/15

    Input, Regular Expression& Resultsections

  • 8/4/2019 Introduction to Regex(2)

    14/15

    Lex program Example

  • 8/4/2019 Introduction to Regex(2)

    15/15

    definitions %% rules %% subroutines