regular expressions

20
Regular Expressions

Upload: raj-gupta

Post on 22-May-2015

933 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Regular expressions

Regular Expressions

Page 2: Regular expressions

Agenda

What are regular expressions Need for regular expressions Basic rules Practical examples Regular expression groups search() match() replace()

Page 3: Regular expressions

What are Regular expressions

A regular expression is an object that describes a pattern of

text characters.

There are two ways of defining a regular expression :

var regex=new RegExp(pattern,modifiers); or

var regex=/pattern/modifiers;

Example of pattern : ^[a-zA-Z0-9]+$

Modifiers :

g - global, i - ignore case, m - multiline

Page 4: Regular expressions

Without regular expressions ...

The following javascript function tests if a string contains only

alpha-numeric characters :

function bTestOnlyAlphaNum(strToTest) {

if (strToTest.length == 0) return false;

for (var i=0; i < strToTest.length; i++) {

var testChar = strToTest.charCodeAt(i);

if ((testChar < 48 || testChar > 57) &&

(testChar < 65 || testChar > 90) &&

(testChar < 97 || testChar > 122)) return false;

} return true;

}

Page 5: Regular expressions

Magic of regular expressions

By using regular expressions, the same functionality as shown

in the previous slide can be achieved as :

function bTestOnlyAlphaNum(strToTest) {

return (strToTest.match(/^[a-zA-Z0-9]+$/) != null);

}

We will get into details of this regular expression later. Let us

first walk through the basic rules of regular expressions.

Page 6: Regular expressions

Basic Rules

. Matches any one character, except for line breaks. * Matches 0 or more of the preceding character. + Matches 1 or more of the preceding character. ? Preceding character is optional. Matches 0 or

1 occurrence. \d Matches any single digit (opposite: \D) \w Matches any alphanumeric character &

underscore) (opposite: \W). \s Matches a whitespace character(opposite: \S)

Page 7: Regular expressions

Basic Rules

[XYZ] Matches any single character from the character

class.

[XYZ]+ Matches one or more of any of the characters in the set. $ Matches the end of the string. ^ Matches the beginning of a string. [^a-z] When inside of a character class, the ^ means NOT;

in this case, it will match anything that is NOT a

lowercase letter.

Page 8: Regular expressions

Practical Examples

1. In several cases, we want user to enter only alphanumeric

characters. We can achieve that functionality by using the

following function.

function bTestOnlyAlphaNum(strToTest) {

return (strToTest.match(/^[a-zA-Z0-9]+$/) != null);

}

Here we are using match() function of javascript on strToTest

which is a string.

Page 9: Regular expressions

Practical Examples

match() function returns non-null value if the string matches

the regular expression pattern, otherwise it returns null. If it

returns non-null value, our function returns true, meaning that

the input string contained only alphanumeric characters.

/^[a-zA-Z0-9]+$/

is the regular expression pattern.

^ specifies – from the beginning of the input string.

[a-zA-Z0-9] specifies – any one character which may be

any lowercase letter, uppercase letter or digit.

Page 10: Regular expressions

Practical Examples

+ specifies – one or more occurance of the previous character.

$ specifies –the end of the input string.

The entire pattern collectively specifies a string that from

beginning till the end contains one or more characters which

should be lowercase letter, uppercase letter or digit.

If user enters any such string that satisfy this regular expression

pattern, match function returns non-null. Thus our function

returns true.

Page 11: Regular expressions

Practical Examples

2. The following function matches a postal code which

contains only digits and may be in format xxxxx or

xxxxx-xxxx.

function bTestPostalCode(strToTest) {

if (strToTest == null || strToTest.length == 0) return false;

return (strToTest.match(/^\d{5}(-\d{4})?$/) != null);

}

Matches :

12345

12345-6789

Page 12: Regular expressions

Practical Examples

\d{5} specifies five digits

(-\d{4})? Means ”-” followed by four digits. Parenthesis are

used for grouping. ”?” at the end means that this entire group

is optional.

Thus the entire regular expression specifies, from the beginning

of the string and till the end, there must be 5 digits followed by

an optional group of - character with another 4 digits.

Page 13: Regular expressions

Groups

To match a pattern like 1234-567, we can write the regex as : /\d{4}-\d{3}/

In order to extract the individual portions, we can group them as follows :

/(\d{4})-(\d{3})/

Now we can access the first four digits by \1 and last three digits by \2

Page 14: Regular expressions

Groups

Example : let's say we want to mach ”howdy123” RegEx : /[”'][^”']*[”']/ But this regex does not require the opening quote to be

same as closing quote. It will also match patterns like ”howdy123', which is not permitted.

To prevent this we can write : /([”'])[^”']*\1/ Now it will match only if opening and closing quotes are

same.

Page 15: Regular expressions

search()

Finds position and occurance of pattern in the string. Does not support global search. Return character position of matched pattern or -1 if no match

is found. Example : ”abc 123 def 345 ghi”.search(/\d{3}/) Output : 4

Page 16: Regular expressions

match()

String.match(RegExp) can perform global search and returns an array of results.

For global search, the returned array contains all the matching parts of the source string.

For non-global search, the returned array contains the full match along with any parenthesized sub-patterns.

”abc 123 def 345 ghi”.match(/\d{3}/g) It will return an array ["123", "345"]

Page 17: Regular expressions

replace()

String.replace(RegExp,replacement) RegExp is the expression which defines the pattern to be

searched for. Replacement is the text to replace the match found or is a

function that generates the replacement text. If we are using global modifer with replace(), we can call a

function for every match found. For every match found, the matched value will be passed to the function as first argument. If there is a group inside the matched pattern, then it will be passed as the next argument. Each matched value will be replaced with the return value from its corresponding function.

Page 18: Regular expressions

replace()

function fnReplaceWithFunction(){

var srcText=”Number one is 011-33233334, number two is 032-83993333 and finally number three is 033-37443343. Site is http://www.abc.com/index.html”

var result=srcText.replace(/(\d{3})-(\d{8})/g, function(found,a,b){return b;});

alert(result);

}

Here for every string matching the pattern, function receives

three arguments. For example for first matched string

011-33233334, found= 011-33233334, a=011 and b=33233334.

Page 19: Regular expressions

replace()

Function is returning b, thus the matched pattern 011-33233334

is replaced with 33233334. Same happens for all the matched

Values. The final output is :

Page 20: Regular expressions

Thank You