wildcard expansion

21
Mayank Gupta and Rajpal Singh Wildcard Match 0in FE Noida March, 2012

Upload: mayanknsit

Post on 05-Dec-2014

326 views

Category:

Technology


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Wildcard expansion

Mayank Gupta and Rajpal Singh

Wildcard Match

0in FE Noida

March, 2012

Page 2: Wildcard expansion

2© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Agenda

Introduction Motivation New Flow Class Hierarchy

Mayank, Wildcard Match, March 2012

Page 3: Wildcard expansion

3© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Motivation

Efficiently matching a regular expression in a RTL design.

Use NELT to do matching.— Previous flow creates a separate data structure

altogether to do matching. — Using NELT hierarchy would reduce memory usage.

Enhance Functionality.

Mayank, Wildcard Match, March 2012

Page 4: Wildcard expansion

4© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

Mayank, Wildcard Match, March 2012

Tokenize •Tokenizing Pattern•Store it in appropriate Data structure

Match on NELT •Start matching on NELT.

Match on UTG •Do matching on UTG.•For Record/Arrays.

Page 5: Wildcard expansion

5© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

STEP 1 : Tokenize wildcard

Eg : Wildcard is “a*.b*.*.*c*”

Mayank, Wildcard Match, March 2012

a*

b* *

*c*

Page 6: Wildcard expansion

6© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

STEP 2 : Start matching nodes in NELT

- Match current token with top’s children

Mayank, Wildcard Match, March 2012

top

a1

a aa b2

b c1

C

b1

b c1

c

a*

b* *

*c*

Page 7: Wildcard expansion

7© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

- Match “b*” with children of a1

Mayank, Wildcard Match, March 2012

top

a1

a aa b2

b c1

C

b1

b c1

ca*

b* *

*c*

Page 8: Wildcard expansion

8© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

- Match “*” with children of b2

Mayank, Wildcard Match, March 2012

top

a1

a aa b2

b c1

c

b1

b c1

ca*

b* *

*c*

Page 9: Wildcard expansion

9© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

- Match “*c*” and “*” with children of c1

Mayank, Wildcard Match, March 2012

top

a1

a aa b2

b c1

c

b1

b c1

ca*

b* *

*c*

a*

b* *

*c*

Final Match

Page 10: Wildcard expansion

10© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

New Flow

Mayank, Wildcard Match, March 2012

Step 3 : Match on UTG hierarchy

— If we hit a record/Array/Subtype we match using UTG Hierarchy.

Page 11: Wildcard expansion

11© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Why match using UTG?

Mayank, Wildcard Match, March 2012

Because we do not create NELT for record symbols.

Hence we use UTG for matching inside records.

top

a1

a b2

b

b1

b

Record1

Record2

f21 f22

f1 f2

No NELT for this portion

Page 12: Wildcard expansion

12© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Tokenizing a wildcard

Mayank, Wildcard Match, March 2012

TokenBase

StarTokenStringToken

Class Hierarchy

A token can be of two types :— String Token — Star Token

Star token is simply a ‘*’ String token is anything other

than ‘*’ Eg : “a*.b*.*.*c*”

— String Tokens are a*,b*,*c*

— Star token is only 1 here - *

Page 13: Wildcard expansion

13© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

How to do Hierarchical Match?

Mayank, Wildcard Match, March 2012

Star Token

Hierarchical Match Star

Local Match Star

Two types of ‘*’ in regex

How we ensure that we match hierarchy in case of ‘*’

There are two types of ‘*’— Local Match Star— Hierarchical Match Star

Local Star matches only the nodes at current level

Hierarchical Star matches all the nodes at current and lower level.

Page 14: Wildcard expansion

14© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

How to do Hierarchical Match?

Mayank, Wildcard Match, March 2012

Star Token

Hierarchical Match Star

Local Match Star

Two types of ‘*’ in regex

Example. If pattern is “a*.b*.*.*c*”

It will be converted to

a*

H*

b*

H*

L*

H*

*c*

H*

Page 15: Wildcard expansion

15© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Organizing Tokens

Mayank, Wildcard Match, March 2012

Class NeltTokenArray contains — vector<TokenBase*>

Class NeltTokenIndex contains — NeltTokenArray*— Index (current token)

Class NeltRegexExpr contains— Vector<NeltTokenIndex*>

a*

b* *

*c*

a*

b* *

*c*

Index

a*

b* *

*c*

Index

a*

b* *

*c*

a*

b* *

*c*

IndexIndex

Page 16: Wildcard expansion

16© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Manager Classes

Mayank, Wildcard Match, March 2012

Class NeltRegexMgr is used to match on NELT. Class NeltUtgRegexMgr is used to match on UTG . It is the responsibility of NeltRegexMgr to invoke

NeltUtgRegexMgr.

Page 17: Wildcard expansion

17© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

C++ classes

NeltRegexMgr — NeltTraverse

NeltUtgRegexMgr— NeltTypeTraverse

NeltRegexExpr NeltTokenIndex

— NeltUtgTokenIndex

NeltTokenArray TokenBase

— StringToken— StarToken

Mayank, Wildcard Match, March 2012

NeltTokenIndex

NeltRegexMgr

NeltTraverse

NeltUtgTokenIndex

NeltUtgRegexMgr

NeltUtgTypeTraverse

TokenBase

StarTokenStringToken

Page 18: Wildcard expansion

18© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Source Code

Source files— src/commonpp/nelt

– neltRegexMgr.cxx– neltRegexMgr.hxx– neltUtgRegexMgr.cxx– neltUtgRegexMgr.hxx– neltRegexUtils.cxx– neltRegexUtils.hxx

Mayank, Wildcard Match, March 2012

Page 19: Wildcard expansion

19© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Performance data

Mayank, Wildcard Match, March 2012

S.No Test Case Old Flow Time(s)

New Flow Time(s)

1 Parme 161 484

2 Oracle 1814 1658

Page 20: Wildcard expansion

20© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

Future Work

Add a new class SliceToken deriving from TokenBase to store tokens of the form tok[slice]

Avoid duplicate matching— Eg : “a*.a*b” is expanded into two patterns : 1) a*.H*.a*b 2) a*.H*.a*.H*.*b Both the patterns have “a*.H*” in the beginning and

hence it gets matched twice.

Mayank, Wildcard Match, March 2012

Page 21: Wildcard expansion

21© 2011 Mentor Graphics Corp. Company Confidentialwww.mentor.com

www.mentor.com

Mayank, Wildcard Match, March 2012