parsing for fun and profit

33
Parsing for Fun and Profit (but mainly fun) Ash Moran [email protected] PatchSpace Ltd Saturday, 23 February 13

Post on 17-Oct-2014

2.103 views

Category:

Technology


3 download

DESCRIPTION

Slides from my talk Parsing for Fun and Profit, code is available here: https://github.com/patchspace/parsing_for_fun_and_profit

TRANSCRIPT

Page 1: Parsing for Fun and Profit

Parsingfor Fun and Profit(but mainly fun)

Ash [email protected]

PatchSpace LtdSaturday, 23 February 13

Page 2: Parsing for Fun and Profit

What?

Saturday, 23 February 13

Page 3: Parsing for Fun and Profit

Parsing

Adding structure and meaning to text

Saturday, 23 February 13

Page 4: Parsing for Fun and Profit

Parsing Human Languages

Jake stretched his legs“Jake”, “stretched”, “his”, “legs”“Jake”<noun>, “stretched”<verb, past>, “his”<possessive pronoun>, “legs”<noun>“Jake” <noun, subject>, “stretched”, (“his”, “legs”)<noun phrase, object>

Saturday, 23 February 13

Page 5: Parsing for Fun and Profit

Parsing Computer Languages

“foo = bar + 123”“foo”, “=”, “bar”, “+”, “123”“foo”<var>, “=”<assignment_op>, “bar”<var>, “+”<op_plus>, “123”<int_literal>

Saturday, 23 February 13

Page 6: Parsing for Fun and Profit

Why?

Saturday, 23 February 13

Page 7: Parsing for Fun and Profit

Not just compiling!Compilers breathe fire.

Saturday, 23 February 13

Page 8: Parsing for Fun and Profit

Pretty PrintingSaturday, 23 February 13

Page 9: Parsing for Fun and Profit

Pretty Printing

gofmt

http://gofmt.com/

Saturday, 23 February 13

Page 10: Parsing for Fun and Profit

Code Smell Detectorshttps://rubygems.org/gems/reek

Saturday, 23 February 13

Page 11: Parsing for Fun and Profit

Code Smell DetectorsSaturday, 23 February 13

Page 12: Parsing for Fun and Profit

Other ideasCode metricsBug detectorsDomain-specific languagesLanguage translators (e.g. Ruby -> PHP)Code obfuscatorsAlternative syntaxes (e.g. CoffeeScript)Refactoring tools

Saturday, 23 February 13

Page 13: Parsing for Fun and Profit

How?

Saturday, 23 February 13

Page 14: Parsing for Fun and Profit

Step 13 year computer science

degree

Saturday, 23 February 13

Page 15: Parsing for Fun and Profit

Lexing/Tokenising

if x > 100 then return “big” else return “small”if x > 100 then return “big” else return “small”

Saturday, 23 February 13

Page 16: Parsing for Fun and Profit

Tree Buildingif x > 100 then return “big” else return a + b

if

x

>

100

then

return

“big”

else

return

a+

b

Saturday, 23 February 13

Page 17: Parsing for Fun and Profit

Parsing Expression Grammars

Like regular expressions, but can handle recursion, e.g. HTMLNot actually that much harder to use

Saturday, 23 February 13

Page 18: Parsing for Fun and Profit

Regexes and HTML

Saturday, 23 February 13

Page 19: Parsing for Fun and Profit

Treetop PEG grammarSaturday, 23 February 13

Page 20: Parsing for Fun and Profit

Doing Sums

Saturday, 23 February 13

Page 21: Parsing for Fun and Profit

Switch to Sublime Text, idiot

Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/

Saturday, 23 February 13

Page 22: Parsing for Fun and Profit

A Ruby Syntax Highlighter

Saturday, 23 February 13

Page 23: Parsing for Fun and Profit

What

A tool to read in simple Ruby source and output syntax highlighted HTML

Saturday, 23 February 13

Page 24: Parsing for Fun and Profit

Why

Because I thought it would be funIt wasBecause I thought it would be easy…

Saturday, 23 February 13

Page 25: Parsing for Fun and Profit

Why

Saturday, 23 February 13

Page 26: Parsing for Fun and Profit

HowBuild a parse tree of the Ruby sourceWalk the tree and spit out a <span> element for each bit of textOh yes, make sure each line goes in <div> and <pre> tagsWrap it in <html>And for bonus points, do some fancy method highlighting

Saturday, 23 February 13

Page 27: Parsing for Fun and Profit

Switch to Chrome, idiot

Saturday, 23 February 13

Page 28: Parsing for Fun and Profit

Switch to Sublime Text again, idiot

Code is now available:https://github.com/patchspace/parsing_for_fun_and_profit/

Saturday, 23 February 13

Page 29: Parsing for Fun and Profit

We’re doing this the hard way

Ruby’s grammar is too complex and undefined to easily implement as a PEGTools for parsing Ruby already exist

Saturday, 23 February 13

Page 30: Parsing for Fun and Profit

Ripper (Ruby 1.9.3)Saturday, 23 February 13

Page 31: Parsing for Fun and Profit

Learn more!

Skip theoretical physics, start by playing with Lego

Saturday, 23 February 13

Page 32: Parsing for Fun and Profit

Do moreIdeas you might like to try:

CSV parserJSON parser (return arrays & hashes)XML parserJSON highlighterA simple JavaScript minifier (just kill whitespace)

Saturday, 23 February 13

Page 33: Parsing for Fun and Profit

Thank you

Ash [email protected]

PatchSpace LtdSaturday, 23 February 13