nltk & python day 8 ling 681.02 computational linguistics harry howard tulane university

15
NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

Upload: delphia-randall

Post on 18-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

NLTK & PythonDay 8

LING 681.02Computational Linguistics

Harry HowardTulane University

Page 2: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

NLTK should be installed on the computers in this room!

Page 3: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

NLPP §2 Accessing text

corpora and lexical resources

§2.2 Conditional frequency

Page 4: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

4

Practice

Do "Your Turn" up to p. 55Exercises 2.8.2-4, 2.8.8

Page 5: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

NLPP §2 Accessing text

corpora and lexical resources

§2.3 More Python: Reusing code

Page 6: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

6

Creating a program with a text editor

Create the monty.py program.

Page 7: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

7

Other IDEs

Eclipse (Java Dev) + Pydev pluginhttp://www.eclipse.org/downloads/

Mac users should use Cocoa version

http://pydev.org/index.html

Xcode Tools now supports PythonIt is part of optional installation on DVD.You have to register as a developer to

download it from http://developer.apple.com/

Page 8: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

8

Functions

What might you want to put in your program? Why, a function, of course!

A function takes an input to produce an output or return value:>>> def my_function_name(my_inputs)... # calculate my_output... return my_output...

Page 9: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

9

Modules and higher

As you accumulate functions, you will want to store them somewhere.Save them all in the same text file with the .py

suffix, i.e. my_mod.py, called a module and import them as needed:

from my_mod import my_function_name

Hierarchyfunction < module < package < library

Page 10: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

NLPP §2 Accessing text

corpora and lexical resources

§2.4 Lexical resources

Page 11: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

11

Lexical resources

What is a lexicon?a collection of words and/or phrases,

sometimes with additional information such as part of speech or meaning

What is a lexical entry?A headword/lemma, along with that other info

saw1 [verb] past tense of see

saw2 [noun] cutting instrument

Page 12: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

12

More corpora

Wordlist corporawordsNames CorpusDo ex. 2.8.8

CMU Pronouncing DictionaryDo ex. 2.8.12

Comparative wordlistsSwadesh wordlistShoebox/Toolbox

Page 13: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

NLPP §2 Accessing text

corpora and lexical resources§2.5 WordNet

Page 14: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

11-Sept-2009 LING 681.02, Prof. Howard, Tulane University

14

Semantic relations

SynonymSynonyms are grouped into synsets in

WordNetlook at codeDo Your turn

Page 15: NLTK & Python Day 8 LING 681.02 Computational Linguistics Harry Howard Tulane University

Next time

Q/P2

Do two of Ex. 2.8.16-19

Start NLPP §3