computational linguistics yoad winter

24
Computational Linguistics Yoad Winter * General overview * Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State Automata and Formal Grammars

Upload: adia

Post on 31-Jan-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Computational Linguistics Yoad Winter. *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State Automata and Formal Grammars. Linguistics - from Theory to Technology. Language Technology. Industrie. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Computational Linguistics  Yoad Winter

Computational Linguistics Yoad Winter

* General overview

* Examples: Transducers; Stanford Parser; Google Translate; Word-Sense

Disambiguation

* Finite State Automata and Formal Grammars

Page 2: Computational Linguistics  Yoad Winter

Linguistics - from Theory to Technology

Computational Linguistics

Theoretical Linguistics

Natural Language Processing

Language Technology

INFO

TLW

Industrie

Page 3: Computational Linguistics  Yoad Winter

Goals of CL:* Foundations for Linguistics in Computer Science (e.g. Formal Language theory)* Computable linguistic theories (HPSG, LFG,

Categorial Grammar)* Implementation of demos for linguistic

theories* (Mathematical Linguistics)

Computational Linguistics

Page 4: Computational Linguistics  Yoad Winter

Goals of NLP – practical applications of CL:* Speech recognition/synthesis* Machine translation* Summarization* Question answering* Text categorization* Grammar checking

Statistical NLP:* Unsupervised* Supervised (corpus-based)

Natural Language Processing

Page 5: Computational Linguistics  Yoad Winter

Goals of LT:* Useful linguistic resources (lexicons, grammar

rules, semantics webs)* Implementation of most useful tools involving

language processing(Google translation, Word spell checker, MS Speech Recognizer etc.)

Language Technology

Computational Linguistics

Theoretical Linguistics

Natural Language Processing

Language Technology

Page 6: Computational Linguistics  Yoad Winter

Input: Output: J&M (2009)

Words in text Part of speech (Noun/Verb); Morphological Information

Speech Sound TextWave

text Speech Sound Wave

Sentence in text Phrases in Sentences (noun phrase, verb phrase)

Sentence/text Action/Reasoning

Sentence/text Translation

Language Processing - Tasks

I: Words

II: Speech

III: Syntax

IV: Semantics & Pragmatics

V: Applications

Page 7: Computational Linguistics  Yoad Winter

Start with null information state I=0Repeat while there is language to read:

- Read a language token T- Recognize T: extract information

I(T) - Update information state I using I(T)- Do some action using I(T)

Processing - General Idea

Page 8: Computational Linguistics  Yoad Winter

I want to cash a check:- Start from state 0- Read “I want to”, move to state 1, and

output nothing- Read “cash”, move to state 3, and output V- Read “a check”, move to state 4, and

output nothing

Example 1 – Finite State Transducers

I want to have some |ε

I want to |ε cash | V

cash | Na check |ε0

2

1 3

4

Page 9: Computational Linguistics  Yoad Winter

I want to have some cash:- Start from state 0- Read “I want to have some”, move to state

2, and output nothing- Read “cash”, move to state 4, and output N

Example 1 – Finite State Transducers

I want to have some |ε

I want to |ε cash | V

cash | Na check |ε0

2

1 3

4

Page 10: Computational Linguistics  Yoad Winter

Your queryI want to cash a checkTaggingI/PRP want/VBP to/TO cash/VB a/DT check/NNParse

Example 2 – Stanford Parser

http://nlp.stanford.edu/software/lex-parser.shtml link

Page 11: Computational Linguistics  Yoad Winter

Your queryI want to have some cashTaggingI/PRP want/VBP to/TO have/VB some/DT cash/NNParse

Example 2 – Stanford Parser

http://nlp.stanford.edu/software/lex-parser.shtml link

Page 12: Computational Linguistics  Yoad Winter

Example 3 – Google Translate

Page 13: Computational Linguistics  Yoad Winter

Example 3 – Google Translate

Page 14: Computational Linguistics  Yoad Winter

Summary

We have seen ways to process:- words, word-by-word:

transducers- sentences, with a tree structure:

Stanford Parser

A word like CASH must be disambiguated for Noun or Verb, in order to have a correct translation.

Other kinds of disambiguation?

Page 15: Computational Linguistics  Yoad Winter

Example 4 - Word-Sense Disambiguation

the light blue car: 1. de lichtblauwe auto2. de lichte blauwe auto

John likes the light blue car but not the deep blue car

John was able to lift the light blue car but not the heavy blue car

Google Translate: lichtblauwe auto in both cases

Word-Sense disambiguation: finding the right sense of the word

Page 16: Computational Linguistics  Yoad Winter

Basic Model 1: Finite State Automata (FSA)

q0- start stateq4- accepting statearrows – transitions, also defined by a transition

table

Page 17: Computational Linguistics  Yoad Winter

FSA - formally

Page 18: Computational Linguistics  Yoad Winter

Tracing the execution of an FSA

“baaa!” is accepted because when taking the input symbols one by one, we reached the accepting state q4.

Page 19: Computational Linguistics  Yoad Winter

FSA’s as Grammars

An FSA possibly describes an infinite set of strings over a finite input alphabet Σ.

We thus say that an FSA describes a grammar over Σ, which derives a formal language over Σ.

More officially:Σ – a finite set Σ* – all the strings over Σ (infinite)L(FSA) = the language of the FSA

is the set of strings S in Σ* that are derived by the FSA.

Any set described by an FSA is called regular.

Page 20: Computational Linguistics  Yoad Winter

Non-regular languages and complexity

L = { ab, aabb, aaabbb, aaaabbbb, … }can be shown to be non-regular.

No FSA can derive this language L!

But there are grammars that can also generate non-regular languages!

Are natural languages regular or non-regular?How hard it is for a computer to recognize regular

and non-regular languages?Are there different classes of formal languages in

terms of their complexity?

Page 21: Computational Linguistics  Yoad Winter

Another way to define regular langauges – regular expressions

A regular expression is a compact way for describing a regular language.

Example: baa(a*)!

descibes the same language as the FSA we saw.We say that this regular expression matches any string in this

language, and does not match other strings.

Page 22: Computational Linguistics  Yoad Winter

Regular expressions - formally

Σ - a finite alphabet1- Any string in Σ is a regular expression that matches

itself“a” matches “a”; “b” matches “b”; etc.

2- If A and B are regular expressions then AB is a regular expression that matches any concatenation of a string that A matches with a string that B matches.

“ab” matches “ab”3- If A and B are regular expressions then A|B is a regular

expression that matches any string that A or B match. “a|b” matches both “a” and “b”

4- If A is regular expression then A* matches any string that has zero or more As.

“a*” matches the empty string, “a”, “aa”, “aaa” etc.

Page 23: Computational Linguistics  Yoad Winter

Examples

Convention: we give precedence to *.AB* = A(B*)

Convenience: we let ε match the empty string.

a|b* matches {ε, "a", "b", "bb", "bbb", ...}

(a|b)* matches the set of all strings with no symbols other than "a" and "b", including the empty string: {ε, "a", "b", "aa", "ab", "ba", "bb", "aaa", ...}

ab*(c|ε) denotes the set of strings starting with "a", then zero or more "b"s and finally optionally a "c": {"a", "ac", "ab", "abc", "abb", "abbc", ...}

Page 24: Computational Linguistics  Yoad Winter

At home

Read 3.4-3.6 on Transducers as preparation for Eva’s class.