natural language processing dcg and syntax nlp dcg a “translation” example: special case a dcg...

18
Natural Language Processing DCG and Syntax • NLP • DCG A “translation” example: special case A DCG recogniser

Upload: james-crawford

Post on 17-Dec-2015

237 views

Category:

Documents


0 download

TRANSCRIPT

Natural Language Processing

DCG and Syntax

• NLP• DCG • A “translation” example: special case• A DCG recogniser

Natural Language Processing

NLP is the art and science of getting computers to understand natural language.

• NLP draws on materials from other disciplines: computer science, formal philosophy and formal linguistics

• NLP is an “AI complete” task: all the activities which turn up elsewhere in AI, such as knowledge representation, planning, inference and so on turn up in one form or another in NLP.

DCG (Definite Clause Grammar)

• An example• $120pw $200perweek $150pweek• [$, 1, 2, 0, p, w]

price-->dollar, number, unit.dollar-->[$].number-->digit, number.number-->digit.unit-->[p,w].unit-->[p, e, r, w, e, e, k].unit-->[p, w, e, e, k].

digit-->[1].digit-->[2].digit-->[3].... • ?- price([$, 2, 3, 4, p, w], []).• Yes• ?-price([8, 0, 0, p, w, e, e, k], []).• No.

Expand DCG to standard predicates

Price-->dollar, number, unit.

price(List1, List2):-

dollar(List1, List11),

number(List11, List12),

unit(List12, List2).

dollar-->[$].

dollar([$|List], List).

digit-->[1].

digit([1|List], List]).

number-->digit, number.

number-->digit.

Extending DCG

1. Add variables2. Add normal predicates in { }

• ?-price(X, [$, 1,2,3, p, w], []).• X=[1,2,3]

price(X)-->dollar, number(X), unit.

number([D|T])-->digit(D), number(T).

number([D])-->digit(D).

digit(1)-->[1].digit(2)-->[2].

price(X)-->dollar, number(X), unit, {length(X,N), N<3}.

Expand extended DCG to standard predicates

price(X)-->dollar, number(X), unit, {length(X, N), N<3}.

price(X, List1, List2):-dollar(List1, List11),number(X, List11, List12),

unit(List12, List2),length(X, N),N<3.

A “machine translation” example

• three hundred and thirty four: 334• twenty one: 21• fourteen: 14• five: 5

• ?-to_number(N, [three, hundred, and, thirty, four],[]).

• N=334.

A “translation” example

• Vocabulary, lexicondigit(1) --> [one].digit(2) --> [two].…..digit(9) --> [nine].

teen(10) --> [ten].teen(11) --> [eleven].…..teen(19) --> [nineteen].

tens(20) --> [twenty].tens(30) --> [thirty].…..tens(90) --> [ninety].

A “translation” example

• Numbers with one or two digits.

xx(N) --> digit(N).xx(N) --> teen(N).xx(N) --> tens(T), rest_xx(N1), {N is T+N1}.

rest_xx(N) --> digit(N).rest_xx(0) --> [].

A “translation” example

% numbers with 3 or fewer digitsxxx(N) --> digit(D), [hundred], rest_xxx(N1),

{N is D*100+N1}.xxx(N) --> xx(N).

rest_xxx(N) --> [and], xx(N).rest_xxx(0) --> [].

%top level to_number(0) --> [zero].to_number(N) --> xxx(N).

Query?-to_number(N, [two, hundred, and, twenty

one], []).N=221

Representing Syntactic Knowledge

Syntactic knowledge:– Syntactic Categories: e.g. Noun, Sentence.– Grammatical features: e.g. Singular, Plural– Grammar rules.

• Why bother?

Parts of language

• Regard sentences as being built out of constituents

• Two types of constituents:– words (simple constituents), which have

lexical categories like noun, verb, etc.– phrases (compound constituents), like noun

phrases, verb phrases, etc.

• How to store syntactic knowledge?– lexicon– grammar rules

Words: Lexical Categories (Parts of Speech)

• Noun (N): Jack, tree, house, cannon

• Verb (V): build, walk, kill

• Adjective (Adj): big, red, unpleasant

• Determiner (Det): the, a, which, that– Jack built {the, a, that} big, red house;– Which house did Jack build?

• Preposition (Prep): with, for, in, from, to, through, via, under

Words: Lexical Categories (ctd)

• Pronoun (Pro): her, him, she, itself, that, it– I saw the man in the park with the telescope– Don't do that to him

• Conjunction (Conj): and, or, but.

Two kinds of lexical categories:

1. Open categories (“content words”): N, V, Adj

2. Closed categories (“function words”): Det, Prep, Pro, Conj

Compound Constituents

Some compound constituents:

Sentence (S): Jack built the house.Noun Phrase (NP):

John;the big, red house;the house that Jack built;the destruction of the city.

Verb Phrase (VP):built the house quickly;saw the man in the park.

Prepositional Phrase (PP):with the telescope;on the table

A Simple Grammar

S NP VP

VP V NP

NP Proper_N

NP det N

Proper_N John

Proper_N Mary

N cake

V loves

V ate

det the

Sentences in this language:“John loves Mary”“John ate the cake”“John loves the cake”

Definite Clause Grammars (DCGs)

The above grammar can be simply implemented in DCG notation as follows:

s --> np, vp.vp --> v, np.np --> proper_n.np --> det, n.proper_n --> [john].proper_n --> [mary].n --> [cake].v --> [loves].v --> [ate].det --> [the].

Translating DCG

Consider the rules --> np, vp.

Prolog translates this as:s(Ws1,Ws2) :- np(Ws1,Ws),vp(Ws,Ws2).

This says that after taking an s off the start of Ws1, Ws2 remains

The ruleproper_n --> [john].

is translated asproper_n([john|Ws],Ws).

Query• s([john, ate, the cake],[]).• Yes• s([ate, john, cake, the],[]).• No