informatics 2a: language complexity and the chomsky hierarchy
TRANSCRIPT
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Informatics 2A: Language Complexity and theChomsky Hierarchy
Slides by Bonnie Webber (modified by Stuart Anderson)
September 28, 2010
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Review
Chomsky’s Models
Dependency as a measure of Complexity of Language
The Chomsky Hierarchy
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Starter 1
Is there a finite state machine that recognises all those strings sfrom the alphabet {a, b} where the difference between the numberof as and number of bs is less than k for some constant k?
I True or
I False?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Starter 2
Is there a finite state machine that recognises all those strings sfrom the alphabet {a, b} where the difference between the numberof as and number of bs is less than k for some constant k in everyprefix of s?A prefix of any string s is a string p such that there is a string qsuch that s = pq. Note that it is possible that q = ε.
I True or
I False?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Readings and Labs
I J&M[2nd.Ed] ch. 15 (pp. 1–4)
I Kozen: Lecture 21
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Languages: Collection and Generation
A formal language is the possibly infinite set of strings over a finiteset of symbols (called a vocabulary or lexicon).
Such strings are also called sentences of the language.
Where do the sentences come from?
I from a (finite) list – useful, but not very interesting (maybemore interesting when we have collections of really largesamples of speech or text).
I from a grammar – abstract characterisation of the stringsbelonging to a language. Grammars are a generativemechanism, they give rules for generating potentially infinitecollection of finite strings.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Different kinds of Language
Programming language: Programmers are given an explicitgrammar for the syntactically valid strings of the language thatthey must adhere to.
Human language: Children hear/see sentences of a language (their“mother tongue” or other languages used at home or in theircommunity) and are sometimes (but not always!) corrected if astring they generate isn’t in the language.
Without being given an explicit grammar, how dochildren learn a grammar(s) for the infinite number ofsentences that belong to the language(s) they speak andunderstand?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Structure and Meaning
Small red androids sleep quietly.√
Colorless green ideas sleep furiously.√
Sleep green furiously ideas colorless. ]
Mary persuaded John to wash himself with lavender soap.√
Mary persuaded John to wash herself with lavender soap. ]Mary persuaded John to wash her with lavender soap.
√
Mary promised John to wash herself with lavender soap.√
Mary promised John to wash himself with lavender soap. ]Mary promised John to wash him with lavender soap.
√
I Characterising child language acquisition is one goal ofLinguistics.
I Characterising language learnability (grammar induction) isone goal of Informatics.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Natural and Formal Languages
More broadly, the goals of Linguistics are to characterise:I individual languages: figuring out and specifying their sound
systems, grammars, and semantics;I how children learn language and what allows them to do so;I the social systems of language use;I how individual languages change over time, and how new
languages arise.
Work on formal languages in Informatics contributes to achievingthese goals through
I clear computational methods of characterising the complexityof languages;
I clear computational methods for processing languages;I clear computational theories of language learnability.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Questions
We heard from Lecture 2 that grammars differ in their complexity.I What is complex about a complex grammar?I How does adding a data structure to an automaton allow its
corresponding grammar to be more complex?I How does removing limits on how the store on an automaton
is accessed allow its corresponding grammar to be morecomplex?
I Is there any relationship between language complexity andhow hard a language is to learn?
Chomsky’s desire to find a “simple and revealing” grammar thatgenerates exactly the sentences of English led him to the discoverythat some models of language were more powerful than others.[Noam Chomsky, Three Models for the Description of Language,IRE Transactions on Information Theory 2 (1956), pp. 113–124.]
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Noam Chomsky
I Credited with the creation of the theory of generativegrammar
I Significant contributions to the field of theoretical linguisticsI Sparked the cognitive revolution in psychology through his
review of B.F. Skinner’s Verbal BehaviorI Credited with the establishment of the
Chomsky-Schutzenberger hierarchy, a classification of formallanguages in terms of their generative power
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Three Models for the Description of Language
I Linguistic theory attempts to explain the ability of a speakerto produce and understand new sentences, and to reject asungrammatical other new sequences, on the basis of hislimited linguistic experience. [Chomsky 1956, p. 113]
I The adequacy of a linguistic theory can be tested by lookingat a grammar for a language constructed according to thetheory and seeing if it makes predictions that accord withwhat’s found in a large corpus of sentences of that language.
I What about what is not found in a large corpus of sentences?
I Chomsky’s paper explores the sort of linguistic theory that is“required as a basis for an English grammar what will describethe set of English sentences in an interesting and satisfyingmanner”.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Three Models for the Description of Language
For that description to be “interesting and satisfying”, Chomskyfelt that a grammar had to be
I finite
I “revealing”, in allowing strings to be associated with meaning(semantics) in a systematic way
The three models he considered were:
1. Grammars based on Finite-state Markov processes [Shannon& Weaver 1947, The Mathematical Theory ofCommunication] – regular grammars
2. Phrase structure grammars reflecting pedagogical ideas of“sentence diagramming”
3. Transformational grammars
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Dependency and Complexity
Much of Chomsky’s argument in 3MDL is based on the notion ofdependency:Suppose s =a1a2 . . . an is a sentence of language L.We say that S has an i-j dependency if when symbol ai is replacedwith symbol bi , the string is no longer a sentence of L and whensymbol aj is then replaced by some new symbol bj , the resultingstring is a sentence of L.We’ve already seen such a dependency in English: Marypersuaded John to wash himself with lavender soap.
John ⇒ Suehimself ⇒ herself
Mary persuaded Sue to wash herself with lavender soap.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Dependencies don’t need to be binary
I R.D. Laing took this to extremes in Knots his play on sanityin everyday language.
I “There must be something the matter with him because hewould not be acting as he does unless there was therefore heis acting as he is because there is something the matter withhim”
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Dependency Sets
I If we restrict ourselves to binary dependencies, then for anysentence s we can construct a dependency setD = {(i1, j1), . . . (ik , jk)} where each pair is a dependency inS .
I For example: If Mary has persuaded John to wash himselfwith lavender soap, then he is clean. (dep set size = 4)
I Sentences in the language generated by a regular grammarcan have dependencies.
I Consider the regular language described by a regularexpression:
L0 = (b∗ + (ab∗c))∗
I.e. where every a is eventually followed by a c and only bsmay intervene
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
An example: L0
I bbbabbcbbbabcbbbb ∈ L0 is a typical sentence in thelanguage.
I {(4, 7), (11, 13)} is the dependency set for the sentence.
I If we use the convention that we colour the pair of symbols inthe dependency set the same colour and we can reuse coloursfor parts of the string after the later symbol in thedependency pair has appeared. How many colours do we needto colour the symbols in sentences in L0?
I bbbabbcbbbabcbbbb uses just one colour.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Limits to Dependencies
I The number of colours we need to colour the dependency setof a sentence gives us a measure of the amount that has to beremembered about earlier symbols to get the dependenciesright. If we need k colours then we need to remember ksymbols at most at any one time.
I For any regular language R there must exist a constant kR
such that the dependency set for any sentence in the languagecan be coloured with at most kR colours.
I What do you make of this claim?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Example 1
I L1 consists of all (and only) sentences over {a, b} containingn as followed by n bs: e.g., ab, aabb, aaabbb, . . ..
I Suggest a dependency set for aaaaaabbbbbb.
I How many colours does it take to colour the dependencies?
I How many colours does it take to colour the dependencies foranbn?
I Is this a good example? What would you need to add toimprove it?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Example 2
I L2 consists of all (and only) sentences over {a, b} containinga string of as and bs followed by its reverse{ααR | α ∈ {a, b}∗}: e.g.,aa, bb, abba, baab, abaabbaaabbbbbbaaabbaaba, . . ..
I What is the dependency set for aaaaaaaa?
I How many colours are required to colour this dependency set?
I How many colours does it take to colour the dependency setfor a2n?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Example 3
I L3 consists of all (and only) sentences over {a, b} containinga string of as and bs followed by the same string over again,{αα | α ∈ {a, b}∗}: e.g.aa, bb, abab, baba, abbabb, abaaba, . . ..
I What is the dependency set for aaaaaaaa?
I How many colours does it take to colour the dependencies?
I How many colours does it take to colour the dependency setfor a2n?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Questions
I For any string of length 2k in L2, what is its dependency set?
I For any string of length 2k in L3, what is its dependency set?
I Is the dependency set unique for strings in L1? strings in L2?strings in L3?
I For each of the languages L1, L2 and L3 what is the minimumand maximum size of the dependency set for any string oflength 2k?
I Give an example language in which some sentences have morethan one dependency set.
I Can you devise a language which is regular (i.e. recognisableby a FSM) and whose dependency set needs more than onecolour?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
The simplest languages – ones that can be described by a regulargrammar – need at most a finite number of colours to colour anydpendency set in the language.
They are at the lowest rung of the Chomsky Hierarchy.
regular grammars
Are all languages with arbitrarily many dependenciesequally complex?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Phrase Structure Grammars
Phrase structure grammars provide a way of analysing sentencesvery much like some of us were taught to do:
the man took the book
NP verb NPVP
Sentence
This is called an “Immediate Constituent Analysis”.It shows a sentence made of a noun phrase (NP) followed by averb phrase (VP).
. . . a verb phrase made of a verb folllowed by an NP.
How is phrase structure specified?
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
A phrase structure grammar consists of
I a finite vocabulary V
I a finite set Σ of initial strings over VI a finite set of rules of the form X → Y where
1. X and Y are strings over V2. Y is formed from X by replacing one symbol of X with a string
over V3. Neither the replaced symbol nor the replacing string is empty
(ε).
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Context-free Phrase Structure Grammars
Rules of the simplest PS Grammars contain only a single symbolon their left-hand side – e.g.,
Σ: {S}S → NP VPVP → verb NPNP → the manNP → the bookverb → took
These are called Context-free PSGs or, for short, Context-freeGrammars (CFGs).
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Derivations in CFGs
The sequence of strings over V produced by a sequence of PS ruleapplications, starting from an initial string, is called a derivation:
S ⇒ NP VP ⇒ NP verb NP ⇒ NP verb the book ⇒ NPtook the book ⇒ the man took the book
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Dependency and PS Grammars
Some dependencies that are beyond the capability of a regulargrammar can be captured by a context-free grammar. Suchdependencies are ones that can be generated locally.
Recall L1: all (and only) sentences over {a, b} containing n a’sfollowed by n b’s.Here, the presence of a b on the right of the string depends onthere being a comparable a on the left.Simple PSG for generating L1:
V = {a, b, S}Σ = SPS rules: S → aSb
S → ab
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Derivation
Sample derivation:
S ⇒ aSb ⇒ aaSbb ⇒ aaaSbbb ⇒ aaaabbbb
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Dependency and Complexity Revisited
Are all dependencies local?Are there dependencies that cannot be capture in a CFG?
context-free grammars
regular grammars
The dependency in L3 = {XX} where X is a string over {a, b}cannot be captured by a CFG, nor can the dependency in L4,consisting of all (and only) sentences over {a, b, c} containing astring of n a’s, then n b’s followed by n c’s – e.g., abc, aabbcc,aaabbbccc, etc.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Context-sensitive PSGs
Phrase structure grammars with rules whose LHS contain >1symbol are called context-sensitive phrase structure grammars orsimply, context-sensitive grammars.
Simple context-sensitive grammar for generating L4:
V = {a, b, c, S, B}Σ = SPS rules: S → abc | aSBc
cB → BcbB → bb
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Sample Derivation
Sample derivation:
S ⇒ aSBc ⇒ aaSBcBc ⇒ aaabcBcBc ⇒ aaabBccBc ⇒aaabbccBc ⇒ aaabbcBcc ⇒ aaabbBccc ⇒ aaabbbccc
Context on the LHS allows for more dependencies and hence morecomplexity.
context-sensitive grammars
context-free grammars
regular grammars
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy
OutlineReview
Chomsky’s ModelsDependency as a measure of Complexity of Language
The Chomsky Hierarchy
Top of the Chomsky Hierarchy
Arbitrary re-write systems that can take account of any amount ofcontext on the LHS and re-write any number of symbols, calledType 0 grammars.
Type 0 grammars
context-sensitive grammars
context-free grammars
regular grammars
This is what is normally called the Chomsky hierarchy.
Slides by Bonnie Webber (modified by Stuart Anderson) Inf2A: Chomsky Hierarchy