computational language
DESCRIPTION
Computational Language. Andrew Hippisley. Computational Language. Computational language and AI Language engineering: applied computational language Case study: spell checkers. Computational language & AI. Artificial Intelligence: - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/1.jpg)
Computational Language
Andrew Hippisley
![Page 2: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/2.jpg)
Computational Language Computational language and AI Language engineering: applied
computational language Case study: spell checkers
![Page 3: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/3.jpg)
Computational language & AI Artificial Intelligence:
“the simulation on computer of distinctly human mental functions.” Wilks (1993)
![Page 4: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/4.jpg)
Computational language & AI Language integral to intelligent
systems Artificial Intelligence Turing Test ELIZA
![Page 5: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/5.jpg)
ELIZA Eliza: Hello. I am ELIZA. How can I help you? Me: I am feeling a bit depressed today Eliza: Do you think coming here will help you not to be depressed? Me: yes Eliza: You are sure? Me: a friend told me I should talk to you Eliza: Do you wish to you should talk to me? Me: I am willing to try Eliza: How long have you been willing to try? Me: I have been willing to try for about two weeks Eliza: Can you elaborate on that? Me: a friend said I should try to talk to you about my problem Eliza: Why do you say your problem? Me: I have a problem with depression Eliza: Can you elaborate on that? Me: no, I don't wish to discuss it Eliza: You are being a bit negative
![Page 6: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/6.jpg)
Computational language & AI Why language engineering? Language integral to intelligent
systems Artifiicial Intelligence Turing Test ELIZA Expert systems: natural language
interface, natural language database
![Page 7: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/7.jpg)
Computational language & AI Methods shared across systems
Finite State Transition Networks (FSTN)
Logic Formal rules Probability Data: you know it!
![Page 8: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/8.jpg)
Applied computational language
History of the field Machine Translation: 1960, 1966, post 1966 Database access Text interpretation Information retrieval Text categorisation
![Page 9: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/9.jpg)
Language engineering
Information overload Need a way of automatically
processing text documents Information extraction
![Page 10: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/10.jpg)
Language engineering
Information extraction GIDA: system for automatically
monitoring financial market sentiment
![Page 11: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/11.jpg)
GIDA
-5
-4
-3
-2
-1
0
1
2
3
4
5
1 2 3 4 5 6 7 8 9 1 0
T ra d in g d a y
% C
hang
e
A c tua l C lo s ing% C ha ng eC a lc u la te d % c ha ng e
O u tp u t o b ta in e d fo r t h e p e r io d 1 st t o 1 2 th Ju ly 2 0 0 2 .
![Page 12: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/12.jpg)
Language engineering
Information overload Need a way of automatically
processing text documents Information extraction Summarisation
![Page 13: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/13.jpg)
Automatic summarisation(courtesy of Paulo FERNANDES de OLIVEIRA, PhD)
• Get information source;
• Extract some content from it;
• Present the most importantmost important part to the userxx xxx xxxx x xx xxxx xxx xx xxx xx xxxxx xxxx xx xxx xx x xxx xx xx xxx x xxx xx xxx x xx x xxxx xxxx xxxx xxxx xxxx xxxxxx xx xx xxxx x xxxxx x xx xx xxxxx x x xxxxx xxxxxx xxxxxx x xxxxxxxx xx x xxxxxxxxxx xx xx xxxxx xxx xx x xxxx xxxx xxx xxxx xx
xxx xx xxx xxxx xxxxx x xxxx x xx xxxxxx xxx xxxx xx x xxxxxx xxxx x xxx x xxxxx xx xxxxx x x xxxxxxxxx xx x xxxxxxxxxx xx xx xxxxx xxx xxxxx xx xxxx x xxxxxxx xxxxx x
![Page 14: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/14.jpg)
Lexical CohesionSentence 23:
J&J's stock added 83 cents to $65.49.
Sentence 26:
Flagging stock markets kept merger activity and new stock offerings on the wane, the firm said.
Sentence 42:
Lucent, the most active stock on the New York Stock Exchange, skidded 47 cents to $4.31, after falling to a low at $4.30.
Sentence 15:
"For the stock market this move was so deeply discounted that I don't think it will have a major impact".
Links Example
Text title: U.S. stocks hold some gains.
Collected from Reuters’ Website on 20 March 2002.
![Page 15: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/15.jpg)
Lexical Cohesion
17. In other news, Hewlett-Packard said preliminary estimates showed shareholders had approved its purchase of Compaq Computer -- a result unconfirmed by voting officials.
19. In a related vote, Compaq shareholders are expected on Wednesday to back the deal, catapulting HP into contention against International Business Machines for the title of No. 1 computer company.
Bonds Example
Text title: U.S. stocks hold some gains.
Collected from Reuters’ Website on 20 March 2002.
![Page 16: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/16.jpg)
Language engineering
Information overload Need a way of automatically
processing text documents Information extraction Summarisation Translation Retrieve only relevant documents Voice processing
![Page 17: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/17.jpg)
Language engineering
Two main approaches Symbolic Stochastic
![Page 18: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/18.jpg)
Case study spell checkers
![Page 19: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/19.jpg)
Spelling dictionaries aim? given a sequence of symbols:
1. identify misspelled strings 2. generate a list of possible ‘candidate’
correct strings 3. select most probable candidate from
the list
![Page 20: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/20.jpg)
Spelling dictionaries Implementation:
Probabilistic framework bayesian rule noisy channel model
![Page 21: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/21.jpg)
Spelling dictionaries Types of spelling error
actual word errors non-word errors
![Page 22: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/22.jpg)
Spelling dictionaries Types of spelling error
actual word errors /piece/ instead of /peace/ /there/ instead of /their/
non-word errors
![Page 23: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/23.jpg)
Spelling dictionaries Types of spelling error
actual word errors /piece/ instead of /peace/ /there/ instead of /their/
non-word errors /graffe/ instead of /giraffe/
![Page 24: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/24.jpg)
Spelling dictionaries Types of spelling error
actual word errors /piece/ instead of /peace/ /there/ instead of /their/
non-word errors /graffe/ instead of /giraffe/
of all errors in type written texts, 80% are non-word errors
![Page 25: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/25.jpg)
Spelling dictionaries non-word errors
Cognitive errors /seperate/ instead of /separate/ phonetically equivalent sequence of symbols
has been substituted due to lack of knowledge about spelling
conventions
![Page 26: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/26.jpg)
Spelling dictionaries non-word errors
Cognitive errors Typographic (‘typo’) errors
influenced by keyboard e.g. substitution of /w/ for /e/ due to its
adjacency on the keyboard /thw/ instead of /the/
![Page 27: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/27.jpg)
Spelling dictionaries non-word errors noisy channel model
The actual word has been passed through a noisy communication channel
This has distorted the word, thereby changing it in some way
The misspelled word is the distorted version of the actual word
Aim: recover the actual word by hypothesising about the possible ways in which it could have been distorted
![Page 28: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/28.jpg)
Spelling dictionaries non-word errors noisy channel model What are the possible distortions?
insertion deletion substitution transposition all of these viewed as transformations that
take place in the noisy channel
![Page 29: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/29.jpg)
Spelling dictionaries Implementing spelling identification
and correction algorithm
![Page 30: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/30.jpg)
Spelling dictionaries Implementing spelling identification and
correction algorithm STAGE 1: compare each string in document with a
list of legal strings; if no corresponding string in list mark as misspelled
STAGE 2: generate list of candidates Apply any single transformation to the typo string Filter the list by checking against a dictionary
STAGE 3: assign probability values to each candidate in the list
STAGE 4: select best candidate
![Page 31: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/31.jpg)
Spelling dictionaries STAGE 3
prior probability given all the words in English, is this candidate more
likely to be what the typist meant than that candidate? P(c) = c/N where N is the number of words in a corpus
likelihood Given, the possible errors, or transformation, how likely
is it that error y has operated on candidate x to produce the typo?
P(t/c), calculated using a corpus of errors, or transformations
Bayesian rule: get the product of the prior probability and the
likelihood P(c) X P(t/c)
![Page 32: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/32.jpg)
Spelling dictionaries non-word errors Implementing spelling identification
and correction algorithm STAGE 1: identify misspelled words STAGE 2: generate list of candidates STAGE 3a: rank candidates for probability STAGE 3b: select best candidate Implement:
noisy channel model Bayesian Rule
![Page 33: Computational Language](https://reader034.vdocuments.mx/reader034/viewer/2022051517/56814d61550346895dbaa8ad/html5/thumbnails/33.jpg)
Next week
Finite state machines and regular expressions