computational models of text quality ani nenkova university of pennsylvania esslli 2010, copenhagen...
Post on 27-Dec-2015
222 Views
Preview:
TRANSCRIPT
Computational Models of Text Quality
Ani NenkovaUniversity of Pennsylvania
ESSLLI 2010, Copenhagen
1
The ultimate text quality application Imagine your favorite text editor
With spell-checker and grammar checker But also functions that tell you
``Word W is repeated too many times” ``Fill the gap is a cliché” ``You might consider using this more figurative
expression” ``This sentence is unclear and hard to read’’ ``What is the connection between these two
sentences?” ……..
2
Currently
It is our friends who give such feedback Often conflicting We might agree that a text is good, but find it
hard to explain exactly why
Computational linguistics should have some answers
Though far from offering a complete solution yet
3
In this course
We will overview research dealing with various aspects of text quality
A unified approach does not yet exist, but many proposals have been tested on corpus data integrated in applications
4
Current applications: education
Grading student writing Is this a good essay? One of the graders of SAT and GRE essays is in fact
a machine! [1]http://www.ets.org/research/capabilities/automated_scoring
Providing appropriate reading material Is this text good for a particular user? Appropriate grade level Appropriate language competency in L2 [2,3]http://reap.cs.cmu.edu/
5
Current applications: information retrieval Particularly user generated content
Questions and answers on the web Blogs and comments
Searching over such content poses new problems [4] What is a good question/answer/comment?http://answers.yahoo.com/
Relevant for general IR as well Of the many relevant document some, are better
written
6
Current applications: NLP
Models of text quality lead to improved systems [5] offer possibilities for automatic evaluation [6]
Automatic summarization Select important content and organize it in as well-
written text
Language generation Select, organize and present content on
document, paragraph, sentence and phrase level
Machine translation7
Text quality factors
Interesting Style (clichés, figurative language)
Vocabulary use Grammatical and fluent sentences Coherent and easy to understand
In most types of writing, well-written means clear and easy to understand. Not necessarily so in literary works.
Problems with clarity of instructions motivated a fair amount of early work. 8
Early work: keep in mind these predate modern computers! Common words are easier to understand
stentorian vs. loud myocardial infarction vs. heart attack
Common words are shorto Standard readability metrics o percentage of words not among the N most frequento average numbers of syllables per word
Syntactically simple sentences are easier to understand o average number of words per sentence
[Flesch-Kincaid, Automated Readability Index, Gunning-Fog, SMOG, Coleman-Liau]
9
Modern equivalents
Language models Word probabilities from a large collection
http://www.speech.cs.cmu.edu/SLM_info.html
Features derived from syntactic parse [2,7,8,9] Parse tree height Number of subordinating conjunctions Number of passive voice constructions Number of noun and verb phrases
10
Language models
Unigram and bigram language models Really, just huge tables Smoothing necessary to account for unseen
words
€
p(w) =nwN
€
p(w1 |w2) =nw2w1
nw2
11
Features from language models
Assessing the readability of text t consisting of m words, for intended audience class c
Number of out of vocabulary words in the text with respect to the language model for c
Text likelihood and perplexity
€
L(t) = P(c)P(w1 | c)....P(wm | c)
PP = 2H (t |c )
H(t | c) = −1
mlog2 P(t | c)
12
Application to grade level predictionCollins-Thompson and Callan, NAACL 2004 [10]
13
Application to grade level predictionCollins-Thompson and Callan, NAACL 2004 [10]
14
Results on predicting grade levelSchwarm and Ostendorf, ACL 2005 [11]
Flesch-Kincaid Grade Level index number of syllables per word sentence length
Lexile word frequency sentence length
SVM features language models and syntax
15
Models of text coherence
Global coherence Overall document organization
Local coherence Adjacent sentences
16
Text structure can be learnt in an unsupervised manner
Human-written examples from a domain
Location, time
relief efforts
magnitude
damage
17
Content model Barzilay & Lee’04 [12]
Hidden Markov Model (HMM)-based States - clusters of related sentences “topics” Transition prob. - sentence precedence in corpus Emission prob. - bigram language model
location, magnitude casualties
relief efforts
)|()|(),|,( 11111 iieiitiiii hsphhphshsp
Earthquake reportsTransition from previous topic
Generating sentence in current topic
18
Generating Wikipedia articlesSauper and Barzilay, 2009 [12]
Articles on diseases and American film actors
Create templates of subtopics
Focus only on subtopic level structure◦Use paragraphs from documents on the web
19
Template creation
Cluster similar headings signs and symptoms, symptoms, early
symptoms… Choose k clusters
average number of subtopics in that domain Find majority ordering for the clusters
Biography
Early life
Career
Personal life
Death
Diseases
Symptoms
Causes
Diagnosis
Treatment 20
Extraction of excerpts and ranking
Candidates for a subtopic Paragraphs from top 10 pages of search results
Measure relevance of candidates for that subtopic Features ~ unigrams, bigrams, number of
sentences…
21
Need to control redundancy across subtopics
Integer Linear Program
Variables One per excerpt (value 1-chosen or 0)
Objective Minimize sum of the ranks of the excerpts chosen
causes
symptoms
diagnosis
treatment
Constraints◦ Cosine similarity between any selected pair <= 0.5◦ One excerpt per subtopic
1 2 3 4 5
22
Linguistic models of coherence[Halliday and Hasan, 1976] [13]
Coherent text is characterized by the presence of various types of cohesive links that facilitate text comprehension
Reference and lexical reiteration Pronouns, definite descriptions, semantically
related words
Discourse relations (conjunction) I closed the window because it started raining.
Substitution (one) or ellipses (do)23
Referential coherence
Centering theory tracking focus of attention across adjacent
sentences [14, 15, 16, 17] Syntactic form of references
Particularly first and subsequent mention [18, 19], pronominalization
Lexical chains Identifying and tracking topics within a text [20,
21, 22, 23]
24
Discourse relations
Explicit vs. implicit I stayed home because I had a headacheo Signaled by a discourse connective
o Inferred without the presence of a connective I took my umbrella. [Because] The forecast was
for rain in the afternoon.
25
Lexical chains
Often discussed as cohesion indicator, implemented systems, but not used in text quality tasks Find all words that refer to the same topic Find the correct sense of the words
LexChainer Tool: http://www1.cs.columbia.edu/nlp/tools.cgi [23]
Applications: summarization, IR, spell checking, hypertext construction
John bought a Jaguar. He loves the car.
LC = {jaguar, car, engine, it}
26
Centering theory ingredients(Grosz et al, 1995)
Deals with local coherence What happens to the flow from sentence to
sentence Does not deal with global structuring of the
text (paragraphs/segments)
Defines coherence as an estimate of the processing load required to “understand” the text
27
Processing load
Upon hearing a sentence a person Cognitive effort to interpret the expressions in
the utterance Integrates the meaning of the utterance with
that of the previous sentence Creates some expectations on what might
come next
28
Example
(1) John met his friend Mary today.
(2) He was surprised to see her.
(3) He thought she is still in Italy.
Form of referring expressions Anaphora needs to be resolved “Create” a discourse entity at first mention with
full noun phrase Creating expectations
29
Creating and meeting expectations
(1) a. John went to his favorite music store to buy a piano. b. He had frequented the store for many years. c. He was excited that he could finally buy a piano. d. He arrived just as the store was closing for the day.
(2) a. John went to his favorite music store to buy a piano. b. It was a store John had frequented for many years. c. He was excited that he could finally buy a piano. d. It was closing just as John arrived.
30
Interpreting pronouns
a. Terry really goofs sometimes.
b. Yesterday was a beautiful day and he was excited about trying out his new sailboat.
c. He wanted Tony to join him on a sailing expedition.
d. He called him at 6am.
e. He was sick and furious at being woken up so early.
31
Basic centering definitions
Centers of an utterance Set of entities serving to link that utterance to
the other utterances in the discourse segment that contains it
Not words or phrases themselves Semantic interpretations of noun phraes
32
Types of centers
Forward looking centers An ordered set of entities What could we expect to hear about next Ordered by salience as determined by grammatical function Subject > Indirect object > Object > Others
John gave the textbook to Mary. Cf = {John, Mary, textbook}
Preferred center Cp
The highest ranked forward looking center High expectation that the next utterance in the segment will
be about Cp
33
Backward looking center
Single backward looking center, Cb (U) For each utterance other than the segment-
initial one
The backward looking center of utterance Un+1
connects with one of the forward looking centers of Un
Cb (U+1) is the most highly ranked element from Cf (Un) that is also realized in U+1
34
Centering transitions ordering
Cb(Un+1)=Cb(Un) OR
Cb(Un)=[?]
Cb(Un+1) != Cb(Un)
Cb(Un+1) = Cp(Un+1) continue smooth-shift
Cb(Un+1) != Cp(Un+1) retain rough-shift
35
Centering constraints
There is precisely one backward-looking center Cb(Un)
Cb(Un+1) is the highest-ranked element of Cf(Un) that is realized in Un+1
36
Centering rules
If some element of Cf(Un) is realized as a pronoun in Un+1 then so is Cb(Un+1)
Transitions not equal continue > retain > smooth-shift > rough-shift
37
Centering analysis
Terry really goofs sometimes. Cf={Terry}, Cb=?, undef
Yesterday was a beautiful day and he was excited about trying out his new sailboat. Cf={Terry,sailboat}, Cb=Terry, continue
He wanted Tony to join him in a sailing expedition. Cf={Terry, Tony, expedition}, Cb=Terry, continue
He called him at 6am. Cf={Terry,Tony}, Cb=Terry, continue
38
He called him at 6am. Cf={Terry,Tony}, Cb=Terry, continue
Tony was sick and furious at being woken up so early. Cf={Tony}, Cb=Tony, smooth shift
He told Terry to get lost and hung up. Cf={Tony,Terry}, Cb=Tony, continue
Of course, Terry hadn’t intended to upset Tony. Cf={Terry,Tony}, Cb = Tony, retain
39
Rough shifts in evaluation of writing skills (Miltsakaki and Kukich, 2002)
Automatic grading of essays by E-rater Syntactic variety
Represented by features that quantify the occurrence of clause types
Clear transitions Cue phrases in certain syntactic constructions
Existence of main and supporting points Appropriateness of the vocabulary content of the
essay What about local coherence?
40
Essay score model
Human score available E-rater prediction available Percentage of rough-shifts in each essay:
analysis done manually
Negative correlation between the human score and the percentage of rough-shifts
41
Linear multi-factor regression Approximate the human score as a linear function
of the e-rater prediction and the percentage of rough-shifts
Adding rough shifts significantly improves the model of the score 0.5 improvement on 1—6 scale
How easy/difficult would it be to fully automate the rough-shift variable?
42
Variants of centering and application to information ordering
Karamanis et al, 09 is the most comprehensive overview of variants of centering theory and an evaluation of centering in a specific task related to text quality
43
Information ordering task
Given a set of sentences/clauses, what is the best presentation? Take a newspaper article and jumble the
sentences---the result will be much more difficult to read than the original
Negative examples constructed by randomly permuting the original
Criteria for deciding which of two orderings is better Centering would definitely be applicable
44
Centering variations
Continuity (NOCB=lack of continuity) Cf(Un) and Cf(Un+1) share at least one element
Coherence Cb(Un) = Cb(Un+1)
Salience Cb(U) = Cp(U)
Cheapness (fulfilled expectations) Cb (Un+1) = Cp(Un)
45
Metrics of coherence
M.NOCB (no continuity)
M.CHEAP (expectations not met)
M.KP sum of the violations of continuity, cheapness, coherence and salience
M. BFP seeks to maximize transitions according to Rule 2
46
Experimental methodology
Gold-standard ordering The original order of the text (object description,
news article) Assume that other orderings are inferior
Classification error rate Percentage orderings that score better than the
gold-standard + 0.5*percentage of the orderings that score the same
47
Results
NOCB gives best results Significantly better than the other metrics Consistent results for three different corpora
Museum artifact descriptions (2) News Airplane accidents
M.BFP is the second best metric
48
49
Entity grid(Barzilay and Lapata, 2005, 2008)
Inspired by centering Tracks entities across adjacent sentences, as
well as their syntactic positions
Much easier to compute from raw textBrown Coherence Toolkit
http://www.cs.brown.edu/~melsner/manual.html
50
Several applications , with very good results
Information ordering Comparing the coherence of pairs of
summaries Distinguishing readability levels
Child vs. adult Improves over Petersen&Ostendorf
Entity grid: applications
51
Entity grid example
1 [The Justice Department]S is conducting an [anti-trust trial]O against [Microsoft Corp.]X with [evidence]X that [the company]S is increasingly attempting to crush [competitors]O.
2 [Microsoft]O is accused of trying to forcefully buy into [markets]X where [its own products]S are not competitive enough to unseat [established brands]O.
3 [The case]S revolves around [evidence]O of [Microsoft]S aggressively pressuring [Netscape]O into merging [browser software]O.
4 [Microsoft]S claims [its tactics]S are commonplace and good economically.
5 [The government]S may file [a civil suit]O ruling that [conspiracy]S to curb [competition]O through [collusion]X is [a violation of the Sherman Act]O.
6 [Microsoft]S continues to show [increased earnings]O despite [the trial]X.
52
Entity grid representation
53
16 entity grid features
The probability of each type of transition in the text Four syntactic distinctions S, O, X, _
54
Type of reference and info ordering(Elsner and Charniak, 2008)
Entity grid features not concerned with how an entity is mentioned
Discourse old vs. discourse new
Kent Wells, a BP senior vice president said on Saturday during a technical briefing that the current cap, which has a looser fit and has been diverting about 15,000 barrels of oil a day to a drillship, will be replaced with a new one in 4 to 7 days.
The new cap will take 4 to 7 days to be installed, and in case the new cap is not effective, Mr. Wells said engineers were prepared to replace it with an improved version of the current cap.
55
The probability of a given sequence of discourse new and old realizations gives a further indication about ordering
Similarly, pronouns should have reasonable antecedents
Adding both models to the entity grid improves performance on the information ordering task
56
Sentence Ordering n sentences
Output from a generation or summarization system
Find most coherent ordering n! permutations
With local coherence metrics
◦Adjacent sentence flow◦ Finding best ordering is NP complete Reduction from Traveling Salesman Problem
57
Word co-occurrence model(Lapata, ACL 2003; Soricut and Marcu, 2005) [23,24]
Idea from statistical machine translation Alignment models
John went to a restaurant.He ordered fish.The waiter was very attentive.……
John est allé à un restaurant.Il ordonna de poisson.Le garçon était très attentif.……
P(fish | poisson)
John went to a restaurant.He ordered fish.The waiter was very attentive.……
He ordered fish.The waiter was very attentive.John gave him a huge tip.……
P(ordered | restaurant)
P(waiter | ordered)
P(tip | waiter)
…
We ate at a restaurant yesterday.
We also ordered some take away.
58
Discourse (coherence) relations
Only recently empirically results have shown that discourse relations are predictive of text quality (Pitler and Nenkova, 2008)
59
PDTB discourse relations annotations
Largest corpus of annotated discourse relations
http://www.seas.upenn.edu/~pdtb/
Four broad classes of relations Contingency Comparison Temporal Expansion
Explicit and implicit
60
Implicit and explicit relations
(E1) He is very tired because he played tennis all morning.
(E2) He is not very strong but he can run amazingly fast.
(E3) We had some tea in the afternoon and later went to a restaurant for a big dinner
(I1) I took my umbrella this morning. [because] The forecast was for rain.
(I2) She is never late for meetings. [but] He always arrives 10 minutes late.
(I3) She woke up early. [afterwards] She had breakfast and went for a walk in the park.
61
What is the relative importance of factors in determining text quality? Competent readers (native English speaker)
graduate students at Penn Wall Street Journal texts
30 texts ranked on scale 1 to 5 How well-written is this article? How well does the text fit together? How easy was it to understand? How interesting is the article?
62
Several judgments for each text Final quality score was the average
Scores range from 1.5 to 4.33 Mean 3.2
63
Which of the many indicators will work best? Usually research study focus on only one or
two How do indicators combine?
Metrics Correlation coefficient Accuracy of pair-wise ranking prediction
64
Correlation coefficients between assessor ratings and different features
65
Baseline measures
Average Characters/Word r = -.0859 (p = .6519)
Average Words/Sentence r = .1637 (p = .3874)
Max Words/Sentence r = .0866 (p = .6489)
Article length r = -.3713 (p = .0434)
66
Vocabulary factors
Language model probability of the article
M estimated from PTB (WSJ) M estimated from general news (NEWS)
w
wCMwp )()|(
w
Mwpwc ))|(log()(
67
Correlations with ‘well-written’ assessment Log likelihood, WSJ
r = .3723 (p = .0428)
Log likelihood, NEWS r= .4497 (p = .0127)
Log likelihood with length, WSJ r = .3732 (p = .0422)
Log likelihood with length, NEWS r = .6359, p = .0002
68
Syntactic features
Average parse tree height r = -.0634 (p = .7439)
Avr. number of noun phrases per sentence r = .2189 (p = .2539)
Average SBARs r = .3405 (p = .0707)
Avr. number of verb phrases per sentence r = .4213 (p = .0228)
69
Elements of lexical cohesion
Avr. cosine similarity between adjacent sents r = -.1012 (p = .5947)
Avr. word overlap between adjacent sentences r = -.0531, p = .7806
Avr. Noun+Pronoun Overlap r = .0905, p = .6345
Avr. # Pronouns/Sent r = .2381, p = .2051
Avr # Definite Articles r = .2309, p = .2196
70
Correlation with ‘well-written” score
Prob. of S-S transition r = -.1287 (p = .5059)
Prob. of S-O transition r = -.0427 (p = .8261)
Prob. of S-X transition r = -.1450 (p = .4529)
Prob. of S-N transition r = .3116 (p = .0999)
Prob. of O-S transition r = .1131 (p = .5591)
Prob. of O-O transition r = .0825 (p = .6706)
Prob. of O-X transition r = .0744 (p = .7014)
Prob. of O-N transition r = .2590 (p = .1749)
71
Prob. of X-S transition r = .1732 (p = .3688)
Prob. of X-O transition r = .0098 (p = .9598)
Prob. of X-X transition r = -.0655 (p = .7357)
Prob. of X-N transition r = .1319 (p = .4953)
Prob. of N-S transition r = .1898 (p = .3242)
Prob. of N-O transition r = .2577 (p = .1772)
Prob. of N-X transition r = .1854 (p = .3355)
Prob. of N-N transition r = -.2349 (p = .2200)
72
Well-writteness and discourse
Log likelihood of discourse rels r = .4835 (p = .0068)
# of discourse relations r = -.2729 (p = .1445)
Log likelihood of rels with # of rels r = .5409 (p = .0020)
# of relations with # of words r = .3819 (p = .0373)
Explicit relations only r = .1528 (p = .4203)
Implicit relations only r = .2403 (p = .2009)
73
Summary: significant factors
Log likelihood of discourse relations r = .4835
Log likelihood , NEWS r = .4497
Average verb phrases per sentence r = .4213
Log likelihood, WSJ r = .3723
Number of words r = -.3713
74
Text quality prediction as ranking
Every pair of texts with ratings differing by 0.5
Features are the difference of feature values for each text
Task: predict which of the two articles has higher text quality score
75
Prediction accuracy (10-fold cross validation) None (Majority Class) 50.21% number of words 65.84%
ALL 88.88%
Grid only 79.42% log l discourse rels 77.77% Avg VPs sen 69.54% log l NEWS 66.25%
76
Findings
Complex interplay between features
Entity grid features not significantly correlated with ‘well-written score’ but very useful for the ranking task
Discourse information is very helpful But here we used gold-standard annotations Developing automatic classifier underway
77
Implicit and explicit discourse relationsClass Explicit Implicit
Comparison 69% 31%
Contingency 47% 53%
Temporal 80% 20%
Expansion 42% 58%
78
Sense classification based on connectives only Four-way classification
Explicit relations only 93% accuracy
All relations (implicit+explicit) 75% accuracy
Implicit relations are the real challenge79
Explicit discourse relations, tasksPitler and Nenkova, 2009 [25]
Discourse vs. non-discourse use I will be happier once the semester is over. I have been to Ohio once.
Relation sense Contingency, comparison, temporal,
expansion I haven’t been to Paris since I went there on a
school trip in 1998. [Temporal] I haven’t been to Antarctica since it is very far
away. [Contingency]
80
Largest available annotated corpus of discourse relations Penn Treebank WSJ articles 18,459 explicit discourse relations 100 connectives
“although” vs. “or”91% discourse 3% discourse
Penn Discourse Treebank
81
Positive examples: discourse connectives Negative examples: same strings in PTDB,
unannotated
10-fold cross validation Maximum Entropy classifier
Discourse Usage Experiments
82
Discourse Usage Results
83
Discourse Usage Results
84
Sense Disambiguation: Comparison, Contingency, Expansion, or Temporal?
Features Accuracy
Connective 93.67%
Connective + Syntax 94.15%
Interannotator Agreement 94%
85
Automatic annotation of discourse use and sense of discourse connectives
Discourse Connectives Taggerhttp://www.cis.upenn.edu/~epitler/discourse.html
Tool
86
Is there hope to have a usable tool soon?
Early studies on unannotated data gave reason for optimism
But when recently tested on the PDTB, their performance is poor Accuracy of contingency, comparison and
temporal is below 50%
What about implicit relations?
87
Not easy to infer from combined results how early systems performed on implicits As we saw, one can get reasonable overall
performance by doing nothing for explicts
Same sentence [26]
Graphbank corpus: doesn’t distinguish implicit and explicit [27]
Classify implicits and explicits together
88
Classify on large unannotated corpus
89
Pitler et al, ACL 2009 [31] Wide variety of features to capture semantic opposition
and parallelism
Lin et al, EMNLP 2009 [32] (Lexicalized) syntactic features
Results improve over baselines, better understanding of features, but the classifiers are not suitable for application in real tasks
Experiments with PDTB
90
Most basic feature for implicits
I_there, I_is, …, tired_time, tired_difference
Word pairs as features
I am a Iittle tired
there is a 13 hour time difference
Marcu and Echihabi , 2002
91
The recent explosion of country funds mirrors the “closed-end fund mania of the 1920s, Mr. Foot says, when narrowly focused funds grew wildly popular.
They fell into oblivion after the 1929 crash.
Intuition: with large amounts of data, will find semantically-related pairs
92
Using just content words reduces performance (but has steeper learning curve) Marcu and Echihabi, 2002
Nouns and adjectives don’t help at all Lapata and Lascarides, 2004 [33]
Filtering out stopwords lowers results Blair-Goldensohn et al., 2007
Meta error analysis of prior work
93
Word pairs experimentsPitler et al 2009
94
Function words have highest information gain
But…Didn’t we remove the connective?
95
“but” signals “Not-Comparison” in synthetic data
96
Pairs of words from the two text spans
What doesn’t work Training on synthetic implicits
What really works Use synthetic implicits for feature selection Train on PDTB
Results: Word pairs
97
Comparison
21.96 (17.13)
Contingency
47.13 (31.10)
Expansion
76.41 (63.84)
Temporal
16.76 (16.21)
Best Results: f-scores
98
Comparison/Contingency baseline: synthetic implicits word pairsExpansion/Temporal baseline: real implicits word pairs
Results from classifying each relation independently Naïve Bayes, MaxEnt, AdaBoost
Since context features were helpful, tried CRF
6-way classification, word pairs as features Naïve Bayes accuracy: 43.27% CRF accuracy: 44.58%
Further experiments using context
99
If we had perfect co-reference and discourse relation information, would we be able to explain local discourse coherence
Our recent corpus study indicates the answer is NO
30% of adjacent sentences in the same paragraph in PDTB Neither share an entity nor have an implicit
comparison contingency or temporal relation Lexical chains?
Do we need more coherence factors?Louis and Nenkova, 2010 [34]
100
References
[1] Burstein, J. & Chodorow, M. (in press). Progress and new directions in technology for automated essay evaluation. In R. Kaplan (Ed.), The Oxford handbook of applied linguistics (2nd Ed.). New York: Oxford University Press.
[2] Heilman, M., Collins-Thompson, K., Callan, J., and Eskenazi, M. (2007). Combining Lexical and Grammatical Features to Improve Readability Measures for First and Second Language Texts. Proceedings of the Human Language Technology Conference. Rochester, NY.
[3] S. Petersen and M. Ostendorf, “A machine learning approach to reading level assessment,” Computer, Speech and Language, vol. 23, no. 1, pp. 89-106, 2009
[4] Finding High Quality Content in Social Media, Eugene Agichtein, Carlos Castillo, Debora Donato, Aristides Gionis, Gilad Mishne, ACM Web Search and Data Mining Conference (WSDM), 2008
[5] Regina Barzilay and Lillian Lee, Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization, HLT-NAACL 2004: Proceedings of the Main Conference, pp113—120, 2004
101
References
[6] Emily Pitler, Annie Louis and Ani Nenkova, Automatic Evaluation of Linguistic Quality in Multi-Document Summarization, Proceedings of ACL 2010
[7] Schwarm, S. E. and Ostendorf, M. 2005. Reading level assessment using support vector machines and statistical language models. In Proceedings of ACL 2005.
[8] Jieun Chae, Ani Nenkova: Predicting the Fluency of Text with Shallow Structural Features: Case Studies of Machine Translation and Human-Written Text. In proceedings of EACL 2009: 139-147
[9] Charniak, E. and Johnson, M. 2005. Coarse-to-fine n-best parsing and MaxEnt discriminative reranking. In Proceedings of ACL 2005.
[10] K. Collins-Thompson and J. Callan. (2004). A language modeling approach to predicting reading difficulty. Proceedings of HLT/NAACL 2004.
[11] Sarah E. Schwarm and Mari Ostendorf. Reading Level Assessment Using Support Vector Machines and Statistical Language Models. In Proceedings of ACL, 2005.
102
References
[12] Automatically generating Wikipedia articles: A structure-aware approach, C. Sauper and R. Barzilay, ACL-IJCNLP 2009
[13] Halliday, M. A. K., and Ruqaiya Hasan. 1976.Cohesion in English. London: Longman
[14] B. Grosz, A. Joshi, and S. Weinstein. 1995. Centering: a framework for modelling the local coherence of dis- course. Computational Linguistics, 21(2):203–226
[15] E. Miltsakaki and K. Kukich. 2000. The role of centering theory’s rough-shift in the teaching and evaluation of writing skills. In Proceedings of ACL’00, pages 408– 415.
[16] Karamanis, N., Mellish, C., Poesio, M., and Oberlander, J. 2009. Evaluating centering for information ordering using corpora. Comput. Linguist. 35, 1 (Mar. 2009), 29-46.
[17] Regina Barzilay, Mirella Lapata, "Modeling Local Coherence: An Entity-based Approach”, Computational Linguistics, 2008.
[18] Ani Nenkova, Kathleen McKeown: References to Named Entities: a Corpus Study. HLT-NAACL 2003
103
References
[19] Micha Elsner, Eugene Charniak: Coreference-inspired Coherence Modeling. ACL (Short Papers) 2008: 41-44
[20] Morris, J. and Hirst, G. 1991. Lexical cohesion computed by thesaural relations as an indicator of the structure of text. Comput. Linguist. 17, 1 (Mar. 1991), 21-48.
[21] Regina Barzilay and Michael Elhadad, "Text summarizations with lexical chains”, In Inderjeet Mani and Mark Maybury, editors, Advances in Automatic Text Summarization. MIT Press, 1999.
[22] Silber, H. G. and McCoy, K. F. 2002. Efficiently computed lexical chains as an intermediate representation for automatic text summarization. Comput. Linguist. 28, 4 (Dec. 2002), 487-496.
[23] Mirella Lapata, Probabilistic Text Structuring: Experiments with Sentence Ordering, Proceedings of ACL 2003.
[24] Discourse generation using utility-trained coherence models, R. Soricut & D. Marcu, COLING-ACL 2006
104
References
[25] Emily Pitler and Ani Nenkova. Using Syntax to Disambiguate Explicit Discourse Connectives in Text. Proceedings of ACL, short paper, 2009
[26] Radu Soricut and Daniel Marcu. 2003. Sentence Level Discourse Parsing using Syntactic and Lexical Information. Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL-2003)
[27] Ben Wellner, James Pustejovsky, Catherine Havasi, Roser Sauri and Anna Rumshisky. Classification of Discourse Coherence Relations: An Exploratory Study using Multiple Knowledge Sources. In Proceedings of the 7th SIGDIAL Workshop on Discourse and Dialogue
[28] Daniel Marcu and Abdessamad Echihabi (2002). An Unsupervised Approach to Recognizing Discourse Relations. Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL-2002)
[29] Sasha Blair-Goldensohn, Kathleen McKeown, Owen Rambow: Building and Refining Rhetorical-Semantic Relation Models. HLT-NAACL 2007: 428-435
105
References
[30] Sporleder, C. and Lascarides, A. 2008. Using automatically labelled examples to classify rhetorical relations: An assessment. Nat. Lang. Eng. 14, 3 (Jul. 2008), 369-416.
[31] Emily Pitler, Annie Louis, and Ani Nenkova. Automatic Sense Prediction for Implicit Discourse Relations in Text. Proceedings of ACL, 2009.
[32] Ziheng Lin, Min-Yen Kan and Hwee Tou Ng (2009). Recognizing Implicit Discourse Relations in the Penn Discourse Treebank. In Proceedings of EMNLP
[33] Lapata, Mirella and Alex Lascarides. 2004. Inferring Sentence-internal Temporal Relations. In Proceedings of the North American Chapter of the Assocation of Computational Linguistics, 153-160.
[34] Annie Louis and Ani Nenkova, Creating Local Coherence: An Empirical Assessment, Proceedings of NAACL-HLT 2010
106
top related