1 ceratops center for extraction and summarization of events and opinions in text janyce wiebe, u....

71
1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U. Utah

Upload: giles-oconnor

Post on 18-Jan-2018

216 views

Category:

Documents


0 download

DESCRIPTION

3 NP Coreference Resolution Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty.

TRANSCRIPT

Page 1: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

1

CERATOPS Center for Extraction and Summarization

of Events and Opinions in Text

Janyce Wiebe, U. PittsburghClaire Cardie, Cornell U.

Ellen Riloff, U. Utah

Page 2: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

2

Outline

• Sample of research results since March 2008– Coreference resolution– Topics and discourse

• Topic identification for fine-grained opinion analysis• Discourse-level opinion graphs for polarity classification

• Resources and systems– Upcoming releases– Jodange system for extracting and aggregating opinions from

on-line content

• Interactions • Research plans for the remainder of the grant

Page 3: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

3

NP Coreference Resolution

Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty.

Page 4: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

4

NP Coreference Resolution

Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italian coach Lippi has also been blasted for his comments after the game.In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given.

Page 5: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

5

Coreference and Opinion Holders

Stoyanov & Cardie [2006]

Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italian coach Lippi has also been blasted for his comments after the game.In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given.

Page 6: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

6

Reconcile

• NP coreference resolution toolbox/testbed– Public domain

• Joint effort of Cornell, LLNL, Univ of Utah• Algorithms for

– Training NP coref classifiers– Generating standard feature vectors– Testing/training on standard data sets (MUC, ACE)– Evaluating w.r.t. multiple metrics (MUC, B3, CEAF, etc.)

• Trained versions of state-of-the-art systems

Page 7: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

7

Outline

• Sample of research results since March 2008– Coreference resolution– Topics and discourse

• Topic identification for fine-grained opinion analysis• Discourse-level opinion graphs for polarity classification

• Resources and systems– Upcoming releases– Jodange system for extracting and aggregating opinions from

on-line content

• Interactions • Research plans for the remainder of the grant

Page 8: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

8

Topics in Fine-grained Opinion Analysis

– Opinion expression– Polarity

• positive• negative• neutral

– Strength/intensity• low..extreme

– Source (opinion holder)– Target (topic)

“The Australian Press launched a bitter attack on Italy”

Opinion FramePolarity: negative Intensity: highSource: “The Australian Press”Target: “Italy”

Page 9: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

9

Opinion Summaries

Australian PressAustralian Press

ItalyMarcello Lippi

penalty

Socceroos

Page 10: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

10

Definitions

• Topic - the real-world object, event or abstract entity that is the subject of the opinion as intended by the opinion holder

• Topic span - the closest, minimal span of text that mentions the topic

• Target span - the span of text that covers the syntactic surface form comprising the contents of the opinion

Stoyanov & Cardie [LREC 2008, Coling 2008]

Page 11: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

11

Definitions

(1) [OH John] likes Marseille for its weather and cultural diversity.

(1) [OH John] likes [TARGET+TOPIC SPAN Marseille] for its weather and cultural diversity.

Page 12: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

12

Definitions

(2) [OH Al] thinks that the government should tax more in order to curb CO2 emissions.

(2) [OH Al] thinks that [TARGET SPAN the government should tax gas more in order to curb CO2 emissions].

(2) [OH Al] thinks that [TARGET SPAN [TOPIC SPAN? the government] should [TOPIC SPAN? tax gas] more in order to [TOPIC SPAN? curb [TOPIC SPAN? CO2 emissions]]].

(3) Although he doesn’t like government imposed taxes, he thinks that a fuel tax is the only effective solution.

Page 13: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

13

Issues in Opinion Topic Identification

• Multiple potential topics • Opinion topics are not always explicitly

mentioned

(3) [OH John] believes the violation of Palestinian human rights is one of the main factors.

Topic: ISRAELI-PALESTINIAN CONFLICT

(4) [OH I] disagree entirely!

Page 14: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

14

A Coreference Approach

• Hypothesize that the notion of topic coreference will facilitate identification of opinion topics

• Easier than specifying the topic of each opinion in isolation

Two opinions are topic-coreferent if they share the same opinion topic.

Page 15: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

15

A Coreference Approach to Opinion Topic Identification

• Find the clusters of topic-coreferent opinions– the critical step for topic identification

• Label the clusters with the name of the topic– we currently ignore this step – address in future work through frequency analysis of

the terms in each of the clusters

Page 16: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

16

The Topic Coreference Algorithm

• Adapt a standard machine learning-based approach to NP coreference resolution (Soon et al., 2001; Ng & Cardie, 2002)1. identify topic spans2. perform pairwise classification of the associated

opinions and topic spans to determine whether or not they are topic-cofererent

3. cluster the opinions according to the results of (2)

Page 17: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

17

Results• Annotated subset of the MPQA corpus for topic-

coreferent opinions– Acceptable inter-annotator agreement

• 10-fold cross-validation• Baselines

B3 CEAFall-in-one .37 -.10 .301 opinion per cluster .29 .22 .27same paragraph .55 .31 .50Choi 2000 .54 .37 .54

Page 18: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

18

Results

sentence spans .57 .40 .54

automatic (heuristics) .57 .41 .54

modified manual topic spans .64 .51 .61

manual topic spans .71 .66 .62

B3 CEAFall-in-one .37 -.10 .301 opinion per cluster .29 .22 .27same paragraph .55 .31 .50Choi 2000 .54 .37 .54

Page 19: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

19

Results

sentence spans .57 .40 .54automatic (heuristics) .57 .41 .54modified manual topic spans .64 .51 .61manual topic spans .71 .66 .62

B3 CEAFall-in-one .37 -.10 .301 opinion per cluster .29 .22 .27same paragraph .55 .31 .50Choi 2000 .54 .37 .54

Page 20: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

20

Outline

• Sample of research results since March 2008– Coreference resolution– Topics and discourse

• Topic identification for fine-grained opinion analysis• Discourse-level opinion graphs for polarity classification

• Resources and systems– Upcoming releases– Jodange system for extracting and aggregating opinions from

on-line content

• Interactions • Research plans for the remainder of the grant

Page 21: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

21

Discourse level opinion graphs for polarity classification

• Somasundaran et al. Discourse level Opinion Graphs for Polarity Classification. Submitted.

• Somasundaran et al. (2008). Discourse Level Opinion Interpretation. COLING-2008.

• Somasundaran et al. (2008). Discourse Level Opinion Relations: An Annotation Study. SIGdial Workshop on Discourse and Dialogue-2008.

Page 22: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

22

Motivation

• The interpretations of opinions often depend on the other opinions in the discourse

Page 23: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

23

Motivation• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

Page 24: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

24

Motivation• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

Interdependent Interpretation of opinions in the discourse

Page 25: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

25

• Shapes should be curved, so round shapes. Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Motivation

Page 26: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

26

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

Motivation

Page 27: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

27

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

What were the opinions regarding the curved shape?Will the curved shape be accepted?

Motivation

Page 28: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

28

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

Direct opinion

Motivation

Page 29: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

29

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

Direct opinionOpinions towards mutually

exclusive option (alternative)

Motivation

Page 30: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

30

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

Direct opinionOpinions towards mutually

exclusive option (alternative)

Opinions towards the square shapes reveal more about the

speaker’s opinion regarding the curved shape

Motivation

Page 31: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

31

• Shapes should be curved, so round shapes Nothing square-like.

• ... So we shouldn’t have too square corners and that kind of

thing.

Arguing Negative

Arguing Positive

Arguing Negative

Sentiment Negative

Direct opinionOpinions towards mutually

exclusive option (alternative)

More opinion information

Motivation

Opinions towards the square shapes reveal more about the

speaker’s opinion regarding the curved shape

Page 32: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

32

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Neg

Page 33: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

33

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Neg

same

Page 34: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

34

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Neg

same

same

Page 35: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

35

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Negalternative

same

same

Page 36: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

37

Annotation tool

Page 37: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

38

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Negalternative

same

same

Page 38: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

39

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment Neg

SPSPsame

alternative

same

same

Page 39: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

41

Opinion Frames • This blue remote is cool.

• What’s more, the rubbery material is ergonomic.

• We must really go for the blue remote.

• I feel the red remote is a better choice.

• The blue remote will be too expensive.

Sentiment Pos

Sentiment Pos

Arguing Pos

Sentiment Pos

Sentiment NegSPSNaltalternative

same

same

Page 40: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

42

Data

• AMI meeting corpus (Carletta et al., 2005)– Rich in opinions– Rich interplay between opinion types, targets,

and polarities– Same and alternative target relations are

prominent– Targets are relatively concrete

Page 41: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

43

The Project roadmapAccomplished

– Developed a multi-stage annotation scheme – Tested human reliability in recognizing opinion frames– Showed that the frame presence between two opinion bearing

sentences can be automatically detected– Showed that interdependent interpretation improves automatic

detection of opinion polarityTo Do

– Fully automate effective discourse level opinion interpretation – Text data

• Do the opinion frames behave the same way? What knowledge sources can we use for our task in text?

Page 42: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

44

Discourse Level Opinion Graphs (DLOGs)

• Pair-wise relationship from Opinion frames are used to make Discourse Level Opinion Graph (DLOGs)

Page 43: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

45

LegendAlternative Target relation ------------Same Target relation ------------Reinforcing frame relation ------------Non-reinforcing frame relation ------------

Page 44: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

46

LegendAlternative Target relation ------------Same Target relation ------------Reinforcing frame relation ------------Non-reinforcing frame relation ------------

Page 45: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

47

Improving Polarity Classification using DLOG

• Computationally model discourse level relations between opinions

• Leverage these relations to improve polarity classification

Page 46: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

48

Improving Polarity Classification using DLOG

• Initial step:– Individual classifiers for recognizing opinions,

polarities, and the 4 types of links• Iterative steps

– Relational classifiers that use information of the neighboring nodes

• The classes assigned to neighbors thus far

Page 47: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

49

Illustration• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

Page 48: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

50

DLOG• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot ?

Page 49: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

51

DLOG• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

same

same

Page 50: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

52

DLOG• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

same

same

Reinforcing

Page 51: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

53

DLOG• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

same

same

Reinforcing

Reinforcing

Page 52: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

54

DLOG• I like the LCD feature

• We must implement the LCD

• I think the LCD is hot

same

same

Reinforcing

Reinforcing

Page 53: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

57

Findings

• Link prediction is hard due to data skewness fully automatic system does not significantly improve polarity classification

Page 54: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

58

Findings

• Can discourse-level opinion graphs be exploited to improve the performance of opinion polarity classification?

• Yes!

• 8-16 percentage point improvement in F-measure when the graph structure is known as well as the neighbor’s polarity is known

• 3-5 percentage point improvement in F-measure when only the graph structure is known

Page 55: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

59

Upcoming releases

• Reconcile NP coreference system• OpinionFinder version 2

– Improved performance on opinion and polarity recognition

• MPQA opinion annotated corpus version 3– Attitude and target span annotations added – Opinion topic coreference annotations added

• New lexicon representation and toolkit– Uniform representation for different types of subjectivity clues– Attributes, such as polarity, intensity– Toolkit for adding entries to the dictionary and finding instances

of them in a corpus

Page 56: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

60

Jodange

• Demo www.jodange.com• Commercialization of methods to identify and

aggregate opinions and their attributes

Opinion FrameExpression: “blasted”Polarity: negative Source: “The Australian Press”Topic: “Italy”

NP Coref ResolutionSource: “The Australian Press” = “their” = “the reporters”

Page 57: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

61

Interactions• UACs

– Riloff and Hovy have continued their collaboration on Information Extraction

– Riloff was on sabbatical at ISI through the summer• Other

– Riloff and Cardie are engaged in coreference resolution work with researchers at LLNL

– Wiebe is in the planning stages of work with researchers at LLNL on a news search project

– Riloff is collaborating with the Veterinary Information Network on research in pet health surveillance

– Wiebe is collaborating with researchers in the U. Pittsburgh Medical School on an NIH syndromic surveillance project

– Cardie is collaborating with researchers in the Cornell U. Information Science program on a project on the detection of deception in on-line communication

Page 58: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

62

Research Plans

Information extraction for pet health surveillance– collaborating with the Veterinary Information Network

to study message board discussions between veterinary professionals.

– goal is information extraction of entities and relations associated with pet health events, to discover:

• associations between symptoms and diseases• spikes or trends in diseases (e.g., kidney failure

cases during pet food recall)• unusual or unexplained clusters of symptoms or

diseases

Page 59: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

63

Research Plans

Coreference resolution– Developing methods to incorporate knowledge from the web as

corroborating evidence for coreference resolution.

• Ex: Orrin Hatch and Ralph Becker are possible antecedents for “the mayor” … issue web queries to decide!

– Experimenting with methods to automatically generate training data for domain-specific coreference resolution.

– Developing state-of-the-art coreference resolver called Reconcile that will be made publicly available.

• testbed for research and analysis

• easily configurable to use different subcomponents

Page 60: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

64

Research PlansSelective information extraction with relevant regions– developing weakly supervised learners to create relevant

region classifiers

– learning several different types of clues with varying levels of reliability

• primary indicators are reliable enough to apply everywhere

• secondary indicators are only applied within relevant regions of text

Domains: terrorist events & disease outbreak reports

Page 61: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

65

Architecture for Selective IE

UnstructuredText

RelevantSentenceClassifier

IE System

RelevantSentences

IE Patternsand Clues

Extractions

Pattern &Clue Learner

Relevant Region Training

Seed IEPatterns

Relevant andIrrelevant

Documents

Page 62: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

66

Research Plans

• Complete the releases of the MPQA corpus, OpinionFinder, & lexicon and toolkit

Page 63: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

67

Discourse level opinion interpretation

• Move from task-oriented meetings to blogs and other texts

• Challenges – Other types of relations between targets are more dominant than

in task-oriented meetings– Targets are often more abstract

• Opportunities– Richer lexically– Can exploit parse trees– Explicit discourse cues are more common– Sentences are longer and more self contained

Page 64: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

68

Lexicon

• Explore different uses of words, to zero in on the subjective ones

• Example: benefit

Page 65: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

69

Lexicon

• Example: benefit• Very often objective, as a Verb: Children with ADHD benefited from a 15-course

of fish oil

Page 66: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

70

Lexicon

• Noun uses look more promising: The innovative economic program has shown benefits to humanity

Page 67: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

71

Lexicon

• However, there are objective noun uses too: …tax benefits.…employee benefits.…tax benefits to provide a stable economy.…health benefits to cut costs.

Page 68: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

72

Lexicon

• Pattern: benefits as the head of a noun phrase containing a prepositional phrase

• Matches this:The innovative economic program has shown

proven benefits to humanity• But none of these:…tax benefits.…employee benefits.…tax benefits to provide a stable economy.…health benefits to cut costs.

Page 69: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

73

Entry representationbe soft on crime

<item index="1"><itemMorphoSyntax>

<lemma>be</lemma></itemMorphoSyntax><itemRelation xsi:type="ngramPattern">

<distance>2</distance><landmark>2</landmark></itemRelation></item>

<item index="2"><itemMorphoSyntax>

<word>soft</word><majorClass>J</majorClass></itemMorphoSyntax>

<itemRelation xsi:type="ngramPattern"><distance>1</distance><landmark>3</landmark></itemRelation></item>

<item index="3"><itemMorphoSyntax>

<word>on</word></itemMorphoSyntax><itemRelation xsi:type="ngramPattern">

<distance>1</distance><landmark>4</landmark></itemRelation></item>

<item index="4"><itemMorphoSyntax>

<word>crime</word><majorClass>N</majorClass></itemMorphoSyntax>

Page 70: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

74

Attributive information

<entryAttributes origin="j"><name>be soft on crime</name><subjective>true</subjective><reliability>h</reliability><confidence>h</confidence><subType>sen</subType><example>The Obama campaign rejected the notion that the senator

might be vulnerable to accusations that he is soft on crime.</example><morphosyn>vp</morphosyn><target>s</sp_target><polarity>n</polarity><intensity>m</intensity><confidence>h</confidence><regex>1:[morph:[lemma="be"] order:[distance="2" landmark="2"]] 2:

[morph:[word="soft" majorClass="J"] order:[distance="1" landmark="3"]] 3:[morph:[word="on"] order:[distance="1" landmark="4"]] 4:[morph:[word="crime" majorClass="N"]]</regex>

<patterntype>ngramPattern</patterntype>

Page 71: 1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U

75

Research Plans

• Learn subjective uses from corpora (bodies of texts)

• Capture longer subjective constructions• Incorporate word senses

– OntoNotes (Hovy)• Add relevant knowledge about expressions• Exploit the richer knowledge resource to improve

opinion extraction