semantic annotation & utility evaluation meeting: feb 14 , 2008
DESCRIPTION
Semantic Annotation & Utility Evaluation Meeting: Feb 14 , 2008. Project Organization: Who is here? Agenda Meaning Layers and Applications Ongoing work. Project Organization. CMU (Mitamura, Levin, Nyberg) Coreference Entity relations Factiveness. BBN (Ramshaw, Habash) - PowerPoint PPT PresentationTRANSCRIPT
Semantic Annotation & Utility Evaluation Meeting: Feb 14, 2008
Project Organization: Who is here? Agenda Meaning Layers and Applications Ongoing work
Project Organization
Columbia (Rambow, Passonneau)Dialogic ContentFactiveness
CMU (Mitamura, Levin, Nyberg)CoreferenceEntity relationsFactiveness
BBN (Ramshaw, Habash)Temporal AnnotationCoreference (complex)
Affiliated EffortsEd HovyMartha PalmerGeorge Wilson (Mitre)
UMBC (Nirenburg, McShane)Modality: polarity, epistemic, belief, deontic, volitive, potential, permissive, evaluative
EvaluationBonnie DorrDavid YarowskyKeith HallSaif Mohammad
Who is here? Kathy Baker (DoD) Mona Diab (Columbia) Bonnie Dorr (UMD) Jason Duncan (DoD) Tim Finin (JHU/APL) Nizar Habash (Columbia) Keith Hall (JHU) Eduard Hovy (USC/ISI) Lori Levin (CMU) James Mayfield (JHU/APL) Marjorie McShane (UMBC) Teruko Mitamura (CMU) Saif Mohammad (UMD) Smaranda Muresan (UMD)
Sergei Nirenburg (UMBC) Eric Nyberg (CMU) Doug Oard (UMD) Boyan Onyshkevych (DoD) Martha Palmer (Colorado) Rebecca Passonneau (Columbia) Owen Rambow (Columbia) Lance Ramshaw (BBN) Gary Strong (DoD) Clare Voss (ARL) Ralph Weischedel (BBN) George Wilson (Mitre) David Yarowsky (JHU)
Semantic Annotation & Utility Evaluation Meeting: Today’s Plan MORNING:
Site presentations should include an overview of the phenomena covered and utility-motivating examples, extracted from the target corpus. Discussion of annotation conventions and interoperability issues should wait until the afternoon.
Discussion will be seeded by preliminary analysis from the hosts. The primary goal of discussion is to flesh out our collective assessment of what additional capabilities could be achieved if a machine could achieve near human-performance on annotation of these meaning layers relative to applications operating on text without such meaning layer analysis.
AFTERNOON: Compatibility, Interoperability, Database population including integration into
larger KB environment. Participants should bring with them thoughts on specific issues regarding
compatibility/interoperability and database population relative to their meaning layers. (Slides forwarded in advance.)
Semantic Annotation & Utility Evaluation Meeting Agenda 9:00 Boyan Onyshkevych - remarks on annotation and utility; goals for future 9:30 David or Bonnie - Utility analysis overview 9:45 Brief presentation of meaning layers and utility discussion:
Factiveness: Columbia, CMU Factiveness utility discussion [Yarowsky] Coreference: CMU, BBN Coreference utility discussion [Mohammad] Propbanking/Ontonotes: Palmer, Hovy Propbanking/Ontonotes utility discussion [Dorr]
11:00 Break 11:15 Continuation
Temporal Annotation: BBN Temporal Annotation utility discussion [Dorr] Dialogic Content: Columbia Dialogic Content utility discussion [Hall] Modality: UMBC Modality utility discussion [Yarowsky]
12:15 Working Lunch: Continue discussion (from above) 1:15 Compatibility, Interoperability, Database population [Mayfield] 2:00 Discussion about interoperability and database population 3:30 Break 3:45 Future Plans: Immediate follow-on for Y1 completion, Broader goals for Y2 4:45 Wrap-up
Meaning Layers and Coarse UtilitySentiment Analysis
Biog. Creation
Knowledge Discovery / Causality,Anomalies
Deception Detection
SN Analysis
Visualization
IR and KB Population
Self-learning tutor/guide
Text Mining, Pattern Detection
MT Q/A
Contradiction /Redundancy Detection
X X X X X X X
Coreference Resolution
X X X X X X X X X X X
Dialogic Analysis X X X
Temporal Ordering
X X X X X X X X X
Factiveness,
Confidence
X X X X X X X X X X
Modality: polarity, epistemic, deontic, volitive, belief, potential, permissive, evaluative
X X X X X X X X X X X
Techniques, Issues, Applications: Contradiction/Redundancy
Techniques Identifying contradictory word pairs
X has only a token presence in Europe X has a large presence in Europe
Issues Identifying word pairs that are not antonyms, but convey the opposite point when
taken in context (as above). Identifying which pairs are not contradictions because they refer to different
entities Applications: Knowledge discovery, KB population, question
answering, summarization, coreference filtering Capabilities beyond IR/keyword matching:
• KB population: Finding evidence that supports and refutes a hypothesis• QA: Conveying opposite views on the same point of discussion• Knowledge Discovery: Identifying new information (anomalies) when
analyzing extended-time events
Techniques, Issues, Applications: Coreference Resolution
Techniques (CMU, BBN) Member and subset, member-base, reference type
Militant leaders including Bill Andres have received death threats Resolution of pronominal coreference ambiguity.
Darius and his son ruled Persia. He was born in August Issues
Teasing out the information on the interested entity from information about other entities sharing the same name
Determining that the person mentioned here is the same as the person we know from earlier
Applications: Biography creation, information retrieval, KB population, question answering, inferencing
Capabilities beyond IR/keyword matching:• Biography creation and SN Analysis: Identifying more information about an entity
through coreferential inferencing than available by keyword matching.• Knowledge Discovery: Determining that a person mentioned in one part of the
text (or in a different text) refers to a person who is currently being tracked.
Techniques, Issues, Applications: Dialogic Analysis
Techniques (Columbia): Thread-wide annotation: Information-Fact (Meetings run late on Mondays.),
Information-Opinion (His progress on that project is slow.), External-event-planning (The project report is due tomorrow.), Social (Did you see the game last night?).
Dialog Function Units: INFORM (The meeting is at 10), REJECT (I don’t know), REQUEST (When is it due?), COMMIT (I will get back to you on that).
Belief: Non-committed belief (I am not sure), Committed belief (I am certain). Issues:
“If” clauses complex: If J dies, I will cry.Purely hypothetical. But causal link could be committed belief.
Missing emails in a thread. Applications: Deception Detection, SN Analysis, Sentiment
Capabilities beyond IR/keyword matching: • Deception: Determine if person X is consistently telling person Y something
that isn’t true.• SN Analysis: Determine structure of network from information beyond meta
data—bureaucratic structure from communication pattern.• Sentiment: Determine opinion of X—beyond who said what to whom.
Techniques, Issues, Applications: Temporal Ordering
Techniques (BBN) Time unit identification: temporally salient phrase, e.g., “last night” Temporal type assignment: Event, Say, Be, Date, Range Inherent time: hypothetical, partially specified, past, current, future Temporal parent assignment: rel clause, conjunction, etc. Temporal relationship assignment: before, after, etc.: “After Obama’s
presentation [e1] yesterday evening [t1], Clinton made [e2] a few remarks”. during (e1, t1), after(e2, e1)
Issues: Temporal coreference still being worked out (Monday is his return day.)
Applications: Biography Creation, SN Analysis, KB population, self-learning tutor/guide, knowledge discovery, MT
Capabilities beyond IR/keyword matching:• KB population: Queries should take time into consideration• Knowledge discovery: Identifying unusual/anomalous events• MT: Generation of appropriate tense depends on temporal analysis
Techniques, Issues, Applications:Factiveness, Confidence
Techniques (CMU, Columbia): Deducing the probability of truth of a statement based on text analysis Analysis of other aspects of the "assertional force" or conditional truth of a
statement. For example: “Guzman lives in Lima”, “Guzman must live in Lima”, “I doubt Guzman lives in Lima”, “If … then Guzman lives in Lima”
Issues: Knowledge representation for truth status and probability Integrating and modifying the truth value of individual assertions relative to other
facts, either elsewhere in the document or in existing databases Resolving ambiguities such as "must" as requirement vs. confidence estimation
Applications: KB population, text mining and visualization, question answering, sentiment and deception analysis
Capabilities beyond IR/keyword matching:• Text Mining: Filtering imported textual assertions based on truth status (e.g.
negated, conditional) and assigning confidence values to the imported knowledge
• KB Population: Determine which systems onsite are vulnerable to threat• Sentiment: “Should the US continue fighting?”
Techniques, Issues, Applications:Modality
Techniques (UMBC): Assessment of a broad set of "modality" conditions of textual statements: polarity, epistemic,
belief, deontic, volitive, potential, permissive, evaluative, epiteuctic, etc.:• He is trying to get Hamas to co-exist with Israel (volitive)• Conservative israelies are skeptical (belief & uncertainty)
Analysis of potential linguistic indicators for each modality type and performance of disambiguation when multiple possible
Issues: Relationship to (and inter-rater agreement with) factiveness analysis Coverage and inherent ambiguity
Applications: Knowledge discovery, KB population, text mining, visualization, question answering, sentiment and deception analysis
Capabilities beyond IR/keyword matching:• Knowledge discovery: adding modality “status” attributes to extracted facts and
supporting decisions based on these distinctions (e.g. desire, intention, expectation)• KB population: Determine if a particular country succeeded in building weapons
(epiteuctic modality).• Sentiment analysis: Determine whether a particular political group is skeptical (belief &
uncertainty)
Ongoing work
Analysis of intra-site and cross-site annotation agreement rates
Additional rounds of utility analysis Initial assessment of computational
feasibility