machine reading as a process of partial question-answering peter clark and phil harrison boeing...
TRANSCRIPT
Machine Reading as a Process ofPartial Question-Answering
Peter Clark and Phil Harrison
Boeing Research & Technology
June 2010
Overview
Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary
Machine Reading
Machine Reading = A “holy grail” of AI Constructing an inference-supporting representation
from text Connecting what is read with what is already known
Reader already knows something Text is elaborating/deepening that knowledge
Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?
Machine Reading
Machine Reading
Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?
Do I already know this?Can I interpret this as something that I know?Can I interpret some of this as something I know?
Question-Answering
Machine Reading
Any remainder = new knowledge
Any remainder = failed query
Machine Reading
Question-Answering
Machine Reading
Main insight: These are similar processes
Can apply question-answering techniques to machine reading.Why is that important?Question-answering is precisely a technology for linking what is said (asked) with what is known.
i.e., To read text TAsk: Is it true that T?
Overview
Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary
General Approach
“The mitotic spindle consists of hollow microtubules.”
“Does the mitotic spindle consist of hollow microtubules?”
“Mitotic spindle has parts [hollow] microtubules”
“Those microtubules are hollow”
Text:
Question:
Partial Answer:
New Knowledge:
Knowledge has guided interpretation
General Approach
“The mitotic spindle consists of hollow microtubules.”
“Does the mitotic spindle consist of hollow microtubules?”
“The mitotic spindle has parts [hollow] microtubules”
“Those microtubules are hollow”
Text:
Question:
Partial Answer:
New Knowledge:
..and identified the “anchor points” in
the KB for new knowledge
General Approach
“The mitotic spindle consists of hollow microtubules.”
“Does the mitotic spindle consists of hollow microtubules?”
“The mitotic spindle has parts [hollow] microtubules”
“Those microtubules are hollow”
Text:
Question:
Partial Answer:
New Knowledge:
Pipelined (KB independent) NLP
Word-SenseDisambiguation
Semantic Role Labeling
?
Topic in the KB
During prophase, the cell…
Parse, logical form
Interleaved Interpretation and Answering
Topic in the KB
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Existing Knowledge
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Existing Knowledge
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Existing Knowledge
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Suppose this is the best we can do,interpreting text as existing knowledge
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Traditional NLP
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
New Knowledge
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Extended KB
During prophase, the cell…
Logical Form
Interleaved Interpretation and Answering
Topic in the KB
Extended KB
Word sense choicesSemantic role choicesParaphrase rewrites
During prophase, the cell…
Logical Form
Some Possible Semantic Role Labels…
“DNA synthesized by the polymerase”
agent?location? means?
KB
Some Possible Paraphrases (DIRT)…
“spindle consists of microtubules”
“microtubules are part of the spindle”
“spindle is staffed by microtubules”
“microtubules participate in the spindle”
…KB
Overview
Machine Reading and Question-Answering Approach Algorithm Preliminary Results Summary
Knowledge Representation
Ontology: ~400 biology concepts, ~400 general concepts
Axioms: Mainly “Forall…exists…” axioms, e.g., “All eukaryotic cells contain a nucleus” “Subevents of mitosis are prophase, metaphase, …”
Inference: Reason about an instance of a concept Conclusions apply to all instances of the concept (via UG)
Topics
Topic = the concept that a text describes We assume a text is about a single topic Topic could be identified using ML (we do it by hand) Given topic, can find (some) expected “participants” from KB
The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….
Topic: Prophase
Topics
Topic = the concept that a text describes Participants = Individuals implied to exist given the topic
Can infer (some) participants using the KB
Topic: Prophase
KB
Prophase
The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….
→ centrosome moves to the pole of a eukaryotic cell → nucleus, cytoplasm → nuclear membrane, etc. etc.
Topics
Topic = the concept that a text describes Participants = Individuals implied to exist given the topic
Can infer (some) participants using the KB
Topic: Prophase
KB
Prophase
Text provides information about participants
The centrosomes are pushed apart to opposite ends of the cell nucleus by the action of molecular motors acting on the microtubules. The nuclear envelope breaks downm allowing….
→ centrosome moves to the pole of a eukaryotic cell → nucleus, cytoplasm → nuclear membrane, etc. etc.
Algorithm
Identify the topic of the text Parse and create initial “logical form”
“The mitotic spindle consists of hollow microtubules.”
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), modifier(m,h).
1. SetupCreate representation of topic + (known) participants in KB
2. Search: repeat: interpret + (try to) prove parts of the LF
until: as much proved as possibleInterpret remainder (normal NLP) and add to KB
Topic: Prophase
Y4:Mitotic-SpindleX0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
destination
Create a representation of the topic in the KB
“The mitotic spindle consists of hollow microtubules.”
Y4:Mitotic-SpindleX0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
destination
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:
“The mitotic spindle consists of hollow microtubules.”
Generate Logical Form
Y4:Mitotic-SpindleX0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
destination
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:
“The mitotic spindle consists of hollow microtubules.”
Interpret and (try) prove some part of the LF
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
destination
Y4:Mitotic-Spindle
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
Bind a LF variable
“The mitotic spindle consists of hollow microtubules.”
Interpret and (try) prove some part of the LF
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
destination
Y4:Mitotic-Spindle
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
“The mitotic spindle consists of hollow microtubules.”
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), material(Y4,m), mod(m,h). ? Interpret and (try) prove some part of the LF
LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
destination
Y4:Mitotic-Spindle
“The mitotic spindle consists of hollow microtubules.”
Interpret and (try) prove some part of the LF
?
LF interpretation:
Y4:Mitotic-SpindleX0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
RecognizedOld Knowledge
destination
“The mitotic spindle consists of hollow microtubules.”
Interpret and (try) prove some part of the LF
LF interpretation:
Y4:Mitotic-SpindleX0:Prophase
Y0:Move Y1:Centrosome
Y7:Microtubule
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
has-part
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
RecognizedOld Knowledge
destination
isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).
“The mitotic spindle consists of hollow microtubules.”
Interpret and (try) prove some part of the LF
!
LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).
destination
Y7:Microtubulehas-part
Y4:Mitotic-Spindle
“The mitotic spindle consists of hollow microtubules.”
LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).
isa(Y4,MSpindle), isa(Y8,Hollow), isa(Y7,Microtubule), has-part(Y4,Y7), shape(Y7,Y8).
Y4:Mitotic-Spindle
has-part
destination
Y7:Microtubule
Traditional NLP for the rest…
“The mitotic spindle consists of hollow microtubules.”
LF interpretation:
X0:Prophase
Y0:Move Y1:Centrosome
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
Y6:Create
……
"mitotic-spindle"(s), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,s), "of"(c,m), mod(m,h).
isa(Y4,MSpindle), "consist"(c), "hollow"(h), "microtubule"(m), subject(c,Y4),"of"(c,m),mod(m,h)
isa(Y4,MSpindle), "hollow"(h), "microtubule"(m), has-part(Y4,m), mod(m,h).
isa(Y4,MSpindle), "hollow"(h), isa(Y7,Microtubule), has-part(Y4,Y7), modifier(Y7,h).
isa(Y4,MSpindle), isa(Y8,Hollow), isa(Y7,Microtubule), has-part(Y4,Y7), shape(Y7,Y8).
Y4:Mitotic-Spindle
has-part
Y8:Hollowshape
New Knowledge
destination
Y7:Microtubule
Add to the KB
“The mitotic spindle consists of hollow microtubules.”
X0:Prophase
Y0:Move Y1:Centrosome
Y3:Elongate
Y2:Eukaryotic-Cell
Y5:Pole
subeventhas-part has-region
object
object
Y6:Create
……
Y4:Mitotic-Spindle
has-part
Y8:Hollowshape
New Knowledge
destination
Y7:Microtubule
“The mitotic spindle consists of hollow microtubules.”
Overview
Machine Reading and Question-Answering Approach Algorithm Illustration and Preliminary Results Summary
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text + Topic (here, Prophase):
Output Axioms (expressed in English):
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text:
Output Axioms (expressed in English):
Good interpretation using paraphrases
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text:
Output Axioms (expressed in English):
Useful New Knowledge
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text:
Output Axioms (expressed in English):
Good interpretation
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text:
Output Axioms (expressed in English):
Not very useful
Illustration
“During prophase, chromosomes become visible, the nucleolus disappears, the mitotic spindle forms, and the nuclear envelope disappears. Chromosomes become more coiled and can be viewed under a light microscope. Each duplicated chromosome is seen as a pair of sister chromatids joined by the duplicated but unseparated centromere. The nucleolus disappears during prophase. In the cytoplasm, the mitotic spindle, consisting of microtubules and other proteins, forms between the two pairs of centrioles as they migrate to opposite poles of the cell. The nuclear envelope disappears at the end of prophase. This signals the beginning of the substage called prometaphase.”
In all prophase events:• The chromosome moves.• The chromatids are attached by the centromere.• The nucleolus disappears during the prophase.• The mitotic spindle has parts the microtubule and the protein.• The mitotic spindle is created between the centrioles in the cytoplasm.• The centrioles move to the poles.• The nuclear envelope disappears at the end.• Something signals.
Input Text:
Output Axioms (expressed in English):
Bad interpretation
A Preliminary Experiment
10 paragraphs (110 sentences) about prophase, from Web 114 logic statements created
23 (20%) fully known to the KB 27 (24%) partially new knowledge 64 (56%) completely new knowledge
Biologist ranked the statements (expressed in English) as: c = correct; useful knowledge for the KB q = questionable; not useful (meaningless, vague) i = incorrect
A Preliminary Experiment
100Incorrect
3881Questionable
251922Correct
Fullynew
Mixture ofknown & new
Fullyknown
Statements that are:
“The membrane break down”• Questionable due to poor rendering in English, not the original logic
A Preliminary Experiment
100Incorrect
3881Questionable
251922Correct
Fullynew
Mixture ofknown & new
Fullyknown
Statements that are:
70% judged correct
A Preliminary Experiment
100Incorrect
3881Questionable
251922Correct
Fullynew
Mixture ofknown & new
Fullyknown
Statements that are:
39% judged correct
A Preliminary Experiment
Is extracting and integrating some useful knowledge Potentially useful as interactive tool
100Incorrect
3881Questionable
251922Correct
Fullynew
Mixture ofknown & new
Fullyknown
Statements that are:
Summary
Clearly only a first step Simple KR, single parse, contradictions, noisy, …
But: Interpretation guided by knowledge Identifies the “hooks” for new knowledge Is a “real” context for machine reading
To read T,ask “Is it true that T?”