information extraction, conditional random fields, and social network analysis
DESCRIPTION
Information Extraction, Conditional Random Fields, and Social Network Analysis. Andrew McCallum Computer Science Department University of Massachusetts Amherst Joint work with Aron Culotta, Charles Sutton, Ben Wellner, Khashayar Rohanimanesh, Wei Li, Andres Corrada, Xuerui Wang. Goal:. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/1.jpg)
Information Extraction,Conditional Random Fields,and Social Network Analysis
Andrew McCallum
Computer Science Department
University of Massachusetts Amherst
Joint work with
Aron Culotta, Charles Sutton, Ben Wellner, Khashayar Rohanimanesh, Wei Li,Andres Corrada, Xuerui Wang
![Page 2: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/2.jpg)
Goal:
Mine actionable knowledgefrom unstructured text.
![Page 3: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/3.jpg)
Extracting Job Openings from the Web
foodscience.com-Job2
JobTitle: Ice Cream Guru
Employer: foodscience.com
JobCategory: Travel/Hospitality
JobFunction: Food Services
JobLocation: Upper Midwest
Contact Phone: 800-488-2611
DateExtracted: January 8, 2001
Source: www.foodscience.com/jobs_midwest.html
OtherCompanyJobs: foodscience.com-Job1
![Page 4: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/4.jpg)
A Portal for Job Openings
![Page 5: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/5.jpg)
Job
Op
enin
gs:
Cat
ego
ry =
Hig
h T
ech
Key
wo
rd =
Jav
a L
oca
tio
n =
U.S
.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 6: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/6.jpg)
Data Mining the Extracted Job Information
![Page 7: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/7.jpg)
IE fromChinese Documents regarding Weather
Department of Terrestrial System, Chinese Academy of Sciences
200k+ documentsseveral millennia old
- Qing Dynasty Archives- memos- newspaper articles- diaries
![Page 8: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/8.jpg)
IE from Cargo Container Ship Manifests
Cargo Tracking Div.US Navy
![Page 9: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/9.jpg)
IE from Research Papers[McCallum et al ‘99]
![Page 10: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/10.jpg)
IE from Research Papers
![Page 11: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/11.jpg)
Mining Research Papers
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
[Giles et al]
[Rosen-Zvi, Griffiths, Steyvers, Smyth, 2004]
![Page 12: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/12.jpg)
What is “Information Extraction”
Information Extraction = segmentation + classification + clustering + association
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation
![Page 13: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/13.jpg)
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation
![Page 14: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/14.jpg)
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation
![Page 15: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/15.jpg)
What is “Information Extraction”
Information Extraction = segmentation + classification + association + clustering
As a familyof techniques:
October 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gates railed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.“
Richard Stallman, founder of the Free Software Foundation, countered saying…
Microsoft CorporationCEOBill GatesMicrosoftGatesMicrosoftBill VeghteMicrosoftVPRichard StallmanfounderFree Software Foundation
NAME
TITLE ORGANIZATION
Bill Gates
CEO
Microsoft
Bill Veghte
VP
Microsoft
Richard Stallman
founder
Free Soft..
*
*
*
*
![Page 16: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/16.jpg)
Larger Context
SegmentClassifyAssociateCluster
Filter
Prediction Outlier detection Decision support
IE
Documentcollection
Database
Discover patterns - entity types - links / relations - events
DataMining
Spider
Actionableknowledge
![Page 17: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/17.jpg)
Outline
• Examples of IE and Data Mining.
• Conditional Random Fields and Feature Induction.
• Joint inference: Motivation and examples
– Joint Labeling of Cascaded Sequences (Belief Propagation)
– Joint Labeling of Distant Entities (BP by Tree Reparameterization)
– Joint Co-reference Resolution (Graph Partitioning)
– Joint Segmentation and Co-ref (Iterated Conditional Samples.)
• Two example projects
– Email, contact address book, and Social Network Analysis
– Research Paper search and analysis
![Page 18: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/18.jpg)
Hidden Markov Models
St - 1
St
Ot
St+1
Ot +1
Ot -1
...
...
Finite state model Graphical model
Parameters: for all states S={s1,s2,…} Start state probabilities: P(st ) Transition probabilities: P(st|st-1 ) Observation (emission) probabilities: P(ot|st )Training: Maximize probability of training observations (w/ prior)
∏=
−∝||
11 )|()|(),(
o
ttttt soPssPosP
vvv
HMMs are the standard sequence modeling tool in genomics, music, speech, NLP, …
...transitions
observations
o1 o2 o3 o4 o5 o6 o7 o8
Generates:
State sequenceObservation sequence
Usually a multinomial over atomic, fixed alphabet
![Page 19: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/19.jpg)
IE with Hidden Markov Models
Yesterday Rich Caruana spoke this example sentence.
Yesterday Rich Caruana spoke this example sentence.
Person name: Rich Caruana
Given a sequence of observations:
and a trained HMM:
Find the most likely state sequence: (Viterbi)
Any words said to be generated by the designated “person name”state extract as a person name:
person name
location name
background
![Page 20: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/20.jpg)
We want More than an Atomic View of Words
Would like richer representation of text: many arbitrary, overlapping features of the words.
St - 1
St
Ot
St+1
Ot +1
Ot -1
identity of wordends in “-ski”is capitalizedis part of a noun phraseis in a list of city namesis under node X in WordNetis in bold fontis indentedis in hyperlink anchorlast person name was femalenext two words are “and Associates”
…
…part of
noun phrase
is “Wisniewski”
ends in “-ski”
![Page 21: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/21.jpg)
Problems with Richer Representationand a Joint Model
These arbitrary features are not independent.– Multiple levels of granularity (chars, words, phrases)
– Multiple dependent modalities (words, formatting, layout)
– Past & future
Two choices:
Model the dependencies.Each state would have its own Bayes Net. But we are already starved for training data!
Ignore the dependencies.This causes “over-counting” of evidence (ala naïve Bayes). Big problem when combining evidence, as in Viterbi!
St - 1
St
Ot
St+1
Ot +1
Ot -1
St - 1
St
Ot
St+1
Ot +1
Ot -1
![Page 22: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/22.jpg)
Conditional Sequence Models
• We prefer a model that is trained to maximize a conditional probability rather than joint probability:P(s|o) instead of P(s,o):
– Can examine features, but not responsible for generating them.
– Don’t have to explicitly model their dependencies.
– Don’t “waste modeling effort” trying to generate what we are given at test time anyway.
![Page 23: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/23.jpg)
€
vs = s1,s2,...sn
v o = o1,o2,...on
Joint
Conditional
St-1 St
Ot
St+1
Ot+1Ot-1
St-1 St
Ot
St+1
Ot+1Ot-1
...
...
...
...
€
P(v s ,
v o ) = P(st | st−1)P(ot | st )
t=1
|v o |
∏
€
Φo(t) = exp λ k fk (st ,ot )k
∑ ⎛
⎝ ⎜
⎞
⎠ ⎟ (A super-special case of
Conditional Random Fields.)
[Lafferty, McCallum, Pereira 2001]
€
P(v s |
v o ) =
1
P(v o )
P(st | st−1)P(ot | st )t=1
|v o |
∏
€
=1
Z(v o )
Φs(st ,st−1)Φo(ot ,st )t=1
|v o |
∏
where
From HMMs to Conditional Random Fields
Set parameters by maximum likelihood, using optimization method on L.
![Page 24: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/24.jpg)
Conditional Random Fields
St St+1 St+2
O = Ot, Ot+1, Ot+2, Ot+3, Ot+4
St+3 St+4
1. FSM special-case: linear chain among unknowns, parameters tied across time steps.
∏ ∑=
− ⎟⎠
⎞⎜⎝
⎛=
||
11 ),,,(exp
)(
1)|(
o
t kttkk tossf
oZosP
vv
vvv
λ
[Lafferty, McCallum, Pereira 2001]
2. In general: CRFs = "Conditionally-trained Markov Network" arbitrary structure among unknowns
3. Relational Markov Networks [Taskar, Abbeel, Koller 2002]: Parameters tied across hits from SQL-like queries ("clique templates")
![Page 25: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/25.jpg)
Training CRFs
€
Log - likelihood gradient :
∂L
∂λ k
= Ck (v s ( i),
v o (i))
i
∑ − P{λk }(v s |
v o (i)) Ck (
v s ,
v o ( i))
v s
∑i
∑ − λ k2
Ck (v s ,
v o ) = fk (
t
∑ v o , t,st−1,st )
€
Maximize log - likelihood of parameters given training data :
L({λ k} |{v o ,
v s
( i)})
Feature count using correct labels
Feature count using predicted labels
Smoothing penalty- -
![Page 26: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/26.jpg)
Linear-chain CRFs vs. HMMs
• Comparable computational efficiency for inference
• Features may be arbitrary functions of any or all observations
– Parameters need not fully specify generation of observations; can require less training data
– Easy to incorporate domain knowledge
![Page 27: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/27.jpg)
IE from Research Papers[McCallum et al ‘99]
![Page 28: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/28.jpg)
IE from Research Papers
Field-level F1
Hidden Markov Models (HMMs) 75.6[Seymore, McCallum, Rosenfeld, 1999]
Support Vector Machines (SVMs) 89.7[Han, Giles, et al, 2003]
Conditional Random Fields (CRFs) 93.9[Peng, McCallum, 2004]
error40%
![Page 29: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/29.jpg)
Table Extraction from Government ReportsCash receipts from marketings of milk during 1995 at $19.9 billion dollars, was slightly below 1994. Producer returns averaged $12.93 per hundredweight, $0.19 per hundredweight below 1994. Marketings totaled 154 billion pounds, 1 percent above 1994. Marketings include whole milk sold to plants and dealers as well as milk sold directly to consumers. An estimated 1.56 billion pounds of milk were used on farms where produced, 8 percent less than 1994. Calves were fed 78 percent of this milk with the remainder consumed in producer households. Milk Cows and Production of Milk and Milkfat: United States, 1993-95 -------------------------------------------------------------------------------- : : Production of Milk and Milkfat 2/ : Number :------------------------------------------------------- Year : of : Per Milk Cow : Percentage : Total :Milk Cows 1/:-------------------: of Fat in All :------------------ : : Milk : Milkfat : Milk Produced : Milk : Milkfat -------------------------------------------------------------------------------- : 1,000 Head --- Pounds --- Percent Million Pounds : 1993 : 9,589 15,704 575 3.66 150,582 5,514.4 1994 : 9,500 16,175 592 3.66 153,664 5,623.7 1995 : 9,461 16,451 602 3.66 155,644 5,694.3 --------------------------------------------------------------------------------1/ Average number during year, excluding heifers not yet fresh. 2/ Excludes milk sucked by calves.
![Page 30: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/30.jpg)
Table Extraction from Government Reports
Cash receipts from marketings of milk during 1995 at $19.9 billion dollars, was
slightly below 1994. Producer returns averaged $12.93 per hundredweight,
$0.19 per hundredweight below 1994. Marketings totaled 154 billion pounds,
1 percent above 1994. Marketings include whole milk sold to plants and dealers
as well as milk sold directly to consumers.
An estimated 1.56 billion pounds of milk were used on farms where produced,
8 percent less than 1994. Calves were fed 78 percent of this milk with the
remainder consumed in producer households.
Milk Cows and Production of Milk and Milkfat:
United States, 1993-95
--------------------------------------------------------------------------------
: : Production of Milk and Milkfat 2/
: Number :-------------------------------------------------------
Year : of : Per Milk Cow : Percentage : Total
:Milk Cows 1/:-------------------: of Fat in All :------------------
: : Milk : Milkfat : Milk Produced : Milk : Milkfat
--------------------------------------------------------------------------------
: 1,000 Head --- Pounds --- Percent Million Pounds
:
1993 : 9,589 15,704 575 3.66 150,582 5,514.4
1994 : 9,500 16,175 592 3.66 153,664 5,623.7
1995 : 9,461 16,451 602 3.66 155,644 5,694.3
--------------------------------------------------------------------------------
1/ Average number during year, excluding heifers not yet fresh.
2/ Excludes milk sucked by calves.
CRFLabels:• Non-Table• Table Title• Table Header• Table Data Row• Table Section Data Row• Table Footnote• ... (12 in all)
[Pinto, McCallum, Wei, Croft, 2003 SIGIR]
Features:• Percentage of digit chars• Percentage of alpha chars• Indented• Contains 5+ consecutive spaces• Whitespace in this line aligns with prev.• ...• Conjunctions of all previous features,
time offset: {0,0}, {-1,0}, {0,1}, {1,2}.
100+ documents from www.fedstats.gov
![Page 31: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/31.jpg)
Table Extraction Experimental Results
Line labels,percent correct
Table segments,F1
95 % 92 %
65 % 64 %
85 % -
HMM
StatelessMaxEnt
CRF w/outconjunctions
CRF
52 % 68 %
[Pinto, McCallum, Wei, Croft, 2003 SIGIR]
![Page 32: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/32.jpg)
Feature Induction for CRFs
1. Begin with knowledge of atomic features, but no features yet in the model.
2. Consider many candidate features, including atomic and conjunctions.
3. Evaluate each candidate feature.
4. Add to the model some that are ranked highest.
5. Train the model.
[McCallum, 2003, UAI]
![Page 33: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/33.jpg)
Candidate Feature Evaluation
Common method: Information Gain
∑∈
−=Ff
fCHfPCHFC )|( )()(),(InfoGain
True optimization criterion: Likelihood of training data
Λ+Λ −= LLf fλλ ),(Gain Likelihood
Technical meat is in how to calculate this efficiently for CRFs• Mean field approximation• Emphasize error instances (related to Boosting)• Newton's method to set
[McCallum, 2003, UAI]
![Page 34: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/34.jpg)
Named Entity Recognition
CRICKET - MILLNS SIGNS FOR BOLAND
CAPE TOWN 1996-08-22
South African provincial side Boland said on Thursday they had signed Leicestershire fast bowler David Millns on a one year contract. Millns, who toured Australia with England A in 1992, replaces former England all-rounder Phillip DeFreitas as Boland's overseas professional.
Labels: Examples:
PER Yayuk BasukiInnocent Butare
ORG 3MKDPCleveland
LOC ClevelandNirmal HridayThe Oval
MISC JavaBasque1,000 Lakes Rally
![Page 35: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/35.jpg)
Automatically Induced Features
Index Feature
0 inside-noun-phrase (ot-1)
5 stopword (ot)
20 capitalized (ot+1)
75 word=the (ot)
100 in-person-lexicon (ot-1)
200 word=in (ot+2)
500 word=Republic (ot+1)
711 word=RBI (ot) & header=BASEBALL
1027 header=CRICKET (ot) & in-English-county-lexicon (ot)
1298 company-suffix-word (firstmentiont+2)
4040 location (ot) & POS=NNP (ot) & capitalized (ot) & stopword (ot-1)
4945 moderately-rare-first-name (ot-1) & very-common-last-name (ot)
4474 word=the (ot-2) & word=of (ot)
[McCallum & Li, 2003, CoNLL]
![Page 36: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/36.jpg)
Named Entity Extraction Results
Method F1
HMMs BBN's Identifinder 73%
CRFs w/out Feature Induction 83%
CRFs with Feature Induction 90%based on LikelihoodGain
[McCallum & Li, 2003, CoNLL]
![Page 37: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/37.jpg)
Outline
• Examples of IE and Data Mining.
• Conditional Random Fields and Feature Induction.
• Joint inference: Motivation and examples
– Joint Labeling of Cascaded Sequences (Belief Propagation)
– Joint Labeling of Distant Entities (BP by Tree Reparameterization)
– Joint Co-reference Resolution (Graph Partitioning)
– Joint Segmentation and Co-ref (Iterated Conditional Samples.)
• Two example projects
– Email, contact address book, and Social Network Analysis
– Research Paper search and analysis
![Page 38: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/38.jpg)
Problem:
Combined in serial juxtaposition,IE and KD are unaware of each others’ weaknesses and opportunities.
1) KD begins from a populated DB, unaware of where the data came from, or its inherent uncertainties.
2) IE is unaware of emerging patterns and regularities in the DB.
The accuracy of both suffers, and significant mining of complex text sources is beyond reach.
SegmentClassifyAssociateCluster
IE
Documentcollection
Database
Discover patterns - entity types - links / relations - events
KnowledgeDiscovery
Actionableknowledge
![Page 39: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/39.jpg)
SegmentClassifyAssociateCluster
Filter
Prediction Outlier detection Decision support
IE
Documentcollection
Database
Discover patterns - entity types - links / relations - events
DataMining
Spider
Actionableknowledge
Uncertainty Info
Emerging Patterns
Solution:
![Page 40: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/40.jpg)
SegmentClassifyAssociateCluster
Filter
Prediction Outlier detection Decision support
IE
Documentcollection
ProbabilisticModel
Discover patterns - entity types - links / relations - events
DataMining
Spider
Actionableknowledge
Solution:
Conditional Random Fields [Lafferty, McCallum, Pereira]
Conditional PRMs [Koller…], [Jensen…], [Geetor…], [Domingos…]
Discriminatively-trained undirected graphical models
Complex Inference and LearningJust what we researchers like to sink our teeth into!
Unified Model
![Page 41: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/41.jpg)
1. Jointly labeling cascaded sequencesFactorial CRFs
Part-of-speech
Noun-phrase boundaries
Named-entity tag
English words
[Sutton, Khashayar, McCallum, ICML 2004]
![Page 42: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/42.jpg)
1. Jointly labeling cascaded sequencesFactorial CRFs
Part-of-speech
Noun-phrase boundaries
Named-entity tag
English words
[Sutton, Khashayar, McCallum, ICML 2004]
![Page 43: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/43.jpg)
1. Jointly labeling cascaded sequencesFactorial CRFs
Part-of-speech
Noun-phrase boundaries
Named-entity tag
English words
[Sutton, Khashayar, McCallum, ICML 2004]
But errors cascade--must be perfect at every stage to do well.
![Page 44: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/44.jpg)
1. Jointly labeling cascaded sequencesFactorial CRFs
Part-of-speech
Noun-phrase boundaries
Named-entity tag
English words
[Sutton, Khashayar, McCallum, ICML 2004]
Joint prediction of part-of-speech and noun-phrase in newswire,matching accuracy with only 50% of the training data.
Inference:Tree reparameterization BP
[Wainwright et al, 2002]
![Page 45: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/45.jpg)
2. Jointly labeling distant mentionsSkip-chain CRFs
Senator Joe Green said today … . Green ran for …
…
[Sutton, McCallum, SRL 2004]
Dependency among similar, distant mentions ignored.
![Page 46: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/46.jpg)
2. Jointly labeling distant mentionsSkip-chain CRFs
Senator Joe Green said today … . Green ran for …
…
[Sutton, McCallum, SRL 2004]
14% reduction in error on most repeated field in email seminar announcements.
Inference:Tree reparameterization BP
[Wainwright et al, 2002]
![Page 47: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/47.jpg)
3. Joint co-reference among all pairsAffinity Matrix CRF
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
45
99Y/N
Y/N
Y/N
11
[McCallum, Wellner, IJCAI WS 2003, NIPS 2004]
~25% reduction in error on co-reference ofproper nouns in newswire.
Inference:Correlational clusteringgraph partitioning
[Bansal, Blum, Chawla, 2002]
“Entity resolution”“Object correspondence”
![Page 48: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/48.jpg)
Coreference Resolution
Input
AKA "record linkage", "database record deduplication", "entity resolution", "object correspondence", "identity uncertainty"
Output
News article, with named-entity "mentions" tagged
Number of entities, N = 3
#1 Secretary of State Colin Powell he Mr. Powell Powell
#2 Condoleezza Rice she Rice
#3 President Bush Bush
Today Secretary of State Colin Powell met with . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . he . . . . . .. . . . . . . . . . . . . Condoleezza Rice . . . . .. . . . Mr Powell . . . . . . . . . .she . . . . . . . . . . . . . . . . . . . . . Powell . . . . . . . . . . . .. . . President Bush . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . Rice . . . . . . . . . . . . . . . . Bush . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
![Page 49: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/49.jpg)
Inside the Traditional Solution
Mention (3) Mention (4)
. . . Mr Powell . . . . . . Powell . . .
N Two words in common 29Y One word in common 13Y "Normalized" mentions are string identical 39Y Capitalized word in common 17Y > 50% character tri-gram overlap 19N < 25% character tri-gram overlap -34Y In same sentence 9Y Within two sentences 8N Further than 3 sentences apart -1Y "Hobbs Distance" < 3 11N Number of entities in between two mentions = 0 12N Number of entities in between two mentions > 4 -3Y Font matches 1Y Default -19
OVERALL SCORE = 98 > threshold=0
Pair-wise Affinity Metric
Y/N?
![Page 50: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/50.jpg)
The Problem
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
affinity = 98
affinity = 11
affinity = 104
Pair-wise mergingdecisions are beingmade independentlyfrom each otherY
Y
N
Affinity measures are noisy and imperfect.
They should be madein relational dependencewith each other.
![Page 51: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/51.jpg)
A Markov Random Field for Co-reference
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
45
30Y/N
Y/N
Y/N
[McCallum & Wellner, 2003, ICML](MRF)
Make pair-wise mergingdecisions in dependent relation to each other by- calculating a joint prob.- including all edge weights- adding dependence on consistent triangles.
11
€
P(v y |
v x ) =
1
Z v x
exp λ l f l (x i, x j , y ij ) + λ ' f '(y ij , y jk,y ik )i, j,k
∑l
∑i, j
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
![Page 52: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/52.jpg)
A Markov Random Field for Co-reference
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
45
30Y/N
Y/N
Y/N
[McCallum & Wellner, 2003](MRF)
Make pair-wise mergingdecisions in dependent relation to each other by- calculating a joint prob.- including all edge weights- adding dependence on consistent triangles.
11 ∞−
€
P(v y |
v x ) =
1
Z v x
exp λ l f l (x i, x j , y ij ) + λ ' f '(y ij , y jk,y ik )i, j,k
∑l
∑i, j
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
![Page 53: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/53.jpg)
A Markov Random Field for Co-reference
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
Y
Y
N
[McCallum & Wellner, 2003]
€
P(v y |
v x ) =
1
Z v x
exp λ l f l (x i, x j , y ij ) + λ ' f '(y ij , y jk,y ik )i, j,k
∑l
∑i, j
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
(MRF)
infinity
45)
30)
(11)
![Page 54: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/54.jpg)
A Markov Random Field for Co-reference
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
N
Y
N
[McCallum & Wellner, 2003]
€
P(v y |
v x ) =
1
Z v x
exp λ l f l (x i, x j , y ij ) + λ ' f '(y ij , y jk,y ik )i, j,k
∑l
∑i, j
∑ ⎛
⎝ ⎜ ⎜
⎞
⎠ ⎟ ⎟
(MRF)
64
45)
30)
(11)
![Page 55: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/55.jpg)
Inference in these MRFs = Graph Partitioning[Boykov, Vekler, Zabih, 1999], [Kolmogorov & Zabih, 2002], [Yu, Cross, Shi, 2002]
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
45
11
30
. . . Condoleezza Rice . . .
134
10
€
log P(v y |
v x )( )∝ λ l f l (x i,x j , y ij )
l
∑i, j
∑ = w ij
i, j w/inparitions
∑ − w ij
i, j acrossparitions
∑
106
![Page 56: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/56.jpg)
Inference in these MRFs = Graph Partitioning[Boykov, Vekler, Zabih, 1999], [Kolmogorov & Zabih, 2002], [Yu, Cross, Shi, 2002]
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . . . . . Condoleezza Rice . . .
= 22
45
11
30
134
10
106
€
log P(v y |
v x )( )∝ λ l f l (x i,x j , y ij )
l
∑i, j
∑ = w ij
i, j w/inparitions
∑ − w ij
i, j acrossparitions
∑
![Page 57: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/57.jpg)
Inference in these MRFs = Graph Partitioning[Boykov, Vekler, Zabih, 1999], [Kolmogorov & Zabih, 2002], [Yu, Cross, Shi, 2002]
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . . . . . Condoleezza Rice . . .
€
log P(v y |
v x )( )∝ λ l f l (x i, x j , y ij )
l
∑i, j
∑ = w ij
i, j w/inparitions
∑ + w'iji, j acrossparitions
∑ = 314
45
11
30
134
10
106
![Page 58: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/58.jpg)
Co-reference Experimental Results
Proper noun co-reference
DARPA ACE broadcast news transcripts, 117 stories
Partition F1 Pair F1Single-link threshold 16 % 18 %Best prev match [Morton] 83 % 89 %MRFs 88 % 92 %
error=30% error=28%
DARPA MUC-6 newswire article corpus, 30 stories
Partition F1 Pair F1Single-link threshold 11% 7 %Best prev match [Morton] 70 % 76 %MRFs 74 % 80 %
error=13% error=17%
[McCallum & Wellner, 2003]
![Page 59: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/59.jpg)
Joint co-reference among all pairsAffinity Matrix CRF
. . . Mr Powell . . .
. . . Powell . . .
. . . she . . .
45
99Y/N
Y/N
Y/N
11
[McCallum, Wellner, IJCAI WS 2003, NIPS 2004]
~25% reduction in error on co-reference ofproper nouns in newswire.
Inference:Correlational clusteringgraph partitioning
[Bansal, Blum, Chawla, 2002]
![Page 60: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/60.jpg)
p
Databasefield values
c
4. Joint segmentation and co-reference
o
s
o
s
c
c
s
o
Citation attributes
y y
y
Segmentation
[Wellner, McCallum, Peng, Hay, UAI 2004]Inference:Variant of Iterated Conditional Modes
Co-reference decisions
Laurel, B. Interface Agents: Metaphors with Character, in The Art of Human-Computer Interface Design, B. Laurel (ed), Addison-Wesley, 1990.
Brenda Laurel. Interface Agents: Metaphors with Character, in Laurel, The Art of Human-Computer Interface Design, 355-366, 1990.
[Besag, 1986]
World Knowledge
35% reduction in co-reference error by using segmentation uncertainty.
6-14% reduction in segmentation error by using co-reference.
Extraction from and matching of research paper citations.
see also [Marthi, Milch, Russell, 2003]
![Page 61: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/61.jpg)
Joint IE and Coreference from Research Paper Citations
Textual citation mentions(noisy, with duplicates)
Paper database, with fields,clean, duplicates collapsed
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
AUTHORS TITLE VENUECowell, Dawid… Probab… SpringerMontemerlo, Thrun…FastSLAM… AAAI…Kjaerulff Approxi… Technic…
4. Joint segmentation and co-reference
![Page 62: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/62.jpg)
Laurel, B. Interface Agents: Metaphors with Character , in
The Art of Human-Computer Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents: Metaphors with Character , in
Smith , The Art of Human-Computr Interface Design , 355-366 , 1990 .
Citation Segmentation and Coreference
![Page 63: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/63.jpg)
Laurel, B. Interface Agents: Metaphors with Character , in
The Art of Human-Computer Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents: Metaphors with Character , in
Smith , The Art of Human-Computr Interface Design , 355-366 , 1990 .
1) Segment citation fields
Citation Segmentation and Coreference
![Page 64: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/64.jpg)
Laurel, B. Interface Agents: Metaphors with Character , in
The Art of Human-Computer Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents: Metaphors with Character , in
Smith , The Art of Human-Computr Interface Design , 355-366 , 1990 .
1) Segment citation fields
2) Resolve coreferent citations
Citation Segmentation and Coreference
Y?N
![Page 65: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/65.jpg)
Laurel, B. Interface Agents: Metaphors with Character , in
The Art of Human-Computer Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents: Metaphors with Character , in
Smith , The Art of Human-Computr Interface Design , 355-366 , 1990 .
1) Segment citation fields
2) Resolve coreferent citations
3) Form canonical database record
Citation Segmentation and Coreference
AUTHOR = Brenda Laurel TITLE = Interface Agents: Metaphors with CharacterPAGES = 355-366BOOKTITLE = The Art of Human-Computer Interface DesignEDITOR = T. SmithPUBLISHER = Addison-WesleyYEAR = 1990
Y?N
Resolving conflicts
![Page 66: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/66.jpg)
Laurel, B. Interface Agents: Metaphors with Character , in
The Art of Human-Computer Interface Design , T. Smith (ed) ,
Addison-Wesley , 1990 .
Brenda Laurel . Interface Agents: Metaphors with Character , in
Smith , The Art of Human-Computr Interface Design , 355-366 , 1990 .
1) Segment citation fields
2) Resolve coreferent citations
3) Form canonical database record
Citation Segmentation and Coreference
AUTHOR = Brenda Laurel TITLE = Interface Agents: Metaphors with CharacterPAGES = 355-366BOOKTITLE = The Art of Human-Computer Interface DesignEDITOR = T. SmithPUBLISHER = Addison-WesleyYEAR = 1990
Y?N
Perform jointly.
![Page 67: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/67.jpg)
x
s
Observed citation
CRF Segmentation
IE + Coreference Model
J Besag 1986 On the…
AUT AUT YR TITL TITL
![Page 68: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/68.jpg)
x
s
Observed citation
CRF Segmentation
IE + Coreference Model
Citation mention attributes
J Besag 1986 On the…
AUTHOR = “J Besag”YEAR = “1986”TITLE = “On the…”
c
![Page 69: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/69.jpg)
x
s
IE + Coreference Model
c
J Besag 1986 On the…Smyth . 2001 Data Mining…
Smyth , P Data mining…
Structure for each citation mention
![Page 70: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/70.jpg)
x
s
IE + Coreference Model
c
Binary coreference variablesfor each pair of mentions
J Besag 1986 On the…Smyth . 2001 Data Mining…
Smyth , P Data mining…
![Page 71: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/71.jpg)
x
s
IE + Coreference Model
c
y n
n
J Besag 1986 On the…Smyth . 2001 Data Mining…
Smyth , P Data mining…
Binary coreference variablesfor each pair of mentions
![Page 72: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/72.jpg)
y n
n
x
s
IE + Coreference Model
c
J Besag 1986 On the…Smyth . 2001 Data Mining…
Smyth , P Data mining…
Research paper entity attribute nodes
AUTHOR = “P Smyth”YEAR = “2001”TITLE = “Data Mining…”...
![Page 73: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/73.jpg)
p
Databasefield values
c
4. Joint segmentation and co-reference
o
s
o
s
c
c
s
o
Citation attributes
y y
y
Segmentation
[Wellner, McCallum, Peng, Hay, UAI 2004]
Inference:Variant of Iterated Conditional Modes
Co-reference decisions
Laurel, B. Interface Agents: Metaphors with Character, in The Art of Human-Computer Interface Design, B. Laurel (ed), Addison-Wesley, 1990.
Brenda Laurel. Interface Agents: Metaphors with Character, in Laurel, The Art of Human-Computer Interface Design, 355-366, 1990.
[Besag, 1986]
World Knowledge
35% reduction in co-reference error by using segmentation uncertainty.
6-14% reduction in segmentation error by using co-reference.
Extraction from and matching of research paper citations.
![Page 74: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/74.jpg)
Outline
• Examples of IE and Data Mining.
• Conditional Random Fields and Feature Induction.
• Joint inference: Motivation and examples
– Joint Labeling of Cascaded Sequences (Belief Propagation)
– Joint Labeling of Distant Entities (BP by Tree Reparameterization)
– Joint Co-reference Resolution (Graph Partitioning)
– Joint Segmentation and Co-ref (Iterated Conditional Samples.)
• Two example projects
– Email, contact address book, and Social Network Analysis
– Research Paper search and analysis
![Page 75: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/75.jpg)
Workplace effectiveness ~ Ability to leverage network of acquaintances
But filling Contacts DB by hand is tedious, and incomplete.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Email Inbox Contacts DB
QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
WWW
Automatically
Managing and UnderstandingConnections of People in our Email World
![Page 76: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/76.jpg)
System Overview
ContactInfo andPerson Name
Extraction
Person Name
Extraction
NameCoreference
HomepageRetrieval
Social NetworkAnalysis
KeywordExtraction
CRFWWW
names
Email QuickTime™ and aTIFF (Uncompressed) decompressorare needed to see this picture.
![Page 77: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/77.jpg)
An ExampleTo: “Andrew McCallum” [email protected]
Subject ...
First Name:
Andrew
Middle Name:
Kachites
Last Name:
McCallum
JobTitle: Associate Professor
Company: University of Massachusetts
Street Address:
140 Governor’s Dr.
City: Amherst
State: MA
Zip: 01003
Company Phone:
(413) 545-1323
Links: Fernando Pereira, Sam Roweis,…
Key Words:
Information extraction,
social network,…
Search for new people
![Page 78: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/78.jpg)
Summary of Results
Token
Acc
Field
Prec
Field
Recall
Field
F1
CRF 94.50 85.73 76.33 80.76
Person Keywords
William Cohen Logic programming
Text categorization
Data integration
Rule learning
Daphne Koller Bayesian networks
Relational models
Probabilistic models
Hidden variables
Deborah McGuiness
Semantic web
Description logics
Knowledge representation
Ontologies
Tom Mitchell Machine learning
Cognitive states
Learning apprentice
Artificial intelligence
Contact info and name extraction performance (25 fields)
Example keywords extracted
1. Expert Finding: When solving some task, find friends-of-friends with relevant expertise. Avoid “stove-piping” in large org’s by automatically suggesting collaborators. Given a task, automatically suggest the right team for the job. (Hiring aid!)
2. Social Network Analysis: Understand the social structure of your organization. Suggest structural changes for improved efficiency.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
![Page 79: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/79.jpg)
Clustering words into topics withLatent Dirichlet Allocation
[Blei, Ng, Jordan 2003]
![Page 80: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/80.jpg)
STORYSTORIES
TELLCHARACTER
CHARACTERSAUTHOR
READTOLD
SETTINGTALESPLOT
TELLINGSHORT
FICTIONACTION
TRUEEVENTSTELLSTALE
NOVEL
MINDWORLDDREAM
DREAMSTHOUGHT
IMAGINATIONMOMENT
THOUGHTSOWNREALLIFE
IMAGINESENSE
CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE
WATERFISHSEA
SWIMSWIMMING
POOLLIKE
SHELLSHARKTANK
SHELLSSHARKSDIVING
DOLPHINSSWAMLONGSEALDIVE
DOLPHINUNDERWATER
DISEASEBACTERIADISEASES
GERMSFEVERCAUSE
CAUSEDSPREADVIRUSES
INFECTIONVIRUS
MICROORGANISMSPERSON
INFECTIOUSCOMMONCAUSING
SMALLPOXBODY
INFECTIONSCERTAIN
Example topicsinduced from a large collection of text
FIELDMAGNETIC
MAGNETWIRE
NEEDLECURRENT
COILPOLESIRON
COMPASSLINESCORE
ELECTRICDIRECTION
FORCEMAGNETS
BEMAGNETISM
POLEINDUCED
SCIENCESTUDY
SCIENTISTSSCIENTIFIC
KNOWLEDGEWORK
RESEARCHCHEMISTRY
TECHNOLOGYMANY
MATHEMATICSBIOLOGY
FIELDPHYSICS
LABORATORYSTUDIESWORLD
SCIENTISTSTUDYINGSCIENCES
BALLGAMETEAM
FOOTBALLBASEBALLPLAYERS
PLAYFIELD
PLAYERBASKETBALL
COACHPLAYEDPLAYING
HITTENNISTEAMSGAMESSPORTS
BATTERRY
JOBWORKJOBS
CAREEREXPERIENCE
EMPLOYMENTOPPORTUNITIES
WORKINGTRAINING
SKILLSCAREERS
POSITIONSFIND
POSITIONFIELD
OCCUPATIONSREQUIRE
OPPORTUNITYEARNABLE
[Tennenbaum et al]
![Page 81: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/81.jpg)
STORYSTORIES
TELLCHARACTER
CHARACTERSAUTHOR
READTOLD
SETTINGTALESPLOT
TELLINGSHORT
FICTIONACTION
TRUEEVENTSTELLSTALE
NOVEL
MINDWORLDDREAM
DREAMSTHOUGHT
IMAGINATIONMOMENT
THOUGHTSOWNREALLIFE
IMAGINESENSE
CONSCIOUSNESSSTRANGEFEELINGWHOLEBEINGMIGHTHOPE
WATERFISHSEA
SWIMSWIMMING
POOLLIKE
SHELLSHARKTANK
SHELLSSHARKSDIVING
DOLPHINSSWAMLONGSEALDIVE
DOLPHINUNDERWATER
DISEASEBACTERIADISEASES
GERMSFEVERCAUSE
CAUSEDSPREADVIRUSES
INFECTIONVIRUS
MICROORGANISMSPERSON
INFECTIOUSCOMMONCAUSING
SMALLPOXBODY
INFECTIONSCERTAIN
FIELDMAGNETIC
MAGNETWIRE
NEEDLECURRENT
COILPOLESIRON
COMPASSLINESCORE
ELECTRICDIRECTION
FORCEMAGNETS
BEMAGNETISM
POLEINDUCED
SCIENCESTUDY
SCIENTISTSSCIENTIFIC
KNOWLEDGEWORK
RESEARCHCHEMISTRY
TECHNOLOGYMANY
MATHEMATICSBIOLOGY
FIELDPHYSICS
LABORATORYSTUDIESWORLD
SCIENTISTSTUDYINGSCIENCES
BALLGAMETEAM
FOOTBALLBASEBALLPLAYERS
PLAYFIELD
PLAYERBASKETBALL
COACHPLAYEDPLAYING
HITTENNISTEAMSGAMESSPORTS
BATTERRY
JOBWORKJOBS
CAREEREXPERIENCE
EMPLOYMENTOPPORTUNITIES
WORKINGTRAINING
SKILLSCAREERS
POSITIONSFIND
POSITIONFIELD
OCCUPATIONSREQUIRE
OPPORTUNITYEARNABLE
Example topicsinduced from a large collection of text
[Tennenbaum et al]
![Page 82: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/82.jpg)
From LDA to Author-Recipient-Topic(ART)
![Page 83: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/83.jpg)
Inference and Estimation
Gibbs Sampling:- Easy to implement- Reasonably fast
r
![Page 84: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/84.jpg)
Enron Email Corpus
• 250k email messages• 23k people
Date: Wed, 11 Apr 2001 06:56:00 -0700 (PDT)From: [email protected]: [email protected]: Enron/TransAltaContract dated Jan 1, 2001
Please see below. Katalin Kiss of TransAlta has requested an electronic copy of our final draft? Are you OK with this? If so, the only version I have is the original draft without revisions.
DP
Debra PerlingiereEnron North America Corp.Legal Department1400 Smith Street, EB 3885Houston, Texas [email protected]
![Page 85: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/85.jpg)
Topics, and prominent sender/receiversdiscovered by ART
![Page 86: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/86.jpg)
Topics, and prominent sender/receiversdiscovered by ART
Beck = “Chief Operations Officer”Dasovich = “Government Relations Executive”Shapiro = “Vice Presidence of Regulatory Affairs”Steffes = “Vice President of Government Affairs”
![Page 87: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/87.jpg)
Comparing Role Discovery
connection strength (A,B) =
distribution overauthored topics
Traditional SNA
distribution overrecipients
distribution overauthored topics
Author-TopicART
![Page 88: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/88.jpg)
Comparing Role Discovery Tracy Geaconne Dan McCarty
Traditional SNA Author-TopicART
Similar roles Different rolesDifferent roles
Geaconne = “Secretary”McCarty = “Vice President”
![Page 89: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/89.jpg)
Traditional SNA Author-TopicART
Different roles Very similarNot very similar
Geaconne = “Secretary”Hayslett = “Vice President & CTO”
Comparing Role Discovery Tracy Geaconne Rod Hayslett
![Page 90: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/90.jpg)
Traditional SNA Author-TopicART
Different roles Very differentVery similar
Blair = “Gas pipeline logistics”Watson = “Pipeline facilities planning”
Comparing Role Discovery Lynn Blair Kimberly Watson
![Page 91: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/91.jpg)
Traditional SNA Author-TopicART
Block structured NotNot
Comparing Group Discovery Enron TransWestern Division
![Page 92: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/92.jpg)
McCallum Email Corpus 2004
• January - October 2004• 23k email messages• 825 people
From: [email protected]: NIPS and ....Date: June 14, 2004 2:27:41 PM EDTTo: [email protected]
There is pertinent stuff on the first yellow folder that is completed either travel or other things, so please sign that first folder anyway. Then, here is the reminder of the things I'm still waiting for:
NIPS registration receipt.CALO registration receipt.
Thanks,Kate
![Page 93: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/93.jpg)
McCallum Email Blockstructure
![Page 94: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/94.jpg)
Four most prominent topicsin discussions with ____?
![Page 95: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/95.jpg)
![Page 96: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/96.jpg)
Two most prominent topicsin discussions with ____?
Words Problove 0.030514house 0.015402
0.013659time 0.012351great 0.011334hope 0.011043dinner 0.00959saturday 0.009154left 0.009154ll 0.009009
0.008282visit 0.008137evening 0.008137stay 0.007847bring 0.007701weekend 0.007411road 0.00712sunday 0.006829kids 0.006539flight 0.006539
Words Probtoday 0.051152tomorrow 0.045393time 0.041289ll 0.039145meeting 0.033877week 0.025484talk 0.024626meet 0.023279morning 0.022789monday 0.020767back 0.019358call 0.016418free 0.015621home 0.013967won 0.013783day 0.01311hope 0.012987leave 0.012987office 0.012742tuesday 0.012558
![Page 97: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/97.jpg)
Topic 37Words Prob Sender Recipient Probjean 0.032349 jean mccallum 0.111932james 0.02684 mccallum jean 0.072785lab 0.017229 jean jean 0.056024space 0.016292 mccallum allan 0.054266ciir 0.015471 jean allan 0.035162students 0.015354 allan mccallum 0.027778bruce 0.014885 mccallum croft 0.025668office 0.014768 mccallum grupen 0.021917staff 0.013244 mccallum stowell 0.01887funding 0.012072 mccallum [email protected] 0.011017 grupen mccallum 0.018167adam 0.010666 gwking mccallum 0.016057move 0.010666 [email protected] 0.014299grad 0.01008 mccallum saunders 0.014299ll 0.009962 mccallum jensen 0.007267rod 0.009494 mccallum culotta 0.006915time 0.008556 gauthier mccallum 0.006564asked 0.008556 grupen grupen 0.006446student 0.008556 corrada mccallum 0.006212today 0.008322 mccallum pal 0.005626
![Page 98: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/98.jpg)
Topic 40
Words Prob Sender Recipient Probcode 0.060565 hough mccallum 0.067076mallet 0.042015 mccallum hough 0.041032files 0.029115 mikem mccallum 0.028501al 0.024201 culotta mccallum 0.026044file 0.023587 saunders mccallum 0.021376version 0.022113 mccallum saunders 0.019656java 0.021499 pereira mccallum 0.017813test 0.020025 casutton mccallum 0.017199problem 0.018305 mccallum ronb 0.013514run 0.015356 mccallum pereira 0.013145cvs 0.013391 hough melinda.gervasio 0.013022add 0.012776 mccallum casutton 0.013022directory 0.012408 fuchun mccallum 0.010811release 0.012285 mccallum culotta 0.009828output 0.011916 ronb mccallum 0.009705bug 0.011179 westy hough 0.009214source 0.010197 xuerui corrada 0.0086ps 0.009705 ghuang mccallum 0.008354log 0.008968 khash mccallum 0.008231created 0.0086 melinda.gervasiomccallum 0.008108
![Page 99: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/99.jpg)
![Page 100: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/100.jpg)
Pairs with highestrank difference between ART & SNA
5 other professors3 other ML researchers
![Page 101: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/101.jpg)
Role-Author-Recipient-Topic Models
![Page 102: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/102.jpg)
Main Application Project:
![Page 103: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/103.jpg)
Main Application Project:
ResearchPaper
Cites
![Page 104: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/104.jpg)
Main Application Project:
ResearchPaper
Cites
Person
UniversityVenue
Grant
Groups
Expertise
![Page 105: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/105.jpg)
![Page 106: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/106.jpg)
Summary
• Conditional Random Fields combine the benefits of– Conditional probability models (arbitrary features)– Markov models (for sequences or other relations)
• Success in– Factorial finite state models– Jointly labeling distant entities– Coreference analysis– Segmentation uncertainty aiding coreference
• Current projects– Email, contact management, expert-finding, SNA– Mining the scientific literature
![Page 107: Information Extraction, Conditional Random Fields, and Social Network Analysis](https://reader031.vdocuments.mx/reader031/viewer/2022013115/56813763550346895d9ef1b1/html5/thumbnails/107.jpg)
End of Talk