coupled bayesian sets algorithm for semi-supervised learning and information extraction

Post on 24-Feb-2016

40 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Coupled Bayesian Sets Algorithm for Semi-supervised Learning and Information Extraction. Saurabh Verma IIT (BHU), Varanasi Estevam R. Hruschka Jr. F ederal University of São Carlos, Brazil. http:// rtw.ml.cmu.edu. NELL: Never-Ending Language Learner. Inputs: initial ontology - PowerPoint PPT Presentation

TRANSCRIPT

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets Algorithm for Semi-supervised Learning and

Information Extraction

Saurabh VermaIIT (BHU), Varanasi

Estevam R. Hruschka Jr.Federal University of São Carlos, Brazil

ECML/PKDD2012 Bristol, UK September, 26th, 2012

http://rtw.ml.cmu.edu

ECML/PKDD2012 Bristol, UK September, 26th, 2012

NELL: Never-Ending Language Learner

Inputs: initial ontology handful of examples of each predicate in ontology the web occasional interaction with human trainers

The task: run 24x7, forever• each day:

1. extract more facts from the web to populate the initial ontology

2. learn to read (perform #1) better than yesterday

ECML/PKDD2012 Bristol, UK September, 26th, 2012

NELL: Never-Ending Language Learner

Goal:• run 24x7, forever• each day:

1. extract more facts from the web to populate given ontology

2. learn to read better than yesterday

Today...Running 24 x 7, since January, 2010Input: • ontology defining ~800 categories and relations • 10-20 seed examples of each • 1 billion web pages (ClueWeb – Jamie Callan)

Result: • continuously growing KB with +1,300,000 extracted beliefs

ECML/PKDD2012 Bristol, UK September, 26th, 2012

NELL: Never-Ending Language Learner

http://rtw.ml.cmu.edu

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

arg1 is home of …… traits such as arg1

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

arg1 is home of …… traits such as arg1

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

arg1 is home of …… traits such as arg1

ECML/PKDD2012 Bristol, UK September, 26th, 2012

The Problem with Semi-Supervised Bootstrap Learning

ParisPittsburghSeattleCupertino

… mayor of arg1… live in arg1

San FranciscoAustindenial

arg1 is home of …… traits such as arg1

it’s underconstrained!!

Bayesian Sets (BS)

Given and , rank the elements of by how well they would “fit into” a set which includes

Define a score for each :

From Bayes rule, the score can be re-written as:

}{xD DDc D

cD

)()(

)(x

xx

pDp

score c

Dx

)()(),()(c

c

DppDpscore

xxx

Ghahramani & Heller; NIPS 2005

Bayesian Sets (BS)

Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.

Ghahramani & Heller; NIPS 2005

)()(),()(c

c

DppDpscore

xxx

Bayesian Sets (BS)

Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.

Ghahramani & Heller; NIPS 2005

)()(),()(c

c

DppDpscore

xxx

Bayesian Sets (BS)

Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.

Ghahramani & Heller; NIPS 2005

)()(),()(c

c

DppDpscore

xxx

Bayesian Sets (BS)

Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.

Ghahramani & Heller; NIPS 2005

)()(),()(c

c

DppDpscore

xxx

ECML/PKDD2012 Bristol, UK September, 26th, 2012

BS using NELL’s Ontology

Initial ontology: Everything

PersonCompany VegetableSport

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Initial ontology: Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus, run BS onceEverything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus, run BS onceEverything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

TexacoFacebook

DELL…

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

AristotleAlan Turing

Alexander Fleming…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

MarathonBaseball

Badminton…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbaneBeijingCairo

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus, iteratively run BSEverything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

Freud

BasketballFootball

SwimmingTennisGolf

SoccerVolleyball

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Iterative BS using NELL’s OntologyZhang & Liu, 2011

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus, iteratively run BSEverything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Iterative BS using NELL’s OntologyZhang & Liu, 2011

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus, iteratively run BSEverything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

TexacoFacebook

DELL…

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

AristotleAlan Turing

Alexander Fleming…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

MarathonBaseball

Badminton…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbaneBeijingCairo

Iterative BS using NELL’s OntologyZhang & Liu, 2011

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Iterative BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Iterative BS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

NELL: Coupled semi-supervised training of many functions

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Training Type 2:Structured Outputs, Multitask, Posterior

Regularization, Multilabel

Learn functions with the same input, different outputs, where we know some constraint

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Training Type 2:Structured Outputs, Multitask, Posterior

Regularization, Multilabel

Learn functions with the same input, different outputs, where we know some constraint

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Training Type 2:Structured Outputs, Multitask, Posterior

Regularization, Multilabel

Learn functions with the same input, different outputs, where we know some constraint

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Coupled Bayesian Sets (CBS)

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahoo

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…AT&T

Boeing

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…AT&T

Boeing

BasketballFootball

SwimmingTennisGolf…

AT&TBoeing

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…AT&T

Boeing

BasketballFootball

SwimmingTennisGolf…

AT&TBoeing

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town…

AT&TBoeing

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

Boeing

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…Brazil Telecom

Texaco

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…Brazil Telecom

Texaco

BasketballFootball

SwimmingTennisGolf…

Brazil TelecomTexaco

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

…Brazil Telecom

Texaco

BasketballFootball

SwimmingTennisGolf…

Brazil TelecomTexaco

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town…

Brazil TelecomTexaco

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco

Peter FlachBill ClintonJeremy Lin

AdeleBarak Obama

BasketballFootball

SwimmingTennisGolf

BristolPittsburgh

Rio de JaneiroTokyo

Cape Town

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

Great BritainKeyboard

Pencil

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

Great BritainKeyboard

Pencil

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

Great BritainKeyboard

Pencil

MutuallyExclusive(Company,NotCompany);

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

Great BritainKeyboard

Pencil

MutuallyExclusive(Company,NotCompany);

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?

Everything

PersonCompany CitySport

AppleMicrosoftGoogle

IBMYahooAT&T

BoeingBrazil Telecom

Texaco…

Great BritainKeyboard

Pencil

Peter FlachBill ClintonJeremy Lin

AdeleBarak ObamaDalai Lama

FreudTom Mitchell

Aristotle…

BasketballFootball

SwimmingTennisGolf

SoccerVolleyballJogging

Marathon…

BristolPittsburgh

Rio de JaneiroTokyo

Cape TownNew YorkLondon

Sao PauloBrisbane

Not Company

Great BritainKeyboard

Pencil

MutuallyExclusive(Company,NotCompany);

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What if we do not have the mutual exclusiveness constraints?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What about Semantic Relations?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What about Semantic Relations?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

CBS using NELL’s Ontology

What about Semantic Relations?

ECML/PKDD2012 Bristol, UK September, 26th, 2012

ConclusionsCoupled Bayesian Sets

• semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York) from web pages;

• based on the original Bayesian Sets • can outperform algorithms such as the original Bayesian

Set, the Naive Bayes classifier, the Bas-all and the coupled semi-supervised logistic regression algorithm (CPL);

• can be used to automatically generate new constraints to the set expansion task even when no mutually exclusiveness relationship is previously defined

ECML/PKDD2012 Bristol, UK September, 26th, 2012

AcknowledgementsThanks to:

ECML/PKDD2012 audience! Also Thanks to:

• Department of Science and Technology, Government of India under Indo-Brazil cooperation programme;

• Brazilian research agencies CAPES and CNPq;

• Zoubin Ghahramani and K.A. Heller;

• All the Read The Web group;

contact: estevam.hruschka@gmail.comhttp://rtw.ml.cmu.edu

top related