sources the trustworthiness of web knowledge-based trust ...motivation for knowledge-based trust...
TRANSCRIPT
![Page 1: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/1.jpg)
Knowledge-Based Trust: Estimating the Trustworthiness of Web
Sources
Xin Luna Dong, Evgeniy Gabrilovich, Kevin Murphy, Van Dang, Wilko Horn, Camillo Lugaresi,
Shaohua Sun, Wei ZhangGoogle Inc.
@VLDB’2015
![Page 2: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/2.jpg)
Motivation for Knowledge-Based Trust (KBT)
● Providing a new perspective to evaluate Web source quality
● What we have now--Exogenous signals○ Link-based○ Search log and click-through rate○ Web spam
● Key idea: Evaluate trustworthiness of sources by the correctness of its factual information--Endogenous signals
![Page 3: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/3.jpg)
Correctness of Factual Information
Fact 1
Fact 2
Fact 3
Fact 4
Fact 5
Fact 6
Fact 7
Fact 8
Fact 9
Fact 10
...
Accu 0.7
✓
✓
✘
✓
✘
✓
✓
✓
✓
✘
...
![Page 4: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/4.jpg)
How Can Trustworthiness Help?
![Page 5: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/5.jpg)
Knowledge-Based Trust (KBT)
Trustworthiness in [0,1] for 5.6M websites and 119M webpages
![Page 6: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/6.jpg)
Knowledge-Based Trust vs. PageRank
Correlated scores
Often tail sources w. high trustworthiness
![Page 7: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/7.jpg)
I. Tale Sources w. Low PageRank May Provide Valuable Info
Among 100 sampled websites, 85 are indeed trustworthy.
![Page 8: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/8.jpg)
Knowledge-Based Trust vs. PageRank
Often tail sources w. high trustworthiness
Correlated scoresOften sources
w. low accuracy
![Page 9: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/9.jpg)
II. Popular Websites May Not Be Trustworthy
http://www.ebizmba.com/articles/gossip-websites
Gossip Websites
Domain
www.eonline.com
perezhilton.com
radaronline.com
www.zimbio.com
mediatakeout.com
gawker.com
www.popsugar.com
www.people.com
www.tmz.com
www.fishwrapper.com
celebrity.yahoo.com
wonderwall.msn.com
hollywoodlife.com
www.wetpaint.com
14 out of 15 have a PageRank among top 15% of the websites
All have knowledge-based trust in bottom 50%
![Page 10: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/10.jpg)
II. Popular Websites May Not Be Trustworthy
![Page 11: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/11.jpg)
III. Website Recommendation by Vertical
![Page 12: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/12.jpg)
III. Website Recommendation by Vertical
![Page 13: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/13.jpg)
Now, How to Compute KBT?
![Page 14: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/14.jpg)
Key Idea in KBT
Fact 1
Fact 2
Fact 3
Fact 4
Fact 5
Fact 6
Fact 7
Fact 8
Fact 9
Fact 10
...
Accu 0.7
✓
✓
✘
✓
✘
✓
✓
✓
✓
✘
...
![Page 15: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/15.jpg)
Knowledge Vault–Probabilistic Knowledge Fusion
#Triples3.0B
(0.3B w. pr>=0.7)
#URLs2.5B
(28M Websites)
#Extractors 16
[SIGKDD, 2014][VLDB, 2014]
![Page 16: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/16.jpg)
KV Makes This Possible
Fact 1
Fact 2
Fact 3
Fact 4
Fact 5
Fact 6
Fact 7
Fact 8
Fact 9
Fact 10
...
Accu 0.7
✓
✓
✘
✓
✘
✓
✓
✓
✓
✘
...
![Page 17: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/17.jpg)
KV Makes This Possible
Accu 0.7
Triple 1
Triple 2
Triple 3
Triple 4
Triple 5
Triple 6
Triple 7
Triple 8
Triple 9
Triple 10
...
1.0
0.9
0.3
0.8
0.4
0.8
0.9
1.0
0.7
0.2
...
![Page 18: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/18.jpg)
Challenges
Triple 1 1.0
Triple 2 0.9
Triple 3 0.3
Triple 4 0.8
Triple 5 0.4
Triple 6 0.8
Triple 7 0.9
Triple 8 1.0
Triple 9 0.7
Triple 10 0.2
... ...
Accu 0.7
How to decide if a triple is indeed claimed by the source instead of an extraction error?
![Page 19: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/19.jpg)
Extractions Can Be Wrong
● (Obama, nationality, Kenya)2087 extractions:○ Example of a correct extraction
http://beforeitsnews.com/obama-birthplace-controversy/2013/04/alabama-supreme-court-chief-justice-roy-moore-to-preside-over-obama-eligibility-case-2458624.html
○ Example of a wrong extractionhttp://www.monitor.co.ug/News/National/US+will+respect+winner+of+Kenya+election++Obama+says/-/688334/1685814/-/ksxagx/-/index.html
![Page 20: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/20.jpg)
Extractions Can Be Wrong
● (Obama, nationality, USA)2481 extractions:○ Example of a correct extraction
http://www.dogonews.com/2009/10/9/a-nobel-prize-for-our-awesome-president
○ Example of a wrong extractionhttp://blogs.telegraph.co.uk/news/timstanley/100169248/barack-obamas-life-story-contains-myth-not-truth-says-biographer-so-why-did-the-media-report-it-as-truth/
![Page 21: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/21.jpg)
1. Graphical model--predict at the same timea. extraction correctnessb. triple correctnessc. source accuracyd. extractor precision/recall
2. Un(Semi-)supervised learning (Bayesian)a. leverage source/extractor agreements b. trust a source/extractor w. high quality
3. Source/extractor hierarchya. Break down “large” sourcesb. Group “small” sources
KBT Strategies
![Page 22: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/22.jpg)
Graphical Model
Observations● Xewdv: whether extractor e
extracts from source w the (d,v) item-value pair
Latent variables● Cwdv: whether source w indeed
provides (d,v) pair● Vd: the correct value(s) for d
Parameters● Aw: Trust of source w● Pe: Precision of extractor e● Re: Recall of extractor e
![Page 23: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/23.jpg)
Algorithm
Compute Pr(W provides T | Extractor quality)
by Bayesian analysis
Compute source accuracy
Compute extractor precision and recall
Compute Pr(T | Source quality) by Bayesian analysis
E-Step
M-Step
![Page 24: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/24.jpg)
Web Source Trustworthiness
1.0
1.0
1.0
1.0
0.9
0.9
0.8
0.2
0.1
0.1
...
Fact 1
Fact 2
Fact 3
Fact 4
Fact 5
Fact 6
Fact 7
Fact 8
Fact 9
Fact 10
...
Accu 0.7
✓
✓
✘
✓
✘
✓
✓
✓
✓
✘
...
Triple 1
Triple 2
Triple 3
Triple 4
Triple 5
Triple 6
Triple 7
Triple 8
Triple 9
Triple 10
...
1.0
0.9
0.3
0.8
0.4
0.8
0.9
1.0
0.7
0.2
...
TripleCorr
ExtractionCorr
Accu 0.73
![Page 25: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/25.jpg)
● (Obama, nationality, Kenya)2087 extractions:○ Example of a correct extraction (Pr_extCorr=0.792)
http://beforeitsnews.com/obama-birthplace-controversy/2013/04/alabama-supreme-court-chief-justice-roy-moore-to-preside-over-obama-eligibility-case-2458624.html
○ Example of a wrong extraction (Pr_extCorr=0.130)http://www.monitor.co.ug/News/National/US+will+respect+winner+of+Kenya+election++Obama+says/-/688334/1685814/-/ksxagx/-/index.html
● Pr_tripleCorr=0 (not enough support)
Predicting Extraction and Triple Correctness
![Page 26: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/26.jpg)
Predicting Extraction and Triple Correctness
● (Obama, nationality, USA)2481 extractions:○ Example of a correct extraction (Pr_extCorr=0.999)
http://www.dogonews.com/2009/10/9/a-nobel-prize-for-our-awesome-president
○ Example of a wrong extraction (Pr_extCorr=0.261)http://blogs.telegraph.co.uk/news/timstanley/100169248/barack-obamas-life-story-contains-myth-not-truth-says-biographer-so-why-did-the-media-report-it-as-truth/
● Pr_tripleCorr=1 (higher support)
![Page 27: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/27.jpg)
Predicting Extraction and Triple Correctness
Distribution of providers for Kenya and USA
![Page 28: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/28.jpg)
Predicting Extraction and Triple Correctness
![Page 29: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/29.jpg)
Predicting Triple Correctness
![Page 30: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/30.jpg)
What is the Future of KBT?
![Page 31: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/31.jpg)
1. Extraction is still very sparsea. 74% URLs each contributes fewer than 5 triplesb. We compute reliable KBT for <20% websites
and <<5% webpages2. Extraction is of low quality
a. Overall accuracy is as low as 11.5%b. Low accuracy for some good sources because
of undetected extraction errors
Future Works
Call to arms –- Leave NO Valuable Data Behind
![Page 32: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/32.jpg)
Press Coverage of the Paper
![Page 33: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/33.jpg)
... I read with interest your recent paper on KBT … Actually, that’s false – I tried to read it, and did read all of the parts that weren’t numbers and Greek characters. It is quite an interesting proposal, though.
I’m writing because XXX published a piece claiming that YYY would be injured under a ranking system that took KBT into account they got that from footnote 16 in your paper ...
I’m writing with a simple request: Can you provide me with the XXX’s KBT score and percentile ranking, and how it compares to YYY’s? …
KBT Anecdote (Emails Dated 3/2015)
![Page 34: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/34.jpg)
https://www.washingtonpost.com/news/the-intersect/wp/2015/03/02/google-has-developed-a-technology-to-tell-whether-facts-on-the-internet-are-true/
![Page 35: Sources the Trustworthiness of Web Knowledge-Based Trust ...Motivation for Knowledge-Based Trust (KBT) Providing a new perspective to evaluate Web source quality What we have now--Exogenous](https://reader034.vdocuments.mx/reader034/viewer/2022050302/5f6b546ae9d9332c8639011d/html5/thumbnails/35.jpg)
THANK YOU!