semantic web munich meetup fall 2012 - web information extraction to bootstrap semantic e-commerce
TRANSCRIPT
Semantic Web Meetup Munich 10 / 12 - Google MunichPhD Topic: Web Information Extraction to bootstrap Semantic E-commerce
Uwe Stoll
Uwe Stoll, Universität der Bundeswehr München
Motivation & Problem
2
✴ GoodRelations web vocabulary for e-commerce is great!
✴ Rich snippets on SERPs
✴ Browser plugins
✴ Applications on Dataspaces
✴ already, ~10k shops use it
✴ but there are ~500k shops
98%
2%
GR :) no GR :(
Uwe Stoll, Universität der Bundeswehr München
Why is it still low?
✴ Deployment shop by shop, no centralized switching on
✴ some expertise needed
✴ Incentive limited mostly on SEO benefits
3
Uwe Stoll, Universität der Bundeswehr München
What can we do?
✴ Automate the problem✴ Web information
extraction (WIE) FTW!✴ Web information
extraction is a way to get structured data automatically out of web sites
4
image: coursera
Uwe Stoll, Universität der Bundeswehr München
Pros & Cons of shop deployment and WIE for Semantic E-commerce
5
Shop deploymentWeb Information
Extraction
main need incentive computing power
market coverage rel. low pot. high
granularity high pot. low
publishing SW data decentralized basically centralized
Uwe Stoll, Universität der Bundeswehr München
How to imrove granularity
✴ build specific extractors that recognize shop software systems
✴ use existing markup as a learning set for WIE
6
Uwe Stoll, Universität der Bundeswehr München7
✴ build an API for shops to request structured markup for their pages
ShopCache
Semantic E-Commerce Web
Information Extraction API
structured markup
How to improve publishing
Uwe Stoll, Universität der Bundeswehr München
Take-away
✴ Exploit Web Information Extraction to grow the GoodRelations piece of cake!
✴ Let machines work for men again and not the other way round!
8
94%
4%2%
shop deployment WIE no GR :(
Thank youUwe Stoll
http://www.semantium.de
twitter: ustoll