linked data marketplaces

53
v0.6 / Mar 2011 (Linked) Data Marketplaces Marin Dimitrov (Ontotext)

Upload: marin-dimitrov

Post on 07-May-2015

13.054 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Linked Data Marketplaces

v0.6 / Mar 2011

(Linked) Data Marketplaces

Marin Dimitrov (Ontotext)

Page 2: Linked Data Marketplaces

Contents

• Introduction

• Data Marketplaces

– Factual, InfoChimps, Azure DataMarket, Freebase, Socrata, Kasabi

– Data Market, Timetric, xIgnite

• Data Marketplaces for Linked Data

#2(Linked) Data Marketplaces Jan 2011

Page 3: Linked Data Marketplaces

INTRODUCTION

(Linked) Data Marketplaces #3Jan 2011

Page 4: Linked Data Marketplaces

Definitions

• Data-as-a-Service (DaaS)– “Like all members of the "as a Service" (XaaS) family, DaaS is based on

the concept that the product, data in this case, can be provided on demand to the user regardless of geographic or organizational separation of provider and consumer. Additionally, the emergence of service-oriented architecture (SOA) has rendered the actual platform on which the data resides also irrelevant” (Wikipedia)

• Data Marketplaces– “Services that make it easy to find data from a range of secondary

data sources, then consume the data in a usable and unified format. Several of these services are trying to create marketplaces for data, envisioning that data providers can offer their data sets for sale to data seekers” (DataMarket.com)

#4(Linked) Data Marketplaces Jan 2011

Page 5: Linked Data Marketplaces

Data Marketplaces properties

• Proposed classification by Bauereiss & Fensel

1. Data domain

2. Population of content

3. Community management

4. Operating party

5. Pricing models

6. Data exchange

• Some additional differentiating characteristics

– Data model, Data size, Data export

– Branded marketplaces, SLA

– Query languages, Data tools#5(Linked) Data Marketplaces Jan 2011

Page 6: Linked Data Marketplaces

DATA MARKETPLACES

(Linked) Data Marketplaces #6Jan 2011

Page 7: Linked Data Marketplaces

Factual

• www.factual.com / @factual

#7(Linked) Data Marketplaces Jan 2011

Page 8: Linked Data Marketplaces

Factual (2)

• Data domain

– Travel, finance, sports, autos, movies, music, TV, books, health, food, politics, education, science, arts, …

– High quality local data• USA, Germany, France, Italy, UK, Japan, Switzerland, Australia, …

• Used by Facebook Places

• Data population

– Crawling the web

– Public data sources

– Community contributions

• Upload XLS/ODS, CSV

#8(Linked) Data Marketplaces Jan 2011

Page 9: Linked Data Marketplaces

Factual (3)

• Data model

– tabular

– Taxonomy of 400 categories• 13 Level 1 categories: Arts, Automotive, Business, Government, …

• Data size – 500,000 datasets

• Company info

– Factual Inc. (USA)

– $27M VC funding so far

#9(Linked) Data Marketplaces Jan 2011

Page 10: Linked Data Marketplaces

Factual (4)

• Monetization model

– Pricing model not finalised yet (currently free)

– Pay-per-use pricing (per API call) with subscriptions• Companies that contribute data will have a fee reduction

• Data access options

– REST API

• Read from table, Add/Write to table, Get schema info

– Web applications

• Read/write raw data from a web page (JavaScript)

• Web widgets for visualising, filtering and sorting data

#10(Linked) Data Marketplaces Jan 2011

Page 11: Linked Data Marketplaces

Factual (5)

• Data tools

– AutoClipper – find tables on the web

– PageClipper – extract tabular data from a web page

– FactClipper – find individual facts (query templates)

#11(Linked) Data Marketplaces Jan 2011

Page 12: Linked Data Marketplaces

InfoChimps

• www.infochimps.com / @infochimps

#12(Linked) Data Marketplaces Jan 2011

Page 13: Linked Data Marketplaces

InfoChimps (2)

• Data domain

– All purpose• Including data from Freebase, Wikipedia infoboxes, CKAN, Twitter,

Data.gov, Data.gov.uk, GeoNames, …

• Data population

– Public datasets

– User submitted datasets

• Data model is dataset specific

• 10,000+ datasets organised in 13 collections

#13(Linked) Data Marketplaces Jan 2011

Page 14: Linked Data Marketplaces

InfoChimps (3)

• Company info

– InfoChimps (USA)

– $1.6M VC funding so far

– Acquired DataMarketplace in 12/2010

• Monetization model

– Charge data sellers

• Data sellers choose the price & licensing of their data

• Charge for data storage

• 30% commission for InfoChimps on each sale

#14(Linked) Data Marketplaces Jan 2011

Page 15: Linked Data Marketplaces

InfoChimps (4)

• Monetization model (2)

– Charge data buyers

• Baboon – free, 100K API calls / mo

• Brass Monkey – $20/mo, 500K API calls / mo

• Silverback – $250/mo, 2M API calls / mo

• Golden Ape – $4,000/mo, 15M API calls / mo

• Data access options

– REST API• api.infochimps.com/DATASET/METHOD.json?PARAM=VALUE

– YQL tables

#15(Linked) Data Marketplaces Jan 2011

Page 16: Linked Data Marketplaces

Azure DataMarket

• https://datamarket.azure.com

#16(Linked) Data Marketplaces Jan 2011

Page 17: Linked Data Marketplaces

Azure DataMarket (2)

• Data domain

– All purpose, incl. Data.gov, UN data, Wolfram|Alpha, ESRI

• Data population

– Data publishers (need prior approval)

• Data can be stored on SQL Azure, Azure Storage or 3rd party clouds (via Data Access Layers)

• Data model

– Depends on the dataset and the storage, but always presented as OData to consumers

• Data size – 90 datasets

#17(Linked) Data Marketplaces Jan 2011

Page 18: Linked Data Marketplaces

Azure DataMarket (3)

#18(Linked) Data Marketplaces Jan 2011

(c) Microsoft

Page 19: Linked Data Marketplaces

Azure DataMarket (4)

• Company info

– Microsoft

• Monetization model

– Subscription for data buyers (limited/unlimited API calls)

• Access options

– OData (feeds, queries, updates)

• Data tools

– Service Explorer

– Excel add-in (find, purchase, consume data)

– Integration with SQL Server Reporting Services / Integration Services

#19(Linked) Data Marketplaces Jan 2011

Page 20: Linked Data Marketplaces

DataMarket

• www.datamarket.com / @datamarket

#20(Linked) Data Marketplaces Jan 2011

Page 21: Linked Data Marketplaces

DataMarket (2)

• Data domain

– Statistical data from 2,000 providers, incl. UN, Eurostat, World Bank, US agencies, BP, FIFA, …

• Data population

– Data aggregation (2,000 data providers)

• Data size

– 13K datasets, 100M time series, 600M facts

• Company info

– DataMarket (Iceland)

#21(Linked) Data Marketplaces Jan 2011

Page 22: Linked Data Marketplaces

DataMarket (3)

• Monetization model

– Charge data sellers

• Free datasets – $249/mo; Paid datasets – 25% commission; Branded datasets – $699/mo + commission

– Charge data buyers

• Free – 50 API calls/mo; $99 – 500 API calls/mo; $299 – 10K API calls/mo; $799 – 100K API calls/mo

• Data access

– REST API

#22(Linked) Data Marketplaces Jan 2011

Page 23: Linked Data Marketplaces

Socrata

• www.socrata.com / @socrata

#23(Linked) Data Marketplaces Jan 2011

Page 24: Linked Data Marketplaces

Socrata (2)

• Data domain

– Business, education, government data

• Data population

– Uploads from data publishers

• Data size

– 13K datasets

• Data model

– tabular

#24(Linked) Data Marketplaces Jan 2011

Page 25: Linked Data Marketplaces

Socrata (3)

• Company info

– Socrata (USA)

• Monetization model

– Charge data buyers (“Plans starting at $499 per month”)

• Basic – 100K API calls/mo + 50GB traffic; Plus – 250K API calls/mo + 250GB traffic; Premium – 1M API calls/mo + 1.2TB traffic; Ultimate – 10M API calls/mo + 5TB traffic

• Data access

– REST API (Socrata Open Data API)

– Data export (XLS, CSV, RDF, XML)

– RSS updates

#25(Linked) Data Marketplaces Jan 2011

Page 26: Linked Data Marketplaces

Kasabi

• www.kasabi.com / @TeamKasabi

#26(Linked) Data Marketplaces Jan 2011

Page 27: Linked Data Marketplaces

Kasabi (2)

• Data domain

– All purpose, incl. DBpedia, GeoNames, BBC Linked Data, …

• Data population

– Public datasets

– User submitted datasets

• Data size

– 55 datasets

• Data model

– RDF

#27(Linked) Data Marketplaces Jan 2011

Page 28: Linked Data Marketplaces

Kasabi (3)

• Company info

– Talis (UK)

• Monetization model

– Charge data consumers

– Data hosting is free

• Data access

– SPARQL / Linked Data endpoint

– REST API

– Additional APIs

– PHP & Ruby client libraries

#28(Linked) Data Marketplaces Jan 2011

Page 29: Linked Data Marketplaces

Freebase

• www.freebase.com / @fbase

#29(Linked) Data Marketplaces Jan 2011

Page 30: Linked Data Marketplaces

Freebase (2)

• Data domain

– General purpose

• Data model

– Graph (RDF dumps available)

• Data population

– Community curated data (licensed as CC-BY)

– Import of public data sources (Wikipedia, MusicBrainz, WordNet, LoC, …)

• Data size

– 20M entities

#30(Linked) Data Marketplaces Jan 2011

Page 31: Linked Data Marketplaces

Freebase (3)

• Company info

– Metaweb (USA), now Google

• Monetization model

– Free for 100K read API calls per day (10K write)

– Paid for higher volumes

• Data access

– REST API

– Linked Data endpoint (http://rdf.freebase.com)

– Triple uploader / RDF dumps

– Acre (application hosting platform)

#31(Linked) Data Marketplaces Jan 2011

Page 32: Linked Data Marketplaces

Freebase (4)

• Data tools

– Web based – schema editor, review queue, viewers, …

– GridWorks (Google Refine)• Exploring, data cleaning, transformation of tabular data

• Map data to Freebase schema & RDF export (3rd party extension)

– Acre• Application hosting platform

– User contributed JavaScript code (converted to Java with Rhino)

• Access & store data directly into Freebase

#32(Linked) Data Marketplaces Jan 2011

Page 33: Linked Data Marketplaces

timetric

• www.timetric.com / @timetric

#33(Linked) Data Marketplaces Jan 2011

Page 34: Linked Data Marketplaces

timetric (2)

• Data domain

– Economic data

• Data population

– aggregate data from the world's leading sources of economic data (World Bank, Eurostat, …)

– User uploaded data

• Data size

– 2.5M public statistics

#34(Linked) Data Marketplaces Jan 2011

Page 35: Linked Data Marketplaces

timetric (3)

• Company info

– Timetric Ltd. (UK)

• Monetization model

– Free public datasets

– Paid exclusive datasets

• Data access

– REST API

#35(Linked) Data Marketplaces Jan 2011

Page 36: Linked Data Marketplaces

xIgnite

• www.xignite.com

#36(Linked) Data Marketplaces Jan 2011

Page 37: Linked Data Marketplaces

xIgnite (2)

• Data domain

– Financial data

• Data population

– aggregate data from leading sources (Dow Jones, Thomson Reuters, stock exchanges, …)

– Public datasets (national banks, SEC, Federal Reserve, …)

– User uploaded data

• Company info

– Xignite (USA)

#37(Linked) Data Marketplaces Jan 2011

Page 38: Linked Data Marketplaces

xIgnite (3)

• Monetization model

– Paid subscriptions

• Data access

– Web services (REST/SOAP)

#38(Linked) Data Marketplaces Jan 2011

Page 39: Linked Data Marketplaces

Coming soon…

• BuzzData

– www.buzzdata.com / @buzzdata

– Company: BuzzData

#39(Linked) Data Marketplaces Jan 2011

Page 40: Linked Data Marketplaces

Data marketplaces – features summary

• Data

– Data model, domain, export options

• Monetization

– Charge buyers/ sellers

– free API calls

– branded marketplaces & Service Level Agreement

• For developers

– REST API; query language

– Tools for data management / integration

– Application hosting

#40(Linked) Data Marketplaces Jan 2011

Page 41: Linked Data Marketplaces

Feature matrix

#41(Linked) Data Marketplaces Jan 2011

Fa

ctu

al

Info

Ch

imp

s

Azu

re

Da

taM

ark

et

Da

taM

ark

et

So

cra

ta

Ka

sa

bi

Fre

eb

ase

tim

etr

ic

xIg

nit

e

DATA

Data from all domains + + + - + + + - -

Data model tabular various various ? tabular RDF graph ? ?

Data export - - + - + ? + - -

RDF export - - - - + + + - -

MO

NETIZ

ATIO

N

Charge buyers + +/- + +/- + + +/- +/- +

Charge sellers ? + - + - ? - ? ?

Free API calls (month) ? 100K ? 50 - ? 3M ? -

Branded marketplaces - - + + + ? - - -

Service Level guarantee ? - - - - ? - - -

TO

OLS

REST API + + + + + + + + +

Query language + - + - - + + - -

Tools + - + - - + + - -

App hosting - - + - - ? + - -

Page 42: Linked Data Marketplaces

LINKED DATA + MARKETPLACES

(Linked) Data Marketplaces #42Jan 2011

Page 43: Linked Data Marketplaces

Linked Data cloud (Sep 2010)

#43(Linked) Data Marketplaces Jan 2011

(c) R. Cyganiak and A. Jentzsch

Page 44: Linked Data Marketplaces

Benefits of Linked Data for Data Marketplaces

• Unified data representation model (RDF)

– Easy consumption of the data

• Global identifiers for all objects (URI)

– Makes incremental data integration & federation easier

• Interlinked datasets

– New data added to the marketplace can be integrated with existing data

– Network effects

• Data marketplace interoperability

– Data from different marketplaces can be easily integrated

#44(Linked) Data Marketplaces Jan 2011

Page 45: Linked Data Marketplaces

Benefits of Linked Data for Data Marketplaces (2)

• Derived knowledge / facts

– RDF inference of additional implicit facts

– (see FactForge and LinkedLifeData)

• Rich queries

– SPARQL offers unmatched query expressivity

• Easy import of existing LOD datasets

– Linked Open Data cloud already includes 200+ datasets with 20+ billion RDF triples

#45(Linked) Data Marketplaces Jan 2011

Page 46: Linked Data Marketplaces

Linked Data for marketplaces – challenges

• Quality of data

– Different (public) datasets may come with inconsistent or controversial data

– Quality more important than quantity

• Large scale data integration

– Ontology (schema) mapping of different datasets & vocabularies

• Licensing

– Some datasets come with “CC-BY-NC” or unclear licensing

• Billing

– API calls / SPARQL queries with varying computational cost #46(Linked) Data Marketplaces Jan 2011

Page 47: Linked Data Marketplaces

Linked Data for marketplaces – challenges (2)

• Billing

– API calls / SPARQL queries with varying computational cost

• Operations

– Service Level guarantees

– Availability & scalability challenges• Most Linked Data endpoints at present are neither scalable, nor

available

#47(Linked) Data Marketplaces Jan 2011

Page 48: Linked Data Marketplaces

LinkedLifeData & FactForge

#48(Linked) Data Marketplaces Jan 2011

(c) R. Cyganiak and A. Jentzsch

FactForge

LinkedLifeData

Page 49: Linked Data Marketplaces

LinkedLifeData & FactForge

• FactForge

– Integrates some of the most central LOD datasets

– General-purpose information (not specific to a domain)

– 1.2 billion explicit and 1 billion inferred statements

– The largest upper-level knowledge base

– http://www.FactForge.net

• Linked Life Data

– 25 of the most popular life-science datasets

– 2.7 billion explicit and 1.4 billion inferred statements

– http://www.LinkedLifeData.com

#49(Linked) Data Marketplaces Jan 2011

Page 50: Linked Data Marketplaces

Strategic questions

• Monetization strategy

– which (linked) datasets can be monetized

– Charge buyers / charge sellers / free quota

– Branded marketplaces

• Community building

– Crowdsource the data curation to the community

– How to provide incentives to data curators?

#50(Linked) Data Marketplaces Jan 2011

Page 51: Linked Data Marketplaces

Strategic questions (2)

• Operations

– How to ensure Service Level guarantees?

– How to deal with licensing issues?

– Account management, metering, billing

• Platform

– RDF database – data volume, query volume

– ETL tools

– Curation tools

– Data export & consumption

#51(Linked) Data Marketplaces Jan 2011

Page 52: Linked Data Marketplaces

Data monetization with WebServius

• Benefits

– user management, quotas & restrictions

– Metering, pricing, billing

– Security, scalability, SLAs

#52(Linked) Data Marketplaces Jan 2011

(c) WebServius

Page 53: Linked Data Marketplaces

Q & A

Questions?@ontotext

#53(Linked) Data Marketplaces Jan 2011