exploring the semantic web

49
Exploring Semantic Web Data and particularly Linked Data Roberto García AIC Seminar Series SRI International, Menlo Park, August 14 th 2012 Human-Computer Interaction and Data Integration Research Group Universitat de Lleida Spain

Upload: rogargon

Post on 06-May-2015

2.354 views

Category:

Technology


0 download

DESCRIPTION

Talk about Exploring the Semantic Web, and particularly Linked Data, and the Rhizomer approach. Presented August 14th 2012 at the SRI AIC Seminar Series, Menlo Park, CA

TRANSCRIPT

  • 1.Exploring Semantic Web Data and particularly Linked Data Roberto GarcaAIC Seminar SeriesSRI International, Menlo Park, August 14th 2012Human-Computer Interaction Universitat de Lleida and Data Integration Spain Research Group

2. Who Associate Professor, Universitat de Lleida, Spain Visiting Associate Professor, Standford University Stanford HCI Group +12 years Semantic Web research 1999 MSc Thesis: Knowledge Management usingRDF plus reasoning (SiLRI) 2006 PhD Thesis: A Semantic Web approach to DRM 2006- Copyright Ontology 2007- Lleida HCI Group, Semantic Web UserInterfaces 3. What is Open Data?Open data is data that can be freely used, reused and redistributed by anyone - subject only, at most, to the requirement to attribute and sharealike Open Knowledge Foundation Make your data OPEN Available online with open license For instance Creative Commons CC-BY No more than reproduction cost No matter format 4. Open Data Worldwide 169 initiativesRate: City (40), Country, Region or State (125),Supranational (4) http://datos.fundacionctic.org/sandbox/catalog/faceted/ 5. Welcome to Data.CA.Gov 6. Open Data Formats However, encourage formats that facilitate reuse and interoperability Tim Berners-Lee 5 stars classificationhttp://5stardata.info 7. Open Data Make data available on the Web under an open license Data licenses: Public Domain Dedication and License (PDDL), Open DataCommons Attribution License (ODC-by) or Creative CommonsPublic Domain Dedication (CC0) Whatever format Example: PDF But data is locked-up in a document Hard to get data out, custom scrapershttp://5stardata.info 8. Open Data Make it available as structured data Example: Excel instead of image scan of a table But data still locked-up You depend on proprietary softwarehttp://5stardata.info 9. Open Data Use non-proprietary formats Example: CSV instead of Excel "Temperature forecast for Galway", "Day","Lowest Temperature (C)" "Saturday, 13 November 2010",2 "Sunday, 14 November 2010",4 "Monday, 15 November 2010",7 But data on the Web and not data in the Web What does Galway mean? Is it a temperature? What is the unit? Local time?...http://5stardata.info 10. Galway (disambiguation) Places Ireland Galway County Galway Galway Bay Sri Lanka Galways Land National Park United States Galway (town), New York Galway (village), New York Things Galway (sheep), a breed of sheep that originated in Galway, Ireland Galway harp, a type of harp Galway Hooker, a type of sailing boat Galway or Claddagh Ring, a type of wedding ring made in Galway 11. Open Data Use URIs to identify things,so that people can point at your stuff Example: RDF1 (but also Atom, OData, JSON-LD,)@prefix meteo: .@prefix galweather: .Vocabularies@prefix xsd: . Ontologies meteo:forecast [ meteo:predicted "2010-11-13T12:00:00Z"^^xsd:dateTime ; meteo:temperature [ meteo:celsius "2"^^xsd:decimal ] ] . But what if we (humans or computers) dontknow what http://example.org/Galway means?1Resource Description Framework, http://www.w3.org/RDF/ 12. Linked Open Data Link your data to other data to provide context(semantics, meaning) Example: http://dbpedia.org/resource/GalwayHTTP GET@prefix dbpedia: ....dbpedia:Galway a , ,;rdfs:label "Galway"@en;dbp:populationBlank "Galwegian, Tribesman"@en;dbp:populationTotal "75529"^^xsd:int;dbp:populationUrban "76778"^^xsd:int;dcterms:subject ,,,;rdfs:comment "Galway or City of Galway (Cathair na Gaillimhe) is a city on the west coast ofIreland. It is located on the River Corrib between Lough Corrib and Galway Bay and is surrounded byCounty Galway. It is the third largest city within the state, though if the wider urban area isincluded then it falls into fourth place behind Limerick. The population of Galway city at the 2011census was 75,529, rising to 76,778 across the entire urban area."@en;geo:lat 53.2719;geo:long -9.04889;foaf:homepage . and also dbp:PopulationTotal, dct:subject, 13. Network Effect~31 billionstatementshttp://linkeddata.org 14. Fine for computers but people?C. Warren(blogger)Im writingabout Films ILike.Can I reuseLinkedMDB? M. Harper (developer) Im developing a bird watching application. Can I reuse DBPedia?http://linkeddata.org 15. User Testing Users typical questions: Where do I start? Where do I go now? What is this data about? How do I find this? What do Linked Data user interfaces offer? 16. DBPedia Scenario Linked Data version of Wikipedia 3.5 million things described Ontology: 257 classes y 1276 properties 17. Target Technical Users DBPedia main page 18. Semantic Query Languages SPARQL: select distinct(?c) (count(?i) as ?n)where {?i a ?c} order by desc(?n)cn http://www.w3.org/2002/07/owl#Thing 1668503 http://www.w3.org/2004/02/skos/core#Concept 632607 http://www.opengis.net/gml/_Feature 571764 http://dbpedia.org/ontology/Place 462349 http://dbpedia.org/ontology/Person363751 http://dbpedia.org/ontology/Work355100 http://dbpedia.org/ontology/PopulatedPlace340443 http://xmlns.com/foaf/0.1/Person296595 19. Text Search What to type? A URI? A URI label? How to take advantage from semantics? 20. Semantic Query UIs iSPARQLhttp://dbpedia.org/isparql/ 21. ProposalOntologies and dataset structureAutomatic UI GenerationInformation Architecture Components [Morville] OverviewMenus, Sitemaps,InteractionPatterns for Zoom & Filter FacetsData Analysis [Shneiderman] Details Lists, Maps, Timelines 22. IA Components. Menus From dataset ontologies and thesaurus For each class/topic URI, label, # instances/uses, subclasses/subtopics Flatten to desired # entries and subentries When there is room, entries or subentries, divide class/topic with the most instances When too many, group that with the fewest Other is the generic group 23. IA Components. Menus7 menus with 10 submenus Automatic Generation 24. DEMO http://rhizomik.net/dbpedia/ IA Components. MenusProvide DBPedia overview but what about 12.334 birds? 25. IAComponents. Facets Pre-computed list of facets / class or topic Ontologies or thesaurus + instance data Facet metrics: frequency, #values, most common value cardinality DBPedia Birds class: 226 properties dbo:kingdom, 100%, 3 values, 6846 (Animalia), 26. DEMO http://rhizomik.net/dbpedia/Scenario DBPedia 27. DEMO http://rhizomik.net/linkedmdbScenario LinkedMDB 28. Testing LinkedMDB Evaluation with lay users as part of RITE1development process Iteration test with 6 users LinkedMDB (Linked Data version of iMDb)User Task:Find three films whereWoody Allen is director andalso actor. 1 Rapid Iterative Testing and Evaluation 29. Evaluation Results Seemed easy butno user completed task without help Really, just 1 issue: Users started from Actor instead than fromFilm, and got lost from there User interaction is too constrained byunderlying explicit data structure Lack of context while browsing graph 30. New Features Facets for all inverse properties(explicit or implicit) Actor actor Film: Actor has facet is actor of Film Breadcrumbs show query built so far Click Film, then for facet Actorsearch Woody Allen: Showing Film has actorwhere actor name is Woody Allen 31. New Features What about getting from Actors to Films torestrict by director? Add Actor facet directed by? DANGER: facets explosion Director Film Country Continent Director facet: continents of countries where films directed! 32. New Features Pivoting: switch from faceted view torelated faceted view (keeping filters) E.g.: from Actors facets move to Films facetsthrough is Actor of Film facet For each class facet also compute: Most specific class for target instances Actor is Actor of Film and TV Episode Audiovisual Work Pivot that facet to get: Faceted view for target class + filters so far 33. DEMOhttp://rhizomik.net/linkedmdb/ 34. Next Round Evaluation Semantic Web Exploration ToolsQuality in Use Model: Task success, Task time, Satisfaction, UI Component Efficiency, Task Flexibility, LayoutFlexibility, Task: Films Woody Allen director and actor Task time: Pre-pivot PivotReductionMinimum1.050.89 15%Maximum5.232.23 57%Mean 2.411.69 30%St. Dev. 1.490.57 62% 35. Summary Menus Dataset classes (topics) overview Facets Filter class using properties and values Pivoting Switch faceted views, carrying filters 36. DEMOhttp://rhizomik.net/linkedmdb/ Conclusions Users build queries without SPARQL ordataset structure knowledge Example: Who has directed more films in Oceania?SELECT DISTINCT ?r1 WHERE {?r1 a movie:Director .?r2 movie:director ?r1 .?r2 a movie:Film.?r2 movie:country ?r3 .?r3 movie:country_continent ?r3var0FILTER(str(?r3var0)="Oceania") } 37. Work in Progress Interaction design Explore the best way to make pivoting, and un-pivoting, evident for users Improve breadcrumbs Specialized facets: Range dependent: histogram for numbers,calendar for dates, 38. Work in ProgressIntegrateRDF2SVG 39. Work in Progress Object-Action interaction paradigm Objet properties determine actions Actions: plugable Semantic Web Serviceslat, long, pointtime, date, start, end 40. DEMOhttp://lodvisualization.appspot.com Work in Progress Other IA components: sitemaps 41. DEMO http://rhizomik.net/apollo/Work in Progress Interactively select data and configurevisualizations 42. Data Quality 43. Assisted Edition (and Trust) WebIDhttp://www.w3.org/2008/09/msnws/papers/foaf+ssl.html 44. Thanks for your attentionRoberto Garcahttp://rhizomik.net/[email protected] Interaction Universitat de Lleida and Data Integration Spain Research Group