The Semantic Web for Librarians and Publishers, by Michael Keller, Stanford University

Download The Semantic Web for Librarians and Publishers, by Michael Keller, Stanford University

Post on 01-Nov-2014

1.623 views

Category:

Technology

0 download

Embed Size (px)

DESCRIPTION

 

TRANSCRIPT

  • 1. Seman&c Web for Libraries & Publishers Charleston Conference 111103Monday, November 21, 11so, whats the problem?
  • 2. The Problem Set 2Monday, November 21, 11
  • 3. Monday, November 21, 11Silos
  • 4. Monday, November 21, 11More silos
  • 5. Monday, November 21, 11Lots of different silos
  • 6. Monday, November 21, 11Blue silos
  • 7. Monday, November 21, 11Old SilosWe in the library and publishing trades force readers, some of them who are authors aswell, to search iteratively for information they want or need or thinks might exist, inmany different silos, using many different search engines, forms, and vocabularies. Wedo not make it easy for them to discover what is locally available, what is more or lesseasy to get, or everything that might be available.No wonder the young and foolish depend upon and believe in Googles searches.Google is quick...and in terms of search terms of relevance, very, very dirty.
  • 8. Monday, November 21, 11We give them better interfaces, ones that permit renement of results, to our holdings atthe title level, BUT...
  • 9. Monday, November 21, 11Simulateneously, we show them many other tools, each excellent in some ways, tocontinue their exploration of the literature. No single tool is comprehensive. We do notrefer our clients to the Web, at least not on our own web sites! // Our OPACs refer to ourholdings. While Indices and abstracts refer our readers to articles in journals to whichwe may have licensed. SFX and similar provide readers with links to titles revealed towhich we have subscribed. Neither our opacs nor the secondary databases directly tomore than a tiny, percentage of the vast collection of pages that is the World Wide Web.The Web, of course, refers in fragmentary fashion to information resources we might, Iemphasize, MIGHT have on hand for our readers.
  • 10. Monday, November 21, 11And the results of using other, often very good, discovery tools differ in relevanceranking, format, and options than the ones we provide for our OPAcs, thus addingconfusion.
  • 11. Monday, November 21, 11some of us provide our readers with lots of databases to search. Too many really, for allbut a few are not forensic-level scholars.
  • 12. Monday, November 21, 11Selecting a licensed data base is an art in itself!Once again notice that we rarely offer a web search engine as an option, and for goodreasons. Nevertheless, the discoverable relevant information resources on the webapparently are not part of our repertory.
  • 13. !!!Monday, November 21, 11We have not conspired to make the search for relevant information objects difcult. Wejust have not yet had the tools, the methods, the vision, and yes, the gumption to trysomething new.
  • 14. ATLAS at LHC -- 150*106 sensors Ntl Cntr for Biotech Info NSF CyberInfrastructure quake engineering simulationMonday, November 21, 11Heres a teensy slice of the information and communication environment in which ourfaculty and students nd themselves. And it gets more complex every day. Alas thelarger the number of websites indexed by Bing or Google or whatever search engine dujour, the more likely it is that the relevance of the returns will be less pointed andprecisely matched to what the searcher hoped to nd.
  • 15. Monday, November 21, 11Too many silos.Heres the biggest of the lot...
  • 16. 16Monday, November 21, 11
  • 17. One size fits all??? 17Monday, November 21, 11Does one size t all?
  • 18. 18Monday, November 21, 11Not quite. Even Google has silos and uses, as do others, clever interfaces to hide the fact of the silos.
  • 19. Monday, November 21, 11Given all these silos and search engines, our users, our authors, and readers, andteachers, and students, people on the street, our nations...need us to nd a better way.Facts about the information objects we have acquired or leased, facts about books,articles, lms, and so forth that we have published need to be found in the wild, on theweb. Ideally, we, librarians and publishers will get the facts about what we have andwhat we are making public, for fun or prot, discoverable on the Web.
  • 20. Discovery & Access ... the problemsMonday, November 21, 11Lets dwell on the problemsbriefly...
  • 21. 1. Too many stovepipe systems 2. Too little precision with inadequate recall 3 3. Too far removed from W Web Wide WorldMonday, November 21, 11
  • 22. 1. Too many stovepipe systemsMonday, November 21, 11
  • 23. 1. Too many stovepipe systems The landscape of discovery & access services is a shamblesMonday, November 21, 11
  • 24. 1. Too many stovepipe systems The landscape of discovery & access services is a shambles It cant be mapped in any logical wayMonday, November 21, 11
  • 25. 1. Too many stovepipe systems The landscape of discovery & access services is a shambles It cant be mapped in any logical way not by us (the supposed information pros) not by the faculty & students who must navigate the chaosMonday, November 21, 11
  • 26. 1. Too many stovepipe systems The landscape of discovery & access services is a shambles It cant be mapped in any logical way not by us (the supposed information pros) not by the faculty & students who must navigate the chaos This state of affairs shouldnt be a surpriseMonday, November 21, 11
  • 27. 2. Too little precision with inadequate recallMonday, November 21, 11
  • 28. 2. Too little precision with inadequate recall Some of the problem ... too many stovepipe systemsMonday, November 21, 11
  • 29. 2. Too little precision with inadequate recall Some of the problem ... too many stovepipe systems dumbing-down effects of federation often hinder explicit searches each interface has its own search-refinement tricks numerous, overlapping discovery paths hamper full recallMonday, November 21, 11
  • 30. 2. Too little precision with inadequate recall Some of the problem ... too many systems dumbing down effects of federation often hinder explicit searches each interface has its own search-refinement tricks numerous, overlapping discovery paths hamper full recall Most of the problem ... limitations in the design & execution of infrastructure that supports discovery & accessMonday, November 21, 11
  • 31. the 1st limiting factor ... ambiguityMonday, November 21, 11
  • 32. the 1st limiting factor ... ambiguity Most of our metadata uses a string of bytes to label a semantic entity [people, places, things, events, ...]Monday, November 21, 11
  • 33. the 1st limiting factor ... ambiguity Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...] discovery based on matching text labels not on the gist of semantic entitiesMonday, November 21, 11
  • 34. the 1st limiting factor ... ambiguity Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...] discovery based on matching text labels not on the gist of semantic entities For libraries, the fix is authorities authoritative forms of strings (names, organization, titles, places, events, topics, etc.)Monday, November 21, 11
  • 35. the 1st limiting factor ... ambiguity Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...] discovery based on matching text labels not on the gist of semantic entities For libraries, the fix is authorities authoritative forms of strings (names, organization, titles, places, events, topics, etc.) work to improve precision and recall hold on ... what about cases where no one-to-one relationship exists between a string-of-text label & the underlying semantic entityMonday, November 21, 11
  • 36. the 1st limiting factor ... ambiguity Most of our metadata uses a string of bytes to label a semantic entity [person, place, thing, event, ...] discovery based on matching text labels not on the gist of semantic entities For libraries, the fix is authorities authoritative forms of strings (names, organization, titles, places, events, topics, etc.) work to improve precision and recall hold on ... what about cases where no one-to-one relationship exists between a string-of-text label & the underlying semantic entity Take for example the text string: jaguar byte string: 4a 61 67 75 61 72Monday, November 21, 11
  • 37. ... a rose is a rose is a rose company Ltd. cars XK series, in pro- duction since 1996 E-Type (UK) or XK-E (US) mftg 1961 to 1974 etc. hardware & software Atari video game console Macintosh OS X 10.2 John Giannandrea, CTO, MetawebMonday, November 21, 11Imagine this keyword search and realize the ambiguity of the term jaquarinspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC inApril, 2008
  • 38. ... a rose is a rose is a rose company music Ltd. heavy metal band formed in Bristol, England. Dec 1979 cars Fender electric guitar, XK series, in pro- introduced in 1962 duction since 1996 Philadelphia-based singer/songwriter E-Type (UK) or Jaguar Wright XK-E (US) mftg 1961 to 1974 etc. military type 140 Jaguar class fast attack craft [torpedo], hardware & software Germany WWII Atari video game console Anglo-French ground attack aircraft Macintosh XF10F prototype swing-wing OS X 10.2 fighter, early 1950s, Grumman John Giannandrea, CTO, MetawebMonday, November 21, 11 inspired by John Giannandrea, CTO, Metaweb ... from his presentation at PARC in April, 2008
  • ...

Recommended

View more >