Local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein) Runar Bergheim Asplan Viak Internet as LoCloud is funded.

Download Local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein) Runar Bergheim Asplan Viak Internet as LoCloud is funded.

Post on 01-Apr-2015

213 views

Category:

Documents

1 download

Embed Size (px)

TRANSCRIPT

<ul><li>Slide 1</li></ul> <p>local content in a Europeana cloud Alternative methods of ingestion for small institutions (Stein) Runar Bergheim Asplan Viak Internet as LoCloud is funded by the European Commission's ICT Policy Support Programme Slide 2 Overview of Presentation Characteristics of Europeana content providers Present ingestion methods for Europeana Alternative ingestion methods out there Experiments that may be conducted as part of LoCloud 7 slides 284 words 1 858 characters 2 illustrations (A seemingly endless stream of words) Slide 3 Characteristics of Europeana content providers Those who are in Professional cultural heritage institutions Capacity for investment in infrastructure &amp; projects Technical skills beyond what may be expected Entities that fit into a hierarchy of aggregators Patient Those who are out Very small collections Collections by individuals (tens to hundreds of objects) Independent institutions with strained funding Non-conforming online content structure 1 web page 1 object Slide 4 Present Europeana ingestion process Slide 5 Puts great demands on content providers Partly mitigated by the excellent MINT-MORE tools Limited capacity at harvesting end Partly mitigated by aggregator hierarchy Low frequency of updates each iteration takes a long time Partly mitigated by modified content/aggregation architecture of Europeana Cloud Weaknesses of present Europeana ingestion process Slide 6 Alternative ingestion methods out there Slide 7 Difficult to create complete ESE/EDM from crawling But... the typical Europeana record is not really all that complete Schema.org. Microformats and other embedded semantics may help Deep-content URLs hidden for crawlers Simple site-map protocol may be applied Increases capacity for small content providers Decreases time-consumption of the content ingestion life-cycle Will serve more than one publishing channel Considerations for alternative ingestion methods Slide 8 Content assessment Assess quantity of new content that can be reached using alternative ingestion methods Technology experiments HTML embedded semantics based on open standards Creating a test-spider for auto-extraction of metadata from web pages Transformation of data to ESE/EDM Design of processes Embedding of spider into aggregator organizations business processes Ingestion + Quality assurance Experiments that may be conducted as part of LoCloud Slide 9 Thank you for the attention rb@avinet.no Slide 10 LoCloud is funded by the European Commission's ICT Policy Support Programme The views and opinions expressed in this presentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission. Funding </p>

Recommended

View more >