fedora at northwestern university bill parod academic technologies northwestern university...

43
FEDORA FEDORA at at Northwestern University Northwestern University Bill Parod Academic Technologies Northwestern University [email protected]

Upload: jeffery-moody

Post on 27-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

FEDORAFEDORAatat

Northwestern UniversityNorthwestern University

Bill Parod

Academic Technologies

Northwestern University

[email protected]

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

General BackgroundGeneral Background

Academic TechnologiesFaculty projectsLibrary partnershipsInstitutional partnershipsDiverse clienteleDiverse content“One-off” projects

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Current FEDORA ProjectsCurrent FEDORA Projects

Block Museum of ArtThe Last Expression Art CollectionIntroduction to Asian Art HistoryBBC Spoken Word ArchiveParis Map CollectionEncyclopedia of ChicagoWordHoard Text Analysis Project

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Art collections Wall murals Photographs Historical maps GIS maps Newspapers Book page images

Digital video Spoken word Literary works Encyclopedias Lexical data Census data Event data

Diversity of ContentDiversity of Content

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Wavelet Image Servers Vector Image Processors Streaming Media Servers RDBMS XML Databases XSLT Processors GIS Servlet Engines

Diversity of SystemsDiversity of Systems

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Art collections Wall murals Photographs Historical maps GIS maps Newspapers Book page images

Digital video Spoken word Literary works Encyclopedias Lexical data Census data Event data

Abstract Image ModelsAbstract Image Models

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

4 Image Behavior Classes4 Image Behavior Classes

Core behavior– getCoverpage– getThumbnail

Basic image (UVa)– getThumbnail– getMedium– getHigh– getVeryHigh

Addressable image – getRegion(rgn,size)– getViewer

Layered image– getRegion(,,layers)– getViewer(layers)

Geographic image– getRegion(,,, coords)– getViewer(, coords)

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

4 Image Content Models4 Image Content Models

Core behavior– XML Metadata– HTML XSLT script– Thumbnail Image

Basic image (UVa)– Thumbnail jpeg– Medium Res jpeg– High Res jpeg– Very High Res jpg

Addressable image – Image metadata– Viewer XSLT script

Layered image– Layer metadata

Geographic image– World file for

projection

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Images Simple Zoom Layers

Core getThumbnail getThumbnail getThumbnail

getCoverpage getCoverpage getCoverpage

Basic getThumbnail getThumbnail getThumbnail

get Medium getMedium getMedium

getHigh getHigh getHigh

getVeryHigh getVeryHigh getVeryHigh

Addr getRegion getRegion

getViewer getViewer

Layered getRegion

getViewer

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

BDEF Interface DefinitionBDEF Interface Definition

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

BMECH DescriptionBMECH Description

Method bindings to implementationHTTP URL templates to image servletAccepts image server metadata streamAccepts specific user parametersProvides implementation flexibilityCurrently using TrueSpectra/Scene7 image

server

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

getCoverPage() for simple image – Block Museum Collection

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

getCoverPage() for zoomable image – History of Asian Art class

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Ingesting Images Ingesting Images

Imaging person deposits master TIFF images in WebDAV enabled file stor

Image server configured with “virtual path” to WebDAV stor for master image tiff.

TIFF master is converted to FlashPix and cached in image server

Image server handles request for FEDORA dissemination

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Metadata in Excel METS FEDORA

Tiffs in Xythos TrueSpectraImage Server

DisseminationRequests

• Catalog in Excel converted to METS for FEDORA ingest

• Tiff Masters deposited in collection’s Xythos directory

• Access to Xythos directory enabled for TrueSpectra virtual paths

• METS/FEDORA record includes link to TrueSpectra image

• Access to image is through FEDORA image behaviors

Department Academic Technologies

Data flow

Requests

Users

Image Workflow: FEDORA – TrueSpectra – XythosImage Workflow: FEDORA – TrueSpectra – Xythos

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Auto-ingester FEDORAFiles in Xythos

TrueSpectraStreaming ServerSearch

DisseminationRequests

Faculty or Support Academic Technologies

Data flow

Requests

Users

Physical Collection Management Scenario: Physical Collection Management Scenario: FEDORA – Content Service – Xythos IntegrationFEDORA – Content Service – Xythos Integration

Metadata update

• FEDORA collection object attached to Xythos directory

• Xythos notifies collection object of changes in the directory

• File added – collection creates new member item

• File updated – item accepts new version for file stream

• File removed – item is set dormant in FEDORA

• Metadata added/updated online or batch

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Basic Collection ObjectBasic Collection Object

Collection behavior– getSearchForm– performSearch()– getItem()– getItems()– addItem()– deleteItem()– reindex()– displayItem()

Core behavior– getCoverpage– getThumbnail

Block Museum of Art The Last Expression Vesalius Figures BBC Audio History of Asian Art

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Collection Content ModelCollection Content Model

Search FormXSLT for search resultsIndexHeader/footer XML for result streamMember PIDs

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Search ImplementationSearch Implementation

FEDORA METS files currently indexed offline Plan to integrate update notification and indexing Search Engine

– Have 3 implementations: FEDORA native search Sgrep OpenText

Investigating SRW/CQL Search results passed through XSLT Easy to provide search capability to collections

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

FEDORA DisseminationRequests

External ServicesCache data

Data Request

Dissemination

FEDORA – External ServiceFEDORA – External Service

Image Server

Search EngineBMECH

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

link

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Virtual CollectionsVirtual Collections

Collection maintenance– Topical galleries

Ad-hoc or dynamic collections– For classes...– personal collections…– special exhibits…

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Database IntegrationDatabase Integration

SQL/XQuery for object “data streams”SQL/XQuery for object disseminations

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Encyclopedia of ChicagoEncyclopedia of Chicago

In active developmentMetadata continually updated by research

staff in Microsoft AccessNew content continually added to MS

Access and file storVaried entry typesAll have dynamic “See Also”s

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

SQL DatastreamsSQL Datastreams

“See Also” and “Content” datastreams– Cocoon urls that perform SQL queries on

dynamic research data and convert to XML.– Dynamic updates during development– When project finished will consider moving to

more robust database or “freeze” streams in the repository as “managed”.

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

FEDORA DisseminationRequests

External ServicesCache data

Data Request

Dissemination

FEDORA – External ServiceFEDORA – External Service

RDBMS

Search EngineBMECH

Image Server

Data stream

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

WordHoard Text AnalysisWordHoard Text Analysis

Large TEI XML Etext corporaWord level grammatical and frequency dataText requests via XqueryWord level lexical queries via SQL

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Basic Text BehaviorBasic Text Behavior

BMECH Backed by eXist database

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Viewer ObjectViewer Object

Presentation uncoupled from data object

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Example Book ModelExample Book Model

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

TEXT TOC ServiceTEXT TOC Service

Request for TOC keyed by text PIDTOC XML requested from textTOC DOM cached in serviceUser requests with “open nodes” parameterPruned DOM styled with XSLT from

Viewer content model

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Art collections Wall murals Photographs Historical maps GIS maps Newspapers Book page images

Digital video Spoken word Literary works Encyclopedias Lexical data Census data Event data

Abstract Text ModelAbstract Text Model

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Text MethodsText Methods

Structured text (UVa)– getHeading– getTOC(level)– getChunk(idref) – getPage(idref)

Core behavior– getCoverpage– getThumbnail

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Digital video Spoken word Literary works Encyclopedias Lexical data Census data Event data

Art collections Wall murals Photographs Historical maps GIS maps Newspapers Book page images

Time-based Media ModelTime-based Media Model

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Time-based Media BehaviorsTime-based Media Behaviors

Core behavior– getCoverpage– getThumbnail

Time-based media– Play– playSection()

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Behaviors by TypeBehaviors by TypeImage Map A/V Book News EText

Core

Image

Hi-Res

Layered

Geo

Time

Text

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Next StepsNext Steps

Implement more object types– Event, video, tabular data

Transactions– Ad-hoc groupings of repository objects– Asset management, Annotation– Access control for user editing

Interoperability– Search protocols and repository interactions

Consider application models Specialized clients

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Specialized ClientsSpecialized Clients

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

Viewer ObjectViewer Object

CNI April 15, 2004 Northwestern UniversityBill Parod

Academic Technologies

SummarySummary

Code reuse through object abstractionFlexible implementation bindingComprehensible APIs for applicationsStable APIs for Content reuse

Thank YouThank You

Bill Parod

Academic Technologies

Northwestern University

[email protected]