ala annual june 2008 contentdm in context geri ingram oclc digital collection services manager,...

30
ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Upload: jody-carr

Post on 15-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

ALA Annual

June 2008

CONTENTdm in ConTEXTCONTENTdm in ConTEXT

Geri Ingram

OCLC Digital Collection ServicesManager, Customer Services

Page 2: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Who should attend this morning?Who should attend this morning?

To get the most from the next hour and a half,

Either you have:

•Experience building CONTENTdm collections

OR

•Attended CONTENTdm Training

• Hands-on: on-site or on-line

• Demonstration only: Basic Use Webinar

Page 3: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

OutlineOutline

Part One: Review

• Software architecture

• Collections and Projects

Part Two: Demonstration

• Importing and searching full text

• Research papers

• Yearbooks

• Postcards

• Books

Page 4: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Acquisition Stations or “clients”

Acquisition Stations or “clients”

JPEG2000 ExtensionJPEG2000 Extension

OCR ExtensionOCR Extension

Administration tools• Statistics• Authorization settings• Exporting to WorldCat

Administration tools• Statistics• Authorization settings• Exporting to WorldCat

• Custom Web interfaces• Custom Web interfaces

Web-based ‘Add’

Web-based ‘Add’

CONTENTdm Server

Unix (Linux, Solaris) orWindows (2000, 2003)

CONTENTdm Server

Unix (Linux, Solaris) orWindows (2000, 2003) CONTENTdm

site pagesCONTENTdm site pages

CONTENTdm Architecture

Archival repositoryArchival

repository

OCLC Connexion

‘digital import’

OCLC Connexion

‘digital import’

Search engines

E.g., Google®

WorldCat.orgWorldCat Local

Search engines

E.g., Google®

WorldCat.orgWorldCat Local

Page 5: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Configuring a collectionConfiguring a collection

What’s a Collection?

A group of objects (items) that

• Share the same metadata schema

• Live on the same CONTENTdm server

How many Collections can I have?

• Up to 200 collections per server

How many items can be in a collection?• 16 million items per collection

Page 6: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Populating a collectionPopulating a collection

Through the use of a “Project”

What’s a CONTENTdm Project?

A workspace on your personal computer

• Into which you import up to 5000 items at a time

• Where items reside until you upload to the server

A group of settings that are applied to the items

• E.g., image display resolution, file format, branding

• E.g., automatic metadata input

How many Projects can I have at one time?

Limited only by your disk space on the workstation

Page 7: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

RELATIONSHIP of Collection to ProjectsRELATIONSHIP of Collection to Projects

A single CollectionCollection

Many Projects

Collection

Collection

Project 1Project 1

Project 3Project 3

Project 2Project 2

Page 8: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

What’s a CONTENTdm object or item?What’s a CONTENTdm object or item?

CONTENTdm can store/index/search items in various formats

Display any file format:• Viewed with a Web browser natively or viewed via

a plug-in

• Including: JPEG, JPEG2000, TIFF, PDF, WAV or MP3 audio, AVI or MPEG video, html, MrSID®

Simple items—e.g., images, sound files, research papers (We’ll load papers today as PDF items.)

Compound objects—multiple simple items assembled together

Page 9: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

CONTENTdm Compound ObjectsCONTENTdm Compound Objects

CONTENTdm defined classes

• Documents

• We will load a section of a yearbook

• Postcards

• We will load a handwritten postcard with a typescript

• Monographs (Structured documents)

• We will load a book with chapters

• Picture Cube (six-sided views)

Page 10: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Dublin Core metadata element setDublin Core metadata element set

Page 11: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Review: Basics of CONTENTdmReview: Basics of CONTENTdm

Simple and Qualified Dublin Core element sets offered

• 100 fields per collection

• Only DC.Title required to create a record

• Dublin Core is basis for cross-collection searching

Text is stored in a metadata field

• 128,000 characters per “full text search” field

200 collections/server—i.e., 200 different metadata schema

Page 12: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Providing searchable textProviding searchable text

Remember: metadata fields can be made searchable

In addition, full-text, extracted from the digital object itself can be stored in a metadata field designated as “Full text search” data type, in any of three ways:

1. Extracted (by server) from PDFs (if embedded to begin with)

2. Imported as .txt transcript

• Typescripted from handwritten

or

• OCR’d in advance (external OCR engine)

3. Generated by OCR “on-the-fly” (integrated ABBYY FineReader®)

Page 13: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Review: Populating collections Review: Populating collections

Acquisition Station Projects (PC client)

Add from CONTENTdm Administration (Browser-based)

Connexion digital import (WorldCat cataloging client function)

Page 14: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Review: 1. Acquisition Station—PC clientReview: 1. Acquisition Station—PC client

Project workspace

Project settings

Tools to manage

• Image settings

• Metadata settings

Page 15: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Review: 2. Add –web based functionReview: 2. Add –web based function

Platform independent

Simple item add function may be used for single import of:

• Images—.jpg, .jp2, .tif (if bandwidth allows)

• PDF—single and multi-page

• Audio

• Video

Page 16: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Review: 3. Connexion digital import functionReview: 3. Connexion digital import function

Page 17: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Simple items—some examples that carry textSimple items—some examples that carry text

Reformatted materials e.g., books, documents, posters, broadsides, memos—scans may all contain text

Born digital files e.g., PDFs, single or multi-page

• Single-page PDFs viewed as items

• May opt for ‘in-line’ Adobe viewer

• Multi-page PDFs may be handled as if compound object of type “document”

• Server side conversion

• Import as simple item regardless of conversion choice

Page 18: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Excerpted from Creating and managing text collections using

CONTENTdm

Excerpted from Creating and managing text collections using

CONTENTdm

Page 19: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

First things First-- Recap: Prepare the CollectionFirst things First-- Recap: Prepare the Collection

For importing searchable text items, whether singly or in batch—at minimum:

1. One empty, searchable field is configured as “Full text search” data type to hold text

2. Collection is configured to treat PDFs as compound objects.

3. Collection is configured to provide Full Resolution file management.

4. Other fields are made searchable, hidden, moved, or added, as needed.

5. OPTIONAL: the Web templates are adjusted to suppress display of components of compound objects in search results.

Page 20: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Recap: Prepare the itemsRecap: Prepare the items

These PDFs have been created with searchable text embedded.

Beware: Not all PDFs are created equal!

Page 21: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Demonstration 1a--Simple itemsDemonstration 1a--Simple items

• One simple item—PDF with ‘hidden’ text

Acquisition Station Import file

Web-based Add

Page 22: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Demonstration 1b--Multiple simple items (Acquisition Station) Demonstration 1b--Multiple simple items (Acquisition Station)

A batch of simple items, two ways:

Method A: Import a batch of simple digital items stored in folders

(where Template Creator only is used to automatically generate metadata)

Method B: Import a tab-delimited text file naming and describing the digital items

(where metadata also resides in imported tab-d file)

Page 23: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Recap: Behind the scenes: prepare the items, organize folders

Recap: Behind the scenes: prepare the items, organize folders

Method A:

PDFs had been created with text (Adobe, Word conversion)

For importing a batch of PDFs in one load,

• All PDFs were stored in one folder.

• Digitization Training

Page 24: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Recap: Behind the scenes: prepare the items, organize folders

Recap: Behind the scenes: prepare the items, organize folders

Method B:

PDFs had been created with text (Adobe, Word conversion)

For importing a batch of PDFs in one load,

• All PDFs were stored in one folder.

For loading with tab-d files:

• Prepare .txt file of metadata

• Place it in a directory different from the .pdf files

Page 25: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Demonstration 2—Single Compound objects Demonstration 2—Single Compound objects

Yearbook (OCR’d transcript produced on the fly)

Handwritten Postcard (with a previously created typescript file)

Book (Separate transcript produced in advance)

Page 26: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Questions & AnswersQuestions & Answers

Getting help with Text

• User Support Center

• Downloading the appropriate Acquisition Station

• JPEG2000

• Installing, activating the OCR extension

• Tutorials to study

• Help files related to text works

• Write [email protected]

Page 27: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Questions?Questions?

[email protected]

Page 28: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Collections of documents:Text-based letters, newspapers, diaries, yearbooks, PDFs, and more

Page 29: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

60-Day Free CONTENTdm Evaluation60-Day Free CONTENTdm Evaluation

https://www3.oclc.org/app/contentdm/evaluation/

Page 30: ALA Annual June 2008 CONTENTdm in ConTEXT Geri Ingram OCLC Digital Collection Services Manager, Customer Services

Section BreakLine Two

Section BreakLine Two

Subtitle here

Contact: Ron Gardner, OCLC

[email protected]

1-800-848-5878

For more information about CONTENTdm…

www.oclc.org/contentdm/