the digitool to fda program lydia motyka florida center for library automation

31
The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Upload: ada-rogers

Post on 30-Dec-2015

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

The DigiTool to FDA Program

Lydia Motyka

Florida Center for Library Automation

Page 2: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

What is the DigiTool to FDA Program?

A program developed by FCLA that converts exported DigiTool entities into Submission Information Packages (SIPs) for archiving in the FDA repository.

Page 3: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Archiving DigiTool objectsArchiving DigiTool objects is a four-step process:

– Step 1: Affiliates flag DigiTool objects for export.– Step 2: DigiTool objects flagged for export by Affiliates are

exported using the “Export Digital Entities” job.– Step 3: The DigiTool to FDA program (D2F) aggregates DigiTool

objects into Intellectual Entities and creates Submission Information Packages (SIPs) and descriptors in the format required by the FDA

– Step 4: The standard FDA Ingest process and program are used to archive the SIPs in the FDA repository.

Page 4: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

DigiTool to Preservation Archive Workflow

ETDinformationflagged inDigitool

PDF

<title><etc>

Flag causes export of Metadata & files

ProgramCreates

SubmissionInformation

Package

SIP

SIP is Ingested inFLORIDA DIGITAL ARCHIVE

preservation repository

Page 5: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

DefinitionsDigiTool Digital Entity:

Digital entities contain the following components:– A persistent DigiTool internal ID (PID)– Metadata of various types that describe the object– A stream_ref section that points to an object

Page 6: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

DefinitionsSubmission Information Package (SIP):

An FDA Submission Information Package (SIP) is a set of files intended for ingest into the Florida Digital Archive. (It is recommended practice that a single SIP should include only those files that comprise a single Intellectual Entity.)

Page 7: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

DefinitionsIntellectual entity:“An Intellectual Entity is defined as something that

can be reasonably described and used as a unit, and corresponds roughly to what might be described by a bibliographic record: a book, a sound recording, a photograph. (In the case of serial publications, it is recommended that a SIP include only a single issue, not a volume or set of volumes.)”

FCLA Digital Archive (FDA) SIP Specification, Version 1.0

Page 8: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Selecting DigiTool entities for export to the FDA

• Only those objects with filestreams in formats suitable for long-term preservation should be selected for archiving. (Format information can be found on the FDA website.) Examples:– ETDs containing PDFs– Institutional Repository materials– Masters of scanned images when TIFF files have been loaded

into DigiTool

• Complex objects can be exported to the FDA but care must be taken in flagging them

Page 9: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Flagging DigiTool entities for export• DigiTool entities must have the following Control Fields

in order to be exported for archiving in the FDA:– Pres. Level = “Preservation Master”– Partition C must contain a valid FDA Account and Project code,

separated by a comma

Pres.Level

Partition C

Page 10: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

• Note that each DigiTool object desired for archiving must be flagged with

Pres. Level = “Preservation Master”• Related objects not flagged as “Preservation

Master” will not be exported for archiving.• Objects without proper Partition C content will

not be archived.• Note that Usage Type = “Archive” is irrelevant to

the DigiTool to FDA process.

Flagging DigiTool objects

Page 11: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example – manifestations

3 manifestations

View Main(primary manifestation)

Do NOT flag THUMBNAIL or INDEX for archiving

Page 12: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

The Export Process• FCLA will run the DigiTool “Export Digital

Entities” job nightly to extract all flagged DigiTool entities and their filestreams and metadata.

• Only those objects flagged with Pres. Level = “Preservation Master” will be exported. Related objects (manifestations, parent/children) not flagged as Preservation Masters will not be exported.

• The objects output by this program are copied to a special workspace where the DigiTool to FDA (D2F) program uses them as input.

Page 13: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

The DigiTool to FDA conversion process

• Step 1: exported objects (metadata and filestreams) are aggregated into packages, one for each Intellectual Entity

• Step 2: metadata is extracted from the exported objects and a SIP descriptor file is created for the package

• Step 3: filestreams are listed as content files in the SIP descriptor

Page 14: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Aggregation into Intellectual Entities

• An Intellectual Entity (e.g. book) in DigiTool can consist of a number of digital entities linked by “Manifestation”, “Includes” and “Part of” relationship links

• The “Export Digital Objects” job exports each flagged digital object separately

• After export, DigiTool to FDA uses relationship links to aggregate the exported objects into SIPs that include all of the filestreams that constitute the Intellectual Entity

Page 15: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Rules to Remember

• If you wish to archive multiple manifestations, make sure that one of the manifestations is flagged Usage Type = “View Main”

• If you have a complex object (a parent and child objects) make sure to flag the parent for export

Page 16: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation/Flagging in DigiTool:Single Master (ETD)

PID 111 (manifestation)Dublin Core descriptive metadataFilestream: PDFPres. Level = Pres. MasterPartition C = Account,Project

PID 222 (manifestation)Filestream: thumbnailPres. Level = blankPartition C = blank

Page 17: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – ExportSingle Master (ETD)

“Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today

PID 111

PID 222

DigiTool Export Workspace

PID 111

Page 18: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – D2FSingle Master

PID 111

Export Workspace

SIP 111:•Descriptor (descriptive metadata)•PDF content file

D2F Workspace

Page 19: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation/Flagging in DigiTool:Manifestations

PID 111 (manifestation)Dublin Core descriptive metadataUsage Type=View (primary)Filestream: TIFFPres. Level = Pres. MasterPartition C = Account,Project

PID 222 ( manifestation)Filestream: TIFFPres. Level = Pres. MasterPartition C = Account,Project

PID 333 (manifestation)Filestream: thumbnailPres. Level = blankPartition C = blank

Page 20: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – ExportManifestations

“Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today

PID 111

PID 222

PID 333

DigiTool Export Workspace

PID 111

PID 222

Page 21: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – D2F:Manifestations

PID 111(View Primary)

PID 222

Export Workspace

SIP 111:•Descriptor (descriptive metadata)•TIFF content file•TIFF content file

D2F Workspace

The D2F program creates one SIP from thetwo exported objects, based on “Manifestation” links

Page 22: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation/Flagging in DigiTool:Complex Object

PID 111 (Parent and manifestation)Dublin Core descriptive metadataNo filestreamPres. Level = Pres. MasterPartition C = Account,Project

PID 222 (child and manifestation)Filestream: TIFFPres. Level = Pres. MasterPartition C = Account,Project

PID 333 (manifestation)Filestream: thumbnailPres. Level = blankPartition C = blank

PID 444 (child and manifestation)Filestream: JP2Pres. Level = Pres. MasterPartition C = Account,Project

PID 555 (child and manifestation)Filestream: _*index.htmlPres. Level = blankPartition C = blank

Page 23: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – Export:Complex Object

“Export Digital Entities” Query: Select Pres. Level = Pres. Master and Date=today

PID 111

PID 222

PID 333

PID 444

PID 555

DigiTool Export Workspace

PID 111

PID 222

PID 444

Page 24: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Example of Aggregation – D2F:Complex Object

PID 111(parent)

PID 222

PID 444

Export Workspace

SIP 111:•Descriptor (descriptive metadata)•TIFF content file•JP2 content file

D2F Workspace

The D2F program creates one SIP from thethree exported objects, based on “Part of”, “Includes” links

Page 25: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Creation of metadata in SIP descriptor• Descriptive metadata is copied from the parent entity or

main manifestation into the SIP descriptor (dmdSec)• A checksum is generated for every file in the SIP and

stored in the SIP descriptor.• Other technical metadata is not copied from DigiTool into

the SIP descriptor because the FDA generates its own. • Administrative metadata (change history) is not copied

into the SIP descriptor at this time. It may be added as Phase 2.

• Access restrictions are not copied into the SIP descriptor because the information is local to DigiTool.

Page 26: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Descriptive metadata in DigiToolDigiTool supports the following descriptive

metadata formats:– MARC21– MODS– Dublin Core

The FDA currently loads title information into its database only from MODS and Dublin Core metadata, although all MARC21 metadata is archived in the descriptor file. (MARC21 title information will be included in DAITSS 2)

Page 27: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Step 3: Archiving converted SIPs• SIPs created by D2F are sent to the FDA

Ingest queue and processed by the standard FDA programs like all other SIPs

• A successful ingest of a D2F-created SIP will result in an Ingest report being sent to the usual Affiliate reports address.

• Any D2F-created SIPs rejected by the FDA will result in Error reports being sent to the usual Affiliate reports address

Page 28: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Why would the FDA reject D2F SIPs?

Even though D2F creates SIPs according to FDA specifications, the SIPs can be rejected for the following reasons:– The FDA Account and Project codes in

Partition C are invalid or are not comma-separated

– The SIP contains no content files (DigiTool filestreams).

Page 29: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

Problems that won’t be reported:

FDA ingest program does not recognize the following conditions as errors:

• If you flag a parent for export to the FDA but do not flag all of the appropriate children, critical portions of the Intellectual Entity won’t be archived.

• If you flag children but do not flag the parent, each child will create a separate SIP.

• If you don’t flag all manifestations appropriate for archiving, critical portions of the Intellectual Entity won’t be archived.

Page 30: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

What to do after D2F SIPs are archived

FCLA recommends that you record the FDA IEID (Intellectualy Entity ID) in the Note Control Field of the DigiTool entity.

FDA IEID (from Ingest Report)

Page 31: The DigiTool to FDA Program Lydia Motyka Florida Center for Library Automation

• Beta testing, DigiTool workflow by DigiTool workflow

• Volunteers needed for beta testing

End

Next Steps?