ndiipp partners meeting, june 2009 carl fleischhauer cfle@loc michael stelmach mste@loc
DESCRIPTION
Federal Digitization Moving to Common Guidelines The U.S. Federal Agencies Digitization Initiative http://www.digitizationguidelines.gov/. NDIIPP Partners Meeting, June 2009 Carl Fleischhauer [email protected] Michael Stelmach [email protected] Library of Congress Washington, DC. - PowerPoint PPT PresentationTRANSCRIPT
Federal DigitizationFederal DigitizationMoving to Common Moving to Common
GuidelinesGuidelines
The U.S. Federal Agencies Digitization The U.S. Federal Agencies Digitization InitiativeInitiative
http://www.digitizationguidelines.gov/http://www.digitizationguidelines.gov/NDIIPP Partners Meeting, June 2009NDIIPP Partners Meeting, June 2009
Carl FleischhauerCarl [email protected]@loc.gov
Michael StelmachMichael [email protected]@loc.gov
Library of CongressLibrary of CongressWashington, DCWashington, DC
http://www.digitizationguidelines.gov/http://www.digitizationguidelines.gov/
Participating agencies . . .
http://www.digitizationguidelines.gov/stillimages/http://www.digitizationguidelines.gov/stillimages/
Advisory BoardAdvisory Board
http://www.digitizationguidelines.gov/audio-visual/http://www.digitizationguidelines.gov/audio-visual/
Selected use case objectives Selected use case objectives for master imagesfor master images
Digitizing organization (or successor/ Digitizing organization (or successor/ receiving agency with an archiving receiving agency with an archiving mission) sustains the master (or mission) sustains the master (or migrated copies) for the long-term migrated copies) for the long-term without loss of essential features.without loss of essential features.
Selected use case objectives Selected use case objectives for master imagesfor master images
Digitizing organization uses master to Digitizing organization uses master to produce derivative images for use cases like produce derivative images for use cases like these:these:
(1) end-user-access interface(1) end-user-access interface
(2) other patron uses as listed(2) other patron uses as listed
(3) OCR or other text-creation process(3) OCR or other text-creation process
(4) document the condition of the (4) document the condition of the original itemoriginal item
Selected use case objectives Selected use case objectives for derivative (service) imagesfor derivative (service) images
Publisher uses image to illustrate a book.Publisher uses image to illustrate a book.
Publisher uses image to illustrate a large poster.Publisher uses image to illustrate a large poster.
Exhibit designer uses image for display "mural."Exhibit designer uses image for display "mural." Broadcaster uses image in high-definition Broadcaster uses image in high-definition
television program, zooming in for Ken Burns television program, zooming in for Ken Burns effect.effect.
Selected use case objectives Selected use case objectives for derivative (service) imagesfor derivative (service) images
Patron sees inline image or image set in interface. Patron sees inline image or image set in interface. Some view the complete work, a Some view the complete work, a virtual replicavirtual replica..
Patron prints images. Some require print-on-Patron prints images. Some require print-on-demand copy of complete work, a demand copy of complete work, a physical replicaphysical replica..
Patron is confident that the content received is an Patron is confident that the content received is an authentic reproduction, also receives information authentic reproduction, also receives information on restrictions.on restrictions.
Patron downloads a derivative image and, later, Patron downloads a derivative image and, later, uses embedded metadata to identify content and uses embedded metadata to identify content and determined technical provenance.determined technical provenance.
Plan to move from specifications with these factorsPlan to move from specifications with these factors
• • color/monochromaticcolor/monochromatic
• • pixel density (good old “dpi”)pixel density (good old “dpi”)
• • bit depthbit depth
• • . . . usually output-referred. . . usually output-referred
ToneTone ResolutionResolution Color Color UniformityUniformity NoiseNoise
GammaGamma
WhiteWhite BalanceBalance
SpatialSpatial FrequencyFrequency Response (SFR) Response (SFR)
ResolutionResolution
SamplingSampling EfficiencyEfficiency
SamplingSampling FrequencyFrequency
LuminanceLuminance
Delta EDelta E20002000
Delta E(a*b*)Delta E(a*b*)20002000
ChannelChannel Mis-registrationMis-registration
% Lighting % Lighting Non-uniformityNon-uniformity
Total rmsTotal rms deviationdeviation
To specifications with these factorsTo specifications with these factors
Working document from the National Library of the Netherlands.
Three columns, three categories. Specifications in the various rows.
Tools to Support Tools to Support Image Performance Image Performance
MeasurementMeasurement Digital Image Conformance Digital Image Conformance
Evaluation (DICE) SystemEvaluation (DICE) System Device TargetDevice Target – Imaging Device – Imaging Device
PerformancePerformance Object TargetObject Target – Actual Image Quality – Actual Image Quality SoftwareSoftware for Evaluation/Validation for Evaluation/Validation
Based in LabVIEWBased in LabVIEW Data export for use in SQC/SPC Data export for use in SQC/SPC
Device and Object TargetsDevice and Object Targets
Object target as positioned for use
Thanks to OCLC for help with this part of
the effort.
DICE Software – Main PanelDICE Software – Main Panel
DICE – QC Summary PanelDICE – QC Summary Panel
Beyond performance Beyond performance measurementmeasurement
Embedding metadataEmbedding metadata TIFF header specification online nowTIFF header specification online now Future: exploration of XMPFuture: exploration of XMP
Beyond performance Beyond performance measurementmeasurement
Other “gaps” in prior guidelines to be Other “gaps” in prior guidelines to be investigatedinvestigated Image SharpeningImage Sharpening Quality ManagementQuality Management Image Specification Metric Aims and LimitsImage Specification Metric Aims and Limits Foldouts and Inserts in Bound MaterialsFoldouts and Inserts in Bound Materials Color Encoding AccuracyColor Encoding Accuracy Color Space EncodingColor Space Encoding Selection Criteria for Master Image File FormatSelection Criteria for Master Image File Format
Working draft pertaining to Working draft pertaining to quality assurance and quality quality assurance and quality
controlcontrolWork in progress at the National Archives and Records Administration
Audio-visual effort: recorded Audio-visual effort: recorded soundsound
Compile Compile guidelines guidelines for for recorded recorded soundsound W
ork in
progress
Audio-visual effort: recorded Audio-visual effort: recorded soundsound
Audio-visual effort: videoAudio-visual effort: video
While we wait for agencies to While we wait for agencies to gain experience . . .gain experience . . .
Exploration of “target formats”Exploration of “target formats”
Library of Library of CongressCongress
Packard Campus, Packard Campus, CulpeperCulpeper
Smithsonian Smithsonian Institution Institution
ArchivesArchives
National Archives, National Archives, College ParkCollege Park
Lossless compressedLossless compressed
Each frame is a JPEG Each frame is a JPEG 2000 image2000 image
Lossless (reversible) Lossless (reversible) transformtransform
Produced by the Produced by the SAMMA deviceSAMMA device
What about film?What about film?
Most activity is service to outside Most activity is service to outside customers, usually television customers, usually television documentary makersdocumentary makers
Addressed by making a video copy, Addressed by making a video copy, often still standard definition, often still standard definition, understood to be an imperfect understood to be an imperfect solutionsolution
National Aeronautics and Space Administration
www.nasa.gov
Most active high-resolution film scanning program: NASA Johnson Space Center
Please review our work and pass along your comments:
http://www.digitizationguidelines.gov/contact/
One of the subcategoriesOne of the subcategories
T.3. Documents with poor legibility or T.3. Documents with poor legibility or diffuse characters, e.g., carbon copies, diffuse characters, e.g., carbon copies, Thermofax/Verifax, etc.; manuscripts or Thermofax/Verifax, etc.; manuscripts or printed/typed pages with handwritten printed/typed pages with handwritten annotations or other markings; items annotations or other markings; items with low inherent contrast, staining, with low inherent contrast, staining, fading, printed halftone illustrations, or fading, printed halftone illustrations, or included photographs.included photographs.
One of the subcategoriesOne of the subcategories
Valuation: determinedValuation: determined
by curator or end by curator or end usersusers
to have informationalto have informational
and artifactual value,and artifactual value,
but not requiring but not requiring colorcolor
reproduction.reproduction.
From this document: http://www.digitizationguidelines.gov/stillimages/documents/Digital_Imaging_Framework.pdf
Image recommendation in Image recommendation in 2004 guidelines from NARA2004 guidelines from NARA
8-bit grayscale mode - adjust scan 8-bit grayscale mode - adjust scan resolution to produce a QI of 8 for smallest resolution to produce a QI of 8 for smallest significant charactersignificant character oror
8-bit grayscale mode - 400 ppi for 8-bit grayscale mode - 400 ppi for documents with smallest significant documents with smallest significant character of 1.0 mm or largercharacter of 1.0 mm or larger NOTE: Regardless of approach used, adjust NOTE: Regardless of approach used, adjust
scan resolution to produce a minimum pixel scan resolution to produce a minimum pixel measurement across the long dimension of measurement across the long dimension of 4,000 lines for 8-bit files4,000 lines for 8-bit files
Uncompressed videoUncompressed video Stanford, RutgersStanford, Rutgers 4:2:2 or 4:4:4, 10-bit SDI stream4:2:2 or 4:4:4, 10-bit SDI stream About 100 GB per content-hourAbout 100 GB per content-hour
Another source reported 70 GB for 8-bit videoAnother source reported 70 GB for 8-bit video
Rutgers spec: http://rucore.libraries.rutgers.edu/collab/ref/dos_avwg_video_obj_standard.pdf
The Netherlands Institute for Sound and Vision
Lossy compressed: MPEG-2 @ 50 mbps and 30 mbps (news)
SONY IMX, MPEG-2 @ 50 mbpsSONY IMX, MPEG-2 @ 50 mbps
MPEG-2, all I-frames, 50 mbpsMPEG-2, all I-frames, 50 mbps File size about 28 GB/hourFile size about 28 GB/hour
MPEG-4 (ITU-T H.263 and H.264) may come to MPEG-4 (ITU-T H.263 and H.264) may come to play a bigger role as high-resolution increasesplay a bigger role as high-resolution increases
From: http://www.edithouse.com.au/information/imx.html
Lossless compressedLossless compressed
Audio-visual effort: videoAudio-visual effort: video
Video reformatting target formatVideo reformatting target format Federal Agencies Working Group planned Federal Agencies Working Group planned
action: Documentation of MXF wrapping action: Documentation of MXF wrapping JPEG 2000 and uncompressed videoJPEG 2000 and uncompressed video
Emerging encoding Emerging encoding preferencespreferences
For high value, uncompressed or For high value, uncompressed or lossless compressed is very lossless compressed is very attractive.attractive.
For second-rank content, some make For second-rank content, some make a good case for modest-but-lossy a good case for modest-but-lossy compressed.compressed.
Audio-visual effort: recorded Audio-visual effort: recorded soundsound
System performance System performance testingtesting Considering IASA TC04 Considering IASA TC04
pass-fail specificationspass-fail specifications Appropriate, affordable Appropriate, affordable
equipment for tone equipment for tone generation not at handgeneration not at hand
Audio-visual effort: videoAudio-visual effort: video
Target encoding optionsTarget encoding options UncompressedUncompressed Lossy compressed Lossy compressed Lossless compressedLossless compressed
File wrapper optionsFile wrapper options MXFMXF AVI, QuickTime, otherAVI, QuickTime, other