general reference framework practical guidelines for ... · warsaw, 19-20 december 2006 antonella...

50
Warsaw, 19-20 December 2006 Antonella Fresa General reference framework Practical guidelines for digitisation Cost reduction in digitisation Quality of cultural websites

Upload: others

Post on 14-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

• General reference framework• Practical guidelines for digitisation• Cost reduction in digitisation• Quality of cultural websites

Page 2: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Phases of a digitisation project1. Digitisation project planning2. Selecting source material for digitisation3. Preparation for digitisation4. Handling of originals5. The digitisation process6. Preservation of the digital master material7. Meta-data8. Publication9. IPR and copyrigth10. Managing Digital Projects

Page 3: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

1. Digitisation project planning

This is the first step in any digitisation project.Time spent on planning will make easier the management and the

execution of the project.Clarify goals and objectivesSpecify timing and expected results Identify suitable personnel, with knowledge and skillsPut in place a training plan for the additional expertise that the

project would needCopyrights status of the material to be digitisedTechnical pilot to ensure full feasibility of the project

Page 4: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Important steps to be followed:- the reasons for the project- human resources- research- risks

Page 5: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

The Reasons for the Project

- Aims should be concrete, explicit and documented

- Expected results should be realistic when compared with available resources

- Steps of the project validated should be defined against the fixed aims

- Clear justification for the project should be given from an institutional point of view

Page 6: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Human Resources

- Ensure sufficient staff to carry out the project- Assign staff to each task- Identify training requirements- Carry out training by using software and

hardware which will be used during the project- Aim at small core of skilled dedicated staff

(rather than large group of ‘occasional’ staff)

Page 7: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Research of relevant information

Research into other projects which are addressing similar issues

- it helps in avoiding mistakes- it puts project team in contact with others who

have completed similar projects giving the opportunity to learn from their experience

- it adds credibility and enhances the results of the project

Page 8: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Risks- Intellectual Property Rights management should be

clarified since the beginning of the project for all the content that will be make available online

- It is important to guarantee that source material is not corrupt and has been produced by authorised institutions

- Authenticity- Financing of the project- Level of skill in the project

Page 9: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

2. Selecting source material for digitisation- Ideal choice is to digitise all the material that belongs

to a certain unique collection- The selection of the material depends on the goals of

the project, e.g.: - a school could decide to digitise material in line with a

syllabus;- a museum could decide to digitise its best-known holdings

- Other reasons could be:- Legal and financial constrains,- Institutional policies,- Technical difficulties,- etc.

Page 10: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Important steps to be followed:• establish selection criteria• Selection against criteria

Page 11: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Establish selection criteria- At least, the following criteria may be considered:

- Access to material which would be otherwise unavailable, or of limited availability;

- Wider and easier access to very popular material;- Conditions of the originals;- Preservation of delicate originals;- Project theme;- Copyright and IPR;- Availability of existing digital version;- Cost of digitisation;- If the source material can be viewed online;

- The selected criteria should be discussed with the relevant stakeholders

- Criteria should be fully documented

Page 12: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Selection against criteria

How to manage the actual selection process:- Candidates for digitisation should be checked against

the criteria and decision must be documented;- At this moment the project is entering in contact for the

first time with the items to be digitised; it is the best time to create a knowledge base of the items, within the scope of the project (e.g. the location of the items, if the original item is a rare artifact, etc.)

Page 13: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

3. Preparation for digitisation

This phase is about putting in place hardware and software system that will be used along the digitisation project;

The working environment should be appropriate to the material being digitised (e.g. light, humidity, vibration, movement of the orginal, etc.)

Page 14: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Important steps to be followed:• Hardware• Software• Environment

Page 15: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Hardware

This generally consists of image capture equipment connected to an appropriate computing platform.

Two digitisation methods can be distinguished:- Scanning- Use of digital cameraPlus the generation of metadata and descriptions

Page 16: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Hardware (cont.)- To install and to control quality and functionality of the

equipment before starting the actual digitisation- Work only on non-sensitive material until the full installation of

the hardware is completed- Try to avoid folding or mosaic canning and therefore choice the

largest scanners- Generate the highest resolution that is possible (lower

resolutions can be derived from high resolutions but not viceversa)

- Use whenever it is possible a loss less file format (e.g. TIFF)- Three parameters are important for the quality of the digital

image: - number of pixels in the image- bit-depth- quality of the optical lens

Page 17: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Hardware (cont.)- Have appropriate holds for originals- Possibly consult experience digital photographer before

beginning- Use exactly parallel photographic plane and plane of

the material to be digitised, in order to avoid distortions- Provide appropriate light for the use of digital cameras- Apply suitable filters- Prepare a compute with significant storage capacity

and do back-up regularly

Page 18: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Software- Set an automatic calibration routine when the scanner

of the digital camera is turned on- Install suitable image processing software- The software should be able to do, at a mimum:- Opening very large files- Modify the resolution and the colour depth- Saving multiple different versions, in different file sizes- Selecting and copying a portion of the image and saving this

as a different file- Exporting images in different formats

Page 19: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Software (cont.)

- Be careful with free software- For OCR: - Review and editing on a single screen- Suggest possible corrections for mis-read words- Support the use of multiple text columns (e.g.

newspaper layout),- etc.

Page 20: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Environment

This is mostly related with rare or delicate materialExperts’ opinion should be soughtUse dedicated area for the digitisation activities for the

whole duration of the project: excessive movements and rearrangements of the work space can lead to damage or loss of the originals as well as loss of time

Smoking, eating, drinking in the vicinity of the items should be of course not permitted.

Page 21: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

4. Handling of the originals

This phase deals with ensuring that no damage is produced to the originals during the digitisation process

Some steps should be taken for this scope, on the basis of the specific requirements of the orginals, e.g.:

- Establish the correct micro-climate- Move the digitisation lab close to location of the

material, instead of moving the originals, etc.

Page 22: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Steps to be taken before moving and manipulating the orginal material

- Consult the person who is usually responsible for the source material

- Avoid unbinding of bounded books and records- Always remove staples, paper clips and other

fasteners- Prepare a ‘discipline’ to be followed by all the

people who will manage delicate originals

Page 23: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

5. The digitisation process

Three operations are taken in consideration here:- Using scanners- Using digital cameras- OCR

Page 24: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Using scanners- Only scan material on a flatbed scanner which will not be

damaged by being pressed flat; - Ensure that the glass scanning plate is perfectly clean; this

generates better images and protects the original from being soiling

- In the case of scanning of an item in multiple parts, ensure sufficient overlap to allow the image to be reconstructed during post-processing

- Test scanners with non-sensitive material and train uses as well with non-sensitive material

- Establish file naming convention, that will also allow in the future, the mapping with the original source

- Establish quality control procedures for the digital image and the associated meta-data

Page 25: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Using digital cameras- Use motorized carriage for the camera and steady stand

board for holding the material- Use tailed lights- Organise training from a specialist of digital

photography- Ensure that the background shows the items clearly- Avoid changing light conditions between shots and

between parts or sides of the same image- Use appropriate lens and lens filters to combat color

distortion

Page 26: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

OCROCR is a mechanism to extract text from images, to make it

possible to searching, indexing, format conversation and other data processing operations

Some hints:- Evaluate multiple OCR software that best suits the needs of the

material to be digitised- Choose OCR software packages that provide friendly user

interface for manual handling of mistakes- Apply pre-treatment of originals, when possible, and /or image

processing on the digitised image, to improve the quality of the scanned image to be processed by OCR software

- Verify the availability of dictionaries in the language of the source material

Page 27: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

6. Preservation of digital master material

Aims:- To protect and keep accessible the data which

were created through the digitisation process (images, OCRed texts and meta-data)

- To deal with obsolescence of digital formats and storage media

Page 28: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Important steps to be followed:• File formats• Media Choices

Page 29: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

File formats

- Before deciding on the file format, take into account the relevant standards

- The default digitisation output file for digital images is TIFF-Tagged Image File Format

- Master big files are generally stored locally; smaller versions are derived from the master, to be transmitted on the Internet (TIFF, or JPEG of GIF, …)

Page 30: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Media choices- The output of the digitisation process will be stored on

servers; these machines need to be backed up and if the server is not dedicated exclusively to the project, the digital content should be better stored on removable media;

- The rapid change of the media, primarily due to the electronic industry, has had major effects on the digitisation projects in the past. The increasing trend ‘to store on the Internet ‘ facilitates the migration of data from place to place and from medium to medium.

Page 31: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

7. MetadataThis area is one of the most dynamic in the whole digitisation

sector, as well as information retrieval, web searching, data exchange, enterprise application integration, etc.

The selected metadata model is of particular importance as it influences the the choice of attributes to describe an object.

Related to this is the selection of a standard model for meta-data.Three categories of metadata can be considered:- Descriptive – for description and identification of items,- Structural – for navigation and presentation,- Administrative – for management and processing.

Page 32: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Metadata for object description

- The use of appropriate metadata is very important for enabling search and retrieval of material from digital collections.

- There exist many metadata models. The model should be chosen on the basis of the project’s goals.

- Possibly, controlled vocabularies should be selected

Page 33: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Metadata standards

- Ref. Dublin Core: include as much as possible DC fields in your model and provide a mapping to DC in the case you use a different model

- Creating a totally new metadata standard for your own project only should be avoided

Page 34: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

8. Publication

At this stage of the project:- The digital master material has been created, stored

and backed up,- A suitable metadata model has been selected,- Metadata associated with each item has been created.Next publication includes:- Processing of the digital master (e.g. to reduce size in

order to be downloadable from the Internet)- Actual publication.

Page 35: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Image processing

- Create delivery version of the digital masters. At least one delivery version, plus “thumbnails”.

- Typically, reduction will occur on: - File format (e.g. JPEG or GIF)- color resolution reduction (e.g. 256 colors) - DPI (e.g. 72 DPI)- For video: reduce file per second- For audio: reduce sampling rate.

Page 36: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

3D and virtual reality

- Ensure that viewers for 3D and VR material is readily available. Make the viewer software available from the same site as the material.

- Evaluate multiple viewer before endorsing one or another.

Page 37: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Online publication

- Ref. to MINERVA quality principles for cultural web sites

Page 38: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

9. IPR and copyright

- Publication of any material online should be accompanied by some considerations of the IPR associated with the material and the derived rights to copy the material that are given to the users.

- A range of technical options are available to protect IPR and copyright of the material placed on the Internet. They are surveyed in the following slides.

Page 39: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Establishing copyrightInitial step: to establish ownership of the rights- Each country has different regulatory framework- Certain items (e.g. newspapers) have clear copyrights associated

with, which allow free copying once the papers are of a certain age.

- For items whose copyright is vested in the institution carrying out the project, internal permission will be required for digitisation and online publications

- For items whose copyright is held by a third party, permission must be obtained in writing

- Securing permission to digitise and publish may involve payment.

Page 40: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Safeguarding copyrightHaving established weather or not copyright must be

safeguarded. If yes, then agree to procedure to be use for safeguarding:

- visible watermark on each image- invisible watermark on each image- encryption of images- restricting publication to low-resolution images- Display images only to registered, authorised members

of a particular community (e.g. eLearning licenses)

Page 41: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

10. Managing digitisation projects

- The success of any project, including digitisation projects, is influenced to a large degree by the management of the project.

Page 42: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Important steps to be followed:• Digitisation process management• Team development• Staff training• Working with third parties for technical

assistance• Cooperative projects and content sharing• Costs

Page 43: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Digitisation process management- Work-flow and knowledge base are important

instruments for the success of the project- Name, identifier and any other relevant information for

each item should be entered into the knowledge base, as soon as the item is selected to be digitised;

- Articles which require similar activities should be digitised together

- Location, phone numbers of key service delivery personnel should be noted and remain available along the project.

Page 44: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Team development- If possible, include at least one person with appropriate

skills in the project team- At the beginning of the project, assess the state of

knowledge of the personnel, identify training needs and fulfill these before starting

- IT skills should be complemented with specialist skills whenever necessary (e.g. for metadata generation, for handling of delicate material, etc.)

- In general, it is better to have a small core of skilled personnel, than a larger population of occasional participants

Page 45: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Staff training- Areas to be considered: technology, metadata

generation, handling of source material- Requirements for training should be included in the

knowledge base; this helps the recruitment of new staff in the case of necessity

- Some training can be obtained ‘on the job’, some other must be delivered before starting

- Technology training can be delivered from other projects in the same institutions

- Curator training could be better provided by the people who are actually responsible for the care of the original material.

Page 46: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Working with third parties for technical assistance

To engage the services of one or more third parties for technical assistance allows the cultural body to concentrate on its own areas of expertise, with a benefit for the quality of the whole project’s results.

Most commonly provided services:- Actual digitisation- Management of the project- Integration with third party systems- Software development- etc.

Page 47: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Suggestions for the relationship with third parties for technical assistance- To define clear and strict contracts, including

documented and signed specification of the product or service to be provided

- Review of the work on regular basis, to ensure that what is being delivered is in fact what the project requested

- Having established a working relationship with a supplier, the value of changing supplier may need to be estimated in advance.

Page 48: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Cooperative projects and content sharingGuidelines for establishing and managing multi-partner

projects are many, but a few pointers are included here for the convenience of the attending trainees:

- Establish clear roles and responsibilities to each partner

- Establish a common mode of communication across partners

- Document clearly IPR of all partners - Establish and sign a partnership agreement among all

the participants

Page 49: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

CostsDigitisation projects are normally expensive.A careful monitoring of financial aspects is therefore very

important for the success of the results. In addition to the costs for the actual digitisation work:- Take into account all start-up and infrastructural costs,

i.e.: initial planning, data specifications, tracking and documentation systems, staff training, facilities, storage and delivery systems, etc.

- Plan the running costs for the maintenance of the digital collection that will start as soon as the development phase will be completed

Page 50: General reference framework Practical guidelines for ... · Warsaw, 19-20 December 2006 Antonella Fresa • General reference framework • Practical guidelines for digitisation •

Warsaw, 19-20 December 2006 Antonella Fresa

Operational costsThese include:- Time for handling source material - Preparation of source material (conservation, cleaning, etc.)- Capture time (from set-up to naming and saving)- Cataloguing and handling of metadata- Hw and sw costs per digitised item (depreciation)- Quality assurance time- Hw and sw maintenance- Technical support related to capture- Project management time- Training related to captureTime costs are normally calculated as a percentage of total salary

cost per day