pdfa in a nutshell 211

Upload: bookbinding

Post on 02-Jun-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/11/2019 PDFA in a Nutshell 211

    1/20

    PDF/Ain a Nutshell 2.0PDF for long-term archiving

    The history of the ISO standard

    All versions from PDF/A-1 to PDF/A-3

    How users benefit from PDF/A

    The technical background

    Tools for creating PDF/A files

    Validating PDF/A files

    PDF/A in law and administration

    PDF/A in finance and industry

    Alexandra Oettler

  • 8/11/2019 PDFA in a Nutshell 211

    2/20

    PDF/A in a Nutshell 2.0PDF for long-term archivingThe ISO Standard from PDF/A-1 to PDF/A-3

  • 8/11/2019 PDFA in a Nutshell 211

    3/20

    This work, including all its component parts, is copyright protected. All rights based thereupon are reserved, including those of translation, reprinting,

    presentation, extraction of illustrations or tables, broadcasting, microfilming or reproduction by any other means, or storage in any data-processing device,

    in whole or in part. Reproduction of this work or any part of this work is only permitted where legally specified in the Copyright Act of the Federal Republic

    of Germany dated the 9thof September 1965.

    2013 Association for Digital Document Standards e. V., Berlin

    [email protected]

    Printed in Germany

    The use of any names, trade names, trade descriptions etc. in this work, even those not specially identified as such, does not justify the assumption that

    these names are free according to trademark protection law and thus usable by anyone.

    Text: Alexandra Oettler

    Layout, cover design, design and composition: Alexandra Oettler

    Cover image: Paulgeor, Dreamstime.com

    Picture credits: Page 5: Photocase; Page 6: Sepp Huberbauer, Photocase; Page 8: aoe; Page 13: EU Publications Office; Page 14: Rui Frias, Istockphoto.com;

    Page 15: MBPHOTO, Istockphoto.com; Page 18: Photocase.

    Printed by: Galrev Druck- und Verlagsgesellschaft Hesse & Partner OHG

  • 8/11/2019 PDFA in a Nutshell 211

    4/20

    PDF/A the ISO standard for long-term archiving 5

    Te decisive advantages o PDF/AWidespread acceptance o PDF/A

    PDF/A facts an introduction to the standard 6

    An archiving ormatWhy PDF/A and not just PDF?

    A short history of PDF/A 7

    Becoming an ISO standard

    PDF/A catches on

    The technical side of the PDF/A standard 8

    PDF/A-1: Te first archiving standardPDF/A-2: Based on PDF 1.7PDF/A-3: One more eatureConormance levels: A, B, U

    The most important reasons to use PDF/A 9

    Typical uses for PDF/A 10

    PDF/A creation tools 11

    Desktop sofwareServer-based solutionsProgramming librariesIntegrated PDF/A unctions

    Validation: Is it really PDF/A? 12

    When do I need to validate?Finding the right validation solution

    PDF/A in public administration 13

    PDF/A in finance and industry 14

    Industry documentationBanking and insuranceHealthcareE-invoicing

    PDF/A in legislation and justice 15

    Federal jurisdiction in the United States courts

    Te Italian Chamber o CommerceAustria: BAIKGermany: Land registration

    What the users and experts say 16

    PDF/A and the other PDF standards 17

    PDF/XPDF/APDF/E

    PDFPDF/VPDF/UA

    The myths and legends surrounding PDF/A 18

    Further information on PDF/A 19

    Te portal to the PDF AssociationPDF Association events

    Membership

    Contents

    PDF/A in a Nutshell 2.0 III

    PDF/A in a Nutshell 2.0 III

  • 8/11/2019 PDFA in a Nutshell 211

    5/20

    PDF/A the ISO standard forlong-term archiving

    Up to the end o the 20th

    century, physi-cal media ormats (paper, microfilmsand microfiches) were the only optionor businesses and public authoritiesstoring documents or the long term in areproducible ormat. Te major draw-back to these analogue approaches wasthe significant time and effort required:documents are hard to search through,trained personnel are required, specialistequipment is needed to read microfilms,and entire climate-controlled rooms are

    needed to store documents.Te first digital archiving ormat to

    gain ground in many countries was theIFF image ormat. In 1993, however, amodern, more powerul ormat becameavailable in the orm o PDF. Tis be-came the basis on which the standard ar-chive ormat PDF/A was developed (seepage 7).

    For companies, public authorities andprivate users needing to store digital in-ormation or a long period o time be it

    5 years, 50 or 500 the PDF/A standardis now the clear choice o file ormat.

    PDF/A is a multi-part ISO standarddeveloped over many years o committeework by industry associations, business-es and public authorities around theworld. Te result is a file format basedon PDF, known as PDF/A, which pro-vides a mechanism for representing elec-tronic documents in a manner that

    preserves their visual appearance over

    time, independent of the tools and sys-

    tems used for creating, storing or render-ing the files.(ISO 19005-1, quoted romthe introduction).

    Te first part o the standard, PDF/A-1,has been available since the 1sto Octo-ber 2005. Its official designation is ISO19005-1:2005. Document management Electronic document file format for long-term preservation Part 1: Use of PDF 1.4(PDF/A-1).

    Since then, two urther parts have beenmade available to users: PDF/A-2 (since

    2011) and PDF/A-3 (since 2012). Teseparts exist in parallel and are optimisedto meet particular needs (see page 8).

    Te PDF/A standards amily regulateshow to create electronic documents toensure they can be reliably reproducedor decades to come. Te standard doesnot describe how to build a revision-saearchive, nor the theory behind one.

    The decisive advantages of PDF/A

    A PDF/A file contains everythingneeded to display it and nothing whichcould negatively impact the display.

    PDF/A files can be used on any plat-orm.

    Free programs exist or displayingPDF/A files.

    Te multi-part PDF/A standard offers

    great flexibility to users.

    Widespread acceptance of PDF/APDF/A is becoming more and morecommon, be it in industry, public ad-ministration, financial services or ac-ademia. A large number o authoritiesand institutions worldwide recommendPDF/A or specifically require the use othe standard (see page 13).

    The ISO (International Organization

    for Standardization) is the largest

    organisation in the world for devel-

    oping and publishing international

    standards .

    Introduction

    PDF/A in a Nutshell 2.0 5

    Introduction

    PDF/A in a Nutshell 2.0 5

  • 8/11/2019 PDFA in a Nutshell 211

    6/20

    PDF/A facts an introductionto the standard

    Current file ormats used by popular ap-plications are simply not suitable or pub-lic authorities, businesses and individualusers needing to store unalterable digitaldocuments or long periods o time. Wordprocessors such as Microsof Word orOpenOffice Writer create files which canlook very different depending on the plat-orm used to view them. ext and imagesmay appear different than intended orthey may not appear at all. Nowadays,there are also the questions o how these

    programs will develop in the uture, andwhether or not it will still be possible toopen and view older files an unaccept-able risk when considering the timescalesinvolved in long-term archiving.

    An archiving formatWhen using email or the internet to dis-tribute careully designed documentscontaining text and images, users are in-creasingly choosing PDF. Afer all, thePortable Document Format can embed

    all elements o a document within itsel.Tis can include onts and images, butalso 3D objects, audio and video. Em-bedded onts are optional; it is also possi-ble (in order to save on file size, orexample) to link to one instead. Tis,however, carries the risk that not all ma-chines will correctly display the PDF.

    PDF has also gained such broad world-wide acceptance because ree programsexist or all devices and operating sys-

    tems to view PDF documents. Whether

    viewed on a tablet, a smartphone or adesktop computer, a PDF file will usuallylook the same.Document archives, however, require anexceptionally high standard: the contentmust always appear exactly the sameunder all circumstances. Particularlybecause o its universal availability andworldwide acceptance, it makes sense tobuild on PDF to create an archiving stan-dard or digital documents.

    Why PDF/A and not just PDF?Put in the simplest possible terms,PDF/A is a PDF which orbids certainunctions which could hinder long-termarchiving. PDF/A also demands that thefile meet certain requirements whichguarantee reliable reproduction.

    For example, files must not be encrypt-ed with a password, as all content mustalways be ully available. Embedded vid-eo and audio data are also prohibited:PDF/A consciously avoids anything that

    requires external sofware or display orplayback. JavaScript and certain actionsare also orbidden, as executing themcould potentially alter the PDF.

    PDF/A also places higher demands onthe inormation it contains. All requiredonts (or at least all glyphs or the specificcharacters used) must be embedded withinthe PDF. o ensure a uniorm colour ap-pearance on a variety o platorms and de-vices, colour inormation must be given in

    a platorm-independent ormat using ICCcolour profiles. Te sofware must also usethe XMP ormat or metadata (which isused to store the data identiying the file asa PDF/A, or example).

    PDF/A also sets technical limits: or ex-ample, the page size is limited to an edgelength o either 5.08 metres (PDF/A-1)or up to 381 kilometres (PDF/A-2 andPDF/A-3).

    PDF/A is an industry-recognised

    ISO standard. Future software

    development must reflect the

    need to work reliably with

    these documents.

    6 PDF/A in a Nutshell 2.0

    Introduction

    6 PDF/A in a Nutshell 2.0

    Introduction

  • 8/11/2019 PDFA in a Nutshell 211

    7/20

    A short history of PDF/A

    Tose who first needed to store docu-ments in a uture-proo digital ormatused the popular image ormat IFF(agged Image File Format). Tis ormatwas used or a long time, particularly orscanned documents, but it has a numbero drawbacks.For example, the IFF raster ormat con-tains no text-based inormation, meaningfiles cannot be searched by their text con-tent. And i the IFF file contains colourimages or pages, it will become signifi-

    cantly larger; effective compression is allbut impossible. Only black-and-white lineimages (which is sometimes enough orscanned text pages) can save much spacein IFF ormat.

    Contrary to popular belie, IFF is not anISO standard. Te resolution, colour andmetadata settings or IFF files are mostlylef to the individual users discretion.

    Becoming an ISO standardAs Adobe Systems 1993-published Por-

    table Document Format (PDF) grew inpopularity, users and developers beganto recognise its potential or long-termarchiving.

    In 2002, specialists rom libraries andarchives, rom administrative bodies,rom industry and rom the judicial sys-tem assembled in order to develop a pur-pose-built file ormat or standardisedarchiving. A working group within theISO (International Organisation or

    Standardisation) took up the task: repre-sentatives rom a wide range o US-basedassociations and ederal authorities in-cluding AIIM (Association or Inor-mation and Image Management), NPES(Association or Suppliers o Printing,Publishing and Converting echnolo-gies) and NARA (National Archives andRecords Administration) met with ex-perts rom the library sector (Harvard

    University Libraries, Library o Con-gress), the judicial system (Administra-tive Office o the United States Courts)and industry developers (including Ado-be Systems and Kodak). Afer a numbero meetings and a comprehensive testingand approval phase, the ISO publishedPDF/A on the 1sto October 2005 underthe designation ISO 19005-1:2005. Itwas the worlds first standard file ormator digital long-term archiving.

    PDF/A catches onIn 2006, to promote recognition oPDF/A, a group o sofware developersounded the PDF/A Competence Centre(today a part o the PDF Association) asan industry association or digital doc-ument standards. Trough seminars,conerences, publications and not leastthrough its website www.pda.org, theassociation has helped spread practicalinormation about the ISO standard (seepage 19).

    Initially active in Germany and Swit-zerland in particular, within a ew yearsthe PDF Association was able to expandits area o operations across Europe,America, the Middle East, Asia and Aus-tralia. By the end o 2012, the Associationhad 143 members across 25 countries.

    oday, PDF/A has ound broad ac-ceptance in all sectors where docu-ments are stored long-term. Numerousdocument management solutions pro-

    vide direct support or archiving withPDF/A. More and more countries arerecommending the standard in publicadministration, or even specifically re-quiring it (see page 13). Meanwhile,a correspondingly broad selection oPDF/A creation and validation sofwareis now available (see page 12), romsingle-workstation solutions to auto-mated server-based systems.

    PDF/As wide-ranging every-

    day use is also seen in the

    number of common programs

    which support it. Free word

    processing software such as

    OpenOffice and LibreOffice can

    create PDF/A files at the click

    of a button, and Adobe Reader

    faithfully displays PDF/A docu-

    ments as they were intended

    to be seen. Microsoft Officehas also supported directly

    saving as PDF/A since 2007.

    Introduction

    PDF/A in a Nutshell 2.0 7

    Introduction

    PDF/A in a Nutshell 2.0 7

  • 8/11/2019 PDFA in a Nutshell 211

    8/20

    The technical side of thePDF/A standard

    Afer the first part o PDF/A was pub-lished, two more parts arrived. Tese arenot replacements or part 1, however;rather, they offer additional options orarchiving PDF documents. All existingPDF/A files remain ully valid.

    PDF/A-1: The first archiving standardPDF/A-1 is based on PDF version 1.4,which first appeared in 2001. All re-sources (images, graphics, typographiccharacters) must be embedded within

    the PDF/A document itsel. A PDF/A filerequires precise, platorm-independentcolour data using ICC profiles, and XMPor the document metadata. ransparentelements, some orms o compression(LZW, JPEG2000), PDF layers, and cer-tain actions or JavaScript are orbidden.A PDF/A file must not be password-pro-tected. PDF/A-1 expressly supports em-bedded digital signatures and the use ohyperlinks.

    PDF/A-2: Based on PDF 1.7PDF/A-2 was published in 2011 as ISO19005-2. Based on PDF version 1.7 (seepage 17), , which has since been stan-dardised as ISO 32000-1, it makes useo this versions new eatures. Tis meansPDF/A-2 allows JPEG2000 compression,transparent elements and PDF layers.PDF/A-2 also allows you to embed Open-ype onts and supports PAdES (PDFAdvanced Electronic Signatures)-com-

    pliant digital signatures. One particularlyimportant innovation is the containerunction: PDF/A files can be embeddedwithin a PDF/A-2 document.

    PDF/A-3: One more featurePDF/A-3 has been available since October2012. A PDF/A-3 document allows youto embed any file ormat desired notjust PDF/A documents. For example, a

    PDF/A-3 file can contain the original filerom which it was generated. Te PDF/Astandard does not regulate the suitabilityo these embedded files or archiving.

    Conformance levels: A, B, UTe different conormance levels reflectthe quality o the archived document anddepend on the input material and thedocuments purpose.

    Level A (Accessible) meets all require-

    ments or the standard, including thelogical structure o the document and itscorrect reading order. ext must be ex-tractable and the logical structure mustmatch the natural reading order. Fontsused must meet stringent requirements.Tis PDF/A level can usually only be metby converting born-digital documents.

    Level B (Basic) guarantees that the con-tent o the document can be unambigu-ously reproduced. Level B files are easier

    to create than Level A, but Level B doesnot guarantee 100% text extraction orsearchability. It does not necessarily meanthat the content can be reused withoutany problems. Scanned paper documentscan usually be converted to PDF/A Con-ormance Level B without any extra work.

    Level U(Unicode) was introduced alongwith PDF/A-2. It expands ConormanceLevel B to speciy that all text can be

    mapped to standard Unicode charactercodes.

    Nomenclature: PDF/A versionsand levels are simply given one

    after another. A PDF/A-1b file,

    for example, is a PDF file for

    long-term archiving, of the first

    generation, with visually repro-

    ducible content.

    8 PDF/A in a Nutshell 2.0

    Technical Information

    8 PDF/A in a Nutshell 2.0

    Technical Information

  • 8/11/2019 PDFA in a Nutshell 211

    9/20

    The most important reasonsto use PDF/A

    he PDF/A standard oers practicalsolutions or a wide variety o tasks,bringing advantages to many areas oapplication.

    Long-term archiving: PDF/A providesan ISO-standardised ormat to allthose who need to store digital docu-ments or long periods o time. hiscan include archives, libraries, banks,insurance irms and others.

    Legally binding documents: PDF/A isan excellent option or digitally signeddocuments and records. Te ISO stan-dard allows embedded electronic signa-tures and specifies only their minimumrequirements. Tis means that PDF/Adocuments can always be digitallysigned using the very latest technology,even as it develops in the uture.

    Science and research: PDF/A reliablydisplays special characters or math-

    ematical ormulas or old languages,as all required symbols are embeddedinto the ile itsel. ICC proiles pro-vide total colour control, supportingresearch work in ields such as medi-cine, archaeology or cultural history.As a result, people are always indingmore uses or PDF/A in the academicsector: some universities now only ac-cept assignments and dissertations inthis ISO-standard ormat.

    Global integration: storing inorma-tion in different languages requirescomprehensive support or all kinds owriting systems around the world. InJapanese, Arabic, or Cyrillic, PDF/Amakes sure that texts can always be cor-rectly displayed on any device, includ-ing the reading direction. It also allowsfixed-layout printing.

    Platform-independent : PDF, and soPDF/A too, are platorm-indepen-dent. hanks to PDF/A, documentssuch as invoices, brochures, manualsor research reports can be made reli-ably available through a wide range ochannels.

    Full text searching: PDF/A helps youto ind and access speciic inormationwithin a data set. his is even possiblewith scanned documents, as the stan-

    dard permits searchable text createdthrough optical character recognition(OCR). Even Conormance Level B(Basic) supports this eature.

    Extra search options: XMP metadatacan be used to add additional struc-tured inormation to the document,such as the author, description o thecontent, or source and copyright in-ormation. As a result, the user cansearch or additional stored keywords,

    categories or values within the dataset.

    Use content again and again: PDF/AConormance Level A makes it easierto reuse content. Such iles are veryeasy to convert to Word, HML or eB-ook ormats.

    Use PDF/A in combination with otherstandards: PDF/A is closely related to

    the ISOs other PDF standards. As a re-sult, a PDF/A ile can oten meet therequirements or universally accessiblePDFs (or disabled users) as deined inthe PDF/UA standard. Digital booksin PDF/A ormat are well-suited orprinting on demand i they also meetthe PDF/X standard or digital printdocuments. For an overview o PDFstandards, see page 17.

    Uses and Benefits

    PDF/A in a Nutshell 2.0 9

    Uses and Benefits

    PDF/A in a Nutshell 2.0 9

  • 8/11/2019 PDFA in a Nutshell 211

    10/20

    Typical uses for PDF/A

    Te PDF/A standard has proved its suit-ability or a wide variety o tasks. Herewe can show you just a ew brie practicalexamples.

    Scanned documents for archiving: PDF/Ais widely used to digitise paper-basedfiles and records. A document scannerreads the original text, and specially de-signed sofware automatically convertsthe data into a searchable PDF/A file.

    Archive migration: Solutions exist tohelp digital archives which are still usingolder ormats to migrate to PDF/A. Inmost cases, the process can even be au-tomated.

    Incoming and outgoing mail: Wheth-er a company receives letters or emails,PDF/A provides a reliable storage ormator them. Letters can be automaticallyscanned and archived as PDF/A files,and emails and their attachments can be

    stored in PDF/A too. Tere are also greatadvantages to storing a copy o all out-going mail in PDF/A ormat. Outgoingmail data can be retrieved rom popu-lar print data streams such as AFP (Ad-vanced Function Presentation).

    Office documents: I your presentations,spreadsheets and text documents arelikely to have long-term relevance andneed to remain available or long periods

    o time, then PDF/A is the perect ormatin which to archive them. Te originalprogramme used may directly supportthis to some extent, or you can use addi-tional sofware packages. In both cases,the process can be automated.

    Documentation and typesetting: Bro-chures and instruction books are usuallycreated using layout programs and edi-

    torial systems. As a result, an archivablePDF can be created at the same time with-out significant extra work. Tis can eitherbe done using external solutions (ratherthan using the original program that cre-ated the document) or using a print-readyPDF which is ofen already available inthe PDF/X ormat, the ISO standard orprint documents (see page 17).

    Creating documents from databases:Many PDF/A files were originally creat-

    ed rom databases or were created usingXML data. Tis structured input data o-ten allows you to create PDF/A Conor-mance Level A documents. You can alsoconvert orms to PDF/A.

    Digital document folders: As oPDF/A-3, source documents can alsobe embedded directly into a PDF/A filein their original ormat. Tis eliminatestime-consuming hybrid archiving pro-cesses in which additional documents

    (Excel tables, image files, CAD draw-ings) had to be managed separatelyrom the archived PDF/A file in theiroriginal ormats. Tanks to PDF/A-3,all relevant inormation is now con-tained within a single file.

    Team collaboration: PDF/A-3 in par-ticular is exceptionally powerul andflexible when used within modern col-laboration rameworks. Its hybrid ap-

    proach means that each document cancontain the current working version o adocument and the final archive-ready version. As a result, PDF/A-3 providesideal support or all the most importantunctions within a Microsof SharePointenvironment, or example. In particu-lar, this includes collaborative work ondocuments as well as distribution andarchiving.

    10 PDF/A in a Nutshell 2.0

    Uses and Benefits

    10 PDF/A in a Nutshell 2.0

    Uses and Benefits

  • 8/11/2019 PDFA in a Nutshell 211

    11/20

    PDF/A creation tools

    PDF/A documents can be created in avariety o ways:

    From scanned documents

    By direct conversion o the sourcedata

    By exporting rom the program usedto create the source document

    Using an intermediate step which

    turns a PDF ile into PDF/A

    Using print output ormats or printdata streams such as GDI, PCL,PostScript, AFP and XPS.

    his section will sketch out just a ewtypical approaches. For a more exten-sive list, including speciic productswhich allow you to create PDF/A iles,visit the PDF Associations website atwww.pda.org.

    Desktop softwareOn a standard workstation, oice ap-plications in particular will already o-er inbuilt tools (or can easily beretroitted) to export word-processediles, spreadsheets or presentations di-rectly to PDF/A. I the agged PDFoption is enabled, then Microsot O-ice, OpenOice and LibreOice willeven support PDF/A Conormance

    Level A or semantically structureddata.Some PDF/A conversion solutions

    use print data creation tools to generatePDF or PDF/A iles. Another approachis to use programming libraries to con-vert data or directly write to PDF.

    Adobe Acrobat is used or PDFs inmany industries, and it provides com-prehensive support or PDF/A. his

    sotware can be used to examine PDF/Ailes to ensure they actually meet thePDF/A standard (see page 12).

    Individual-workstation products al-so exist which allow users to scan toPDF/A, including OCR. his sotwareis sometimes supplied with the scanneritsel.

    Server-based solutionsServer-based solutions exist or massPDF/A creation. his allows busi-

    ness-wide standardisation o yourworking processes and lets you managelarge volumes o data. Some desktopPC products also have server-basedversions or high-volume processing.

    Programming librariesProgramming libraries allow devel-opers to add PDF/A unctionality totheir own applications without hav-ing to develop the needed technologyrom scratch. Some desktop or serv-

    er-based products are also availableas programming libraries. Supplierscan thus integrate extra unctionalityinto their solutions with minimal de-velopment work on their part. heseextra unctions may include PDF/Acreation, validation and management.A business I department can alsoadd PDF/A eatures to the companysown sotware environment or internalprojects.

    Integrated PDF/A functionsMany document and output manage-ment solutions providers oer mod-ules which can be used to perormPDF/A unctions. Many systems arealready available or high-volumemanagement o a wide variety o in-put and output channels in PDF andPDF/A ormat.

    Some word proce ssing softwa re

    (such as OpenOffice, shown here) al-

    low you to create PDF/A-1, includingTagged PDF as a prerequisite for

    Conformance Level A.

    Tools

    PDF/A in a Nutshell 2.0 11

    Tools

    PDF/A in a Nutshell 2.0 11

  • 8/11/2019 PDFA in a Nutshell 211

    12/20

    Validation: Is it reallyPDF/A?

    It is not always easy to tell at first glancewhether an existing PDF file actuallymeets the ISOs PDF/A standard. Appli-cations such as Adobe Acrobat and Ado-be Reader do use a pale blue banner toindicate when a file claims to be PDF/Acompliant, but this is only an indicatorand should not be used in place o a ullexamination to ensure the documentmeets the PDF/A standard.

    o be absolutely certain, you can per-orm a validation check which examines

    all relevant parts o a document.

    When do I need to validate?During the typical lie cycle o a PDF/Afile, there are particular points when

    it should be checked or complete ISOcompliance.

    After creation: A PDF/A file should bevalidated immediately afer creation, toensure that the process was carried outsuccessully.

    On receipt: I a company receives aPDF/A file (by email, or example, or by

    other means), validation is advised. A-ter all, it is impossible to know how thePDF/A file was created.

    Prior to transmission/distribution: Whensending a PDF/A file by email or makingit available online, validation in advanceis recommended.

    Before archiving: You must validatedata beore placing it in a digital archive.

    At the end of certain processes: Certainprocessing stages which should not ad-versely affect a PDF/A file under normalcircumstances (such as inserting extrapages) may, in rare cases, cause a PDF/Ato become invalid. Validation will clariythe situation or you.

    I the validation process detects a vio-lation o the PDF/A standard, the file canofen be repaired using the appropriatesofware. I this is not possible, the only

    other option is to recreate the PDF docu-ment rom scratch.

    Finding the right validation solutionSeveral validation programs are availableon the market. As with PDF/A creationsofware, you can choose between work-station applications, server-based solu-tions and modules or workflow systems.PDF/A files can also be validated usingprogramming libraries.

    Some PDF/A creation tools can alsoperorm a test afer conversion to ensurethe result meets the ISO standard. Nat-urally, these solutions can also validatePDF/A documents delivered rom else-where.

    Adobe Acrobat Pro, already used inmany industries, can test almost all PDFISO standards including the three partso PDF/A, using its Preflight unction.

    Adobe Acroba t and Adobe Rea der canindicate whe ther a docum ent may

    be PDF/A compliant, but this is not a

    replace ment for a ful l validati on.

    Acrobat s Preflight func tion checks for co mpliance wi th

    PDF standards.

    Note regarding process valida-

    tion: during an automated stage

    of processing, such as scanning

    to PDF/A, the process as a whole

    is validated rather than each

    individual PDF in the process.

    12 PDF/A in a Nutshell 2.0

    Validation

    12 PDF/A in a Nutshell 2.0

    Validation

  • 8/11/2019 PDFA in a Nutshell 211

    13/20

    PDF/A in publicadministration

    Many government authorities and publicinstitutions worldwide now speciy or-mats to use or digital data. Governmentoffices ofen recommend that workingdocuments use open file ormats. Moreand more ofen, PDF/A is the only or-mat accepted or final-version files.

    EU Publications Office: Te EU Publica-tions Office is tasked with providing ac-cess to all laws, declarations andpublications. Since 2007, the EU Digital

    Library has been tasked with storingprinted texts some o which date backto 1957 in digital orm as well. In a pilotproject, an external digitisation teamtook two years to turn 130,000 paperdocuments in eleven languages intoPDF/A-1b files with searchable text. Animportant actor in choosing PDF/A wasthat XMP metadata can be used or key-words and other bibliographic inorma-tion. o simpliy print-on-demand bookorders, the archive files are now also

    available in the ISO standard ormat ordigital print data, PDF/X-3.

    The European Patent Office: Since April2010, the European Patent Office haspublished patent documents not just inPDF ormat, but also in PDF/A. For thePatent Office, an important eature o thePDF/A ormat is ound in the way it usesmetadata: the XMP metadata fields caninclude the publication number, the pat-

    entee and the international patent clas-sification.

    Comply or Explain in the Netherlands:Te government o the Netherlands hasa Comply or Explain policy regardingopen standard sofware. Te national ac-tion plan Nederland Open in Verbind-ing enorces the use o open standardsand requests the use o standard file or-

    mats, namely ODF, PDF and PDF/A. Allpublic institutions in the country mustuse open standard sofware, as must allcompanies which take on public con-tracts. Any entity which cannot meetthese requirements must ully justiy thisdecision. In many cases, it is generallyeasier and ultimately more cost-effectiveto switch over to a standardised process.

    Brazil: In 2007, the Brazilian govern-ment introduced the e-PING architec-

    ture which regulates the provision odigital services. For final versions o adocument to be transmitted or archived,Brazil preers PDF/A.

    Denmark: Since April 2011, all Danishgovernment bodies are required to savenon-editable documents in PDF/A or-mat.

    France: Since early 2009, the Frenchauthorities have recommended the ISOs

    PDF/A standard or archiving adminis-trative documents with static, unchang-ing content.

    Switzerland: Due to archiving require-ments, all electronic communicationbetween citizens and administrative au-thorities is required to use the PDF/A fileormat. Tis regulation has been in orcesince 2008.

    Germany: German registry offices haverun an electronic register o births, mar-riages and deaths since 2009; or regis-tered data, they use PDF/A and XML. By2014, these offices are expected to haveswitched over to an all-digital system.

    For urther user reports and the latestPDF/A recommendations, visit the web-site o PDF Association: www.pda.org.

    The EU Publications Office in Luxem-

    bourg.

    Libraries and archives are taking

    a leading role in implementingand developing PDF/A. In the

    USA and Europe in particular,

    these institutions are choosing

    the ISO standard for long-term

    archiving.

    Areas of Application

    PDF/A in a Nutshell 2.0 13

    Areas of Application

    PDF/A in a Nutshell 2.0 13

  • 8/11/2019 PDFA in a Nutshell 211

    14/20

    PDF/A in finance andindustry

    Businesses benefit rom the ISO stan-dard or long-term archiving because ithelps them to store digital documentsin compliance with legal requirements.PDF/A-3 has urther increased adoptionrates within the financial sector, as thisnew part o the standard can also be usedto keep source documents organised (seepage 8).

    Industry documentationTe aeroplane manuacturer Airbus was

    a pioneer in the use o PDF/A. Aeroplaneblueprints must be preserved or at least99 years. Back in 2002, one o the man-uacturers working groups recognisedthat PDF was in some respects well-suit-ed to long-term archiving, but also thatit contained a number o problematicunctions. Te team thereore first devel-oped a minimal PDF which was usedor digital archiving until PDF/A becameavailable.

    Te construction industry has rec-

    ognised the advantages o eatures likePDF/A-3s container unctionality.Mechanical engineering companies, orexample, can preserve original 3D mod-els in any ormat as part o the PDF/A-3file.

    NIRMA (the Nuclear Inormation andRecords Management Association) alsorecommends PDF/A when working withnuclear technology in the United States.Te US-based energy provider Southern

    Co. has been using PDF/A or years toensure that all digital documents relatingto nuclear installations will remain read-able into the uture.

    Banking and insurancePDF/A helps the German health insur-ance provider echniker Krankenkassewith a bonus programme or its custom-ers. It begins with the company scan-

    ning existing bonus booklets in colour.Tis file is used to create a PDF/A withcompressed images and searchable text,which can then be archived and sent on-wards.

    Helaba, the state bank o Hesse andTuringia, uses PDF/A to handle incom-ing post and to archive emails. It alsostores digital credit documents in PDF/Aormat.

    Te banking and insurance sector o-ten requires credit and insurance files to

    be retained or 50 or more years.

    HealthcareAs a rule, documents such as doctorsnotes, statements, lab reports, and X-rayand tomographic images must be re-tained or 30 years or more.

    Te medical centre at Greiswald Uni-versity Hospital, or example, uses PDF/Ato archive its patient records includingdigital signatures and timestamps. Dueto requirements o legal certainty, digital

    signatures play a critical role in medicalstatements.

    PDF/A is used in doctors practices aswell as in clinics. Te Lake ConstanceRadiation Oncology Centre uses PDF/Ato process and archive digital patient re-cords.

    E-invoicingTe standardised document and dataormat ZUGFeRD was created to make

    it easier to exchange digital invoices. Itis the result o a joint initiative betweenBIKOM (Bundesverband Inormati-onswirtschaf, elekommunikation undneue Medien e.V.) and FeRD (Forumelektronische Rechnung Deutschland).Te ZUGFeRD exchange ormat alsouses PDF/A-3. It embeds invoice data inXML ormat to allow the recipient o aninvoice to process it automatically.

    14 PDF/A in a Nutshell 2.0

    Areas of Application

    14 PDF/A in a Nutshell 2.0

    Areas of Application

  • 8/11/2019 PDFA in a Nutshell 211

    15/20

    Digital documents are increasingly re-placing traditional paper documents.Existing paper documents are beingscanned and digitised, while digi-tal-only processes are becoming moreand more common and PDF/A is atthe very oreront o this trend.

    PDF/A also plays a signiicant role inlegislation and justice in many coun-tries. Legal bills and court records usu-ally have to be stored or exceptionallylong periods o time. he ability to

    search text and XMP metadata inPDF/A iles can make it much quickerand easier to ind and allocate digitalrecords.

    Federal jurisdiction in the United

    States courts

    o see the huge signiicance o PDF/Ain the legal ield, you need only look atthe Administrative Oice o the UnitedStates Courts, which is taking a lead-

    ing role in standardising PDF/A. Workhas been ongoing since 2002 with theAmerican associations AIIM (Asso-ciation or Inormation and ImageManagement) and NPES (the Nation-al Printing Equipment Association)to develop the standard or archived

    documents. he goal was to turn largequantities o paper-based documentsinto a reliable, uture-proo digital or-mat with no expensive specialist view-ing sotware required.

    The Italian Chamber of CommerceSince 2010, Italian businesses havebeen required to send reports to theappropriate commercial register inPDF/A ormat. his includes balanc-es, certiicates and reports o business

    transactions, acquisitions, mergers andinsolvencies.

    A data transer platorm is usedor input; sotware is used to converttext or existing PDF documents toPDF/A-ormat certiicates. he CGN,a specialist network or the inancial,legal, iscal and labour sectors, wasclosely involved in establishing theplatorm.

    Austria: BAIKhe Bundeskammer der Architektenund Ingenieurkonsulenten in Austria,the Federal Chamber o Architects andConsulting Engineers, requires public-ly available digital certiicates to con-orm to the PDF/A-1b standard. hisguarantees the authenticity o all dig-ital documents accepted into the titleregister, thanks to a qualiied digitalsignature.

    Germany: Land registrationhe Decree o the Baden-Wrttem-berg Ministry o Justice or the intro-duction o digital legal proceduresand digital records o land registrationprocedures (ERGA-VO) requires thatall digital data (ASCII, Unicode, RF,PDF, IFF, Word) submitted must beconvertible to PDF/A. his decree hasbeen in orce since March 2012.

    PDF/A in legislation andjustice

    Areas of Application

    PDF/A in a Nutshell 2.0 15

    Areas of Application

    PDF/A in a Nutshell 2.0 15

  • 8/11/2019 PDFA in a Nutshell 211

    16/20

    Stephen LevensonU.S. District Courts:

    PDF/A now pro-vides a ull decadeo design ideasrom the best andbrightest digitalpreservation prac-

    titioners. Rich opportunities exist orhaving all the eatures o PDF that youhave come to expect in presentation

    and now with PDF/A-3 the ability tohave machine-processible data as XMLor text. PDF/A-3 will also allow orkeeping the original editable version oyour content.

    Most o the world has adopted PDF/Aas their long-term static ormat. It hasbecome the true replacement or paper,as its designers envisioned. Other or-mats will result in the loss o content. Itused to be said, no one was ever fired orpicking IBM. It will be said some day, no

    one was ever fired or picking PDF/A. Itis that dependable.

    Anton Zagar

    EU Publications Office:

    he EuropeanUnion PublicationsOice stores itsdigital archive o

    over 150,000 pub-lications, some othem stretching back to 1952, in thePDF/A ormat. We also use the PDF/Aormat to publish the Oicial Journalo the European Union in 23 languagesevery day: in 2012 alone we produced1.2 million pages.

    Te Office has set itsel the goal omaking all publications, by all bodies o

    the European Community and the Euro-pean Union, available in digital orm.

    Kai Volmar

    Landesbank Hessen-

    Thringen (Helaba):

    In the world o I,sometimes you dontwant to be the first touse a new technolo-

    gy. But the advantag-es o PDF/A convinced us straight away,when compared with the IFF ormat wehad previously used. As a result, the deci-sion in 2006 to use PDF/A to digitise ourrecords was not a hard one.

    Meanwhile, the state bank o Hesseand Turingia now uses nothing butPDF/A or digital archiving, whetherthe documents first need to be digitisedor were born digital in the first place.

    Jaco b Bie lfeldt, Techn i-

    ker Krankenkasse:

    In an internalworkshop in 2006,the echniker Kran-kenkasse identifiedPDF/A as a prom-ising uture-proo

    document ormat; today, this is con-

    firmed by the many advantages PDF/Abrings.Te echniker Krankenkasse is intro-

    ducing PDF/A into its ongoing projectsstep by step. Our first project was to digi-tise staff records; our second was to usePDF/A in output management. We haveincreasingly used PDF/A or input man-agement since 2011 and are planning touse PDF/A urther.

    What the users andexperts say

    16 PDF/A in a Nutshell 2.0

    Expert Opinions

    16 PDF/A in a Nutshell 2.0

    Expert Opinions

  • 8/11/2019 PDFA in a Nutshell 211

    17/20

    PDF/A and the other PDFstandards

    Specialist ISO standards based on thePortable Document Format are avail-able or a wide range o purposes.

    PDF/XBack in 2001, an ISO working groupdeveloped a pre-press PDF standard,ISO 15930. At this time, customersusually sent printers open iles romlayout sotware. his method, howev-er, always carried the r isk o onts andimages going missing. PDF/X is able

    to eliminate all o these problems; italso has the advantage o carrying re-liable colour inormation thanks tocolour management settings.

    he X identiier stands or Ex-change, as PDF/X is intended or re-liable print data exchange. Additionalstandardisation or PDF/X versions 4and 5 has taken into account the newereatures available to the PDF ile or-mat, including transparent elementsand JPEG2000 image compression.

    PDF/X-5 also supports externally re-erenced elements.

    PDF/APDF was also recognised early on ashaving great potential or archivingdigital documents. In 2005, the ISOpublished the irst part o the PDFstandard or long-term archiving,PDF/A.

    PDF/Ehis standard has been available since2008 as ISO 24517; it is aimed atengineering documents such as con-struction drawings. he original dataoten comes rom CAD sotware usedor digital drating. PDF/E can displayrotating and olding 3D objects on-screen, using tools like the ree AdobeReader.

    PDFPDF itsel was also standardised in2008 as ISO 32000. he basis o thestandard was the then-current PDFversion 1.7. With this, PDF became anopen standard. PDF 2.0 is expected tobe published in 2014.

    PDF/VTPDF/V is a standard based onPDF/X-4 and PDF/X-5, supportingvariable data printing. It was pub-

    lished in August 2010.he abbreviation V stands or

    Variable data and transactionalprinting. his includes invoices andpersonalised advertisements, or ex-ample.

    PDF/UAhe PDF/UA (Universal Access) stan-dard, approved in 2012, allows univer-sal access to PDF iles content. his isuseul or users with disabilities (or

    example the partially sighted) andothers.O particular importance is a clear co-herent logical structure o the PDFselements, to ensure that navigationalaids, reading sotware or Braille dis-plays can handle all content includingtext, images and diagrams.

    PDF/UA builds on proven conceptsor accessible web content and addsconcrete demands on the semantic

    structure o PDF documents (whichPDF/A Conormance Level A hadpreviously only given in a very gener-al sense).

    PDF/UA oers users with disabili-ties the best possible access to content.It also makes it easier or mobile de-vices to use this content and supportsits lexible reuse in other orms o pre-sentation.

    PDF/X since 2001

    Prepress digital data

    exchange using PDF

    ISO-Standard for the

    printing industry

    PDF/A since 2005

    PDF Archive

    Standardised long-term

    archiving with PDF

    PDF/E since 2008

    PDF Engineering

    Construction diagrams

    with moving 3D models

    where required

    PDF since 2008

    Portable Document Format

    The ISO standard

    corresponds with PDFversion 1.7

    PDF/VT since 2010

    PDF for Variable Data and

    Transactional Printing

    Used for variable data

    printing

    PDF/UA since 2012

    PDF for Universal Access

    ISO standard for

    universally accessible

    PDF documents

    The Family of PDF Standards

    PDF/A in a Nutshell 2.0 17

    The Family of PDF Standards

    PDF/A in a Nutshell 2.0 17

  • 8/11/2019 PDFA in a Nutshell 211

    18/20

    A number o critics have spoken outagainst PDF/A, especial ly when the stan-dard was first introduced. Many criti-cisms o the ormat, however, are basedon misunderstandings.

    Tese are some o the most commonlyencountered myths and legends:

    PDF/A files are too large: PDF/A actu-ally allows exceptionally small file sizesthanks to its sophisticated use o pow-erul compression algorithms such as

    JBIG2 and JPEG (and JPEG2000, romPDF/A-2 onwards). Embedded onts canslightly increase the size o a PDF/A file.When archiving a very large numbero individual, airly similar documents,this can in some cases (such as or massmailings) prove problematic.

    PDF/A is not as revision-safe as TIFF: IFFfiles are easier to alter than PDF andPDF/A documents. In any case, however,revision saety is not achieved through

    your choice o file ormat. It can only beachieved by using an appropriate docu-ment management or archiving system.

    PDF/A does not allow signatures: Quitethe opposite. PDF/A expressly supportsembedded digital signatures. PDF/A-2 re-quires PADeS-standard compliance here.

    Links are not allowed: Tis claim is alsoalse. Hyperlinks are allowed in princi-ple. Te PDF/A standard sets no require-

    ments as to whether an external linkshould lead to a valid destination.

    PDF is a proprietary format: PDF wasoriginally developed by Adobe Systems,but since then PDF (ISO 32000) andPDF/A (ISO 19005) have become ISOstandards. IFF, on the other hand, isa specification belonging to Adobe Sys-tems alone, and it has not achieved thestatus o ISO standard.

    Scanned documents cannot be searchedby text: PDF/A permits text recognitionprocesses, meaning that even scannedPDF/A documents can be searched.

    PDF/A is not supported by DMS systems:Any ECM system which works withPDF can also handle PDF/A in princi-ple. Many DMS suppliers offer solutionswhich support PDF/A.

    PDF/A does not allow metadata: Not

    at all: PDF/A specifically requires em-bedded standardised metadata corre-sponding to the modern XMP metadatastandard, which was published in Febru-ary 2012 as ISO 16684-1. XMP meta-data can be directly embedded into thePDF/A document.

    PDF/A is not globally relevant: Tis state-ment is alse. Although the very firstPDF/A initiatives and products did come

    rom German-speaking countries, theISO standard has since become a recom-mendation or even a legal requirementin many countries and industries.

    PDF/A is expensive to implement: Yesand no. Implementing PDF/A solutionsand training staff will incur costs at first,but these investments very ofen pay orthemselves within months.

    The myths and legendssurrounding PDF/A

    18 PDF/A in a Nutshell 2.0

    Tidbits

    18 PDF/A in a Nutshell 2.0

    Tidbits

  • 8/11/2019 PDFA in a Nutshell 211

    19/20

    Further information onPDF/A

    he PDF/A Competence Centre, to-day a part o the PDF Association, wasounded very shortly ater PDF/A irstappeared as an ISO standard. his in-ternational organisation aims to pro-mote the development and usage oPDF standards. o that end, the PDFAssociation targets users, developersand decision-makers equally and helpsits members exchange inormationworldwide.

    The portal to the PDF AssociationA good starting point or anyone withquestions about PDF/A or PDF in gen-eral is the PDF Associations website.At www.pda.org, you can find inor-mation in English and German aboutcurrent development, rom all relevantindustries and rom suppliers world-wide. Comprehensive background inor-mation and example applications explainthe technology behind PDF/A andits use in practice. You can view a vid-

    eo-on-demand series about PDF/A andother PDF standards. Te website offersan overview o PDF- and PDF/A-relatedsofware products and services.

    Users can contact the Associationwith speciic questions. o do so, sim-ply register on the website and de-scribe your request on the discussionorum. Specialists and practitionersrom around the world will provide in-ormed answers and suggestions.

    PDF Association eventshe PDF Association attends interna-tional trade shows and other eventson subjects such as document man-agement, digital media and electronicarchiving. For years the Associationhas also organised specialist seminarsand technical conerences around theworld. Companies and public author-

    ities seeking a speaker on PDF/A ortheir event can ind support at the PDFAssociation: interested parties can sim-ply enquire about a presentation usinga orm at www.pda.org.

    Some countries also have direct con-tact persons in the Associations localchapters. For a complete list, includingcontact details, please visit the Associa-tions website.

    Membership

    Anyone aiming to work actively to de-velop and expand use o the PDF stan-dard can become a member o the PDFAssociation, whether an individual oran entire organisation.

    Membership allows a company topresent itsel on the PDF Associationsportal and to publish its own an-nouncements, press releases and arti-cles on the website. hey can alsopresent their sotware solutions andservices within the Associations prod-

    uct showcase. he Association also hasespecially avorable terms or present-ing products and strategies at tradeshows and other events. Members alsohave exclusive access to the Associa-tions intranet.

    The PDF Associations website can be

    found at www.pdfa.org.

    Further Information

    PDF/A in a Nutshell 2.0 19

    Further Information

    PDF/A in a Nutshell 2.0 19

  • 8/11/2019 PDFA in a Nutshell 211

    20/20

    PDF/A is an ISO standard or using the PDF ormat orlong-term archiving o digital documents. Since its pub-lication in 2005, PDF/A has become the ormat o choiceor archiving digital documents in a wide range o indus-tries and applications. PDF/A in a Nutshell 2.0 pro-vides a comprehensive introduction to the material andshows off the latest developments available with PDF/A-2and PDF/A-3. Te brochure provides inormation aboutPDF/A tools and strategies or creating and validating

    PDF/A files. Examples rom around the world demon-strate how users in the areas o finance, administration,academia and law can benefit rom PDF/A.

    Contents:

    Facts about PDF/A

    Te history o PDF/A

    Te technical side

    Who can benefit rom PDF/A and why

    ypical applications

    PDF/A creation tools

    PDF/A validation

    PDF/A in public administration

    PDF/A in finance and industry

    PDF/A in legislation and justice

    What the users and experts have to say

    PDF/A and the other PDF standards

    About the author

    Alexandra Oettler has worked oryears as a reelance journalist in theareas o sofware, print and media.Her work is regularly published inspecialist journals on the subject oprepress in practice, sofware tech-nology and financial developments in

    the publishing sector. She regularly writes news and back-

    ground reports or the online editions o several journals.She also was one o the co-authors o the first edition oPDF/A in a Nutshell.

    PDF/Ain a Nutshell 2.0 PDF for long-term archiving