advisory committee on the electronic records archives april 29-30, 2009 program director’s update

28
Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Upload: lexi-shingleton

Post on 02-Apr-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Advisory Committee on the Electronic Records Archives

April 29-30, 2009

Program Director’s Update

Page 2: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Topics

Development and deployment of the ERA instance for the G.W. Bush presidential records

Plans for further development

Page 3: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Where is ERA?Where is ERA?

Page 4: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Rocket Center, WV

Page 5: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Erma Ora Byrd Conference & Learning Center

Page 6: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update
Page 7: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

The Search & Access ERA Instance for G. W. Bush Electronic Presidential Records

Page 8: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

What Does the Base ERA Do?

Focus:

Functions:

Federal Records Nationwide records management program National Archives

Creation, review and approval of records schedules

Manage transfer of physical and legal custody of all types of records

Systematically collect, create, and manage lifecycle data about records

Actual transfer, inspection, and archival storage of electronic records

Page 9: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

What Does the Search & Access ERA Do?

Focus:

Functions:

Presidential Electronic Records George W. Bush Presidential Library

Rapid ingest of very large volumes of electronic records

Automatic indexing on ingest Immediate searchability, based on index Creation of different versions to support

structured search of priority records Basic case management for review and

redaction of sensitive content.

Page 10: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Search and Access Instance Development

Achieved Initial Operating Capability December 8, 2008

LMC proposed and received NARA and EOP agreement on an expedited method for transfer of electronic records.

NARA has enjoyed excellent collaboration from the EOP.

NARA implemented a contingency plan for access to high priority e-records, the finding aid for WH paper records and the database of digital photography, pending completion of processing into ERA.

Page 11: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

1/26

EOP Transfer & Ingest Overview

ARMS (PRA) = 1.9 TB

PDS = 0.0005 TB

WARDS = .018 TB

SAN B1

PDS (delta) = 0.0005 TB

WARDS (delta) = 0.001 TB

SAN A

Exchange

12/5 (IOC)12/8

ARMS (SAN)

12/15

PDS WARDS

1/15

RMS

1/30

Merlin One = 36 TB

Non-Pri Types = 20TB

RMS = 1.0 TB

6.0

Sto

rag

e

Arr

ay

s

7.1

7.2

ARMS (PRA)

PDS

WARDS

PDS (delta)

WARDS (delta)

Merlin One

1/20

Da

ta T

yp

e

SW

Dro

ps

SA

SS

O

pe

rati

on

s

(In

ge

st)

7.0

SAN B Returns

12/12

Merlin One

April 11, 2023

RMS

2/11

Snap Server

RMS(Update)

11

SAN A2

Merlin One2 = 36 TB

Exchange

Non-Pri Types = 0.2 TB

SAN B2

Exchange = 57 TB

ARMS (FRA) = 5.1 TB

? 5/16?

Page 12: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

G.W. Bush Presidential Electronic Records

RecordsNumber of

objects

Gigabytes of

Data

Shipped to ERA Data Center

Status

Priority Records

Email (2000-2003) 44,815,184 1,688 12/8/2008 >99% available for search in ERA. There are technical problems with the remaining messages.

MS Exchange email (2003-2008)

150,000,000 estimated

16,500 estim

ated

Expected mid May

In temporary storage. Conversion to standard format, separation from federal records, and identification of responsible EOP component largely complete.

Presidential Diary 682,193 1 12/8/2008 and 1/26/2009

100% available for search in ERA

digital photography 11,220,044 31,000 1/26/2009 Problems require shipment of a second set, expected in mid May

Index to White House paper records

313,850 583 1/26/2009 100% available for search in ERA, but about 6% of the records appear to be missing some pieces of data.

Visitor and worker access to EOP buildings

28,922,988 14 12/8/2008 and 1/26/2009

100% available for search in ERA

Index to motion video 305 5 1/26/2009 In ERA, being processed

Email from WH Counsel 572,051 1,057 1/26/2009 In ERA, being processed

Other Records >12,000,000 >5,450 Partial shipment 1/26/2009

Some in ERA, being processed.Remainder expected mid May

Page 13: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Processing Status - 1 All Bush e-records have been transferred to NARA’s custody.

Not all have been transferred to the ERA Data Center in WV. EOP is maintaining copies until NARA successfully completes ingest.

Archives Operational Issues Several sets of records were not transferred in the formats previously agreed by NARA and

EOPo NARA required retransmission

Some records exhibited anomalieso Some ARMS email records had binary data in the “To” fieldo Some metadata in the digital photography system did not have corresponding images.o Some entries in the Records Management System are missing some fields.o MS Exchange email was not divided presidential from federal records or associated

with EOP component, and contained numerous duplicates. EOP is addressing these problems prior to transfer to ABL. EOP has converted from proprietary to standard format. NARA will preserve both the original files and the output of the EOP processing.

o Encoding of date of birth in the Access system impeded searches on that field. Viruses have been found in a small percentage of files.

o Infected files have been successfully quarantined. LMC & NARA are working to produce clean copies.

Page 14: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Processing Status - 2

Technical Issues Issues with COTS products:

o Automatic indexing of a batch of records stops when errors are found in any of the records; e.g., binary data in headers of email.

o Erroneous results returned in certain conditionso Incomplete search results returned in other cases. o LMC underestimated storage space needed for the index.

Additional hardware has been ordered. Unanticipated software development needed to ensure complete

and accurate mapping between ‘.eml’ email produced by the EOP and the original MS ‘.pst’ files

NARA directed LMC to hire a subcontractor to perform actual ingest of records.

Page 15: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Status of Requests for Bush Records

28 Requests for access as of March 17, 2009 Primarily for paper records

NARA has responded using data about the paper records in the Records Management System

A few requests were for digital photographs. Most requests were addressed using the two systems

NARA set up under the Contingency Plan because processing of the records had not been completed at the time the requests were received.

Three requests fulfilled using records on temporary ERA storage.

Page 16: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Plans for Further Development

Page 17: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

What’s in Store for the Future? Increment 2

Preservation Frameworko Introduction and use of a variety of tools for different preservation

needs Public access

o Information about all types of recordso Online access to electronic records

Initial system evolution Increments 3 - 5

Incremental enhancements in capability & capacity Continuing system evolution Governmentwide expansion Full Lifecycle Management Plans Appraisal case management and workflow Search Framework supporting different tools FOIA and other access case management Review and redaction of sensitive content

Page 18: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Shared Services

ERA Functional View: Current Status

System Management

System Management

Help DeskHelp DeskNetworkNetwork

Base InstanceBase Instance EOP InstanceEOP Instance

White HouseAgencies

Enterprise Service Bus

Data Management

Page 19: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Shared Services

ERA Functional View: Planned

System Management

System Management

Preservation Framework

Preservation Framework

Public AccessPublic Access

Help DeskHelp DeskNetworkNetwork

Base InstanceBase Instance EOP InstanceEOP Instance

White House

Congressional Instance

Congressional Instance

Committees

Records Center

Instance

Records Center

Instance

AgenciesAgencies

Public

Enterprise Service Bus

Current capability: solid fill

Future capability: hashed fill

Data Management

Page 20: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

ERA Instances Base Instance (June 2008)

Used by NARA and federal agencies For management of all federal records For transfer, inspection and management of federal electronic records

EOP instance (December 2008) Used by NARA and Presidential Administrations For transfer, inspection, and management of presidential electronic

records Congressional Instance (future)

Used by NARA for Congressional Committees For transfer, inspection, and management of presidential electronic

records Federal Records Center Instance (future)

Used by NARA and other federal agencies For transfer and storage of temporary and permanent federal electronic

records that remain under the control of the originating agency

Page 21: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

ERA Shared Services System Management (current)

System operation and maintenance Security User account management Deployment of new & updated software Backup & other common services

Help Desk (current) Respond to technical questions and issues from users

Network Link to the Internet, NARANET (current) Interfaces with other systems (future)

Data Management Data about records and transactions related to them (current) Description of NARA holdings (Increment 2) Review and redaction of records with restricted content (future)

Preservation Framework (Increment 2) Tools to overcome obsolescence of different digital formats (future)

Public Access (Inc. 2 +) Search and retrieval of information about records, regardless of custody Search and access to electronic records in NARA’s custody Search and access to digitized records from NARA’s holdings Freedom of Information Act for restricted records in NARA’s custody

Page 22: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Advantages of the Instances & Shared Services Approach

Instances enable different business rules and processes for different mission requirements: Base Instance: Federal Records Act provisions on

governmentwide records management and on the National Archives

EOP instance: Presidential Records Act Congressional instance: House and Senate rules. Federal Records Center Instance: Federal Records

Act provisions on storage of temporary and permanent records under originating agencies’ authority.

Page 23: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Advantages of the Instances & Shared Services Approach

Shared services maximize utilization of resources, reduce redundancy and provide a stable foundation for system growth and evolution over time.

Shared services deliver capabilities and capacity wherever needed, regardless of differences in mission and business needs E.g. the Preservation Framework can be used to preserve any

electronic records, regardless of whether they came from Congress, the White House or a federal agency.

E.g., a citizen seeking access to information will be able to find it using a single web portal, regardless of whether

o It is information about records or in the records, o the records are in NARA’s physical custody, o the records are electronic or hard copy, o they originated in the White House, Congress or an agency.

Page 24: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Preservation

Electronic Record2 Preservation

Framework

Record Identity

Record Integrity

Original Order

Tool1 Tool2

Tooln…

The Preservation Framework supports the introduction and use of an arbitrary number and variety of processes under the control of archival requirements for authenticity.

Electronic Recordn

Electronic Record1

Electronic Record2’

Electronic Recordn’

Electronic

Record1’

Page 25: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Public Access

Information about all records From Records Schedules Archival Descriptions Other NARA information

Online access to electronic recordsOnline access to scanned versions of hard copy

recordsRequests for copies of recordsFreedom of Information Act requests for

restricted recordsAssistance from NARA staff

Page 26: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Increment 3 Work Status

Authority to Proceed Issued for Early Analysis Architectural Framework Preservation examination and prototyping Search Engine examination and selection Open Access examination and selection Enhancements to address authorized user defined

changes and software defects not addressed at IOC

Discussions begun on scope of work and technical details for full proposal

Target date for award: 7/09

Page 27: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

Governmentwide Expansion

Initial Implementation June 2008 – June 2009 Four collaborating agencies NARA staff proxy for other agencies

Invitational Phase June 2009 – February 2010 Additional agencies by invitation

Voluntary Phase February 2010 – December 2010 Additional agencies who volunteer and meet critera

Mandatory Phase January 2011 All agencies

Page 28: Advisory Committee on the Electronic Records Archives April 29-30, 2009 Program Director’s Update

The Development Timeline

Full Operating Full Operating CapabilityCapability

Initial Operating Capability)

6/08

Operation & Maintenance

9/05 9/06 9/07 9/08 9/09 9/10 9/11

Search & Access ERA

Public Access &Preservation Framework

Enhancement

Enhancement

ERA Base