brief overview of major enhancements to pawn. producer – archive workflow network (pawn)...
Post on 20-Dec-2015
216 views
TRANSCRIPT
Brief Overview of Major
Enhancements to PAWN
Producer – Archive Workflow Network (PAWN)
Distributed and secure ingestion of digital objects into the archive.
Use of web/grid technologies – platform independent
Ease of integration with data grids or digital libraries.
XML Representation of metadata and bitstream• Self describing bitstream submissions
Accountability of transfer and guarantee of data integrity
Ingestion Workflow (PAWN)
1. Negotiate Submission Agreement.
2. Workflow Initialization and Submission Information Packet (SIP) creation.
3. Transfer of SIPs to receiving servers.
4. Validation of SIP transfer
5. Organization of data into collections and transfer into the distributed archive.
Distributed Ingestion
``
`
Producer
``
`
Producer
``
`
Producer
``
`
Producer
Distributed Archive
Distributed Ingestion
Each Producer registers and arranges files locally prior to transport.
Multiple distributed archival receiving stations.
X.509 based authentication between sites. Independent Certificate Authorities at each
Producer. Persistent archive is geographically
distributed and managed by a data grid.
Producer
Provides data to an Archive based on a prior agreement.
Consists of a management/metadata server and an ingestion client.
Provides initial arrangement, context, and metadata.
Producer Management Interface
Producer data suppliers
Archive
Management Server
Enhancements to the Producer
Data submissions are organized through a logical hierarchy negotiated between the archive and the producer.
Clients no longer see entire hierarchy, but rather attachments points
Better state tracking and oversight of submissions METS documents are no longer merged together,
but rather kept separate to support larger submissions.
Submission can be broken into multiple METS documents linked together through pointers.
Producer signed submissions to ensure integrity.
Different administrator and client views
Manager / Record Manager
Administrator• Views entire producer
hiearchy
Producer / Record Creator• View restricted to
allowable submission points
New Interactions Between Client and Receiving Servers
Ability of client to reserve resources before starting to transfer data into the archive.
Client creates a session with a receiving server and uploads metadata.
Clients upload bitstreams, and receiving server validates checksums during transfer
Client can resume or retransmit failed submissions
Client-Receiver Interaction
1. Reservation Request*
3. Reservation Information*
2. Rese rvation
N
egotia
tion *
[5. Signed Mets Package and Acknowledgement]**6...n Send Payload and Acknowledgement
4. Open Session
Finished Transmitting
* Placeholder calls** Only required once
Scheduler
Receiving Server
Client
Archive - receiving
Receives data from a Producer
Validates bitstreams and metadata, and sends acknowledgement to Producer.
Arranges into collections and specifies preservation policy.
Publishes bitstreams into a digital archive.
Bitstream Validation Service
Digital Archive
Load Balancer
Producer 1
Producer n
Producer 2
New Features for Receiver
Validation Services• Designed a standard API and test suite for rapid
development of validation services.
• New classes of services can be easily developed.
Receiving Server• Configurable endpoints into storage or metadata
repositories
• Better handling of multiple producers
Scheduler
Allocates the processing of data streams from multiple clients to a cluster of receiving servers.
Clients are required to request a resource reservation.
Receiving server will acknowledge/deny the reservation.
Client will be informed about reservation/receiving server.
Currently, receiving server has hooks for scheduler