uc3 services in-depth: data curation for practitioners 2012 workshop
TRANSCRIPT
UC3 Services In-Depth:
Data Curation for Practitioners 2012 Workshop
Deep dive
• Who is it for?
• Why use it?
• How much does it cost?
• What can it do?
• How do I use it?
• Next steps
• Q & A
Who is it for?
• Libraries/archives/museums• ORU/MRUs• Faculty/staff
• Centrally hosted by UC3/CDL
Mediated through campus libraries
Why use it?
• Curation repository– Supporting long-term preservation and access– Publish, share, preserve, discover, (re-)use
• “Model free”– There are no prescriptive requirements for content genre,
format, structure, or accompanying metadata
How much does it cost?
• UC affiliates pricing model and service agreement available soon
– Charged for storage fees only, at $390 per TB
– Pay as you go (1 year)
– Pay once, store for a long time (10 years)
• Services will be available to non-UC contributors
Sample pricing scenariosPay-as-you-go (1 year)
< 100 GB = $39
< 500 GB = $195
< 1 TB = $390
< 5 TB = $1,950
Paid-up (10 years)< 100 GB = $290
< 500 GB = $1,450
< 1 TB = $2,900
< 5 TB = $14,500
Modes of use: dark archive
Modes of use: bright archive
"The Vault is Open" by Patrick Gage Kelley, 10/18/2008Available on Flickrhttp://www.flickr.com/photos/patrickgage/2961930014/
Modes of use: bright archive
• Provide preservation and end user access
• Option to designate collection as open to the public or keep restricted to designated users
• Can provide public access to entire object in the collection (metadata + associated files)
• Or can provide public access to view metadata while restricting associated files
There’s an option for “guest” login
Select a collection to browse
Choosing one of the collections brings you to the list of objects
• View key citation metadata, • Download individual versions• ... or download all versions
The permanent linkis listed on the object landing page for each object
Click-through agreements
• Possible to add a usage agreement at the collection level
• Every time someone tries to download an object from the collection, they would first be presented with the terms of use
• Copies sent to CDL, content owner/depositor, and end user
For collections using DUAs, clicking on the “Download object” button or using the download URL leads to…
Terms of use
Personal information
Once the required fieldsare filled in, click the Accept button…
And the download begins
Modes of use: preservation “back end”
eScholarship
UCSF Clinical & Translational Science Institute:
XTF + Merritt (DataShare)
UCB Information Services and Technology:
Alfresco + Merritt (Research Hub)
UCLA Library: Islandora + Merritt
Modes of use: distributed data grids
• DataONE “Enable new science and knowledge creation through universal access to data about life on earth and the environment that sustains it”
How do I use it?
• Contact us for an account: [email protected]
• Agree to service terms and fee schedule (available soon)
• We’ll work with you to establish and configure collection(s), for depositing content:– Level of access (dark vs. bright archive)
– User accounts and permissions to deposit and access the collection
Depositing content
• User interface• METS feeder• API
manual deposits
automated deposits (METS)
automated deposits
Depositing content
• User interface• METS feeder• API
manual deposits
automated deposits (METS)
automated deposits
See User’s Guide and online help for more information http://merritt.cdlib.org/
Deposit content through the UI• The submission package is always a single file:
– For a single object:
• A single file (image, PDF, A/V, etc.). Supply metadata through the UI• A container (zip, gzip, tar.gz), comprising file(s) for a single object.
Supply metadata through the UI, or include as a metadata file• A manifest, enumerating file(s) for a single object. Supply metadata
through the UI, or reference metadata file in the manifest
– For multiple objects:
• A manifest, enumerating files and metadata records for multiple objects
• A manifest, enumerating multiple containers• A manifest, enumerating multiple manifests
Upload a file, container, or manifest
Optionally supply metadata for the file or container
Using manifests
• A “packing slip” for an object, providing URLs for all object’s file components
An Excel macro is available for automagically generating manifests. See User’s Guide and online help for more information http://merritt.cdlib.org/
Providing metadata for objects
• Optional; if you don’t supply it, we’ll derive it• How to supply metadata:– Provide it through the UI– Include it as part of the manifest– Include it as a metadata file:
• Simple Dublin Core record: mrt-dc.xml• ERC record: mrt-erc.txt
Deposit notification
• You will receive two email separate notifications– Initial notification that we have received your submission,
and that it is queued for subsequent processing
– Final notification that we have fully processed your submission
Future developments
• Improved download for large objects
• Moving some functionality to object level—public option, DUAs
• Updating objects with just the changed components
• Enhanced UI to open access content in Merritt (exploring use of XTF and Solr)
• Drag-and-drop submission similar to Flickr, Dropbox
For more informationMerritt repository:http://merritt.cdlib.org/
Merritt overview and resources:http://www.cdlib.org/services/uc3/merritt/https://confluence.ucop.edu/display/Curation/
• Development plans• Webinars• Use cases and deployment profiles• Cost modeling• TRAC self-audit• And more...
Contact [email protected]
Service ManagersPerry Willett510/[email protected]
Adrian Turner714/[email protected]