preservation and institutional repositories for the digital arts and humanities

54
Dorothea Salo University of Wisconsin [email protected]

Post on 21-Oct-2014

966 views

Category:

Technology


2 download

DESCRIPTION

For the Digital Humanities Data Curation Institute

TRANSCRIPT

Page 1: Preservation and institutional repositories for the digital arts and humanities

Dorothea SaloUniversity of Wisconsin

[email protected]

Page 2: Preservation and institutional repositories for the digital arts and humanities

Institutional repositories for the digital arts and

humanities

Dorothea SaloUniversity of Wisconsin

[email protected]

Page 3: Preservation and institutional repositories for the digital arts and humanities

Preservation for the digital arts and

humanities

Dorothea SaloUniversity of Wisconsin

[email protected]

Page 4: Preservation and institutional repositories for the digital arts and humanities

Dorothea SaloUniversity of Wisconsin

[email protected]

Preservation andinstitutional repositories for the digital arts and

humanities

Page 5: Preservation and institutional repositories for the digital arts and humanities

And I said...

... you’re giving me how much time for this?

Page 6: Preservation and institutional repositories for the digital arts and humanities

Environment

•As several of you are intimately aware, higher ed is trying to figure out What To Do About Data.

•This spells opportunity... IF you can get a seat at the table, and IF you know what to ask for!

• Humanists will not be the first people they think of, sadly.

•Serious (insoluble?) problem: data diversity• Expect compromise “solutions.”

•Do not let IT pros intimidate you.• They do not know everything they think they know.

Page 7: Preservation and institutional repositories for the digital arts and humanities

PICK SOFTWARELAST.

Friendly wordof advice:

Photo: “Briana Calderon; future educator of america.” http://www.flickr.com/photos/46132085@N03/4703617843/

Arielle Calderon / CC-BY 2.0

Page 8: Preservation and institutional repositories for the digital arts and humanities

IT’S WHATTHE SOFTWAREWON’T DO.

It’s not what the software doesthat’ll kill you.

Photo: “Briana Calderon; future educator of america.” http://www.flickr.com/photos/46132085@N03/4703617843/

Arielle Calderon / CC-BY 2.0

Page 9: Preservation and institutional repositories for the digital arts and humanities

DON’T CHASE THE SHINY.

Another friendly word of advice:

Photo: “Sparkle Texture” http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

Page 10: Preservation and institutional repositories for the digital arts and humanities

it’s much lessshiny.

In five years...

Photo: “Sparkle Texture” http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

Page 11: Preservation and institutional repositories for the digital arts and humanities

it’s not shinyat all.

In ten years...

Photo: “Sparkle Texture” http://www.flickr.com/photos/abbylanes/3214921616/Abby Lane / CC-BY 2.0

Page 12: Preservation and institutional repositories for the digital arts and humanities

In twenty years...

it’s probablyuseless.

Page 13: Preservation and institutional repositories for the digital arts and humanities

NOT A SOLUTION:your graduate students

•You have a bright, tech-savvy grad student.•She builds an Awesome Tech Thing.•You have no idea how it works.•She graduates. You’re hosed.

• Because she didn’t (know how to) build it sustainably...• Because you don’t have any documentation...• Because nobody made contingency plans for it...

• I have seen this pattern over and over again.• It’s killed more digital culture and research materials than

anything I can think of in academe.

Page 14: Preservation and institutional repositories for the digital arts and humanities

Am I saying “don’t experiment?”

•Nah, of course not.• I’m saying “know what an experiment means.”• I’m saying “don’t mistake an experiment for an

archive.”• I’m saying “don’t experiment and then expect

everybody else to pick up your pieces because you didn’t plan for metadata or preservation.”

Page 15: Preservation and institutional repositories for the digital arts and humanities

That said?

•You gotta do what you gotta do.•Some friendly advice:

• Know where the exits are. (Can you export your data? In a reusable format?)

• When you finish a project, USE that export. • Triply true if you’re relying on the cloud!

•Your overriding goal, while a project is in progress: keep your eventual options open!

•Long-term... is a totally other kettle of fish.

Page 16: Preservation and institutional repositories for the digital arts and humanities

Your best strategy•The single best strategy for a digital humanist

concerned about long-term preservation...• ... is to figure out how to make it Somebody

Else’s Problem.• Right now, this is hard. I do believe it will get easier.

• It’s a lot easier to figure this out from the start than at the end.

• Different Somebody Elses will have different things that they want. If you know that from the get-go, you’re much better off.

Page 17: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

Page 18: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own

Page 19: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

Page 20: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software

Page 21: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software• e.g. Omeka, Dataverse, ArchivesSpace...

Page 22: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software• e.g. Omeka, Dataverse, ArchivesSpace...• Better, but not foolproof. Upgrades? Security? Backups?

Page 23: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software• e.g. Omeka, Dataverse, ArchivesSpace...• Better, but not foolproof. Upgrades? Security? Backups?• Writing plugins/mods = rolling your own. Avoid if possible.

Page 24: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software• e.g. Omeka, Dataverse, ArchivesSpace...• Better, but not foolproof. Upgrades? Security? Backups?• Writing plugins/mods = rolling your own. Avoid if possible.

•Adopting institutional infrastructure

Page 25: Preservation and institutional repositories for the digital arts and humanities

Institution-internal solutions

•Rolling your own• Please don’t, if you can possibly avoid it.

•Adopting open-source software• e.g. Omeka, Dataverse, ArchivesSpace...• Better, but not foolproof. Upgrades? Security? Backups?• Writing plugins/mods = rolling your own. Avoid if possible.

•Adopting institutional infrastructure• Make sure it’ll survive your departure from the institution!

Page 26: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

Page 27: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories

Page 28: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories• Databib: http://databib.org/

Page 29: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories• Databib: http://databib.org/• re3data: http://re3data.org/

Page 30: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories• Databib: http://databib.org/• re3data: http://re3data.org/• N.b. you will find less here on the humanities than you

would probably prefer. Long story.

Page 31: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories• Databib: http://databib.org/• re3data: http://re3data.org/• N.b. you will find less here on the humanities than you

would probably prefer. Long story.

•Figshare

Page 32: Preservation and institutional repositories for the digital arts and humanities

Outside the institution

•Lists of data repositories• Databib: http://databib.org/• re3data: http://re3data.org/• N.b. you will find less here on the humanities than you

would probably prefer. Long story.

•Figshare• ... and other web services springing up, e.g. omeka.net

Page 33: Preservation and institutional repositories for the digital arts and humanities

You will be limited by...• Infrastructure your library/IT has already

committed to• this is why you want to be in on ground-floor discussions!

•Their willingness and ability to tweak, rewrite, or replace it with something suiting your needs

•Your willingness and ability to evaluate, install, and maintain a software stack that suits you

• ... perhaps indefinitely!

•The availability of hosted solutions, and your ability to pay for them (perhaps indefinitely!)

Page 34: Preservation and institutional repositories for the digital arts and humanities

You need to know what the options are like.

•Your library and IT folks may well need guidance. At minimum, they need clearly-expressed requirements.

•The requirements you give them need to go beyond end-user access, use, and UI.

• Back end: getting material in as efficiently as possible, allowing for additions/changes/deletions

• Preservation requirements• Data and metadata purity, clarity, preservability,

reusability, mashuppability, migratability, standards

Page 35: Preservation and institutional repositories for the digital arts and humanities

Institutional repositories

Page 36: Preservation and institutional repositories for the digital arts and humanities

What’s an IR?•“‘[A]ttics’ (and often fairly empty ones), with

random assortments of content of questionable importance”

• Brown, Griffiths, Rascoff, “University publishing in a digital age.” Ithaka 2007. http://www.sr.ithaka.org/research-publications/university-publishing-digital-age

•A basic digital preservation-and-access platform designed to allow faculty to deposit and describe single PDFs.

•Quite commonly available in research libraries or through library consortia.

• You probably have one available to you!

Page 37: Preservation and institutional repositories for the digital arts and humanities

IR software•Open source

• Fedora Commons: http://fedora-commons.info/ (you’ll need a layer on top of this)

• DSpace: http://dspace.org/• EPrints: http://eprints.org/

•Commercial• ContentDM: http://contentdm.com/• DigiTool: http://www.exlibrisgroup.com/category/

DigiToolOverview

•Hosted• ContentDM: http://contentdm.com/• BePress: http://bepress.com/

Page 38: Preservation and institutional repositories for the digital arts and humanities

Two minutes!

•Find an IR available to you for depositing content.

Page 39: Preservation and institutional repositories for the digital arts and humanities

You can typically expect...•To get in touch with someone in the library to

get an account set up, and a space for you to deposit into

• Have a collection name and description ready.• Default descriptors, if you have any, also a good idea.• Need access controls? To delegate deposit? Talk about this.

•To be able to put materials in on your own, through web forms

• To find the deposit process fiddly and annoying

•To have material appear on the web right after deposit.

Page 40: Preservation and institutional repositories for the digital arts and humanities

IRs work for...

•Small(ish), discrete files that never change• So an Excel-using researcher is just fine with an IR.

•Documentation for data held elsewhere•Some IRs can handle static website captures.•Files with uncomplicated IP lives

• ... which complicates the “static website” question.

•Access restriction may be possible, as may dark archiving; it depends on the IR platform. Expect it to be annoying to implement, though.

Page 41: Preservation and institutional repositories for the digital arts and humanities

IRs don’t work for•Really Big Data

• including, sometimes, audio and video• This is less a reflection on IR software than of most IRs being

horribly underprovisioned with storage and bandwidth.

•Work in progress; files that may change or be updated

•Complex digital objects (except static websites)•Digital objects that need interactivity

• Even something as simple as video streaming. IRs can’t.

•Anything that needs a DOI. (You’ll get a permanent identifier, but it won’t be a DOI.)

•Datasets where the researcher wants to vet any potential reusers

Page 42: Preservation and institutional repositories for the digital arts and humanities

Digital libraries

Page 43: Preservation and institutional repositories for the digital arts and humanities

Digital-library software• Omeka, Greenstone (aging), ContentDM...• Again, chances are your library already has some kind

of digital-collections software.• Go ask a librarian what it is, and whether you can add material to it!• Also ask if it’s attached to any kind of digitization or metadata-help

service. It may not be, but you never know.

• If not, there are hosted options• if you’re prepared to pay for them indefinitely.

• Designed for image exhibitions• May extend to audio and video, but UI won’t be ideal.• May extend to page-scanned books, but may not. (Omeka is terrible

at these.)

Page 44: Preservation and institutional repositories for the digital arts and humanities

Be aware

•The digital-preservation underpinnings of this class of software are weak to nonexistent.

• It’s designed for exhibiting, not for archiving!• It may also entice you into poor sustainability decisions,

such as using web-friendly but lossy JPG as your master image format. Or not making backups.

•On the plus side, though, if it’s a library service, the library feels an institutional commitment to the materials in it.

• That’s a lot of the preservation battle won, right there.

Page 45: Preservation and institutional repositories for the digital arts and humanities

Emerging solutions

•Often involve combining software to attack different parts of the problem

• Preservation underlayers: Fedora Commons, microservices• Deposit and management UI: Hydra, Islandora• End-user UI: Hydra, Islandora, Omeka, plugins, mods, etc.

•Are still pretty DIY at this point• If your library is doing active development, you’re one of

the lucky ones.• The rest of you may have to wait. And lobby.

Page 46: Preservation and institutional repositories for the digital arts and humanities

Archives platforms

•Designed for coping with an undifferentiated mess of random digital stuff.

• I know, right? Nobody has that problem...

•Not usually designed to help other people use or interact with that stuff.

• Also, designed for archivists’ ways of thinking. Archivists are humanists, but not all humanists are archivists.

• Worth getting a software tour from an archivist!

•Archivematica, ArchivesSpace (in beta), Duke Data Accessioner, CollectiveAccess, BitCurator

Page 47: Preservation and institutional repositories for the digital arts and humanities

Data-management platforms•Usually designed for the sciences, not the

humanities!• But that doesn’t necessarily mean they won’t work for what

you have in mind.• (“E-lab notebooks” will probably feel pretty foreign, though.)

•Look at Dataverse Network, http://thedata.org/•Gigantor lists of everything ever:

• http://www.dcc.ac.uk/resources/external/tools-services/archiving-and-preserving-information

• http://dirt.projectbamboo.org/categories/publishing• http://foss4lib.org/packages• Less helpful than you might think; there’s rarely any decision

apparatus alongside.

Page 48: Preservation and institutional repositories for the digital arts and humanities

Dorothea’s cantankerous, crabby, cynical, crude, choleric, churlish, other-words-beginning-with-C

take on digital humanists working with librarians and IT pros

Page 49: Preservation and institutional repositories for the digital arts and humanities

Neil Gaiman on George R.R. Martin and his eager fans

CENSORED

CENSORED

From: http://journal.neilgaiman.com/2009/05/entitlement-issues.html

Page 50: Preservation and institutional repositories for the digital arts and humanities

Digital humanists:

Librarians and IT professionals are

not your bitch.CENSORED

Page 51: Preservation and institutional repositories for the digital arts and humanities

Not entirely your fault!

•$$$ is a consideration, unfortunately• The sciences have it. Unfortunately.

•Your colleagues may have poisoned the well by being prima donnas, even though you’re not!

•Different professional-advancement infrastructure

•We may just. not. be. ready.• Or the infrastructure we rely on may not be.

Page 52: Preservation and institutional repositories for the digital arts and humanities

DH and libraries should be friends•Involve the library from the outset.

• Please do NOT ask us to pick up your messes at the end!• Expect us to have work for you to do, and quality expectations.• Yes, I know that’s how it used to work with analog materials

and archivists. Digital is different.

•Come to us in groups.• We serve all of campus. We cannot afford to move heaven and

earth for any one person. Please don’t be a prima donna!• At minimum, have an idea how what you’re asking will

concretely benefit other campus constituencies.

•Offer quid pro quo. What’s in it for us?• (Library advocacy in high places is always a good trade.)

•Be patient, please.• We don’t turn on a dime.

Page 53: Preservation and institutional repositories for the digital arts and humanities

Will this always work?•Sadly, no.•A good many libraries are just not ready to

take digital preservation and DH support seriously.

• The presence of a “DH center” in the library is not always proof of serious intent.

•Others have been burned before.•Still others are skeptics.• I can’t promise you’ll find help in the library, or

with campus IT. I can promise you won’t if you don’t seriously approach them.

• (Miriam Posner’s article is a must-read!)

Page 54: Preservation and institutional repositories for the digital arts and humanities

Thanks!Questions?