auditing pln’s: preliminary results and next steps

42
Auditing PLN’s: Preliminary Results and Next Steps Prepared for PLN 2012 UNC, Chapel Hill October 2012 Micah Altman, Director of Research, MIT Libraries Non Resident Senior Fellow, The Brookings Institution Jonathan Crabtree, Assistant Director of Computing and Archival Research HW Odum Institute for Research in Social Science, UNC

Upload: delila

Post on 23-Feb-2016

45 views

Category:

Documents


0 download

DESCRIPTION

Prepared for PLN 2012 UNC, Chapel Hill October 2012. Auditing PLN’s: Preliminary Results and Next Steps. Micah Altman, Director of Research, MIT Libraries Non Resident Senior Fellow, The Brookings Institution Jonathan Crabtree, Assistant Director of Computing and Archival Research - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Auditing PLN’s: Preliminary Results and Next Steps

Auditing PLN’s:Preliminary Results and Next Steps

Prepared for PLN 2012

UNC, Chapel HillOctober 2012

Micah Altman, Director of Research, MIT Libraries

Non Resident Senior Fellow, The Brookings Institution

Jonathan Crabtree, Assistant Director of Computing and Archival ResearchHW Odum Institute for Research in Social Science, UNC

Page 2: Auditing PLN’s: Preliminary Results and Next Steps

Collaborators*

• Nancy McGovern• Tom Lipkis & the LOCKSS Team

Research SupportThanks to the Library of Congress, the National Science

Foundation, IMLS, the Sloan Foundation, the Harvard University Library, the Institute for Quantitative Social Science, and the Massachusetts Institute of Technology.

Auditing PLN's

* And co-conspirators

micah
Update
Page 3: Auditing PLN’s: Preliminary Results and Next Steps

Related WorkReprints available from: micahaltman.com• M. Altman, J. Crabtree, “Using the SafeArchive System: TRAC-Based Auditing of LOCKSS”, Proceedings of Archiving

2011, Society for Imaging Science and Technology. • Altman, M., Beecher, B., & Crabtree, J. (2009). A Prototype

Platform for Policy-Based Archival Replication. Against the Grain, 21(2), 44-47.

Auditing PLN's

micah
Update
Page 4: Auditing PLN’s: Preliminary Results and Next Steps

Preview

• Why audit?• Theory & Practice

– Round 0: Setting up the Data-PASS PLN– Round 1: Self-Audit– Round 2: Compliance (almost)– Round 3: Auditing Other Networks

• What’s next?

Auditing PLN's

Page 5: Auditing PLN’s: Preliminary Results and Next Steps

Why audit?

Auditing PLN's

Page 6: Auditing PLN’s: Preliminary Results and Next Steps

Short Answer: Why the heck not?

Auditing PLN's

“Don’t believe in anything you hear, and only half of what you see”

- Lou Reed

“Trust, but verify.”

- Ronald Reagan

Page 7: Auditing PLN’s: Preliminary Results and Next Steps

Insider & ExternalAttacks

Slightly Long Answer: Things Go Wrong

Physical & Hardware

Software

Media Curatorial Error

OrganizationalFailure

Page 8: Auditing PLN’s: Preliminary Results and Next Steps

Full Answer:It’s our responsibility

Auditing PLN's

Page 9: Auditing PLN’s: Preliminary Results and Next Steps

OAIS Model Responsibilities• Accept appropriate information from Information Producers.• Obtain sufficient control of the information to ensure long term

preservation. • Determine which groups should become the Designated

Community able to understand the information. • Ensure that the preserved information is independently

understandable to the DC• Ensure that the information can be preserved against all

reasonable contingencies, • Ensure that the information can be disseminated as authenticated

copies of the original or as traceable back to the original• Makes the preserved data available to the DC

Auditing PLN's

Page 10: Auditing PLN’s: Preliminary Results and Next Steps

OAIS Basic Implied Trust Model• Organization is axiomatically trusted to identify designated

communities• Organization is engineered with the goal of:

– Collecting appropriate authentic document– Reliably deliver authentic documents, in understandable form, at a

future time• Success depends upon:

– Reliability of storage systems:e.g., LOCKSS network, Amazon Glacier

– Reliability of organizations: MetaArchive, DataPASS, Digital Preservation Network

– Document contents and properties: Formats, Metadata, Semantics, Provenance, Authenticity

Auditing PLN's

Page 11: Auditing PLN’s: Preliminary Results and Next Steps

Reflections on OAIS Trust Model

• Specific bundle of trusted properties• Not complete instrumentally nor ultimately

Auditing PLN's

Page 12: Auditing PLN’s: Preliminary Results and Next Steps

Trust Engineering Approaches• Incentive based approaches:

– Rewards, penalties, incentive-compatible mechanisms• Modeling and analysis:

– Statistical quality control & reliability estimation, threat-modeling and vulnerability assessment• Portfolio Theory:

– Diversification (financial, legal, technical… ) – hedges

• Over-engineering approaches:– Safety margin, redundancy

• Informational approaches:– Transparency (release of information needed to directly evaluate compliance); cryptographic signature,

fingerprint, common knowledge, non-repudiation• Social engineering

– Recognized practices; shared norms– Social evidence– Reduce provocations– Remove excuses

• Regulatory approaches– Disclosure; Review; Certification; Audits; Regulations & penalties

• Security engineering– Increase effort: harden target (reduce vulnerability); increase technical/procedural controls– Increase risk: surveillance, detection, likelihood of response– Design patterns: minimal privileges, separation of privileges– Reduce reward: deny benefits, disrupt markets, identify property, remove/conceal targetsAuditing PLN's

Page 13: Auditing PLN’s: Preliminary Results and Next Steps

Audit [aw-dit]:

An independent evaluation of records and activities to assess a system of controls

Fixity mitigates risk only if used for auditing.

Page 14: Auditing PLN’s: Preliminary Results and Next Steps

Functions of Storage Auditing

• Detect corruption/deletion of content

• Verifycompliance with storage/replication policies

• Prompt repair actions

Page 15: Auditing PLN’s: Preliminary Results and Next Steps

Bit-Level Audit Design Choices• Audit regularity and coverage:

on-demand (manually); on object access; on event; randomized sample; scheduled/comprehensive

• Fixity check & comparison algorithms• Auditing scope:

integrity of object; integrity of collection; integrity of network; policy compliance; public/transparent auditing

• Trust model• Threat model

Page 16: Auditing PLN’s: Preliminary Results and Next Steps

Repair

Key Design Elements

• Repair granularity• Repair trust model• Repair latency:

– Detection to start of repair– Repair duration

• Repair algorithm

Auditing mitigates risk only if used for repair.

Page 17: Auditing PLN’s: Preliminary Results and Next Steps

LOCKSS Auditing & RepairDecentralized, peer-2-peer, tamper-resistant

replication & repairRegularity Scheduled

Algorithms Bespoke, peer-reviewed, tamper resistant

Scope - Collection integrity- Collection repair

Trust model - Publisher is canonical source of content- Changed contented treated as new- Replication peers are untrusted

Main threat models - Media failure- Physical Failure- Curatorial Error- External Attack- Insider threats- Organizational failure

Key auditing limitations - Correlated Software Failure- Lack of Policy Auditing, public/transparent auditing

Page 18: Auditing PLN’s: Preliminary Results and Next Steps

Auditing & RepairTRAC-Aligned policy auditing as a overlay

networkRegularity Scheduled; Manual

Fixity algorithms Relies on underlying replication system

Scope - Collection integrity- Network integrity- Network repair- High-level (e.g. trac) policy auditing

Trust model - External auditor, with permissions to collect meta-data/log information from replication network

- Replication network is untrusted

Main threat models - Software failure- Policy implementation failure

(curatorial error; insider threat)- Organizational failure- Media/physical failure through underlying replication

system

Key auditing limitations Relies on underlying replication system, (now) LOCKSS, for fixity check and repair

Page 19: Auditing PLN’s: Preliminary Results and Next Steps

Theory vs. Practice Round 0: Setting up the Data-PASS PLN

Auditing PLN's

“Looks ok to me”

- PHB Motto

Page 20: Auditing PLN’s: Preliminary Results and Next Steps

Theory

Auditing PLN's

Expose Content ( Through

OAI+DDI+HTTP )

Install LOCKSS(On 7 servers)

Harvest Content(through OAI plugin)

Setup PLN configurations

(through OAI plugin)LOCKSS

Magic

Done

Page 21: Auditing PLN’s: Preliminary Results and Next Steps

Expose Content ( Through

OAI+DDI+HTTP )

Install LOCKSS(On 7 servers)

Harvest Content(through OAI

plugin)

Setup PLN configurations(through OAI

plugin)

LOCKSS Magic

Done

Practice (Year 1) • OAI Plugin extensions required:

– Non-DC metadata– Large metadata– Alternate authentication method– Save metadata record– Support for OAI-SETS– Non-fatal error handling

• OAI Provider required:– Authentication extensions– Performance handling for delivery– Performance handling for errors– Metadata validation

• PLN Configuration required:– Stabilization around LOCKSS versions– Coordination around plugin

repository– Coordination around AU definition

Theory

Page 22: Auditing PLN’s: Preliminary Results and Next Steps

Theory vs. PracticeRound 1: Self-Audit

Auditing PLN's

“A mere matter of implementation”

- PHB Motto

Page 23: Auditing PLN’s: Preliminary Results and Next Steps

Theory

Auditing PLN's

Gather Information from

Each Replica

Integrate Information ->

Map Network State

Compare Current Network to Policy

Success

State ==

Policy ?

YES

Add Replica

NO

Page 24: Auditing PLN’s: Preliminary Results and Next Steps

www.safearchive.org

Implementation

Page 25: Auditing PLN’s: Preliminary Results and Next Steps

Practice (Year 2) • Gathering information required

– Permissions– Reverse-engineering UI’s (with help)– Network magic

• Integrating information required– Heuristics for lagged information– Heuristics for incomplete

information– Heuristics for aggregated

information• Comparing map to policy required

Mere matter of implementation • Adding replica:

Uh-oh, most policies failed Adding replicas wasn’t going to resolve most issues

Theory

Gather Information from

Each Replica

Integrate Information ->

Map Network State

Compare Current State Map to Policy

Success

State ==

Policy ?

YES

Add Replica

NO

Page 26: Auditing PLN’s: Preliminary Results and Next Steps

Theory vs. Practice Round 2: Compliance (almost)

Auditing PLN's

“How do you spell ‘backup’?

R – E - C – O – V – E – R - Y

-

Page 27: Auditing PLN’s: Preliminary Results and Next Steps

Practice (and adjustment) makes perfekt?

• Timings (e.g. crawls, polls) – Understand– Tune– Parameterize heuristics, reporting– Track trends over time

• Collections– Change partitioning to AU’s at source– Extend mapping to AU’s in plugin– Extend reporting/policy framework to group AU’s

• Diagnostics– When things go wrong – information to inform adjustment

Auditing PLN's

Page 28: Auditing PLN’s: Preliminary Results and Next Steps

Theory vs. PracticeRound 3: Auditing Other PLNs

Auditing PLN's

“In theory, theory and practice are the same –in practice, they differ.”

-

Page 29: Auditing PLN’s: Preliminary Results and Next Steps

Theory

Auditing PLN's

Gather Information from

Each Replica

Integrate Information ->

Map Network State

Compare Current Network to Policy

Success

State ==

Policy ? YES

Add Replica

NO

Adjust AU Sizes, Polling

Intervals adjusted?

NOYES

Page 30: Auditing PLN’s: Preliminary Results and Next Steps

Practice (Year 3) • 100% of what?• Diagnostic inference

Theory

Gather Information from

Each Replica

Integrate Information -> Map Network

State

Compare Current Network to Policy

Success

State ==

Policy

? YES

Add Replica

NO

Adjust AU Sizes,

Polling Intervals adjusted

?

NOYES

Page 31: Auditing PLN’s: Preliminary Results and Next Steps

100% of what?

• No: Of LOCKSS boxes?• No: Of AU’s?• Almost: Of policy overall • Yes: Of policy for specific collection• Maybe: Of files?• Maybe: Of bits in a file?

Page 32: Auditing PLN’s: Preliminary Results and Next Steps

What you see

Auditing PLN's

Box X,Y,Z all agree on AU A

What you can conclude:

Box X,Y,Z have the same content Content is good

Assumption:Failures on file harvest are independent; number of

harvested files large

Page 33: Auditing PLN’s: Preliminary Results and Next Steps

What you see

Auditing PLN's

Box X,Y,Z don’t agree

What you can conclude?

Page 34: Auditing PLN’s: Preliminary Results and Next Steps

Hypothesis 1: Disagreement is real, but doesn’t really matter.

Non-Substantive AU differences (arising from dynamic elements in AU’s that have no bearing on the substantive content )

1.1 Individual URLS/files that are dynamic and non substantive (e.g., logo images, plugins, Twitter feeds, etc.) cause content changes (this is common in the GLN).

1.2 dynamic content embedded in substantive content (e.g. a customized per-client header page embedded in the pdf for a journal article )

Hypothesis 2: Disagreement is real, but doesn’t really matter in the longer run (even if disagreement persists over long run!)

2.1 Temporary AU Differences. Versions of objects temporarily out or sync. (E.g. if harvest frequency << source update frequency, but harvest times across boxes vary

significantly)2.2 Objects temporarily missing

(E.g. recently added objects are picked up by some replicas, not by others)

Hypothesis 3: Disagreement is real, mattersSubstantive AU differences

3.1 Content corruption (e.g. from corruption in storage, or during transmission/harvesting)3.2 Objects persistently missing from some replicas ( e.g. because of permissions issue @ provider; technical failures during harvest; plugin problems)3.2 Versions of objects persistently missing/out of sync from some replicas(e.g. harvest frequency > source update frequency leading to different AU’s harvesting different versions of the

content. )Note that later “agreement” signifies that a particular version was verified, not that all versions have been

replicated and verified

Hypothesis 4: AU’s really do agree, but we think they don’t

4.1 Appearance of disagreement caused by Incomplete diagnostic information Poll data are missing as a result of system reboot, daemon updates, or other cause.

4.2 Poll data are lagging – from different periods Polls fail, but contains information about agreement that is ignored

Page 35: Auditing PLN’s: Preliminary Results and Next Steps

Auditing PLN's

Page 36: Auditing PLN’s: Preliminary Results and Next Steps

Design Challenge

• Create more sophisticated algorithms and

• Instrument PLN data collection

Such that

Observed behavior allows us to distinguish between hypotheses 1-4.

Auditing PLN's

Page 37: Auditing PLN’s: Preliminary Results and Next Steps

Approaches to Design Challenge

[Tom Lipkis’s Talk]

Auditing PLN's

Page 38: Auditing PLN’s: Preliminary Results and Next Steps

What’s Next?

Auditing PLN's

“It’s tough to make predictions, especially about the future”

-Attributed to Woody Allen, Yogi Berra, Niels Bohr, Vint Cerf, Winston Churchill, Confucius, Disreali [sic], Freeman Dyson, Cecil B. Demille, Albert

Einstein, Enrico Fermi, Edgar R. Fiedler, Bob Fourer, Sam Goldwyn, Allan Lamport, Groucho Marx, Dan Quayle, George Bernard Shaw, Casey

Stengel, Will Rogers, M. Taub, Mark Twain, Kerr L. White, and others 

Page 39: Auditing PLN’s: Preliminary Results and Next Steps

Short Term

• Complete round 3 data collection• Refinements of current auditing algorithms

– More tunable parameters (yeah?!)– Better documentation– Simple health metrics

• Reports, and dissemination

Auditing PLN's

Page 40: Auditing PLN’s: Preliminary Results and Next Steps

Longer Term

• Health metrics, diagnostics, decision support• Additional audit standards• Support additional replication networks• Audit other policy sets

Auditing PLN's

Page 41: Auditing PLN’s: Preliminary Results and Next Steps

Bibliography (Selected)

• B. Schneier, 2012. Liars and Outliers, John Wiley & Sons• H.M. Gladney, J.L. Bennett, 2003. “What do we mean by authentic”, D-Lib 9(7/8)• K. Thompson, 1984. “Reflections on Trusting Trust”,

Communication of the ACM, Vol. 27, No. 8, August 1984, pp. 761-763.• David S.H. Rosenthal, Thomas S. Robertson, Tom Lipkis, Vicky Reich, Seth Morabito. “Requirements for Digital Preservation Systems: A Bottom-Up Approach”, D-Lib Magazine, vol. 11, no. 11, November 2005. • OAIS, Reference Model for an Open Archival Information System (OAIS). CCSDS 650.0-B-1, Blue Book, January 2002

Auditing PLN's

micah
Update
Page 42: Auditing PLN’s: Preliminary Results and Next Steps

Questions?

E-mail: [email protected]

Web: micahaltman.comTwitter: @drmaltman

E-mail: [email protected]

Auditing PLN's