netarchivesuite sabine schostag the netarchive [email protected]
TRANSCRIPT
How we use NetarchiveSuite
Questions and answers on NetarchiveSuite: lifecycle: What aspects of the web archiving life cycle model does the tool cover? What aspects of the model would you like to/do you intend to build into the tool? What functionality does the tool provide that isn't reflected in the model?development: What resources are committed to the tool's ongoing development? What are major features in the roadmap? Is the code open source?adoption: What is the user base for the tool? How environment-specific is the tool as opposed to readily reusable by other organizations?functionality: What are the tool's unique features? What are its shortcomings?
NetarchiveSuite
LifecycleWhat aspects of the web archiving life cycle model does the tool cover?
What aspects of the model would you like to/do you intend to build into the tool? Extended documentation, Search functions, time schedules ≤ 1 hour
What functionality does the tool provide that isn't reflected in the model?Time schedules min: once an hour – max ??
NetarchiveSuite
development: What resources are committed to the tool's ongoing development?
2,6 MP
What are major features in the roadmap? Technical improvements, Upgrade to or support Heritrix 3, Replacing current NetarchiveSuite Archive module Better integration of documentation
Is the code open source?https://sbforge.org/display/NASDOC42/NetarchiveSuite+Overview
NetarchiveSuite
adoption: What is the user base for the tool? How environment-specific is the tool as opposed to readily reusable by other organizations? Even though the NetarchiveSuite software is developed in Java, and therefore is
mostly platform independent, we do have a couple of external calls to the Unix sort command. The parts of our software using this external command therefore only run on Linux/Unix, or Windows with Cygwin installed.
Se installation manual: https://sbforge.org/display/NASDOC42/Installation+Overview
NetarchiveSuite
Functionality: What are the tool's unique features? What are its shortcomings?
Multifaceted aplication Selective Harvests Snapshot Harvests Domains Schedules Extended fields Heritrix GUI Access Global Crawler Traps Harvest History Harvester Templates Quality Assurance System State Bit Preservation See: https://sbforge.org/display/NASDOC42/User+Manual
NetarchiveSuite
Netarchive use of NAS /overviewBroad crawlsSelective crawls
”Selective crawls” Event crawls Special crawls (e.g. upon a scholars wish) Focused crawls: Social media (special templates), very big sites,..