open science - 2011.rmll.info2011.rmll.info/img/pdf/openscience.pdf · open science data &...
TRANSCRIPT
Open Science Data & Source Code dissemination for scientific
research
Charles Marion ([email protected])
Julien Jomier ([email protected])
Introduction
• Publications do not cure Cancer !
• Doctors do not prescribe “reading papers” as a
treatment.
• So.. Why do scientists care so much about publishing ?
Introduction
• Scientific/Technical papers disseminate knowledge
• Academic/Scientific achievements are assessed via
publishing: “publish or perish”
Why publishing research?
“. . . whenever I found out anything remarkable, I have thought it my
duty to put down my discovery on paper, so that all ingenious people
might be informed thereof.”
Antony van Leeuwenhoek. Letter of June 12, 1716
Introduction
• How long it takes to publish a paper on a Journal?
– Typically 2 years
• How much do you have to pay to publish a paper in a
journal? – About 500€ / paper
• How much do you have to pay for reading the same
paper? – About 30€ / paper
Introduction
• How much it costs to post a PDF on the Web ?
New ways of collaboration
• Creating public repositories for source code
• Creating public image databases
• Creating forums for hosting positive discussions online
• Validating other’s methods and suggesting improvements.
Open Science
Open Science
Open Source
Open Access
Open Comput
ing
The Open Access Revolution
• Few journals enforce REPRODUCIBILITY
• Few journals publishes CODE, DATA and PARAMETERS
• No journal publishes NEGATIVE results
The Insight Journal
Open Source
Open Science
Agile Programming
Agile Publishing Insight
Journal
The Insight Journal
• Started in 2000 with the Insight Toolkit (ITK)
• Web-based open-access journal
• Technical work must be reproducible
• Papers should be publicly accessible
• It should take less than 2 years to publish
• The Peer-Review process must be open
The Insight Journal: submission
Code
Input
Data
Journal CVS
Repository
Web
Site Results
Data
Author
Build
Machines
PDF doc
The Insight Journal: review
Reviewer
Selected
Papers
Checked
Paper
Reviewer
Checked
Paper
Checked
Paper
Checked
Paper
Web Site
Checked
Paper
The Insight Journal: Demo
Open Access: Dataset
• Scientific datasets are becoming larger and larger
• Storing datasets is the first step but querying and retrieving them
is even more important
• Data without metadata information are useless
• Distributed and remote computing is becoming a necessity
An Web-based Multimedia Digital Archiving System to store, search,
share and manage (any) digital media.
• Open source (BSD license) and Cross Platform
• Modular, extensible and highly customizable Framework
• Remote and Local solutions
What is MIDAS?
MIDAS
Don’t think
Think
MIDAS Interface
Server Side Processing
• Reliable, scalable, distributed computing using Hadoop:
Server Side Processing
• Distributed Visualization: ParaviewWeb
• Client side Visualization: WebGL
Online Visualization
Online Visualization
MIDAS Instances • Open science journals
- Insight Journal: http://www.insight-journal.org 1,500 users, 380+ open-access publications, 765 reviews
- MIDAS Journal: http://www.midas-journal.org
• Publication Database - Harvard: http://www.slicer.org/publications 1500+ publications
- Kitware: http://www.kitware.com/publications
• Data server - Kitware Public: http://www.insight-journal.org/midas 30+ GB of open-access data
- NCI Small Animal Imaging Multiple TB of data
- NLM Visible Human
- Optical Society of America
What’s next?
• Unification of technologies for complete reproducibility
– CTest, CDash, MIDAS, ParaViewWeb
• Integration Git / GitHub
• Algorithm Validation (COVALIC)
– Source code
– Testing data
– Validation metrics
– Online reporting, comparison, rating
Open Science
Data & Source Code dissemination for scientific
research
Charles Marion ([email protected])
Julien Jomier ([email protected])