an information environment for neuroscientists
TRANSCRIPT
1
An information environment for neuroscientists
David Wallom
Project background
• JISC Information Environment Programme 2009-2011
• Strand A2: Developing e-Infrastructure to support research disciplines
• Project partners: Oxford, Southampton, Reading
– An e-Research South Consortium project
The Neuroscience partners
• The Oxford group is focused on specific issues such as the study of the molecular basis of synapse formation, plasticity and the regulation of neuronal morphology in the normal and diseased brain. They are examining the mechanisms of an activity-dependent form of neural plasticity known as long-term potentiation (LTP).
• Research at CINN focuses on physiological and psychological mechanisms underpinning complex cognitive behaviours, targeting typical and atypical development and decline in individuals. There is also a strong research group working on issues in signal analysis and computational modelling.
• A focus in Southampton is on the integrative analysis of brain function/dysfunction. Modelling aspects of this across levels of biological organization ranging from molecules, cells, tissue, systems to animal behaviour.
Challenges
• Interdisciplinary teams – different expectations, cultures, requirements
• Agreed standards– Different data formats
Microscopes (Multi-photon or Confocal) Live cell fluorescent imaging Electrophysiology recordings
– Meta data standards• Complexity of tools used in community• Ability to share images, data, analysis• Network connectivity not the best
More Specifically......Help Managing Data
Initial Experimental
Idea
Experimental Design
Data CollectionAnalysis
Publication
Experimental Idea & Design
Microsoft
• Word
• Excel
Shared Documents
Google Docs
Shared Calendar
Experimental Data Collection
Lasersharp 2000
• 2 photon imaging
• Confocal imaging
• Photolysis
Andor IQ
• TIRF
• QD imaging
• Photolysis
Fortran compiler
JAVA
• Modelling
WinWCP/ Strathclyde
• Electrophysiology
Analysis
ClampfitImageJ
OriginMaple
Microsoft Excel
MATLAB
SPSS/InStat
Mathematica
Data Archive Bespoke
Publication/Dissemination
Adobe
• Acrobat
• Photoshop
• Illustrator
Microsoft
• Word
• Excel
Websites
Microsoft PowerPoint
What we would like to see:1. VRE – single point of contact2. A consistent annotation method for data archiving3. Web & shared filesystem based repository for data.
Many file formats to be supported4. A searchable data base for images5. A searchable data base for video images6. A document share tool for ‘live’ manuscript editing7. File space for literature sharing (PDFs)8. Blog area
Release 2.0
• Drupal – Frontend content management. Based on Drupal Commons,
• Alfresco – Backend data management. Modified Alfresco module,
• Apache Solr – Search engine,
• Apache Tika – Metadata extraction toolkit for documents,
• Google services – Docs and Calendar,
• Cloud-based computation using GPU’s,
• NCBO ontology-based tagging,
• LDAP – Single sign-on,
• Digital Pens – Used for recording experiments,
• XML-RPC desktop client – uploading and generating content.
What is Alfresco?
• Open source document management system (also enterprise edition),
• Alternative to Microsoft Sharepoint,
• It provides the following features:
– Unified repository – Manages documents, images, video, audio,
etc…
– Network share services – CIFS/Samba, WebDAV, IMAP and
SharePoint protocol.
– Connectivity – CMIS, JSR 168, REST, Microsoft Office integration.
– Version control – Tracking of major and minor document
versions.
– Folder-based Rules and Actions – Support for document
workflows.
What is Apache Solr?
• Open source search platform based on Apache Lucene,
• Apache Solr search platforms features include:
– Full-text searching,
– Faceted searching,
– Dynamic clustering,
– Database integration,
– Caching and replication,
– Document handling via Apache Tika– e.g. Word, PDF, etc…
– Connectivity – HTTP/XML, JSON API
What is Apache Tika?
• Open source content analysis toolkit,
• Detection and extraction of metadata and structured text from various documents,
– Compressed formats – tar, jar, zip, bzip2, gz, tgz.
– Text Documents – Word, Excel, Powerpoint, RTF, PDF, HTML,
XHTML, OpenDocument, Plain text.
– Images – BMP, GIF, PNG, JPEG, TIFF.
– Audio – MP3, AIFF, AU, MIDI, WAV.
• Extensible parser – Allows for custom parsers to be developed for
other document types.
Architecture
Digital microscope
Electro-physiology
rig
Digital pens
PCWorkflows
Drupal
Continuous integration
AlfrescoGoogle
Services
Site Usage
• Oxford:– Have tested different input methods including digital pens, iPad
and tablet PC.– Uploads of their 6000 image files is working fine, with ~250GB of
data stored.• Reading:
– Want to use NeuroHub as a frontend for their whole centre.– Workflow development has improved scientists working pattern
and supported integration with other complex activities.• Southampton:
– Have ~100GB of data stored and have started to organise their files.
– Interested in digital pen input and want to continue uploading their files in NeuroHub.
Demonstrable Features
• Groups – CINN• Lab Books – Southampton• File system & browsing - Laptop• Shared Calendaring – Platt Lab instance• Searching - Southampton• Livescribe Integration - Laptop• Other content
– Blogs– Documents– Discussions– Wikis
• Taxonomy Manager - CINN• Timeline - Development