the big data challenges of connectomics jeff w lichtman, hanspeter pfister nir shavlt presented by...

Post on 20-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

The big data challenges of connectomicsJEFF W LICHTMAN, HANSPETER PFISTER NIR SHAVLT

PRESENTED BY YUJIE LI, OCT 21TH,2015

Connectomics• The study of the structural and functional connections among brain cells.

• Product is the “connectome,” a detailed map of those connections.

• Significant to understanding of the healthy and diseased brain.

• “I am my connectome” -- Sebastian Seung

Neuron structures

http://www.ncbi.nlm.nih.gov/books/NBK21535/

http://science.kennesaw.edu/~jdirnber/Bio2108/Lecture/LecPhysio/PhysioNervous.html

How many neurons in a human brain?

100 billion neuronsHow many neurons in a Drosophila?100,000 neurons.~ 107 synapses

A video to appreciate the challenge faced with connectomics

Brainbow Technique

A Voyage Into the Brain http://ngm.nationalgeographic.com/2014/02/brain/voyage-video

Acquisition

Analytical problems stand between the acquired image and having access to the data in a useful form• Alignment

• Reconstruction

• Feature detection

• Graph generation

Alignment

sections collected on a belt may rotate.

ReconstructionChallenges for automatic segmentation:• Irregular neuron objects• Lateral resolution is several-fold finer

than thickness• Under/over segmentation

Goal : Obtain saturated reconstructions of very large (1mm3)brain volumes in a fully automatic way, with minimal errors and reasonably short time.

Human tracers, cursive handwritings recognition

Feature detection Subcellular features: mitochondria, synaptic vesicles etc…

Difficult to find cell boundaries

Irregular shape

Reduce error and analysis time

Graph generation•Data turned in to a form that represents the wiring diagram.

•Data reduction step

• How much of original data to retain?

• How to store the graph?• Skip Oct-trees.

Common theme: Dehumanizing the pipeline An irony is that humans are especially good at these tasks…. If we know how our brain wires, would be easier to develop tools to automate these processes.

Big data challenges of connectomics• Data size

• Data rate

• Computational complexity

• Parallel computing

• Compute system

• A heterogeneous hierarchical approach

• Data management and sharing

Data size 1mm3 rat cortex image = 2 million gigabytes = 2 petabytes

A complete rat cortex 500mm3 = 1,000 petabytes

(Walmart database manages a few petabytes of data)

A complete human cortex ~1000 larger than rodents = 1,000 * 1000 petabytes = 1 zetabyte

(All information recorded globally today)

Data rate - Imaging task distributed to different labs - Complete connectome of a human cortex is the goal! - Maybe start with substructures.

Data management and sharing - Assumed we obtained the data, do we store it?

◦ Yes, image and graph.

- How to move from microscope to the computer system? Transfer bandwidth◦ Placing computer near the microscope.◦ 500 standard 4-core 3.6 GHz processors would suffice. $1 million.

- Where to store?◦ Disk or tapes.

- How to share?◦ Internet Current achievable data rates: 300 megabites/second◦ Central sharing sites◦ Reconstructed layout graph is easier to deal with.

Computational complexity The goal of many big data system is more than to simply allow storage and access to large amounts of data. Rather, it is to discover correlations within data.◦ Sampling◦ Parallel computing

◦ Image segmentations and feature extraction are embarrassingly parallel.

A heterogenerous hierarchical approach

Combines bottom-up information from the image data with top-down information from the assembled layout graph, to dynamically decide on the appropriate computation level of intensity to be applied to a given sub-volume.

1) Initially apply the lowest cost computations to small volume. 2) The sub-graphs will be tested for consistency. 3) If discrepancies are found, more expensive computation used.4) The process will continue hierarchically, growing the volume of merged segments.

Prospects - The field needs a significant investment to advance. - Commercial values in connectomics

◦ Treating brain diseases◦ Appling lessons learnt to making computer smarter

- Challenges beyond the horizon: still big data problem

CommentsNo address on the EM technical limitations:  • Samples post-mortem, not in vivo• Physical damage during section, potential distortion.• Lack functional information

No comparison with the current popular approaches to the problem• Two photon, confocal, brightfiled images• Neuron-labelling approaches (physical dye, genetic approach)

Big data is not only about handling the super large dataset.• It is also about finding a smart way to fuse data from different modalities

and different sources to obtain a comprehensive understanding

top related