![Page 1: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/1.jpg)
Open & Big Data for Life Imaging Technical aspects : existing solutions, main difficulties
Pierre Mouillard MD
![Page 2: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/2.jpg)
![Page 3: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/3.jpg)
What is Big Data? ‣ lots of data
more than you can process using common database software and standard computers
‣ complex data‣ dataflows and time series
the origins: big science, demographics, economicsthe next generation origins: e-commerce
![Page 4: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/4.jpg)
What could we possibly do with all this data? Could we discover new unknown things, just digging and computing these huge datasets?
The more data we have, the better:‘lots of data means better accuracy’ as a common sense paradigm and mostly a misconception
![Page 5: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/5.jpg)
If you don’t mind, I’ll grab some big data from you.
![Page 6: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/6.jpg)
Medical imaging
‣ prevention, screening ‣ diagnostic and decision aid ‣ training & planning before treatment ‣ real time imaging during therapeutic act ‣ reference sets and atlas ‣ follow-up of pathology, treatment assessment
and: ‣ research ‣ epidemiology
![Page 7: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/7.jpg)
Big data & medical imaging ‣ medical images are big:
dynamic 3D CT scan = 1Gb, digitized XR = 100Mb ‣ an average hospital generate about 10-300 Tb / year ‣ mostly unstructured data (60-80%) ‣ medical image archives are increasing by 20-40% / year ‣ for one patient, average of 3 imaging modalities per medical act
very different from e-commerce!‘fewer’ instances, more data per capita
![Page 8: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/8.jpg)
Sharing medical images ‣ medical practice becomes increasingly collective ‣ multi-modality (CT, MRI, Usound, XR, PET…)
means numerous experts involved ‣ tele-medecine, tele-diagnostics means remote
access to images & medical data ‣ nosology and semiotics have to be redefined and
expanded because of continuous advances in imaging ‣ research is more focused on specific diseases
all this means we need to improve and facilitate medical image sharing
![Page 9: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/9.jpg)
Big data: what for?
big & open data in medical imaging is a challenge, on a global scale
‣ numeric reference images databases ‣ patient similarity searching ‣ disease progression monitoring, clinical follow-up ‣ cases studies, training and learning, expertise sharing ‣ nosology and semiotics redefinition ‣ new algorithms and image editing tools testing ‣ shared archives ‣ epidemiology
![Page 10: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/10.jpg)
Big data: 2 concepts ‣ retrospective and prospective studies
on finite sets of images ‣ statistical analysis at some point and conclusions ‣ = ‘knowledge extraction’ and rules definition
digital expertise and AI is rising!
‣ ‘forever’ ongoing studies, with always expanding image sets ‣ auto-adaptative and machine learning systems (neural networks, genetic algorithms…) ‣ = ‘always improving’ and automatic optimization
STATIC BIG DATA
DYNAMIC BIG DATA
![Page 11: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/11.jpg)
Imaging data ‣ images are just big packs of digits ‣ each pixel (voxel) is a measure (each image is a rich dataset) ‣ images are just raw data matrix, unstructured data ‣ to be used as big data, image processing is needed
‣ image has to be normalized, segmented, and expertized ‣ ROI definition, measures of volumes & distances,
characterization of structures, 3D reconstruction, dynamic analysis… ‣ automatic processing or manual (assisted) ‣ metadata is the key for relevant retrieval and selection
and clinical context data is always needed!
![Page 12: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/12.jpg)
And then came the Internet… ‣ Internet technologies are becoming the de facto standards
for data sharing, collaboration, and global access to information
‣ Internet is now the strongest incentive for technological innovation in IT
![Page 13: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/13.jpg)
Technological trends ‣ data encapsulation ‣ distributed storage ‣ processing and storage virtualization ‣ data centric architectures ‣ APIs, front / back dissociation ‣ new databases systems ‣ security ‣ open formats, open source, normalization
innovation is coming from high demand Internet application projects:social networks, large e-commerce platforms, data sharing clouds
![Page 14: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/14.jpg)
encapsulation
settings measures tags patient ID report clinical data
images overlays 3D infos
Imaging data object
DICOM
![Page 15: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/15.jpg)
Distributed storage
data query
load balancer
nodes storage
![Page 16: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/16.jpg)
Distributed storage
outsourced hosting health data agreement
![Page 17: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/17.jpg)
![Page 18: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/18.jpg)
Processing & storage virtualization
Hadoop / Hive Map reduce
Mahout
![Page 19: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/19.jpg)
APIs - front / back
server DB
back app
user
frontal app
data
API
![Page 20: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/20.jpg)
APIs - front / back
frontal app
API
v
v
user computer
server
pre-processing normalization
individual analytics interface
store-search-retrieval batch processing
distributed computing
![Page 21: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/21.jpg)
Flat database systems
SGDBdata in proprietary
format (no direct access)
user
userapp server
app server
flat data (file system access)NoSQL - Cassandra, MongoDB
![Page 22: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/22.jpg)
Security
APIencrypted
communication SSL
tunnelling / VPN
API data access only
controlled access
data striping
isolated database systems
crypted data
firewall
backups / mirrors
redundant servers
blind server & DB
maintenance
![Page 23: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/23.jpg)
data providers
data architect
back end builder
+ +
front end designer
database backups
scalability virtualization
cloud cluster
snapshots ghost server
hosting outsourcing
log / data supervisor
CRUD APP
+interface
WEB APP (SaaS)
data replication recordsets
open data
Queries
linked data
API sandbox
reports boards
stats
other SaaScontrolled
access
logs
imagery system
anonymization
SaaS
![Page 24: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/24.jpg)
Imaging data architecture ‣ non vendors PACS (open PACS) - DCM4CHEE, Osirix server, PACSOne ‣ open source visualization software : Osirix, Weasis (webapp) ‣ BYOD and scaling down: standard desktop computer, laptop,
even tablets as viewing stations ‣ accessing the PACS at the patient bed or from remote location ‣ linking with in house medical record system
and global medical record system
‣ research connexion: direct access to dynamic big data ‣ integration of new algorithms as add-ons modules (local processing)
objective : promote imaging as actionable data, for research and benefit of the other patients
![Page 25: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/25.jpg)
CT scan
consoles
MRI Ultrasound D X-Rays
viewing stations
shared storage (NAS)
standard PACS
![Page 26: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/26.jpg)
radiology dpt clinical dept
image DB
patients records system
remote viewing station
remote PACS
global patients record system
(DMP)
e-PACS
![Page 27: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/27.jpg)
remote processing
API
remote / mobile viewing
remote images repository
connected research
backup archives
imagery department
added analytics / overlays normalized atlases
open data
epidemiology publications
studies
new algorithms heuristics analytics
open data
![Page 28: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/28.jpg)
difficulties ‣ pooling images from different sources is difficult: different resolutions,
different angles, registration, quality of image… ‣ image analysis is computing intensive: but radiologists cannot wait too long
for results in an everyday use ‣ who wants to share? medical conservatism… too much privacy regulation? ‣ extraction of the important and relevant features in images is a daunting task ‣ image formats, clinical records are not always coherent ‣ incompatible proprietary systems - we need better open source software ‣ what do we do with old data? heterogeneous data? ‣ ethics: who owns the data? what about patient consent? ‣ security concerns: black hat hackers are everywhere ‣ the IT people should not access to patient’s health data ‣ how to integrate into PACS new algorithms modules? ‣ who pays for this?
![Page 29: Open & Big Data for Life Imaging - Inriajust digging and computing ... NoSQL - Cassandra, MongoDB. Security API encrypted communication SSL tunnelling / VPN API data access only controlled](https://reader035.vdocuments.mx/reader035/viewer/2022081522/5f0cd3f57e708231d43754b4/html5/thumbnails/29.jpg)
perspectives ‣ new imaging sources: mobile ultrasound, mobile MRI
‣ connected objects: calibrated images taken with smartphone
‣ dynamic images and time series
‣ robotic surgery
‣ in vivo image multimodal overlay (guided surgery)
‣ PACS app for iPhone?