data intensive computing graph algorithms for irregular, unstructured data – john feo, pacific...

5
Data Intensive Computing • Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory • graph500 and data-intensive computing – Richard Murphy, Sandia National Laboratories • Large-scale knowledge discovery – Steve Reinhardt, Microsoft • Data intensive computing at SNL – Andrew Wilson, Sandia National Laboratories • IBM's InfoSphere Streams – Roger Rea, IBM • Graph500 and Data Intensive HPC – Richard Murphy, Sandia National Laboratory • Data Analytics – Phillip Morris, Platform Computing • Data Intensive Computing – Richard Altmaier, Intel

Upload: edward-murphy

Post on 19-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive

Data Intensive Computing• Graph algorithms for irregular, unstructured data

– John Feo, Pacific Northwest National Laboratory• graph500 and data-intensive computing

– Richard Murphy, Sandia National Laboratories• Large-scale knowledge discovery

– Steve Reinhardt, Microsoft• Data intensive computing at SNL

– Andrew Wilson, Sandia National Laboratories• IBM's InfoSphere Streams

– Roger Rea, IBM• Graph500 and Data Intensive HPC

– Richard Murphy, Sandia National Laboratory• Data Analytics

– Phillip Morris, Platform Computing• Data Intensive Computing

– Richard Altmaier, Intel

Page 2: Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive

1. Please provide a definition for "Data Intensive Computing". Please explain the difference between "finding a needle in a haystack" and "knowledge discovery".

Page 3: Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive

2. "Data Intensive Computing" generally involves analyses of non-numeric data and the number of combinatorial possibilities grows rapidly. The objective of the analysis is to find in the data, meaningful relationships. How do we (a) test for convergence when not evaluating all possible combinations and (b) test for statistical significance--when the data is non-numeric?

Page 4: Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive

3. "Data Intensive Computing" often involves the use of incomplete data. How does this affect the analysis process?

Page 5: Data Intensive Computing Graph algorithms for irregular, unstructured data – John Feo, Pacific Northwest National Laboratory graph500 and data-intensive

4. If you could design an ideal computing architecture for Data Intensive Computing, what would it look like?