sc12 workshop-writeup

SUNDAY:

HPC databases workshop:

rasdman:

• adding arrays to SQL queries• array query operators

• general array contstructor• subset trim & slice• array nest/unest• matrix multiplication• histograms• formal encoding (e.g. c, cpp, java arrays)• nested queries

• storage mapping: variants• coordinate-free sequence• BLOBs• ROLAP• imaging multidimensional OLAP

• tiled array storage• regular• directonal• area of interest

• In-Situ Databases• approach: reference external files• related: SciQL

• adding tertiary storage• tapes• problem: spatial clustering• approach: super-tiles = all of the particular index nodes (reiner 2001 - paper)

• Query processing• optimization 1: query rewriting• optimization 2: JIT compilation

• approach: cluster suitable ops• compile & dynamically bind• benefit: speed up complex, repeated operations• variation: compile code for GPU

• Intra operator parallelization• ...too fast

• query processing in a federation• query splitting• work in progress

• examples• human brain imaging• gene expression analysis (db queries, sexy as fuck) -> output jpeg, correlations,..• geo service standardization (OGC, SIC)

• use cases/ e.g.:• sat imageing• 3d clients/vis.

• historhy of array DBMSs• array as table

• conclusion• awesome for science and so on..

NEEEEEED SLIDES. so much enhanced SQL statement examples.

Energy Efficient HPC:

VERY much information via slides and talk, graphs,..extremely interesting. you should read the slides yourself, if you are interested: http://eehpcwg.lbl.gov/documents

Data-aware networking workshop:

gridftp (fatih university - TR):

https://sites.google.com/a/lbl.gov/ndm2012/home/accepted-papers (first one)

• intro: pipelining, parallelism, concurrency• pipelining:

• useful for large number of small files• higher throughputs on small files (1MB)• nr. of files affects total throughput but not the optimal pipelining level• throughput increases as number of files increases,..• BDP = BW*RTT - optimal windowsize (pfo)• ....

• parallelism:• when buffer size is too small comparing to the BDP• adventagous with large files

• concurrency:• advantages over parallelism:

• para. deteriorates the performance w. small files (pipelining)• concurrency + pipelining has better perf. than cc+pp+p• small RTT: quicker acend to the peak trhoughput• ...

• rules of thumb:• always use pipelining

• set diffrent levels• keep chunks as big as possible• use concurrency with pipelining w. small files and small # files• add parallelism to cc and pp with bigger filess• use parallelism when # files is insufficient to feed BDP

• recursive chunk size division• mean based algo. to construct cluster of files with diff. optimal pipelining lvls.• calc.optimal pipelining level by dividing BDP into mean file size of chunk

• results• awesome (slides needed, graphs and so on,..)

Sandhya Narayan, Hadoop acceleration in an OpenFlow-based cluster:

• overview of SDN/openflow• use case: hadoop

http://eehpcwg.lbl.gov/documents

https://sites.google.com/a/lbl.gov/ndm2012/home/accepted-papers

• hadoop overview• hadoop acceleration approaches (usual stuff)• overview mapreduce pipeline (ibid)• overview of hadoop network traffic (ibid)

• floodlight as openflow controller• openflow switch: openvswitch and link (research link)• queues in openflow (for different bandwidths 50mbps, 200mbps,..)• improvement in latency due to BW queues• conclusion: SDN is awesome, but we don't use much of it now.• further work: QoS, dynamic hadoop flows

no news there.

Mehmet Balman, Streaming Exa Scale data over 100Gbps Networks:

• lot-of-small files problem! - file centric tools (not high speed), latency still a problem• framework for memeory-mapped network channel

• blocks• memory caches are logically mapped between client and server• advantages:

• decoupling i/o and network ops (front/backend)• not limited by file size characteristics• moving climate files efficiently (gridftp, fopen,..)

• SC11 100Gbps demo• CMIP3 data (35tb) over gpfs at NERSC• bs 4MB• each blocks data section was alined according to the system page size• 1gb cache• testbed overview:

• many tcp streams• effects: crazy cpu usage

• memznet's performance (buffer size 5mb)

wtf?! no new information AT ALL.

MONDAY:

parallel storage workshop:

keynote (eric barton)

• http://www.pdsw.org/keynote.shtml • http://www.pdsw.org/pdsw12/slides/keynote-FF-IO-Storage.pdf

poster sessions slides and papers available online: http://www.pdsw.org/index.shtml

slides (papers if no slides available at the time): 1. http://www.pdsw.org/pdsw12/papers/he-pdsw12.pdf 2. http://www.pdsw.org/pdsw12/slides/crume-slides-pdsw12.pdf

http://www.pdsw.org/pdsw12/slides/crume-slides-pdsw12.pdf

http://www.pdsw.org/pdsw12/papers/he-pdsw12.pdf

http://www.pdsw.org/index.shtml

http://www.pdsw.org/pdsw12/slides/keynote-FF-IO-Storage.pdf

http://www.pdsw.org/keynote.shtml

3. http://www.pdsw.org/pdsw12/papers/grawinkle-pdsw12.pdf - no slides yet 4. http://www.pdsw.org/pdsw12/papers/kim-pdsw12.pdf - no slides yet 5. http://www.pdsw.org/pdsw12/slides/jwchoi_sc_SAN.pdf 6. http://www.pdsw.org/pdsw12/slides/ren-tablefs_giga_pdsw.pdf 7. http://www.pdsw.org/pdsw12/papers/goodell-pdsw12.pdf - no slides yet 8. http://www.pdsw.org/pdsw12/slides/watkins-datamods-pdsw12.pdf 9. http://www.pdsw.org/pdsw12/papers/carns-pdsw12.pdf - no slides (yet?)

HFT workshop:

http://www.cs.usfca.edu/~mfdixon/whpcf12/whpcf_12_program.html

2nd keynote - nvidia (john ashley) - how not to be roadkill

• overview• background: EE, realtime data, big data, datamining, geospatial,..• drivers - power and heat• drivers - financial regulators• drivers the world as we dont knot it:

• no arch. for everything, multi-arch• hadoop isnt the answer to everything• need to optimize cost and risk• need tools and techniques to implement across heterogenous solutions• need metrics to identfiy tradeoffs

• example:• hanweck - reduced capt. expen. 10x, oper. expen. 13x• citadel - each gpu saves 180.6K USD / year• JPMC - 80 percent oper. expen. savings through GPUs

• drivers - information advantage• is knowledge power?

• profit = f(knowledge, cap., capability)• low latency/hft teams know this,..

• knowing what your competition does• are you in the red with respect to capability to price and risk deals,..

• analytical? better models?, faster?• computionally? new technology -> time to market

• JPMorgan runs GPUs for risk analysis• crossing the road w/o getting hit

• techonolgy• no longer hw agnostic• heterogenous• suitable• data is the new bottleneck

• skills• parallel thinking

• data awareness• multi-paragidgm, multi-programming• experimentalism• hft guys are into all of this and so on,...

• parallel thinking• chunking work

• distribution

http://www.cs.usfca.edu/~mfdixon/whpcf12/whpcf_12_program.html

http://www.pdsw.org/pdsw12/papers/carns-pdsw12.pdf

http://www.pdsw.org/pdsw12/slides/watkins-datamods-pdsw12.pdf

http://www.pdsw.org/pdsw12/papers/goodell-pdsw12.pdf

http://www.pdsw.org/pdsw12/slides/ren-tablefs_giga_pdsw.pdf

http://www.pdsw.org/pdsw12/slides/jwchoi_sc_SAN.pdf

http://www.pdsw.org/pdsw12/papers/kim-pdsw12.pdf

http://www.pdsw.org/pdsw12/papers/grawinkle-pdsw12.pdf

• tiling• cyclic reduction, parallel solvers, swarm optimization, monte carlo

• numerical issues• awareness of descrete math issues, SP/DP• numerical stability, async. algos, red/black coloring, multi-level grid solvers

• data awareness• not just hadoop• efficient organization, delivery of data to compute is key• dataflow programming is key• hpc programmers already know this• examples:

• structure of arrays vs array of structures, esp. as vector units get wider• tiling algos. vs naive algos drastically improve performance

• some firms still believe that language optimized and hardware aware programming is wrong

• experimentalism• innovate• avoid analysis paralysis• define relevant metrics, collect them, and then act

• STAC-A2: a benchmark focused on metrics and biz problem• can be used to compare a range of potential solutions that are innovative• allows free eign to parallel and data-sensitive computing

• case study• CARMA: standalone arm + gpu micro server, its a dev. kit, over narrow pci-e

• monte carlo based• MPI• carma rocks for hft• speed• low power consumption

sc12 workshop-writeup

Technology