hank childs, university of oregon jan. 21st, 2013 cis 610: many-core visualization libraries

Download Hank Childs, University of Oregon Jan. 21st, 2013 CIS 610: Many-core visualization libraries

If you can't read please download the document

Upload: jocelyn-chapman

Post on 18-Jan-2018

220 views

Category:

Documents


0 download

DESCRIPTION

Schedule this week Tuesday lecture: today – Review of data parallel operations, general discussion of packages so far Thursday lecture: Ken Moreland (Thursday 12: Ken Moreland) Friday lecture: Ken Moreland – 8:30-10 (I can’t make this time) – 11-12:30 – 11:30-1:00

TRANSCRIPT

Hank Childs, University of Oregon Jan. 21st, 2013 CIS 610: Many-core visualization libraries Schedule for this class We have done 5 lectures in 2 weeks We should have done 4 lectures over last two weeks We will do 3 lectures this week We will be one full week ahead of schedule. We will cancel two lectures over the coming weeks. Schedule this week Tuesday lecture: today Review of data parallel operations, general discussion of packages so far Thursday lecture: Ken Moreland (Thursday 12: Ken Moreland) Friday lecture: Ken Moreland 8:30-10 (I cant make this time) 11-12:30 11:30-1:00 Upcoming schedule Tuesday, Jan 28 th 10 minute presentation by each student on the project they want to pursue Non-binding Discuss the problem, and some initial thoughts about how to do it in many-core libraries Upcoming schedule Thursday, Jan 30 th Group session debugging problems. Important that you have started your project by then. Upcoming schedule Weeks following Series of 20 minute presentations, 3 per lecture Two flavors of presentation: Update on my project Overview of a paper I read How this class will be graded You will all submit a report at the end of the quarter. The report will include: A summary of what you have done It will focus on your project You should also include Presentations made Porting of libraries Assistance to other students Bugs debugged (or reported) Etc How this class will be graded It is not curved If you all decide to not to present papers, you will all be penalized I expect you all get very good grades But it is important that you work hard and accomplish something in this class Play with the libraries, present papers in class, and really try to nail your research project Lectures I expect you will all make about 3 presentations 1 research update, 2 papers 2 research updates, 1 paper Some lectures in the short term on CUDA, Thrust, data parallelism, etc, would probably be helpful. EAVL EAVL E XTREME - SCALE A NALYSIS AND V ISUALIZATION L IBRARY Jeremy Meredith January, 2014 A Simple Data-Parallel Operation void CellToCellDivide(Field &a, Field &b, Field &b, Field &c) Field &c){ for_each(i) for_each(i) c[i] = a[i] / b[i]; c[i] = a[i] / b[i];} void CalculateDensity(...) { //... //... CellToCellDivide(mass, volume, density); CellToCellDivide(mass, volume, density);} Internal Library API Provides This Algorithm Developer Writes This Functor + Iterator Approach void CalculateDensity(...) { //... //... CellToCellBinaryOp(mass, volume, density, Divide()); CellToCellBinaryOp(mass, volume, density, Divide());} template void CellToCellBinaryOp (Field &a, Field &b, Field &b, Field &c Field &c T &f) T &f){ for_each(i) for_each(i) f(a[i],b[i],c[i]); f(a[i],b[i],c[i]);} struct Divide { void operator()(float &a, void operator()(float &a, float &b, float &b, float &c) float &c) { c = a / b; c = a / b; }}; Internal Library API Provides This Algorithm Developer Writes This Custom Functor void CalculateDensity(...) { //... //... CellToCellBinaryOp(mass, volume, density, MyFunctor()); CellToCellBinaryOp(mass, volume, density, MyFunctor());} template void CellToCellBinaryOp (Field &a, Field &b, Field &b, Field &c Field &c T &f) T &f){ for_each(i) for_each(i) f(a[i],b[i],c[i]); f(a[i],b[i],c[i]);} struct MyFunctor { void operator()(float &a, void operator()(float &a, float &b, float &b, float &c) float &c) { c = a + 2*log(b); c = a + 2*log(b); }}; Algorithm Developer Writes These Internal Library API Provides This D ATA P ARALLELISM B ASICS Map with 1 input, 1 output Simplest data-parallel operation. Each result item can be calculated from its corresponding input item alone. x struct f { float operator()(float x) { return x*2; } float operator()(float x) { return x*2; }}; result Map with 2 inputs, 1 output With two input arrays, the functor takes two inputs. You can also have multiple outputs. x struct f { float operator()(float a, float b) { return a+b; } float operator()(float a, float b) { return a+b; }}; result y Scatter with 1 input (and thus 1 output) Possibly inefficient, risks of race conditions and uninitialized results. (Can also scatter to larger array if desired.) Often used in a scatter_if type construct. x No functor result indices Gather with 1 input (and thus 1 output) Unlike scatter, no risk of uninitialized data or race condition. Plus, parallelization is over a shorter indices array, and caching helps more, so can be more efficient. x No functor result indices 19693 Reduction with 1 input (and thus 1 output) Example: max-reduction. Sum is also common. Often a fat-tree-based implementation. x result struct f { float operator()(float a, float b) { return a>b ? a : b; } float operator()(float a, float b) { return a>b ? a : b; }}; Inclusive Prefix Sum (a.k.a. Scan) with 1 input/output Value at result[i] is sum of values x[0]..x[i]. Surprisingly efficient parallel implementation. Basis for many more complex algorithms. x No functor. result Exclusive Prefix Sum (a.k.a. Scan) with 1 input/output Initialize with zero, value is sum of only up to x[i-1]. May be more commonly used than inclusive scan. x No functor. result W RITING A LGORITHMS IN EAVL E XAMPLE : T HRESHOLD Threshold Keep cell if it meets some criteria, else discard Criteria: Pressure > 2 10 < temperature < 20 Cells that meet criteria How to implement threshold Iterate over cells If a cell meets the criteria, then place that cell in the output Output is an unstructured mesh FieldName: x y z Component: Explicit cells can be combined with structured coordinates. Explicit cells can be combined with structured coordinates. Example: Thresholding an RGrid (a) eavlCoordinates Name: x Association: LogicalDim0 Values[ni] eavlField#0 Name: y Association: LogicalDim1 Values[nj] eavlField#1 Name: z Association: LogicalDim2 Values[nk] eavlField#2 RegularStructure: eavlStructuredCellSet FieldName: x y z Component: eavlCoordinates Name: x Association: LogicalDim0 Values[ni] eavlField#0 Name: y Association: LogicalDim1 Values[nj] eavlField#1 Name: z Association: LogicalDim2 Values[nk] eavlField#2 Connectivity: (a bunch of cells) eavlExplicitCellSet Cells: () Parent: ( ) A second Cell Set can be added which refers to the first one A second Cell Set can be added which refers to the first one Example: Thresholding an RGrid (b) RegularStructure: eavlStructuredCellSeteavlSubset FieldName: x y z Component: eavlCoordinates Name: x Association: LogicalDim0 Values[ni] eavlField#0 Name: y Association: LogicalDim1 Values[nj] eavlField#1 Name: z Association: LogicalDim2 Values[nk] eavlField#2 RegularStructure: eavlStructuredCellSet Name: x Association: LogicalDim0 Values[ni] eavlField#0 Name: y Association: LogicalDim1 Values[nj] eavlField#1 Name: z Association: LogicalDim2 Values[nk] eavlField#2 FieldName: x y z Component: eavlCoordinates Starting Mesh We want to threshold a mesh based on its density values (shown here) density If we threshold 35 < density < 45, we want this result: Which Cells to Include? Evaluate a Map operation with this functor: struct InRange { float lo, hi; float lo, hi; InRange ( float l, float h ) : lo ( l ), hi ( h ) { } InRange ( float l, float h ) : lo ( l ), hi ( h ) { } int operator ()( float x ) { return x>lo && x lo && x functor. We can use this to create output cell length arrays inrange 6 result plus Where Do the Output Cells Go? InputindicesOutputindices output cell input cell How do we create this mapping? Create Input-to-Output Indexing? Exclusive Scan (exclusive prefix sum) gives us the output index positions inrange startidx Scatter Input Arrays to Output? NO. We can do this, but scatters can be risky/inefficient. Assuming we have multiple arrays to process, we can do something better output_ density density Race condition unless we add a mask array! startidx Create Output-to-Input Indexing? We want to work in the shorter output-length arrays and use gathers. A specialized scatter in EAVL creates this reverse index revindex density Gather Input Mesh Arrays to Output? We can now use simple gathers to pull input arrays (density, pressure) into the output mesh revindex output_ density