![Page 1: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/1.jpg)
A Case for Source-Level Transformations in
MATLAB
Vijay Menon and Keshav Pingali
Cornell University
The MaJic Project
at Illinois/Cornell•George Almasi•Luiz De Rose•David Padua
![Page 2: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/2.jpg)
MATLAB
High-Level Interpreted Language for Numerical Computing Matrix is 1st class type Library of numerical functions
Application Domains Image Processing Structural Mechanics Computational Finance
![Page 3: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/3.jpg)
The Problem
Development is fast... ~10X as concise as C/Fortran
Performance is slow! ~10X as slow as C/Fortran
Conventional Approach: Rewrite Compile
![Page 4: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/4.jpg)
Our Approach: Source-Level Optimization
Apply high-level transformations directly on MATLAB codes
Significant performance benefit for: interpreted code compiled code
![Page 5: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/5.jpg)
Outline
Overheads in MATLABConventional CompilationSource-Level OptimizationComparisonImplementation Status
![Page 6: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/6.jpg)
Outline
Overheads in MATLAB Type/Shape Checking Memory Management Array Bounds Checking
Conventional CompilationSource-Level OptimizationComparisonImplementation Status
![Page 7: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/7.jpg)
Type/Shape Checking
MATLAB has no type/shape declarationsConsider: A * B
Interpreter checks to perform multiply (*)
ShapeScalar*ScalarScalar*MatrixMatrix*Matrix
TypeReal*RealReal*ComplexComplex*Compl
ex
![Page 8: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/8.jpg)
Type/Shape Checking
Consider:for i = 1:n
y = y + a * x(i)
end
Loops perform redundant checks magnify interpreter overhead
![Page 9: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/9.jpg)
Memory Management: Dynamic Resizing
Consider:x(10) = 10;
C/Fortran: x must have >= 10 elements
MATLAB: x is resized if needed Memory reallocated Data copied
![Page 10: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/10.jpg)
Memory Management: Dynamic Resizing
MATLAB dynamically grows arrays:for i = 1 : 1000
x(i) = i;
end
Every iteration triggers resize! 1,000 memory allocations ~500,000 elements copied
Execution Time: x is undefined: 14.2 seconds x is already defined: 0.37 seconds
![Page 11: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/11.jpg)
Array Bounds Checking
Consider array indexing:x(i) = y(i);
Failed Bounds Check on x(i) can trigger resize y(i) can trigger error
![Page 12: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/12.jpg)
Array Bounds Checking
In a loop:for i = 3:100
x(i) = x(i-1) + x(i-2);
end
Interpreter performance redundant checksCompiler work:
Nonresizable arrays: Gupta PLDI’90 Resizable arrays: more difficult
![Page 13: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/13.jpg)
Common Theme
Loops magnify overheads every iteration: redundant checks,
resizes, …
MATLAB interprets naively computes as is no reorganization to optimize
![Page 14: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/14.jpg)
Outline
Overheads in MATLABConventional Compilation
Compile to C/Fortran Rely on C/Fortran compiler for
optimizationSource-Level OptimizationComparisonImplementation Status
![Page 15: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/15.jpg)
MATLAB Compilers
Compile to C/C++/Fortran MCC -> C (The MathWorks) MATCOM -> C++ (Mathtools) FALCON -> F90 (U of Illinois)
Native compiler generates executable code: Link back into MATLAB environment Run as stand-alone program
![Page 16: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/16.jpg)
The MCC Compiler
Safe Optimization: Type Inference - no declarations in MATLAB Eliminate Type Checks / Reduce Storage Specialize for real input variables Always legal!
Unsafe Optimization: Assume all data is real Eliminate all bounds checks - disallow resizing User must ensure legality!
![Page 17: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/17.jpg)
Falcon Benchmarks Collected by DeRose from MATLAB users at Illinois/NCSA
Element/Loop Intensive CN - Crank-Nicholson PDE Solver Di - Dirichlet PDE Solver FD - Finite Difference PDE Solver Ga - Galerkin PDE Solver IC - Incomplete Cholesky Factorization
Memory Intensive AQ - Adaptive Quadrature w/ Simpson’s Rule EC - Euler-Cromer 2 body problem RK - Runga Kutta 2 body problem
Library Intensive CG - Conjugate Gradients Iterative Solver Mei - 3D surface Generation QMR - Quasi-Minimal Residual SOR - Successive Over-Relaxation AQ
![Page 18: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/18.jpg)
MCC: Safe Optimizations
0
10
20
30
40
50
60
70
80
AQ CG CN Di FD Ga IC Mei EC RK QMR SOR
Ex
ec
uti
on
Tim
e (
s)
Interpreted
MCC Safe
![Page 19: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/19.jpg)
MCC: Unsafe Optimizations
0
10
20
30
40
50
60
70
CG Di FD IC QMR SOR
Ex
ecu
tio
n T
ime
(s)
Interpreted
MCC Safe
MCC Unsafe All
Note: User must ensure legality!
![Page 20: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/20.jpg)
Outline
Overheads in MATLABConventional CompilationSource-Level Optimization
Vectorization Preallocation Expression Optimization
ComparisonImplementation Status
![Page 21: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/21.jpg)
Vectorization
Loops are expensive Overheads are magnified
Idea: Eliminate Loops Map loops to higher-level matrix
operations Interpreter uses efficient libraries
BLASLINPACK/EISPACK
![Page 22: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/22.jpg)
Example of Vectorization
In Galerkin, 98% of execution spent in:
for i = 1:N
for j = 1:N
phi(k) += a(i,j)*x(i)*y(i);
end
end
![Page 23: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/23.jpg)
Vectorized Code
In Optimized Galerkin:
phi(k) += x*a*y’;
Fragment Speedup: 260Program Speedup: 110
Note: Not always possible!
![Page 24: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/24.jpg)
Effect of Vectorization
0
10
20
30
40
50
60
70
80
CN Di FD Ga IC
Ex
ecu
tio
n T
ime
(s)
Original
Vectorized
![Page 25: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/25.jpg)
Preallocation
Eliminate Dynamic Resizing Try to predict eventual size of array
Insert early allocation when possible:x = zeros(1000,1);
Resizing will not be triggered
![Page 26: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/26.jpg)
Example of Preallocation
In Euler-Cromer, 87% of time spent in:
for i = 1:N
r(i) = …
th(i) = …
t(i) = …
k(i) = …
p(i) = …
…
end
![Page 27: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/27.jpg)
Preallocated Code
In Optimized Euler-Cromer:
r = zeros(1,N);
...
for i = 1:N
r(i) = …
…
end
Fragment Speedup: 7Program Speedup: 4
![Page 28: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/28.jpg)
Effect of Preallocation
0
10
20
30
40
50
60
70
80
CN Ga EC RK
Ex
ecu
tio
n T
ime
(s)
Original
Preallocated
![Page 29: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/29.jpg)
Expression Optimization
MATLAB interprets expressions naïvely in left to right order
Simple restructuring may significantly effects execution time, e.g.: A*B*x : O(n3) flops A*(B*x) : O(n2) flops
![Page 30: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/30.jpg)
Example of Expression Optimization
In QMR, 70% of execution spent in:
w = A’*q;
A : 420x420 matrixq, w : 420x1 vectors
A’ = transpose(A)
![Page 31: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/31.jpg)
Expression Optimized Code
In Optimized QMR: A’*q == (q’*A)’
w = (q’*A)’;
Transpose 2 vectors instead 1 matrix
Fragment Speedup: 20Program Speedup: 3
![Page 32: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/32.jpg)
Effect of Expression Optimization
0
10
20
30
40
50
60
70
EC RK QMR
Ex
ecu
tio
n T
ime
(s)
Original
Expr. Optimized
![Page 33: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/33.jpg)
Summary Source-Level
0
10
20
30
40
50
60
70
80
AQ
CG
CN Di
FD
Ga IC
Mei
EC
RK
QM
R
SO
R
Ex
ecu
tio
n T
ime
(s)
Original
Source Optimized
![Page 34: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/34.jpg)
Comparison
0
10
20
30
40
50
60
70
80
AQ CG CN Di FD Ga IC Mei EC RK QMR SOR
Ex
ec
uti
on
Tim
e (
s)
Interpreted MCC Safe MCC Best
Opt. Interpreted Opt. MCC Safe Opt. MCC Best
![Page 35: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/35.jpg)
Point #1:
Source optimizations can outperform MCC
0
10
20
30
40
50
60
70
FD Ga IC QMR
Ex
ecu
tio
n T
ime
(s)
Interpreted MCC Safe MCC Best
Opt. Interpreted Opt. MCC Safe Opt. MCC Best
![Page 36: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/36.jpg)
Point #2:
0
10
20
30
40
50
60
70
80
CN FD Ga IC EC
Ex
ecu
tio
n T
ime
(s)
Interpreted MCC Safe MCC Best
Opt. Interpreted Opt. MCC Safe Opt. MCC Best
Source optimizations complement MCC
![Page 37: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/37.jpg)
Benefits of Source-Level Optimizations
Vectorization Directly eliminates loop overhead Move work to hand-optimized BLAS
Preallocation Eliminates resizing overhead Enables MCC array bounds elimination
Expression Optimization Uses algebraic info unavailable in C/Fortran
![Page 38: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/38.jpg)
Implementation Status
Illinois/Cornell MaJic system Just-in-time MATLAB interpreter/compiler Incorporates Source-Level Transformation
Semantic Optimization (Menon/Pingali ICS’99)• Vectorization/BLAS call generation• Expression Optimization
Preallocation/Bounds Check Optimization (Work in progress)
![Page 39: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/39.jpg)
Conclusion
Source Level Optimizations are important for enhancing performance of MATLAB whether code is just interpreted or later compiled
![Page 40: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/40.jpg)
THE END
![Page 41: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/41.jpg)
Unsafe Type Check Removal
0
10
20
30
40
50
60
70
80
Ex
ecu
tio
n T
ime
(s)
Interpreted
MCC Safe
MCC Unsafe Type
Correct on 11/12 Codes
![Page 42: A Case for Source-Level Transformations in MATLAB Vijay Menon and Keshav Pingali Cornell University The MaJic Project at Illinois/Cornell George Almasi](https://reader031.vdocuments.mx/reader031/viewer/2022032522/56649d615503460f94a4357d/html5/thumbnails/42.jpg)
Unsafe Bounds Check Removal
0
10
20
30
40
50
60
70
CG Di FD IC Mei QMR SOR
Ex
ecu
tio
n T
ime
(s)
Interpreted
MCC Safe
MCC Unsafe Bounds
Correct on 7/12 Codes