update now--whats-new-in-intel-compilers-and-libraries
DESCRIPTION
Build fast code faster with the compilers and libraries in the new Intel® Compiler Version 15.0. Build fast code faster with the compilers and libraries in the new Intel® Compiler Version 15.0. We’ll examine new capabilities in the 2015 Intel compiler release, including new optimization and vectorization reports that streamline performance enhancements, increased language standards support, and useful new compiler options. And, what’s in a name? We've simplified Intel® Parallel Studio XE in three editions—join us and find out more. 線上購買及更多軟體介紹及下載試用,歡迎至本公司線上商店 ,Buy Online : http://www.appcenter.com.tw/ or http://www.cheerchain.com.tw Cheer Chain Enterprise Co., Ltd. T +886 4 2386 3559 | F +886-42386 3159 [email protected] | www.cheerchain.com.tw Distribution of Software | Training Courses | Consulting ServicesTRANSCRIPT
What’s new in theIntel® Parallel Studio XE 2015 Composer EditionAugust, 2014
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Support for majority of OpenMP* 4.0
• Major item not included: user-defined reductions
Feature Complete! C++ 11
• Language features only, library features dependent on the standard C++ library with the platform
Feature Complete! Fortran 2003
Fortran 2008 BLOCK (F08 is a work in progress)
Redesign of compiler Optimization Reports (including vec-report and inlining reports)
• Line and loop based reports, line-ordered not phase ordered
Key Features: Big Items
2
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• The compiler is included in three major packages or bundles
• All three major packages begin with the name “Intel® Parallel Studio XE 2015 <bundle name>” where bundle name is:
• “Composer Edition” formerly named “Intel Composer XE”
• “Professional Edition” formerly named “Studio”s
• “Cluster Edition” formerly named “Intel® Cluster Studio XE”
• Installation directories and naming follow previous compiler methods and naming conventions
New Naming
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
New Features Common to both C++ and FortranWhat’s in IFORT & ICC v15.0
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Windows XP* no longer supported
• Microsoft Visual Studio 2008* no longer supported
• Compilers will not integrate into this version
• Latent support for “Windows* 8.2”, “Microsoft Visual Studio 2014*” (i.e. next version)
• IMSL available as add-on for any product containing Visual Fortran for Windows
• Microsoft Visual Studio 2010* (including Fortran Shell) Note: need to install VS2010 SP1 (Service Pack 1) or vsshell2010sp1.msi from Intel
Windows Support Changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Red Hat Enterprise Linux* 7 support added (RHEL 5+ supported)
• Ubuntu*: 12.04 (LTS, 64-bit only), 13.10, 14.04 supported
• Fedora*: Fedora Core 20 support added
• SuSE Linux Enterprise Server* 10 (SLES 10) no longer supported
• SLES 11 supported
• Latent support for Fedora Core 21 and SLES 12
• Intel® Debugger (IDB) no longer supplied with compiler ( GNU* gdb provided with Composer Edition + extensions supplied )
• Read the friendly Release Notes, Requirements for specifics
Linux Support Changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• As always, latest OS X is supported (subject to compiler update release schedule alignment)
• Currently 10.9 supported
• When a new OS is released OFFICIALLY we will OFFICIALLY add support in a subsequent Update
• Xcode* 5.0/5.1 supported
• Xcode 4.x no longer supported
• Intel® Debugger (IDB) no longer supported ( GNU* gdb provided with Composer XE + extensions supplied )
OS* X Changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
A New Optimization Report!
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
We heard your feedback…...
Vectorization and other reports were hard to understand
Messages too cryptic
Loop optimization messages split between multiple reports
Confusion due to multiple versions of one source loop created by the compiler
Hard to relate messages to source when code was inlined
One huge report stream hard to navigate; unsuitable for parallel builds
9
Composer 2015: Why a new Optimization Report?
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Improve the user experience
• Make the report easier to read and understand
• A single, unified report
• Loop-based reporting
• Focus on user actionable information
• Make it easy to select the desired information
• Expand the range of output modes
Make the report’s information accessible
• In a text file
• In the assembly listing
• Through the Microsoft* Visual Studio* IDE
• Through other Intel software tools
10
Report Goals for the 15.0 Compiler
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Introduced in Intel® Compiler v15.0 for C, C++ and Fortran
• for Windows*, Linux* and OS X*
Main options:
/Qopt-report:N (Windows), -qopt-report=N (Linux and OS X)
N = 1-5 for increasing levels of detail, (default N=2)
/Qopt-report-phase:str[,str1,…] -qopt-report-phase=[,str1,…]
str = loop, par, vec, openmp, ipo, pgo, cg, offload, tcollect, all
/Qopt-report-file:[stdout | stderr | filename]-qopt-report-file=[stdout | stderr | filename]
11
General
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Output goes to a text file by default, no longer stderr
• File extension is .optrpt, root is same as object file
• One report file per object file, in object directory
• created from scratch or overwritten (no appending)
/Qopt-report-file:stderr gives old behavior (to stderr)-qopt-report-file=stderr
: or =filename to change default file name
/Qopt-report-format:vs format for Visual Studio* IDE
For debug builds, (-g on Linux* or OS X*, /Zi on Windows*), assembly code and object files contain loop optimization info
• /Qopt-report-embed to enable this for non-debug builds
12
Report Output
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
The optimization report can be large
Filtering can restrict the content to the most performance-critical parts of an application
[-q | /Q]opt-report-routine[: | =]<function1>[,<function2>,…]
“function1” can be a substring of function name or a regular expression
can also restrict to a particular range of line numbers, e.g.:
icl /Qopt-report-filter=“test.cpp,100-300” test.cpp
ifort –qopt-report-filter=“test.f90,100-300” test.f90
Also select the optimization phase(s) of interest with option -qopt-report-phase or /Qopt-report-phase
13
Filtering Report Output
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Hierarchical display of loop nest
• Easier to read and understand
• For loops for which the compiler generates multiple versions, each version gets its own set of messages
Where code has been inlined, caller/callee info available
The “loop” phase (formerly hlo) includes messages about memory and cache optimizations, such as blocking, unrolling and prefetching
• Now integrated with vectorization & parallelization reports
14
Loop, Vectorization and Parallelization Phases
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
15
Hierarchically PresentedLoop Optimization Report
1 double a[1000][1000],b[1000][1000],c[1000][1000];
2
3 void foo() {
4 int i,j,k;
5
6 for( i=0; i<1000; i++) {
7 for( j=0; j< 1000; j++) {
8 c[j][i] = 0.0;
9 for( k=0; k<1000; k++) {
10 c[j][i] = c[j][i] + a[k][i] * b[j][k];
11 }
12 }
13 }
14 }
19/29/2014
LOOP BEGIN at …\mydir\dev\test.c(7,5)Distributed chunk2….
LOOP BEGIN at …\mydir\dev\test.c(9,7)Distributed chunk2….
LOOP BEGIN at …\mydir\dev\test.c(6,3)….
LOOP END
LOOP BEGIN at …\mydir\dev\test.c(6,3)….
LOOP ENDLOOP END
LOOP END
source location
loop nesting
header info
report contents
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
16
By comparison: 14.0 vs 15.0 reportsHPO VECTORIZER REPORT (foo) LOG OPENED ON Mon Feb 24 12:10:15 2014
<d:\dev\test.c;-1:-1;hpo_vectorization;foo;0>HPO Vectorizer Report (foo)
d:\dev\test.c(7:5-7:5):VEC:foo: loop was not vectorized:loop was transformed to memset or memcpyd:\dev\test.c(6:3-6:3):VEC:foo: PERMUTED LOOP WAS VECTORIZEDd:\dev\test.c(9:7-9:7):VEC:foo: loop was not vectorized: not inner loopd:\dev\test.c(7:5-7:5):VEC:foo: loop was not vectorized: not inner loop
HLO REPORT LOG OPENED ON Mon Feb 24 12:10:15 2014
<test.c;-1:-1;hlo;foo;0>High Level Optimizer Report (foo)
<d:\dev\test.c;7:7;hlo_distribution;in foo;0>LOOP DISTRIBUTION in foo at line 7
<d:\dev\test.c;6:6;hlo_linear_trans;foo;0>LOOP INTERCHANGE in loops at line: 6 7Loopnest permutation ( 1 2 ) --> ( 2 1 )LOOP INTERCHANGE in loops at line: 6 7 9Loopnest permutation ( 1 2 3 ) --> ( 2 3 1 )
<d:\dev\test.c;7:7;hlo_reroll;foo;0>Loop at line:7 memset generated
Block, Unroll, Jam Report:(loop line numbers, unroll factors and type of transformation)
<d:\dev\test.c;6:6;hlo_unroll;foo;0>Loop at line 6 completely unrolled by 8
Loop Collapsing Report:
<d:\dev\test.c;7:7;hlo_loop_collapsing;foo;0>Loops at line:7 and line:6 collapsed
Report from: Loop nest, Vector & Auto-parallelization optimizations [loop, vec, par]
LOOP BEGIN at d:\dev\test.c(7,5)Distributed chunk1
remark #25430: LOOP DISTRIBUTION (2 way)remark #25448: Loopnest Interchanged : ( 1 2 ) --> ( 2 1 )remark #25424: Collapsed with loop at line 6remark #25412: memset generatedremark #15144: loop was not vectorized: loop was transformed to memset
or memcpyLOOP END
LOOP BEGIN at d:\dev\test.c(7,5)Distributed chunk2
remark #25448: Loopnest Interchanged : ( 1 2 3 ) --> ( 2 3 1 )remark #15018: loop was not vectorized: not inner loop
LOOP BEGIN at d:\dev\test.c(9,7)Distributed chunk2
remark #15018: loop was not vectorized: not inner loop
LOOP BEGIN at d:\dev\test.c(6,3)remark #15145: vectorization support: unroll factor set to 4remark #15003: PERMUTED LOOP WAS VECTORIZED
LOOP END
LOOP BEGIN at d:\dev\test.c(6,3)remark #15003: REMAINDER LOOP WAS VECTORIZED
LOOP ENDLOOP END
LOOP END
19/29/2014
Ordered Reporting of Transformations
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
LOOP BEGIN at ggFineSpectrum.cc(124,5) inlined into ggFineSpectrum.cc(56,7)
remark #15018: loop was not vectorized: not inner loop
LOOP BEGIN at ggFineSpectrum.cc(138,5) inlined into ggFineSpectrum.cc(60,15)
Peeled
remark #25460: Loop was not optimized
LOOP END
LOOP BEGIN at ggFineSpectrum.cc(138,5) inlined into ggFineSpectrum.cc(60,15)
remark #15145: vectorization support: unroll factor set to 4
remark #15002: LOOP WAS VECTORIZED
LOOP END
LOOP BEGIN at ggFineSpectrum.cc(138,5) inlined into ggFineSpectrum.cc(60,15)
Remainder
remark #15003: REMAINDER LOOP WAS VECTORIZED
LOOP END
LOOP END
17
Multi-versioned loops:Peel loop, remainder loop and kernel
Vectorized withPeeling and Remainder
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Annotated Assembly Listings
.L11: # optimization report
# LOOP WAS INTERCHANGED
# loop was not vectorized: not inner loop
xorl %edi, %edi #38.3
movsd b.279.0.2(%rax,%rsi,8), %xmm0 #41.32
unpcklpd %xmm0, %xmm0 #41.32
# LOE rax rcx rbx rsi rdi r12 r13 r14 r15 edx xmm0
..B1.11: # Preds ..B1.11 ..B1.10
..L12: # optimization report
# LOOP WAS INTERCHANGED
# LOOP WAS VECTORIZED
# VECTORIZATION HAS UNALIGNED MEMORY REFERENCES
# VECTORIZATION SPEEDUP COEFFECIENT 2.250000
movaps a.279.0.2(%rcx,%rdi,8), %xmm1 #41.22
movaps 16+a.279.0.2(%rcx,%rdi,8), %xmm2 #41.22
movaps 32+a.279.0.2(%rcx,%rdi,8), %xmm3 #41.22
movaps 48+a.279.0.2(%rcx,%rdi,8), %xmm4 #41.22
mulpd %xmm0, %xmm1 #41.32
mulpd %xmm0, %xmm2 #41.32
<…>
189/29/2014
L4:: ; optimization report; PEELED LOOP FOR VECTORIZATION
$LN36:$LN37:
vaddss xmm1, xmm0, DWORD PTR [r8+r10*4] ;4.5
snip snip snip
L5:: ; optimization report; LOOP WAS VECTORIZED; VECTORIZATION HAS UNALIGNED MEMORY REFERENCES; VECTORIZATION SPEEDUP COEFFECIENT 8.398438
$LN46:vaddps ymm1, ymm0, YMMWORD PTR [r8+r9*4] ;4.5
snip snip snip
L6:: ; optimization report; LOOP WAS VECTORIZED; REMAINDER LOOP FOR VECTORIATION; VECTORIZATION HAS UNALIGNED MEMORY REFERENCES; VECTORIZATION SPEEDUP COEFFECIENT 2.449219
$LN78:add r10, 4 ;3.3
snip snip snip
L7:: ; optimization report; REMAINDER LOOP FOR VECTORIATION
$LN93:inc rax ;3.3
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
-qopt-report-phase:vec -qopt-reportN
/Qopt-report-phase:vec /Qopt-report:N
N specifies the level of detail; default N=2 if N omitted
Level 0: No vectorization report
Level 1: Reports when vectorization has occurred.
Level 2: Adds diagnostics why vectorization did not occur.
Level 3: Adds vectorization loop summary diagnostics.
Level 4: Additional detail, e.g. on data alignment
Level 5: Adds detailed data dependency information
19
Vectorization – report levels
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
$ icc -c -opt-report=4 -opt-report-phase=loop,vec -opt-report-file=stderr foo.c
Begin optimization report for: foo
Report from: Loop nest & Vector optimizations [loop, vec]
LOOP BEGIN at foo.c(4,3)Multiversioned v1
remark #25231: Loop multiversioned for Data Dependenceremark #15135: vectorization support: reference theta has unaligned accessremark #15135: vectorization support: reference sth has unaligned accessremark #15127: vectorization support: unaligned access used inside loop bodyremark #15145: vectorization support: unroll factor set to 2remark #15164: vectorization support: number of FP up converts: single to double precision 1remark #15165: vectorization support: number of FP down converts: double to single precision 1remark #15002: LOOP WAS VECTORIZEDremark #36066: unmasked unaligned unit stride loads: 1remark #36067: unmasked unaligned unit stride stores: 1…. (loop cost summary) ….remark #25018: Estimate of max trip count of loop=32
LOOP END
LOOP BEGIN at foo.c(4,3)Multiversioned v2
remark #15006: loop was not vectorized: non-vectorizable loop instance from multiversioningLOOP END=========================================================================== 20
Actionable Messages Example (1)
#include <math.h>void foo (float * theta, float * sth) {
int i;for (i = 0; i < 128; i++)
sth[i] = sin(theta[i]+3.1415927);}
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimization Reports
Windows Optimization Reportintegrations in Visual Studio
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Compiler Optimization Report Tool Windows
22Intel Confidential - Intel Internal Use Only
Hierarchically Presented Report
Jump to source position by double click
Jump to call site by click
Filter messages by context scope
Filter messages by optimization phase
Filter messages by any keyword
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Compiler Flags of Interest in Microsoft* Visual Studio* Project Properties
Intel C++ integration
IntelVisual Fortran integration
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimization Notes in Text Editor
Optimization notes at callee site
Report content
Jump to call site by click
Click ‘?’ to get help for message
Optimization notes at call site
Jump to callee site by click
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Inlining options Show report for my code only
Filter by any keyword
Use text editor context menu for advanced scenarios
Compiler Inline Report Tool WindowJump to source position by click
Routine size and increase in size due to inlining
Reason why not inlined
Details on how to force inlining
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Optimization Reports Options(Tools->Options)
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Video recorded Webinar “Getting the most out of your compiler with the new Optimization Reports
• https://software.intel.com/en-us/videos/getting-the-most-out-of-the-intel-compiler-with-new-optimization-reports
• Slides for “Getting the most out of your compiler with the new Optimization Reports”
• https://software.intel.com/sites/default/files/managed/55/b1/new-compiler-optimization-reports.pdf
More Information onOptimization Reports
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
OpenMP 4.0What’s new in Composer XE 2015?
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Everything in 4.0 now supported except for user-defined reductions
• CANCEL directive
Requests cancellation of the innermost enclosing region
• CANCELLATION POINT directive
Defines a point at which implicit or explicit tasks check to see if cancellation has been requested
• DEPEND clause on TASK directive
Enforces additional constraints on the scheduling of a task by enabling dependences between sibling tasks in the task region.
• Combined constructs (TEAMS DISTRIBUTE, etc.)
OpenMP* 4.0 Support
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Fortran WORKSHARE can go parallel (sometimes)
Simple array assignments such as A = B + C parallelize.
Simple array assignments with overlap such as A = A + B + C parallelize.
Array assignments with user-defined function calls parallelize such as A = A + F (B). F must be ELEMENTAL.
Array assignments with array slices on the right hand side of the assignment such as A = A + B(1:4) + C(1:4) parallelize. If the lower bound of the left hand side or the array slice lower bound or the array slice stride on the right hand side is not 1, then the statement does not parallelize.
Assigning into array slices does not parallelize.
Scalar assignments do not parallelize – there is no work that needs to be done in parallel.
FORALL and WHERE constructs do not parallelize.
New OpenMP* Support
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
New C++ FeaturesWhat’s coming in ICC v15.0
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
–ansi-alias enabled by default at –O2 or –O3 on Linux* (gcc* enables –fstrict-aliasing at –O2 and –O3)
-ansi-alias asserts that code follows ANSI aliasing rules, allowing the compiler to make more aggressive optimizations
This can improve performance if this is true
If not true, this can result in bad behavior
Alias checking at compile time may catch this
But it may not resulting in bad runtime results
If in doubt, use –no-ansi-alias to disable
with gcc, -fno-strict-aliasing
-ansi-alias now default on Linux* C++
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® C++ version 15.0 completes all C++11 standard language features!
virtual overrides
inheriting constructors, i.e.:
struct Derived { using Base::Base; }
deprecation of exception specifications
user defined literals
thread_local (C++11 semantics) (Linux only)
C++11 library features are dependent on the support provided by the standard C++ library on the platform:
Windows*: msvcrt/libcmt, Linux*: libstdc++, OS X*: libc++/libstdc++
C++11 features (COMPLETE)
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
•To enable c++11 support you need to use –std=c++11 (or –std=c++0x) option
•We currently support all c++11 features used in the GNU 4.8 versions of the headers enabled when you use the option
• Depending upon the GNU on your system (i.e. g++ in your PATH) you may get different features enabled
• Support of C++11 features requires support from C++ header files included with GNU C/C++ installation – these features vary by version.
• Toolchain for Intel® Many Integrated Core Architecture specifically has issues. See release notes for details.
• Recommend use of GNU 4.8 or newer packages
GNU* Compatibility
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Microsoft* implementation of C++11 features in latest Microsoft Visual Studio* version 2013
C++11 support in VS2012 and older versions depends on the features Microsoft included in their C++ header files
No special command line switch or feature macro used in standard lib headers to access C++11 functionality
Intel® C++ compiler is compatible by default (i.e. whatever C++11 features are provided by the reference Microsoft compiler are available with Intel Windows* compiler)
To get additional C++11 functionality with our compiler use Intel-specific /Qstd=c++0x or /Qstd=c++11 switch
Intel C++ Composer XE 2015 is fully compatible with the Microsoft Visual C++ 2013 with respect to C++11 functionality
Microsoft* compatibility/Microsoft* standard library
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
New Fortran FeaturesWhat’s in IFORT v15.0
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Full language Fortran 2003 support
Parameterized Derived Types
Expanded support of intrinsics in specification and constant expressions
• BLOCK from Fortran 2008
Fortran Standard Features
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Allows programmer to create a template for a type that can have KIND and length parameters deferred
• KIND type parameters are compile-time constants, length parameters can be run-time. Example:
TYPE humongous_matrix(k, d)INTEGER, KIND :: k = kind(0.0)INTEGER(selected_int_kind(12)), LEN :: dREAL(k) :: element(d,d)
END TYPE
TYPE(humongous_matrix(8,10000000)) :: giant
F03: Parameterized Derived Types
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Fortran 2003 relaxed the rules about which intrinsics could appear in specification and constant (formerly initialization) expressions
• Fortran 2008 relaxes them even more
• 14.0 doesn’t support all of the F2003-allowed intrinsics here, such as MERGE
• Inquiry functions from IEEE intrinsic modules are supported
Intrinsics in Specification and Constant Expressions
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• An executable construct that may contain declarations
• Variables declared within the construct are local to that scope
• No COMMON, EQUIVALENCE, NAMELIST, IMPLICIT
• SAVE allowed, local to that construct
• SAVE in outer scope does not affect BLOCK
• Labels and formats are not local
• Useful with DO CONCURRENT for threadlocals
F08: BLOCK Construct
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
IF (swaxpy) THENBLOCK
REAL(KIND(x)) tmptmp = xx = yy = tmp
END BLOCKEND IF
BLOCK Example – a temporary variable
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
DO CONCURRENT (I = 1:N)BLOCK
REAL TT = A(I) + B(I)C(I) = T + SQRT(T)
END BLOCKEND DO
Without BLOCK, no way to create an iteration-local (threadprivate) temporary variable
F08: DO CONCURRENT with BLOCK
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Dynamic buffering of I/O for better performance of both short and long records
Can be controlled with FORT_BUFFERING_THRESHOLD environment variable
• MIC OFFLOAD changes
OFFLOAD SIGNAL argument can now be an expression (was constant only)
Non-contiguous array slice ok for OFFLOAD
OFFLOAD INOUT clause should allow record with allocatable fields and allocatable objects that are fields, eg, INOUT (REC, REC%ALLOC)
Other New Features
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• /Qinit:keyword or –init keyword ([no]arrays, [no]snan, [no]zero)
Initializes static local variables to signaling NaN or zero (replaces /Qzero)
Default is to initialize scalars only
/Qsave also needed, typically
Only REAL and COMPLEX can be sNAN initialized
No EQUIVALENCE, no derived types, no automatic or allocatable variables
Sets –fpe0 – warning if –fpe3 explicitly specified
New Compiler Options
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• /Qopt-dynamic-align[-] (-q[no-]opt-dynamic-align)
Allows user to turn off conditional code paths based on alignment, for better run-to-run reproducibility without larger performance impact of other options
• /Qprof-gen:[no]threadsafe
Accurately collect block counts while multiple threads are updating the same counter.
• -f[no-]fat-lto-objects (Linux only)
Determines whether a fat link-time optimization (LTO) object, containing both intermediate language and object code, is generated during –ipo compile
• -xcore-avx512 coming in 15.0 Update 1 (fall 2014)
New Compiler Options
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Other new features – the little things
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Intel® Math Kernel Library (Intel® MKL) 11.2• Intel® MKL Parallel Direct Sparse Solver for Clusters (CPARDISO), a
distributed memory version of Intel MKL PARDISO direct sparse solver. CPARDISO is a Solver for a system Ax=b on a many-core cluster, where A is a sparse square matrix.
• Verbose mode support for BLAS and LAPACK domains, which allows echoing of input parameters on calls to Intel® MKL. By setting the environment variable MKL_VERBOSE to 1 or by calling the function mkl_verbose (1), every call of a verbose-enabled function prints a log containing information like version, name of function, value of arguments, etc.
• The Intel® Math Kernel Library Cookbook is a new document with recipes for assembling Intel® MKL routines for solving complex problems. (http://software.intel.com/en-us/mkl_cookbook)
• Improved ?GEMM performance on small matrices.
Intel® Integrated Performance Primitives 8.1• Support for Intel® Xeon Phi™ coprocessors
New Libraries Features
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• DWARF3 is now the default debug format• -stdlib=libc++ default on OS X*• Specific GNU standard C headers provided with compiler• -fast and –Ofast now implies /fp:fast=2 or –fp-model fast=2• __intel_simd_lane() intrinsic for SIMD enabled functions• gcc-compatible function multiversioning• Microsoft vectorcall calling convention supported• New aligned_new header for C++11 type alignment• /Qcheck-pointers-narrowing-, -no-check-pointers-narrowing to
relax Pointer Checker analysis of struct fields• Improved debugging information for C++11 lambda functions• #pragma offload now permits non-contiguous data• _Simd, _Safelen, and _Reduction keywords for explicit vectorization• /Qno-builtin-<name> option added for Windows* to disable intrinsic
functions by name
Other compiler changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Arithmetic or logical operators usable for SIMD data types (like __m128)
• -fno-fat-lto-objects to separate IPO-generated IL from object• inline-max-per-routine, inline-max-total-size pragmas to control
inlining per function• INTEL_PROF_DYN_PREFIX environment variable to specify custom
.dyn filename prefix for PGO profiles• /Qprof-gen:threadsafe or -prof-gen=threadsafe to provide
threadsafe PGO instrumentation• Inlining limit diagnostic remarks added
Other compiler changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
• Base Platform Toolset Visual Studio project property added to specify Visual Studio toolset to use with Intel compiler
• Improved Fortran IDE module/procedure navigation• Visual Studio integration change to allow selection of
which MPI to use for Cluster Intel® Math Kernel Library builds
• Intel® Performance Guide improves analysis of applications with flat performance profiles
• gdb* provides support for Intel® Cilk™ Plus threading and Fortran
• Online install option to create customized packages for offline installation of only those subcomponents selected
IDE/Debugger/Install Changes
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Legal Disclaimer & Optimization Notice
INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THIS INFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Copyright © 2014, Intel Corporation. All rights reserved. Intel, Pentium, Xeon, Xeon Phi, Core, VTune, Cilk, and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries.
Optimization Notice
Intel’s compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Copyright © 2014, Intel Corporation. All rights reserved. *Other names and brands may be claimed as the property of others.Optimization Notice
Name Changes, 2015 version decoder
53