performance report primergy rx2540 m2 - fujitsu · white paper performance report primergy rx2540...
TRANSCRIPT
White Paper Performance Report PRIMERGY RX2540 M2
http://ts.fujitsu.com/primergy Page 1 (52)
White Paper FUJITSU Server PRIMERGY Performance Report PRIMERGY RX2540 M2
This document contains a summary of the benchmarks executed for the FUJITSU Server PRIMERGY RX2540 M2.
The PRIMERGY RX2540 M2 performance data are compared with the data of other PRIMERGY models and discussed. In addition to the benchmark results, an explanation has been included for each benchmark and for the benchmark environment.
Version
1.1
2016-07-21
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 2 (52) http://ts.fujitsu.com/primergy
Contents
Document history
Version 1.0 (2016-05-04)
New:
Technical data SPECcpu2006
Measurements with Intel® Xeon
® Processor E5-2600 v4 Product Family
SPECpower_ssj2008 Measurement with Xeon E5-2699 v4
Disk I/O: Performance of RAID controllers Measurements with “LSI SW RAID on Intel C610 (Onboard SATA)”, “PRAID CP400i”, “PRAID EP400i” and “PRAID EP420i” controllers
SAP SD Certification number 2016004
OLTP-2 Results for Intel
® Xeon
® Processor E5-2600 v4 Product Family
TPC-E Measurement with Xeon E5-2699 v4
VMmark V2 “Performance Only” measurement with Xeon E5-2699 v4 “Performance with Server Power” measurement with Xeon E5-2699 v4
STREAM Measurements with Intel
® Xeon
® Processor E5-2600 v4 Product Family
Version 1.1 (2016-07-21)
New:
vServCon Results for Intel
® Xeon
® Processor E5-2600 v4 Product Family
Updated:
SPECcpu2006 Additional measurements with Intel
® Xeon
® Processor E5-2600 v4 Product Family
Document history ................................................................................................................................................ 2
Technical data .................................................................................................................................................... 3
SPECcpu2006 .................................................................................................................................................... 6
SPECpower_ssj2008 ........................................................................................................................................ 12
Disk I/O: Performance of RAID controllers ....................................................................................................... 16
SAP SD ............................................................................................................................................................. 25
OLTP-2 ............................................................................................................................................................. 28
TPC-E ............................................................................................................................................................... 32
vServCon .......................................................................................................................................................... 36
VMmark V2 ....................................................................................................................................................... 43
STREAM ........................................................................................................................................................... 48
Literature ........................................................................................................................................................... 51
Contact ............................................................................................................................................................. 52
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 3 (52)
Technical data
Decimal prefixes according to the SI standard are used for measurement units in this white paper (e.g. 1 GB = 10
9 bytes). In contrast, these prefixes should be interpreted as binary prefixes (e.g. 1 GB = 2
30 bytes) for
the capacities of caches and memory modules. Separate reference will be made to any further exceptions where applicable.
Model PRIMERGY RX2540 M2
Model versions
PY RX2540 M2 4x 3.5' expandable
PY RX2540 M2 12x 3.5'
PY RX2540 M2 8x 2.5' expandable
PY RX2540 M2 24x 2.5'
Form factor Rack server
Chipset Intel® C612
Number of sockets 2
Number of processors orderable 1 or 2
Processor type Intel® Xeon
® Processor E5-2600 v4 Product Family
Number of memory slots 24 (12 per processor)
Maximum memory configuration 1536 GB
Onboard HDD controller Controller with RAID 0, RAID 1 or RAID 10 for up to 8 SATA HDDs
PCI slots 3 × PCI-Express 3.0 x8 3 × PCI-Express 3.0 x16
Max. number of internal hard disks
PY RX2540 M2 4x 3.5' expandable: 8 × 3.5" + 4 × 2.5" PY RX2540 M2 12x 3.5': 12 × 3.5" + 4 × 2.5" PY RX2540 M2 8x 2.5' expandable: 28 × 2.5" PY RX2540 M2 24x 2.5': 28 × 2.5"
PRIMERGY RX2540 M2
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 4 (52) http://ts.fujitsu.com/primergy
Processors (since system release)
Processor
Co
res
Th
rea
ds Cache
[MB]
QPI Speed
[GT/s]
Rated Frequency
[Ghz]
Max. Turbo
Frequency [Ghz]
Max. Memory
Frequency [MHz]
TDP
[Watt]
Xeon E5-2623 v4 4 8 10 8.00 2.60 3.20 2133 85
Xeon E5-2637 v4 4 8 15 9.60 3.50 3.70 2400 135
Xeon E5-2603 v4 6 6 15 6.40 1.70 n/a 1866 85
Xeon E5-2643 v4 6 12 20 9.60 3.40 3.70 2400 135
Xeon E5-2609 v4 8 8 20 6.40 1.70 n/a 1866 85
Xeon E5-2620 v4 8 16 20 8.00 2.10 3.00 2133 85
Xeon E5-2667 v4 8 16 25 9.60 3.20 3.60 2400 135
Xeon E5-2630L v4 10 20 25 8.00 1.80 2.90 2133 55
Xeon E5-2630 v4 10 20 25 8.00 2.20 3.10 2133 85
Xeon E5-2640 v4 10 20 25 8.00 2.40 3.40 2133 90
Xeon E5-2650 v4 12 24 30 9.60 2.20 2.90 2400 105
Xeon E5-2650L v4 14 28 35 9.60 1.70 2.50 2400 65
Xeon E5-2660 v4 14 28 35 9.60 2.00 3.20 2400 105
Xeon E5-2680 v4 14 28 35 9.60 2.40 3.30 2400 120
Xeon E5-2690 v4 14 28 35 9.60 2.60 3.50 2400 135
Xeon E5-2683 v4 16 32 40 9.60 2.10 3.00 2400 120
Xeon E5-2697A v4 16 32 40 9.60 2.60 3.60 2400 145
Xeon E5-2695 v4 18 36 45 9.60 2.10 3.30 2400 120
Xeon E5-2697 v4 18 36 45 9.60 2.30 3.60 2400 145
Xeon E5-2698 v4 20 40 50 9.60 2.20 3.60 2400 135
Xeon E5-2699 v4 22 44 55 9.60 2.20 3.60 2400 145
All the processors that can be ordered with the PRIMERGY RX2540 M2, apart from Xeon E5-2603 v4 and Xeon E5-2609 v4, support Intel
® Turbo Boost Technology 2.0. This technology allows you to operate the
processor with higher frequencies than the nominal frequency. Listed in the processor table is "Max. Turbo Frequency" for the theoretical frequency maximum with only one active core per processor. The maximum frequency that can actually be achieved depends on the number of active cores, the current consumption, electrical power consumption and the temperature of the processor.
As a matter of principle Intel does not guarantee that the maximum turbo frequency will be reached. This is related to manufacturing tolerances, which result in a variance regarding the performance of various examples of a processor model. The range of the variance covers the entire scope between the nominal frequency and the maximum turbo frequency.
The turbo functionality can be set via BIOS option. Fujitsu generally recommends leaving the "Turbo Mode" option set at the standard setting "Enabled", as performance is substantially increased by the higher frequencies. However, since the higher frequencies depend on general conditions and are not always guaranteed, it can be advantageous to disable the "Turbo Mode" option for application scenarios with intensive use of AVX instructions and a high number of instructions per clock unit, as well as for those that require constant performance or lower electrical power consumption.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 5 (52)
Memory modules (since system release)
Memory module
Ca
pa
cit
y [
GB
]
Ra
nk
s
Bit
wid
th o
f th
e
me
mo
ry c
hip
s
Fre
qu
en
cy
[M
Hz]
Lo
ad
re
du
ce
d
Re
gis
tere
d
EC
C
8GB (1x8GB) 1Rx4 DDR4-2400 R ECC 8 1 4 2400
8GB (1x8GB) 2Rx8 DDR4-2400 R ECC 8 2 8 2400
16GB (1x16GB) 2Rx4 DDR4-2400 R ECC 16 2 4 2400
16GB (1x16GB) 2Rx8 DDR4-2400 R ECC 16 2 8 2400
32GB (1x32GB) 2Rx4 DDR4-2400 R ECC 32 2 4 2400
64GB (1x64GB) 4Rx4 DDR4-2400 LR ECC 64 4 4 2400
Power supplies (since system release) Max. number
Modular PSU 450W platinum hp 2
Modular PSU 800W platinum hp 2
Modular PSU 800W titanium hp 2
Modular PSU 1200W platinum hp 2
Some components may not be available in all countries or sales regions.
Detailed technical information is available in the data sheet PRIMERGY RX2540 M2.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 6 (52) http://ts.fujitsu.com/primergy
SPECcpu2006
Benchmark description
SPECcpu2006 is a benchmark which measures the system efficiency with integer and floating-point operations. It consists of an integer test suite (SPECint2006) containing 12 applications and a floating-point test suite (SPECfp2006) containing 17 applications. Both test suites are extremely computing-intensive and concentrate on the CPU and the memory. Other components, such as Disk I/O and network, are not measured by this benchmark.
SPECcpu2006 is not tied to a special operating system. The benchmark is available as source code and is compiled before the actual measurement. The used compiler version and their optimization settings also affect the measurement result.
SPECcpu2006 contains two different performance measurement methods: the first method (SPECint2006 or SPECfp2006) determines the time which is required to process single task. The second method (SPECint_rate2006 or SPECfp_rate2006) determines the throughput, i.e. the number of tasks that can be handled in parallel. Both methods are also divided into two measurement runs, “base” and “peak” which differ in the use of compiler optimization. When publishing the results the base values are always used; the peak values are optional.
Benchmark Arithmetics Type Compiler optimization
Measurement result
Application
SPECint2006 integer peak aggressive Speed single-threaded
SPECint_base2006 integer base conservative
SPECint_rate2006 integer peak aggressive Throughput multi-threaded
SPECint_rate_base2006 integer base conservative
SPECfp2006 floating point peak aggressive Speed single-threaded
SPECfp_base2006 floating point base conservative
SPECfp_rate2006 floating point peak aggressive Throughput multi-threaded
SPECfp_rate_base2006 floating point base conservative
The measurement results are the geometric average from normalized ratio values which have been determined for individual benchmarks. The geometric average - in contrast to the arithmetic average - means that there is a weighting in favour of the lower individual results. Normalized means that the measurement is how fast is the test system compared to a reference system. Value “1” was defined for the SPECint_base2006-, SPECint_rate_base2006, SPECfp_base2006 and SPECfp_rate_base2006 results of the reference system. For example, a SPECint_base2006 value of 2 means that the measuring system has handled this benchmark twice as fast as the reference system. A SPECfp_rate_base2006 value of 4 means that the measuring system has handled this benchmark some 4/[# base copies] times faster than the reference system. “# base copies” specify how many parallel instances of the benchmark have been executed.
Not every SPECcpu2006 measurement is submitted by us for publication at SPEC. This is why the SPEC web pages do not have every result. As we archive the log files for all measurements, we can prove the correct implementation of the measurements at any time.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 7 (52)
Benchmark environment
System Under Test (SUT)
Hardware
Model PRIMERGY RX2540 M2
Processor 2 processors of Intel® Xeon
® Processor E5-2600 v4 Product Family
Memory 16 × 16GB (1x16GB) 2Rx4 DDR4-2400 R ECC
Software
BIOS settings Energy Performance = Performance
Utilization Profile = Unbalanced
CPU C1E Support = Disabled
SPECint_base2006, SPECint2006: Package C State limit = C6 (Retention) COD Enable = Disabled Early Snoop = Disabled Home Snoop Dir OSB = Enabled
SPECfp_base2006, SPECfp2006: Package C State limit = C6 (Retention) COD Enable = Disabled Early Snoop = Disabled Home Snoop Dir OSB = Disabled
SPECint_rate_base2006, SPECint_rate2006, SPECfp_rate_base2006, SPECfp_rate2006: Package C State limit = C0 Xeon E5-2603 v4, E5-2609 v4, E5-2620 v4, E5-2623 v4, E5-2630 v4, E5-2630L v4, E5-2637 v4, E5-2640 v4, E5-2643 v4, E5-2667 v4: COD Enable = Disabled Early Snoop = Enabled Home Snoop Dir OSB = Disabled All others: COD Enable = Enabled Early Snoop = Disabled Home Snoop Dir OSB = Disabled
Operating system SUSE Linux Enterprise Server 12 SP1 (x86_64)
Operating system settings
echo always > /sys/kernel/mm/transparent_hugepage/enabled
Compiler
SPECint_base2006, SPECint2006, SPECint_rate_base2006, SPECint_rate2006: C/C++: Version 16.0.0.101 of Intel C++ Studio XE for Linux
SPECfp_base2006, SPECfp2006, SPECfp_rate_base2006, SPECfp_rate2006: C/C++: Version 16.0.0.101 of Intel C++ Studio XE for Linux Fortran: Version 16.0.0.101 of Intel Fortran Studio XE for Linux
Some components may not be available in all countries or sales regions.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 8 (52) http://ts.fujitsu.com/primergy
Benchmark results
In terms of processors the benchmark result depends primarily on the size of the processor cache, the support for Hyper-Threading, the number of processor cores and on the processor frequency. In the case of processors with Turbo mode the number of cores, which are loaded by the benchmark, determines the maximum processor frequency that can be achieved. In the case of single-threaded benchmarks, which largely load one core only, the maximum processor frequency that can be achieved is higher than with multi-threaded benchmarks.
Processor
Nu
mb
er
of
pro
ce
sso
rs
SP
EC
int_
ba
se2
006
SP
EC
int2
00
6
SP
EC
int_
rate
_b
as
e20
06
SP
EC
int_
rate
200
6
Xeon E5-2623 v4 2 56.1 58.2 376 396
Xeon E5-2637 v4 2 65.0 67.8 479 504
Xeon E5-2603 v4 2 34.6 35.9 311 324
Xeon E5-2643 v4 2 67.9 71.4 702 739
Xeon E5-2609 v4 2 35.7 37.2 413 431
Xeon E5-2620 v4 2 57.0 59.7 638 671
Xeon E5-2667 v4 2 68.9 72.4 899 944
Xeon E5-2630L v4 2 56.4 59.0 693 730
Xeon E5-2630 v4 2 60.1 63.0 810 850
Xeon E5-2640 v4 2 64.6 67.9 859 900
Xeon E5-2650 v4 2 57.5 60.0 999 1050
Xeon E5-2650L v4 2 51.4 53.2 950 1000
Xeon E5-2660 v4 2 63.0 65.4 1120 1170
Xeon E5-2680 v4 2 65.4 68.0 1260 1320
Xeon E5-2690 v4 2 68.6 71.5 1330 1390
Xeon E5-2683 v4 2 61.3 63.1 1310 1370
Xeon E5-2697A v4 2 70.8 73.0 1470 1530
Xeon E5-2695 v4 2 65.7 67.6 1430 1490
Xeon E5-2697 v4 2 1510 1570
Xeon E5-2698 v4 2 1600 1670
Xeon E5-2699 v4 2 71.2 72.9 1750 1820
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 9 (52)
Processor
Nu
mb
er
of
pro
ce
sso
rs
SP
EC
fp_
ba
se2
00
6
SP
EC
fp2
00
6
SP
EC
fp_
rate
_b
as
e20
06
SP
EC
fp_
rate
20
06
Xeon E5-2623 v4 2 90.7 94.6 357 363
Xeon E5-2637 v4 2 108 112 442 451
Xeon E5-2603 v4 2 66.0 67.8 350 357
Xeon E5-2643 v4 2 117 121 609 621
Xeon E5-2609 v4 2 68.9 70.9 438 448
Xeon E5-2620 v4 2 100 106 573 585
Xeon E5-2667 v4 2 124 128 734 750
Xeon E5-2630L v4 2 97.6 104 608 622
Xeon E5-2630 v4 2 105 111 671 687
Xeon E5-2640 v4 2 111 117 695 711
Xeon E5-2650 v4 2 107 112 805 824
Xeon E5-2650L v4 2 89.7 95.5 752 771
Xeon E5-2660 v4 2 111 117 869 890
Xeon E5-2680 v4 2 116 122 923 948
Xeon E5-2690 v4 2 118 125 933 959
Xeon E5-2683 v4 2 109 115 939 965
Xeon E5-2697A v4 2 118 124 994 1020
Xeon E5-2695 v4 2 110 117 978 1010
Xeon E5-2697 v4 2 119 126 1020 1050
Xeon E5-2698 v4 2 118 125 1050 1090
Xeon E5-2699 v4 2 116 124 1100 1130
On 31
st March 2016 the PRIMERGY RX2540 M2 with two Xeon E5-2699 v4 processors was
ranked first in the 2-socket systems category for the benchmark SPECfp_rate_base2006. The result can be found at http://www.fujitsu.com/fts/products/computing/servers/primergy/benchmarks/rx2540m2/.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 10 (52) http://ts.fujitsu.com/primergy
The following four diagrams illustrate the throughput of the PRIMERGY RX2540 M2 in comparison to its predecessor PRIMERGY RX2540 M1, in their respective most performant configuration.
SPECint_base2006
SPECint2006
0
10
20
30
40
50
60
70
80
PRIMERGY RX2540 M12 x Xeon E5-2643 v3
PRIMERGY RX2540 M22 x Xeon E5-2699 v4
64.0
71.2
67.5
72.9
SPECint_rate_base2006
SPECint_rate2006
0
250
500
750
1000
1250
1500
1750
2000
PRIMERGY RX2540 M12 x Xeon E5-2699 v3
PRIMERGY RX2540 M22 x Xeon E5-2699 v4
1370
17501410
1820
SPECcpu2006: integer performance PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
SPECcpu2006: integer performance PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 11 (52)
SPECfp_base2006
SPECfp2006
0
20
40
60
80
100
120
140
PRIMERGY RX2540 M12 x Xeon E5-2667 v3
PRIMERGY RX2540 M22 x Xeon E5-2667 v4
115
124
119
128
SPECfp_rate_base2006
SPECfp_rate2006
0
200
400
600
800
1000
1200
PRIMERGY RX2540 M12 x Xeon E5-2699 v3
PRIMERGY RX2540 M22 x Xeon E5-2699 v4
917
1100946
1130
SPECcpu2006: floating-point performance PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
SPECcpu2006: floating-point performance PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 12 (52) http://ts.fujitsu.com/primergy
SPECpower_ssj2008
Benchmark description
SPECpower_ssj2008 is the first industry-standard SPEC benchmark that evaluates the power and performance characteristics of a server. With SPECpower_ssj2008 SPEC has defined standards for server power measurements in the same way they have done for performance.
The benchmark workload represents typical server-side Java business applications. The workload is scalable, multi-threaded, portable across a wide range of platforms and easy to run. The benchmark tests CPUs, caches, the memory hierarchy and scalability of symmetric multiprocessor systems (SMPs), as well as the implementation of Java Virtual Machine (JVM), Just In Time (JIT) compilers, garbage collection, threads and some aspects of the operating system.
SPECpower_ssj2008 reports power consumption for servers at different performance levels — from 100% to “active idle” in 10% segments — over a set period of time. The graduated workload recognizes the fact that processing loads and power consumption on servers vary substantially over the course of days or weeks. To compute a power-performance metric across all levels, measured transaction throughputs for each segment are added together and then divided by the sum of the average power consumed for each segment. The result is a figure of merit called “overall ssj_ops/watt”. This ratio provides information about the energy efficiency of the measured server. The defined measurement standard enables customers to compare it with other configurations and servers measured with SPECpower_ssj2008. The diagram shows a typical graph of a SPECpower_ssj2008 result.
The benchmark runs on a wide variety of operating systems and hardware architectures and does not require extensive client or storage infrastructure. The minimum equipment for SPEC-compliant testing is two networked computers, plus a power analyzer and a temperature sensor. One computer is the System Under Test (SUT) which runs one of the supported operating systems and the JVM. The JVM provides the environment required to run the SPECpower_ssj2008 workload which is implemented in Java. The other computer is a “Control & Collection System” (CCS) which controls the operation of the benchmark and captures the power, performance and temperature readings for reporting. The diagram provides an overview of the basic structure of the benchmark configuration and the various components.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 13 (52)
Benchmark environment
System Under Test (SUT)
Hardware
Model PRIMERGY RX2540 M2
Processor Xeon E5-2699 v4
Memory 8 × 8GB (1x8GB) 2Rx8 DDR4-2400 R ECC
Network interface 1 × PLAN AP 1x1Gbit Cu Intel I210-T1 LP
Disk subsystem Onboard HDD controller 1 × SSD SATA 6G 64GB DOM N H-P
Power Supply Unit 1 × Modular PSU 800W titanium hp
Software
BIOS R1.6.0
BIOS settings Hardware Prefetcher = Disabled
Adjacent Cache Line Prefetch = Disabled
DCU Streamer Prefetcher = Disabled
ASPM Support = L1 Only
DMI Control = Gen1" in BIOS.
Onboard USB Controllers = Disabled
Power Technology = Custom
Turbo Mode = Disabled
Autonomous C-state Support = Enabled
Uncore Frequency Override = Nominal
QPI Link Frequency Select = 6.4 GT/s
Override OS Energy Performance = Enabled
Energy Performance = Energy Efficient
DDR Performance = Energy optimized
Intel Virtualization Technology = Disabled
COD Enable = Enabled
Firmware 8.20F
Operating system Microsoft Windows Server 2012 R2 Standard
Operating system settings
Using the local security settings console, “lock pages in memory” was enabled for the user running the benchmark.
Power Management: Enabled (“Fujitsu Enhanced Power Settings” power plan)
Set “Turn off hard disk after = 1 Minute” in OS.
Benchmark was started via Windows Remote Desktop Connection.
Each JVM instance was affinitized to 22 logical processors.
JVM Oracle Java HotSpot(TM) 64-Bit Server VM (build 24.80-b11, mixed mode), version 1.7.0_80
JVM settings -server -Xmn11g -Xms13g -Xmx13g -XX:SurvivorRatio=60 -XX:TargetSurvivorRatio=90 -XX:AllocatePrefetchDistance=256 -XX:AllocatePrefetchLines=4 -XX:LoopUnrollLimit=45 -XX:InitialTenuringThreshold=12 -XX:MaxTenuringThreshold=15 -XX:ParallelGCThreads=22 -XX:InlineSmallCode=3900 -XX:MaxInlineSize=270 -XX:FreqInlineSize=2500 -XX:+AggressiveOpts -XX:+UseLargePages -XX:+UseParallelOldGC -XX:-UseAdaptiveSizePolicy
Some components may not be available in all countries or sales regions.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 14 (52) http://ts.fujitsu.com/primergy
Benchmark results
The PRIMERGY RX2540 M2 achieved the following result:
SPECpower_ssj2008 = 11,638 overall ssj_ops/watt
The adjoining diagram shows the result of the configuration described above. The red horizontal bars show the performance to power ratio in ssj_ops/watt (upper x-axis) for each target load level tagged on the y-axis of the diagram. The blue line shows the run of the curve for the average power consumption (bottom x-axis) at each target load level marked with a small rhomb. The black vertical line shows the benchmark result of 11,638 overall ssj_ops/watt for the PRIMERGY RX2540 M2. This is the quotient of the sum of the transaction throughputs for each load level and the sum of the average power con-sumed for each measurement inter-val.
The following table shows the benchmark results for the throughput in ssj_ops, the power consumption in watts and the resulting energy efficiency for each load level.
Performance Power Energy Efficiency
Target Load ssj_ops Average Power (W) ssj_ops/watt
100% 3,445,027 255 13,504
90% 3,096,297 224 13,814
80% 2,748,129 196 13,988
70% 2,410,319 174 13,820
60% 2,065,074 158 13,081
50% 1,720,551 145 11,868
40% 1,374,694 132 10,433
30% 1,030,251 114 9,048
20% 688,822 98.9 6,965
10% 342,368 84.7 4,043
Active Idle 0 43.6 0
∑ssj_ops / ∑power = 11,638
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 15 (52)
The following diagram shows for each load level the power consumption (on the right y-axis) and the throughput (on the left y-axis) of the PRIMERGY RX2540 M2 compared to the predecessor PRIMERGY RX2540 M1.
Thanks to the new Broadwell processors the PRIMERGY RX2540 M2 has in comparison with the PRIMERGY RX2540 M1 a higher throughput at substantially lower power consumption. Both result in an overall increase in energy efficiency in the PRIMERGY RX2540 M2 of 9.2%.
SPECpower_ssj2008: PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
SPECpower_ssj2008 overall ssj_ops/watt: PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 16 (52) http://ts.fujitsu.com/primergy
Disk I/O: Performance of RAID controllers
Benchmark description
Performance measurements of disk subsystems for PRIMERGY and PRIMEQUEST servers are used to assess their performance and enable a comparison of the different storage connections for these servers. As standard, these performance measurements are carried out with a defined measurement method, which models the accesses of real application scenarios on the basis of specifications.
The essential specifications are:
Share of random accesses / sequential accesses Share of read / write access types Block size (kB) Number of parallel accesses (# of outstanding I/Os)
A given value combination of these specifications is known as “load profile”. The following five standard load profiles can be allocated to typical application scenarios:
In order to model applications that access in parallel with a different load intensity the "# of Outstanding I/Os" is increased from 1 to 512 (in steps to the power of two).
The measurements of this document are based on these standard load profiles.
The main results of a measurement are:
Throughput [MB/s] Throughput in megabytes per second Transactions [IO/s] Transaction rate in I/O operations per second Latency [ms] Average response time in ms
The data throughput has established itself as the normal measurement variable for sequential load profiles, whereas the measurement variable “transaction rate” is mostly used for random load profiles with their small block sizes. Data throughput and transaction rate are directly proportional to each other and can be transferred to each other according to the formula
Data throughput [MB/s] = Transaction rate [IO/s] × Block size [MB]
Transaction rate [IO/s] = Data throughput [MB/s] / Block size [MB]
This section specifies capacities of storage media on a basis of 10 (1 TB = 1012
bytes) while all other capacities, file sizes, block sizes and throughputs are specified on a basis of 2 (1 MB/s = 2
20 bytes/s).
All the details of the measurement method and the basics of disk I/O performance are described in the white paper “Basics of Disk I/O Performance”.
Standard load profile
Access Type of access Block size [kB]
Application
read write
File copy random 50% 50% 64 Copying of files
File server random 67% 33% 64 File server
Database random 67% 33% 8 Database (data transfer) Mail server
Streaming sequential 100% 0% 64 Database (log file), Data backup; Video streaming (partial)
Restore sequential 0% 100% 64 Restoring of files
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 17 (52)
Benchmark environment
All the measurement results discussed in this chapter were determined using the hardware and software components listed below:
System Under Test (SUT)
Hardware
Processor 2 × Xeon E5-2623 v4 @ 2.60GHz
Controller 1 × “LSI SW RAID on Intel C610 (Onboard SATA)”
Intel C610 PCH, Code name Wellsburg
Driver name: megasr1.sys, Driver version: 16.02.2014.0811
BIOS version: A.14.02121826R
1 × “PRAID CP400i”, “PRAID EP400i”, “PRAID EP420i”:
Driver name: megasas2.sys, Driver version: 6.706.06 Firmware package: 24.7.0-0061
Storage media For model versions PY RX2540 M2 4x 3.5' expandable PY RX2540 M2 12x 3.5':
SSDs HDDs
Toshiba PX02SMF040
Samsung MZ7KM240HAGR
HGST HUC156045CSS204
Seagate ST1000NM0033
For model versions PY RX2540 M2 8x 2.5' expandable PY RX2540 M2 24x 2.5':
SSDs HDDs
Toshiba PX02SMF040
Samsung MZ7KM240HAGR
HGST HUC156045CSS204
Seagate ST91000640NS
Software
BIOS settings Intel Virtualization Technology = Disabled VT-d = Disabled Energy Performance = Performance Utilization Profile = Unbalanced CPU C6 Report = Disabled
Operating system Microsoft Windows Server 2012 R2 Standard
Operating system settings
Choose or customize a power plan: High performance
For the processes that create disk I/Os: set the AFFINITY to the CPU node to which the PCIe slot of the RAID controller is connected
Administration software ServerView RAID Manager 6.2.1
Benchmark version 3.0
Stripe size Controller default
Measuring tool Iometer 1.1.0
Measurement area The first 10% of the usable LBA area is used for sequential accesses; the next 25% for random accesses.
File system raw
Total number of Iometer workers
1
Alignment of Iometer accesses
Aligned to whole multiples of 4096 bytes
Some components may not be available in all countries / sales regions.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 18 (52) http://ts.fujitsu.com/primergy
Benchmark results
The results presented here are designed to help you choose the right solution from the various configuration options of the PRIMERGY RX2540 M2 in the light of disk-I/O performance. Various combinations of RAID controllers and storage media will be analyzed below. Information on the selection of storage media themselves is to be found in the section “Disk I/O: Performance of storage media”.
Hard disks
The hard disks are the first essential component. If there is a reference below to “hard disks”, this is meant as the generic term for HDDs (“hard disk drives”, in other words conventional hard disks) and SSDs (“solid state drives”, i.e. non-volatile electronic storage media).
Mixed drive configurations of SAS and SATA hard disks in one system are permitted, unless they are excluded in the configurator for special hard disk types.
More hard disks per system are possible as a result of using 2.5" hard disks instead of 3.5" hard disks. Consequently, the load that each individual hard disk has to overcome decreases and the maximum overall performance of the system increases.
More detailed performance statements about hard disk types are available in the section “Disk I/O: Performance of storage media” in this performance report.
Model versions
The maximum number of hard disks in the system depends on the system configuration. The following table lists the essential cases. Only the highest supported version is named for all the interfaces we have dealt with in this section.
Form factor
Interface Connection type Number of PCIe
controllers Maximum number
of hard disks
2.5", 3.5" SATA 6G direct 0 8
2.5", 3.5" SATA 6G, SAS 12G direct 1 8
3.5" SATA 6G, SAS 12G Expander 1 12
3.5", 2.5" SATA 6G, SAS 12G Expander + direct 1 + 1 12 × 3.5" + 4 × 2.5"
2.5" SATA 6G, SAS 12G Expander 1 24
2.5" SATA 6G, SAS 12G direct + direct 1 + 1 8 + 8
2.5" SATA 6G, SAS 12G Expander + direct 1 + 1 24 + 4
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 19 (52)
RAID controller
In addition to the hard disks the RAID controller is the second performance-determining key component. In the case of these controllers the “modular RAID” concept of the PRIMERGY servers offers a plethora of options to meet the various requirements of a wide range of different application scenarios.
The following table summarizes the most important features of the available RAID controllers of the PRIMERGY RX2540 M2. A short alias is specified here for each controller, which is used in the subsequent list of the performance values.
Controller name Alias Cache Supported interfaces
In the system FBU
Max. # disks per controller
RAID levels
LSI SW RAID on Intel C610 (Onboard SATA)
Onboard C610 - SATA 6G - 4 × 2.5" 4 × 3.5"
0, 1, 10 -
PRAID CP400i PRAID CP400i - SATA 6G SAS 12G
PCIe 3.0 x8
8 × 2.5" 8 × 3.5"
0, 1, 1E, 5, 10, 50
-
PRAID EP400i PRAID EP400i 1 GB SATA 6G SAS 12G
PCIe 3.0 x8
24 × 2.5" 12 × 3.5"
0, 1, 1E, 5, 6, 10, 50, 60
PRAID EP420i PRAID EP420i 2 GB SATA 6G SAS 12G
PCIe 3.0 x8
24 × 2.5" 12 × 3.5"
0, 1, 1E, 5, 6, 10, 50, 60
Onboard RAID controllers are implemented in the chipset on the server system board and use the CPU of the server for the RAID functionality. This simple solution does not require a PCIe slot. The chipset does not communicate with the CPU via PCIe, but via the “Direct Media Interface”, in short DMI. The PRIMERGY RX2540 M2 has the Intel C610 chipset, in which two onboard RAID controllers are integrated. Each of these controllers can be used to form logical drives consisting of up to four hard disks. The alias of the onboard controller is used in this document to refer to one controller instance in the chipset.
System-specific interfaces
The interfaces of a controller in CPU direction (DMI or PCIe) and in the direction of hard disks (SAS or SATA) have in each case specific limits for data throughput. These limits are listed in the following table. The minimum of these two values is a definite limit, which cannot be exceeded. This value is highlighted in bold in the following table.
Controller alias
Effective in the configuration Connection via expander # Disk-side
data channels Limit for throughput of disk interface
# CPU-side data channels
Limit for throughput of CPU-side interface
1 × Onboard C610 1 × 4 × SATA 6G 2060 MB/s 4 × DMI 2.0 1716 MB/s -
2 × Onboard C610 2 × 4 × SATA 6G 4120 MB/s 4 × DMI 2.0 *) 1716 MB/s -
PRAID CP400i 8 × SAS 12G 8240 MB/s 8 × PCIe 3.0 6761 MB/s -
PRAID EP400i 8 × SAS 12G 8240 MB/s 8 × PCIe 3.0 6761 MB/s -/
PRAID EP420i 8 × SAS 12G 8240 MB/s 8 × PCIe 3.0 6761 MB/s -/
*) The second controller instance doesn‘t increase the throughput limit on the CPU side
An expander makes it possible to connect more hard disks in a system than the SAS channels that the controller has. An expander cannot increase the possible maximum throughput of a controller, but makes it available in total to all connected hard disks.
More details about the RAID controllers of the PRIMERGY systems are available in the white paper “RAID Controller Performance”.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 20 (52) http://ts.fujitsu.com/primergy
Settings
In most cases, the cache of HDDs has a great influence on disk-I/O performance. It is frequently regarded as a security problem in case of power failure and is thus switched off. On the other hand, it was integrated by hard disk manufacturers for the good reason of increasing the write performance. For performance reasons it is therefore advisable to enable the hard disk cache. To prevent data loss in case of power failure you are recommended to equip the system with a UPS.
In the case of controllers with a cache there are several parameters that can be set. The optimal settings can depend on the RAID level, the application scenario and the type of data medium. In the case of RAID levels 5 and 6 in particular (and the more complex RAID level combinations 50 and 60) it is obligatory to enable the controller cache for application scenarios with write share. If the controller cache is enabled, the data temporarily stored in the cache should be safeguarded against loss in case of power failure. Suitable accessories are available for this purpose (e.g. an FBU).
For the purpose of easy and reliable handling of the settings for RAID controllers and hard disks it is advisable to use the software “ServerView RAID Manager” that is supplied for the server. All the cache settings for controllers and hard disks can usually be made en bloc – specifically for the application – by using the pre-defined modi “Performance”, “Data Protection” or “Fast Path optimum”. The “Performance” mode ensures the best possible performance settings for the majority of the application scenarios with HDDs. In connection with the “FastPath” RAID controller option, the “Fast Path optimum” mode should be selected if maximum transaction rates are to be achieved with SSDs for random accesses with small blocks (≤ 8 kB, e. g. OLTP operation of databases).
More information about the setting options of the controller cache is available in the white paper “RAID Controller Performance”.
Performance values
In general, disk-I/O performance of a logical drive depends on the type and number of hard disks, on the RAID level and on the RAID controller. If the limits of the system-specific interfaces are not exceeded, the statements on disk-I/O performance are therefore valid for all PRIMERGY systems. This is why all the performance statements of the document “RAID Controller Performance” also apply for the PRIMERGY RX2540 M2 if the configurations measured there are also supported by this system.
The performance values of the PRIMERGY RX2540 M2 are listed in table form below, specifically for different RAID levels, access types and block sizes. Substantially different configuration versions are dealt with separately. The established measurement variables, as already mentioned in the subsection Benchmark description, are used here. Thus, transaction rate is specified for random accesses and data throughput for sequential accesses. To avoid any confusion among the measurement units the tables have been separated for the two access types.
The table cells contain the maximum achievable values. This has three implications: On the one hand hard disks with optimal performance were used (the components used are described in more detail in the subsection Benchmark environment). Furthermore, cache settings of controllers and hard disks, which are optimal for the respective access scenario and the RAID level, are used as a basis. And ultimately each value is the maximum value for the entire load intensity range (# of outstanding I/Os).
In order to also visualize the numerical values each table cell is highlighted with a horizontal bar, the length of which is proportional to the numerical value in the table cell. All bars shown in the same scale of length have the same color. In other words, a visual comparison only makes sense for table cells with the same colored bars.
Since the horizontal bars in the table cells depict the maximum achievable performance values, they are shown by the color getting lighter as you move from left to right. The light shade of color at the right end of the bar tells you that the value is a maximum value and can only be achieved under optimal prerequisites. The darker the shade becomes as you move to the left, the more frequently it will be possible to achieve the corresponding value in practice.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 21 (52)
2.5" - Random accesses (maximum performance values in IO/s):
RA
ID
Co
ntr
olle
r
Ha
rd d
isk
typ
e
#D
isk
s
PRIMERGY RX2540 M2
Model version PY RX2540 M2 8x 2.5' expandable
Model version PY RX2540 M2 24x 2.5'
Configuration version
RA
ID le
ve
l
HD
Ds
ra
nd
om
8 k
B b
loc
ks
67
% r
ea
d
[IO
/s]
HD
Ds
ra
nd
om
64
kB
blo
ck
s
67
% r
ea
d
[IO
/s]
SS
Ds
ra
nd
om
8 k
B b
loc
ks
67
% r
ea
d
[IO
/s]
SS
Ds
ra
nd
om
64
kB
blo
ck
s
67
% r
ea
d
[IO
/s]
2 1 375 328 43156 8392
4 0 563 328 74488 17622
4 10 568 325 60497 14045
2 1 1153 971 65174 9478
8 10 4294 2229 117403 26336
8 0 4784 2531 164298 38806
8 5 2435 1382 28732 17952
2 1 1590 888 62721 9386
8 10 4623 3265 183026 28011
8 0 5291 3784 237818 45574
8 5 3008 2097 132953 16056
16 10 9166 6377 247194 72201
24 0 14086 10367 240837 114361
24 5 7299 5162 139216 33047
2 1 1544 994 63189 9993
8 10 4616 3213 194669 25960
8 0 5230 3729 248073 44424
8 5 2970 2039 133538 16243
16 10 9360 6461 244679 71389
24 0 14115 10381 247036 124201
24 5 7466 5256 137911 32814
PRAID CP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
PRAID EP420iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
PRAID EP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
Onboard C610ST91000640NS SATA HDD
MZ7KM240HAGR SATA SSD
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 22 (52) http://ts.fujitsu.com/primergy
2.5" - Sequential accesses (maximum performance values in MB/s):
RA
ID
Co
ntr
olle
r
Ha
rd d
isk
typ
e
#D
isk
s
PRIMERGY RX2540 M2
Model version PY RX2540 M2 8x 2.5' expandable
Model version PY RX2540 M2 24x 2.5'
Configuration version
RA
ID le
ve
l
HD
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
re
ad
[MB
/s]
HD
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
wri
te
[MB
/s]
SS
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
re
ad
[MB
/s]
SS
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
wri
te
[MB
/s]
2 1 113 108 950 492
4 0 427 149 1541 1262
4 10 224 213 1375 641
2 1 327 230 1910 422
8 10 1051 892 5873 1640
8 0 1795 1779 5851 3297
8 5 1579 1558 5840 1760
2 1 350 232 1877 406
8 10 1148 945 5825 1486
8 0 1874 1892 5817 3081
8 5 1648 1658 5871 2661
16 10 2097 1888 5831 3115
24 0 5395 5356 5828 6283
24 5 5162 3164 5881 3149
2 1 375 232 1887 400
8 10 1174 942 5795 1665
8 0 1871 1892 5819 3077
8 5 1648 1650 5869 2569
16 10 2114 1889 5824 3119
24 0 5396 5386 5824 6271
24 5 5211 3083 5875 3085
PRAID EP420iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
PRAID EP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
Onboard C610ST91000640NS SATA HDD
MZ7KM240HAGR SATA SSD
PRAID CP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 23 (52)
3.5" - Random accesses (maximum performance values in IO/s):
RA
ID
Co
ntr
olle
r
Ha
rd d
isk
typ
e
#D
isk
s
PRIMERGY RX2540 M2
Model version PY RX2540 M2 4x 3.5' expandable
Model version PY RX2540 M2 12x 3.5'
Configuration version
RA
ID le
ve
l
HD
Ds
ra
nd
om
8 k
B b
loc
ks
67
% r
ea
d
[IO
/s]
HD
Ds
ra
nd
om
64
kB
blo
ck
s
67
% r
ea
d
[IO
/s]
SS
Ds
ra
nd
om
8 k
B b
loc
ks
67
% r
ea
d
[IO
/s]
SS
Ds
ra
nd
om
64
kB
blo
ck
s
67
% r
ea
d
[IO
/s]
2 1 349 319 43156 8392
4 0 646 368 74488 17622
4 10 545 315 60497 14045
2 1 1153 971 65174 9478
8 10 4294 2229 117403 26336
8 0 4784 2531 164298 38806
8 5 2435 1382 28732 17952
2 1 1590 888 62721 9386
8 10 4623 3265 183026 28011
8 0 5291 3784 237818 45574
8 5 3008 2097 132953 16056
12 10 6818 4731 245069 53983
12 0 7174 5185 251403 113516
12 5 4161 2838 139247 33373
2 1 1544 994 63189 9993
8 10 4616 3213 194669 25960
8 0 5230 3729 248073 44424
8 5 2970 2039 133538 16243
12 10 6969 4804 244882 52958
12 0 7201 5199 248867 113668
12 5 4183 2888 138903 33433
Onboard C610ST1000NM0033 SATA HDD
MZ7KM240HAGR SATA SSD
PRAID CP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS-SSD
PRAID EP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS-SSD
PRAID EP420iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 24 (52) http://ts.fujitsu.com/primergy
3.5" - Sequential accesses (maximum performance values in MB/s):
Conclusion
At full configuration with powerful hard disks the PRIMERGY RX2540 M2 achieves a throughput of up to 6283 MB/s for sequential load profiles and a transaction rate of up to 251403 IO/s for typical, random application scenarios.
For best possible performance we recommend one of the plug-in PRAID controllers. To operate SSDs within the maximum performance range the PRAID CP400i is already suited for the simpler RAID levels 0, 1 and 10, and a PRAID controller with cache is to be preferred for RAID 5.
In the event of HDDs the controller cache for random load profiles with a significant write share has performance advantages for all RAID levels.
RA
ID
Co
ntr
olle
r
Ha
rd d
isk
typ
e
#D
isk
s
PRIMERGY RX2540 M2
Model version PY RX2540 M2 4x 3.5' expandable
Model version PY RX2540 M2 12x 3.5'
Configuration version
RA
ID le
ve
l
HD
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
re
ad
[MB
/s]
HD
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
wri
te
[MB
/s]
SS
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
re
ad
[MB
/s]
SS
Ds
se
qu
en
tia
l
64
kB
blo
ck
s
10
0%
wri
te
[MB
/s]
2 1 185 180 950 492
4 0 716 712 1541 1262
4 10 368 356 1375 641
2 1 327 230 1910 422
8 10 1051 892 5873 1640
8 0 1795 1779 5851 3297
8 5 1579 1558 5840 1760
2 1 350 232 1877 406
8 10 1148 945 5825 1486
8 0 1874 1892 5817 3081
8 5 1648 1658 5871 2661
12 10 1636 1416 5824 2303
12 0 2763 2825 5822 4584
12 5 2550 2591 5875 3073
2 1 375 232 1887 400
8 10 1174 942 5795 1665
8 0 1871 1892 5819 3077
8 5 1648 1650 5869 2569
12 10 1653 1416 5824 2299
12 0 2805 2844 5824 4590
12 5 2589 2605 5873 1965
PRAID EP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS-SSD
PRAID EP420iHUC156045CSS204 SAS HDD
PX02SMF040 SAS SSD
Onboard C610ST1000NM0033 SATA HDD
MZ7KM240HAGR SATA SSD
PRAID CP400iHUC156045CSS204 SAS HDD
PX02SMF040 SAS-SSD
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 25 (52)
SAP SD
Benchmark description
The SAP application software consists of modules used to manage all standard business processes. These include modules for ERP (Enterprise Resource Planning), such as Assemble-to-Order (ATO), Financial Accounting (FI), Human Resources (HR), Materials Management (MM), Production Planning (PP) plus Sales and Distribution (SD), as well as modules for SCM (Supply Chain Management), Retail, Banking, Utilities, BI (Business Intelligence), CRM (Customer Relation Management) or PLM (Product Lifecycle Management).
The application software is always based on a database so that a SAP configuration consists of the hardware, the software components operating system, zhe database and the SAP software itself.
SAP AG has developed SAP Standard Application Benchmarks in order to verify the performance, stability and scaling of a SAP application system. The benchmarks, of which SD Benchmark is the most commonly used and most important, analyze the performance of the entire system and thus measure the quality of the integrated individual components.
The benchmark differentiates between a 2-tier and a 3-tier configuration. The 2-tier configuration has the SAP application and database installed on one server. With a 3-tier configuration the individual components of the SAP application can be distributed via several servers and an additional server handles the database.
The entire specification of the benchmark developed by SAP AG, Walldorf, Germany can be found at: http://www.sap.com/benchmark.
Benchmark environment
The measurement set-up is symbolically illustrated below:
2-tier environment
Benchmark
driver
Server Disk subsystem
System Under Test (SUT)
Network
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 26 (52) http://ts.fujitsu.com/primergy
System Under Test (SUT)
Hardware
Model PRIMERGY RX2540 M2
Processor 2 × Xeon E5-2699 v4
Memory 16 × 32GB (1x32GB) 2Rx4 DDR4-2400 R ECC
Network interface 1Gbit/s LAN
Disk subsystem PRIMERGY RX2540 M2: 1 × SSD SATA 6Gb/s 2.5” 240GB 1 × SSD SATA 6Gb/s 2.5” 800GB
Software
BIOS settings Energy Performance = Performance
Operating system Microsoft Windows Server 2012 R2 Standard Edition
Database Microsoft SQL Server 2012 (64-bit)
SAP Business Suite Software
SAP enhancement package 5 for SAP ERP 6.0
Benchmark driver
Hardware
Model PRIMERGY RX300 S4
Processor 2 × Xeon X5460
Memory 32 GB
Network interface 1Gbit/s LAN
Software
Operating system SUSE Linux Enterprise Server 11 SP1
Some components may not be available in all countries or sales regions.
Benchmark results
Certification number 2016004
Number of SAP SD benchmark users 20,250
Average dialog response time 0.96 seconds
Throughput Fully processed order line items/hour Dialog steps/hour SAPS
2,217,330 6,652,000 110,870
Average database request time (dialog/update) 0.013 sec / 0.026 sec
CPU utilization of central server 99%
Operating system, central server Windows Server 2012 R2 Standard Edition
RDBMS SQL Server 2012
SAP Business Suite software SAP enhancement package 5 for SAP ERP 6.0
Configuration Central Server
Fujitsu PRIMERGY RX2540 M2 2 processors / 44 cores / 88 threads Intel Xeon E5-2699 v4, 2.20 GHz, 64 KB L1 cache and 256KB L2 cache per core, 55 MB L3 cache per processor 512 GB main memory
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 27 (52)
The following chart shows a comparison of two-tier SAP SD Standard Application Benchmark results for 2-way Xeon E5 v4 based servers with Windows OS and SQL Server database (as of April 7, 2016). The PRIMERGY RX2540 M2 outperforms the comparably configured servers from HPE and Dell. The latest SAP SD 2-tier results can be found at http://www.sap.com/solutions/benchmark/sd2tier.epx.
19300
19750
20250
0 5000 10000 15000 20000
Number of Benchmark Users
Fujitsu PRIMERGY RX2540 M22 x Xeon E5-2699 v42 processors/44 cores/88 threadsWindows Server 2012 R2/ SQL Server 2012SAP enhancement package 5 for SAP ERP 6.0Certification number: 2016004
HPE ProLiant DL380 Gen92 x Xeon E5-2699 v42 processors/44 cores/88 threadsWindows Server 2012/ SQL Server 2012SAP enhancement package 5 for SAP ERP 6.0Certification number: 2016010
Dell PowerEdge R7302 x Xeon E5-2699 v42 processors/44 cores/88 threadsWindows Server 2012 / SQL Server 2012SAP enhancement package 5 for SAP ERP 6.0Certification number: 2016008
2-way Xeon E5 v4 based Two-Tier SAP SD results with Windows OS and SQL Server RDBMS
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 28 (52) http://ts.fujitsu.com/primergy
OLTP-2
Benchmark description
OLTP stands for Online Transaction Processing. The OLTP-2 benchmark is based on the typical application scenario of a database solution. In OLTP-2 database access is simulated and the number of transactions achieved per second (tps) determined as the unit of measurement for the system.
In contrast to benchmarks such as SPECint and TPC-E, which were standardized by independent bodies and for which adherence to the respective rules and regulations are monitored, OLTP-2 is an internal benchmark of Fujitsu. OLTP-2 is based on the well-known database benchmark TPC-E. OLTP-2 was designed in such a way that a wide range of configurations can be measured to present the scaling of a system with regard to the CPU and memory configuration.
Even if the two benchmarks OLTP-2 and TPC-E simulate similar application scenarios using the same load profiles, the results cannot be compared or even treated as equal, as the two benchmarks use different methods to simulate user load. OLTP-2 values are typically similar to TPC-E values. A direct comparison, or even referring to the OLTP-2 result as TPC-E, is not permitted, especially because there is no price-performance calculation.
Further information can be found in the document Benchmark Overview OLTP-2.
Benchmark environment
The measurement set-up is symbolically illustrated below:
All results were determined by way of example on a PRIMERGY RX2540 M2.
Application Server
Tier A Tier B
Clients
Database Server Disk
subsystem
System Under Test (SUT)
Driver
Network
Network
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 29 (52)
Database Server (Tier B)
Hardware
Model PRIMERGY RX2540 M2
Processor Intel® Xeon
® Processor E5-2600 v4 Product Family
Memory 1 processor: 8 × 64GB (1x64GB) 4Rx4 DDR4-2400 LR ECC 2 processors: 16 × 64GB (1x64GB) 4Rx4 DDR4-2400 LR ECC
Network interface 2 × onboard LAN 10 Gb/s
Disk subsystem RX2540 M2: Onboard RAID controller PRAID EP420i
2 × 300 GB 15k rpm SAS Drive, RAID1 (OS),
4 × 600 GB 15k rpm SAS Drive, RAID10 (LOG)
5 × PRAID EP420e
5 × JX40: 16 × 400 GB SSD Drive each, RAID5 (data)
Software
BIOS Version R1.3.0
Operating system Microsoft Windows Server 2012 R2 Standard
Database Microsoft SQL Server 2016 Enterprise
Application Server (Tier A)
Hardware
Model 1 × PRIMERGY RX2530 M1
Processor 2 × Xeon E5-2697 v3
Memory 64 GB, 2133 MHz registered ECC DDR4
Network interface 2 × onboard LAN 10 Gb/s 1 × Dual Port LAN 1 Gb/s
Disk subsystem 2 × 300 GB 15k rpm SAS Drive
Software
Operating system Microsoft Windows Server 2012 Standard
Client
Hardware
Model 1 × PRIMERGY RX300 S7
Processor 2 × Xeon E5-2667 v2
Memory 64 GB, 1600 MHz registered ECC DDR3
Network interface 2 × onboard LAN 1 Gb/s 1 × Dual Port LAN 1Gb/s
Disk subsystem 1 × 300 GB 15k rpm SAS Drive
Software
Operating system Microsoft Windows Server 2012 R2 Standard
Benchmark OLTP-2 Software EGen version 1.14.0
Some components may not be available in all countries / sales regions.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 30 (52) http://ts.fujitsu.com/primergy
Benchmark results
Database performance greatly depends on the configuration options with CPU, memory and on the connectivity of an adequate disk subsystem for the database. In the following scaling considerations for the processors we assume that both the memory and the disk subsystem has been adequately chosen and is not a bottleneck.
A guideline in the database environment for selecting main memory is that sufficient quantity is more important than the speed of the memory accesses. This why a configuration with a total memory of 1024 GB was considered for the measurements with two processors and a configuration with a total memory of 512 GB for the measurements with one processor. Both memory configurations have memory access of 2400 MHz. Further information about memory performance can be found in the White Paper Memory performance of Xeon E5-2600 v4 (Broadwell-EP)-based systems.
The following diagram shows the OLTP-2 transaction rates that can be achieved with one and two processors of the Intel
® Xeon
® Processor E5-2600 v4 Product Family.
619.51
761.11
466.86
1107.96
606.58
997.82
1378.42
1135.67
1282.21
1359.47
1569.89
1528.79
1726.05
1923.32
2120.58
1989.07
2350.72
2137.02
2432.91
2400.04
2597.30
1126.26
1383.69
848.74
2014.27
1102.77
1814.04
2522.33
2064.65
2331.06
2471.52
2854.05
2779.65
3137.96
3496.59
3855.21
3616.13
4273.61
3885.10
4423.04
4363.27
4721.89
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
E5-2623 v4 - 4C, HT
E5-2637 v4 - 4C, HT
E5-2603 v4 - 6C
E5-2643 v4 - 6C, HT
E5-2609 v4 - 8C
E5-2620 v4 - 8C, HT
E5-2667 v4 - 8C, HT
E5-2630L v4 - 10C, HT
E5-2630 v4 - 10C, HT
E5-2640 v4 - 10C, HT
E5-2650 v4 - 12C, HT
E5-2650L v4 - 14C, HT
E5-2660 v4 - 14C, HT
E5-2680 v4 - 14C, HT
E5-2690 v4 - 14C, HT
E5-2683 v4 - 16C, HT
E5-2697A v4 - 16C, HT
E5-2695 v4 - 18C, HT
E5-2697 v4 - 18C, HT
E5-2698 v4 - 20C, HT
E5-2699 v4 - 22C, HT
OLTP-2 tps
2CPUs 1024GB
1CPU 512GB
tpsbold: measured
cursive: calculated HT: Hyper-Threading
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 31 (52)
It is evident that a wide performance range is covered by the variety of released processors. If you compare the OLTP-2 value of the processor with the lowest performance (Xeon E5-2603 v4) with the value of the processor with the highest performance (Xeon E5-2699 v4), the result is a 5.6-fold increase in performance.
The features of the processors are summarized in the section “Technical data”.
The relatively large performance differences between the processors can be explained by their features. The values scale on the basis of the number of cores, the size of the L3 cache and the CPU clock frequency and as a result of the features of Hyper-Threading and turbo mode, which are available in most processor types. Furthermore, the data transfer rate between processors (“QPI Speed”) also determines performance.
A low performance can be seen in the Xeon E5-2603 v4 and E5-2609 v4 processors, as they have to manage without Hyper-Threading (HT) and turbo mode (TM).
Within a group of processors with the same number of cores scaling can be seen via the CPU clock frequency.
If you compare the maximum achievable OLTP-2 values of the current system generation with the values that were achieved on the predecessor systems, the result is an increase of about 25%.
Current System TX2560 M2 RX2530 M2 RX2540 M2 RX2560 M2
Predecessor System TX2560 M1 RX2530 M1 RX2540 M1 RX2560 M1
0
1000
2000
3000
4000
5000+ ~25%
tps
Current System Predecessor System
Maximum OLTP-2 tps
Comparison of system generations
2 × E5-2699 v3 512 GB
SQL 2014
2 × E5-2699 v4 1024 GB
SQL 2016
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 32 (52) http://ts.fujitsu.com/primergy
TPC-E
Benchmark description
The TPC-E benchmark measures the performance of online transaction processing systems (OLTP) and is based on a complex database and a number of different transaction types that are carried out on it. TPC-E is not only a hardware-independent but also a software-independent benchmark and can thus be run on every test platform, i.e. proprietary or open. In addition to the results of the measurement, all the details of the systems measured and the measuring method must also be explained in a measurement report (Full Disclosure Report or FDR). Consequently, this ensures that the measurement meets all benchmark requirements and is reproducible. TPC-E does not just measure an individual server, but a rather extensive system configuration. Keys to performance in this respect are the database server, disk I/O and network communication.
The performance metric is tpsE, where tps means transactions per second. tpsE is the average number of Trade-Result-Transactions that are performed within a second. The TPC-E standard defines a result as the tpsE rate, the price per performance value (e.g. $/tpsE) and the availability date of the measured configuration.
Further information about TPC-E can be found in the overview document Benchmark Overview TPC-E.
Benchmark results
In March 2016 Fujitsu submitted a TPC-E benchmark result for the PRIMERGY RX2540 M2 with the 22-core processor Intel Xeon E5-2699 v4 and 1024 GB memory.
The results show an enormous increase in performance compared with the PRIMERGY RX2540 M1 with a simultaneous reduction in costs.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 33 (52)
Some components may not be available in all countries / sales regions. More details about this TPC-E result, in particular the Full Disclosure Report, can be found via the TPC web page http://www.tpc.org/tpce/results/tpce_result_detail.asp?id=116032901.
FUJITSU Server PRIMERGY RX2540 M2
TPC-E 1.14.0 TPC Pricing 1.7.0
Report Date March 31, 2016
TPC-E Throughput 4,734.87 tpsE
Price/Performance $ 111.65 USD per tpsE
Availability Date July 31, 2016
Total System Cost $ 528,618 USD
Database Server Configuration
Operating System Microsoft Windows Server 2012 R2 Standard Edition
Database Manager Microsoft SQL Server
2016 Enterprise Edition
Processors/Cores/Threads 2/44/88
Memory 1024 GB
SUT
Tier A PRIMERGY RX2530 M1 2x Intel Xeon E5-2697 v3 2.60 GHz 64 GB Memory 2x 300 GB 15k rpm SAS Drive 2x Onboard LAN 10 Gb/s 1x Dual Port LAN 1 Gb/s 1x SAS RAID controller Tier B PRIMERGY RX2540 M2 2x Intel Xeon E5-2699 v4 2.20 GHz 1024 GB Memory 2x 300 GB 15k rpm SAS Drives 4x 600 GB 15k rpm SAS Drives 2x onboard LAN 10 Gb/s 6x SAS RAID Controller Storage 1x PRIMECENTER Rack 5x ETERNUS JX40 S2 80x 400 GB SSD Drives
Initial Database Size 19,552 GB
Redundancy Level 1 RAID-5 data and RAID-10 log
Storage 80 x 400 GB SSD
4 x 600 GB 15k rpm HDD
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 34 (52) http://ts.fujitsu.com/primergy
In March 2016, Fujitsu is represented with six results in the TPC-E list (without historical results).
System and Processors Throughput Price /
Performance Availability Date
PRIMERGY RX300 S8 with 2 × Xeon E5-2697 v2 2472.58 tpsE $135.14 per tpsE September 10, 2013
PRIMEQUEST 2800E with 2 × Xeon E7-8890 v2 8582.52 tpsE $205.43 per tpsE Mai 1, 2014
PRIMERGY RX2540 M1 with 2 × Xeon E5-2699 v3 3772.08 tpsE $130.44 per tpsE December 1, 2014
PRIMERGY RX4770 M2 with 4 × Xeon E7-8890 v3 6904.53 tpsE $126.49 per tpsE June 1, 2015
PRIMEQUEST 2800E2 with 8 × Xeon E7-8890 v3 10058.28 tpsE $187.53 per tpsE November 11, 2015
PRIMERGY RX2540 M2 with 2 × Xeon E5-2699 v4 4734.87 tpsE $111.65 per tpsE July 31, 2016
See the TPC web site for more information and all the TPC-E results (including historical results) (http://www.tpc.org/tpce).
The following diagram for 2-socket PRIMERGY systems with different processor types shows the good performance of the 2-socket system PRIMERGY RX2540 M2.
System and Processors Throughput Price /
Performance Availability Date
PRIMERGY RX300 S8 with 2 × Xeon E5-2697 v2 2472.58 tpsE $135.14 per tpsE September 10, 2013
PRIMERGY RX2540 M1 with 2 × Xeon E5-2699 v3 3772.08 tpsE $130.44 per tpsE December 1, 2014
PRIMERGY RX2540 M2 with 2 × Xeon E5-2699 v4 4734.87 tpsE $111.65 per tpsE July 31, 2016
In comparison with the PRIMERGY RX2540 M1 the increase in performance is +26% and in comparison with the PRIMERGY RX300 S8 +91%. The price per performance is $111.65/tpsE. Compared with the PRIMERGY RX2540 M1 the costs are reduced to 86% and with the PRIMERGY RX300 S8 to 83%.
2,472.58
3,772.08
4,734.87
$135.14 $130.44 $111.65
0
100
200
300
400
500
0
1000
2000
3000
4000
5000
PRIMERGYRX300 S8
2 × E5-2697 v2512 GB
PRIMERGYRX2540 M1
2 × E5-2699 v3512 GB
PRIMERGYRX2540 M2
2 × E5-2699 v41024 GB
$/tpsEtpsE
tpsE
$ per tpsE
bett
er
bett
er
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 35 (52)
The following overview, sorted according to price/performance, shows the best TPC-E price per performance ratios (as of March 31
st, 2016, without historical results) and the corresponding
TPC-E throughputs. PRIMERGY RX2540 M2 with a price per performance ratio of $111.65/tpsE achieved the best cost-effectiveness.
See the TPC web site for more information and all the TPC-E results (including historical results) (http://www.tpc.org/tpce).
Processor type
processors/
cores/threads
tpsE
(higher is
better)
$/tpsE
(lower is better)
availability
date
Fujitsu PRIMERGY RX2540 M22 × Intel Xeon
E5-2699 v44,734.87 111.65 2016-07-31
Lenovo System x3650 M5 2 × Intel Xeon
E5-2699 v44,938.14 117.91 2016-07-31
Fujitsu PRIMERGY RX4770 M2 4 × Intel Xeon
E7-8890 v3 6,904.53 126.94 2015-01-06
Fujitsu PRIMERGY RX2540 M12 × Intel Xeon
E5-2699 v33,772.08 130.44 2014-12-01
Fujitsu PRIMERGY RX300 S82 × Intel Xeon
E5-2697 v22,472.58 135.14 2013-09-10
Lenovo System x3950 X6 8 × Intel Xeon
E7-8890 v311,058.99 143.91 2015-12-17
IBM System x3650 M4 2 × Intel Xeon
E5-2697 v22,590.93 150.00 2013-11-29
HP ProLiant DL385p Gen82 × AMD Opteron
6386SE1,416.37 183.00 2013-05-15
Fujitsu PRIMEQUEST 2800E28 × Intel Xeon
E7-8890 v310,058.28 187.53 2015-11-11
IBM System x3850 X64 × Intel Xeon
E7-4890 v2 5,576.27 188.69 2014-04-15
System
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 36 (52) http://ts.fujitsu.com/primergy
vServCon
Benchmark description
vServCon is a benchmark used by Fujitsu to compare server configurations with hypervisor with regard to their suitability for server consolidation. This allows both the comparison of systems, processors and I/O technologies as well as the comparison of hypervisors, virtualization forms and additional drivers for virtual machines.
vServCon is not a new benchmark in the true sense of the word. It is more a framework that combines already established benchmarks (or in modified form) as workloads in order to reproduce the load of a consolidated and virtualized server environment. Three proven benchmarks are used which cover the application scenarios database, application server and web server.
Each of the three application scenarios is allocated to a dedicated virtual machine (VM). Add to these a fourth machine, the so-called idle VM. These four VMs make up a “tile”. Depending on the performance capability of the underlying server hardware, you may as part of a measurement also have to start several identical tiles in parallel in order to achieve a maximum performance score.
Each of the three vServCon application scenarios provides a specific benchmark result in the form of application-specific transaction rates for the respective VM. In order to derive a normalized score, the individual benchmark results for one tile are put in relation to the respective results of a reference system. The resulting relative performance values are then suitably weighted and finally added up for all VMs and tiles. The outcome is a score for this tile number.
Starting as a rule with one tile, this procedure is performed for an increasing number of tiles until no further significant increase in this vServCon score occurs. The final vServCon score is then the maximum of the vServCon scores for all tile numbers. This score thus reflects the maximum total throughput that can be achieved by running the mix defined in vServCon that consists of numerous VMs up to the possible full utilization of CPU resources. This is why the measurement environment for vServCon measurements is designed in such a way that only the CPU is the limiting factor and that no limitations occur as a result of other resources.
The progression of the vServCon scores for the tile numbers provides useful information about the scaling behavior of the “System under Test”.
A detailed description of vServCon is in the document: Benchmark Overview vServCon.
Application scenario Benchmark No. of logical CPU cores Memory
Database Sysbench (adapted) 2 1.5 GB
Java application server SPECjbb (adapted, with 50% - 60% load) 2 2 GB
Web server WebBench 1 1.5 GB
System Under Test
… …
Tile n
Tile 3
Tile 2
Tile 1
Database VM
Web VM
Idle VM
Java VM
Database VM
Web VM
Idle VM
Java VM
Database VM
Web VM
Idle VM
Java VM
Database VM
Web VM
Idle VM
Java VM
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 37 (52)
Benchmark environment
The measurement set-up is symbolically illustrated below:
All results were determined by way of example on a PRIMERGY RX2560 M2.
System Under Test (SUT)
Hardware
Processor Intel® Xeon
® Processor E5-2600 v4 Product Family
Memory 1 processor: 8 × 32GB (1x32GB) 4Rx4 DDR4-2400 R ECC 2 processors: 16 × 32GB (1x32GB) 4Rx4 DDR4-2400 R ECC
Network interface 1 × Emulex OneConnect OCe14000 Dual Port Adapter with 10Gb SFP+ DynamicLoM interface module
Disk subsystem 1 × dual-channel FC controller Emulex LPe16002
LINUX/LIO based flash storage system
Software
Operating system VMware ESXi 6.0.0 U1b Build 3380124
Load generator (incl. Framework controller)
Hardware (Shared)
Enclosure PRIMERGY BX900
Hardware
Model 18 × PRIMERGY BX920 S1 server blades
Processor 2 × Xeon X5570
Memory 12 GB
Network interface 3 × 1 Gbit/s LAN
Software
Operating system Microsoft Windows Server 2003 R2 Enterprise with Hyper-V
Multiple 1Gb or 10Gb
networks
Load generators
Server Disk subsystem
System Under Test (SUT)
Framework
controller
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 38 (52) http://ts.fujitsu.com/primergy
Load generator VM (per tile 3 load generator VMs on various server blades)
Hardware
Processor 1 × logical CPU
Memory 512 MB
Network interface 2 × 1 Gbit/s LAN
Software
Operating system Microsoft Windows Server 2003 R2 Enterprise Edition
Some components may not be available in all countries or sales regions.
Benchmark results
The PRIMERGY dual-socket rack and tower systems dealt with here are based on processors of the Intel®
Xeon® Processor E5-2600 v4 Product Family. The features of the processors are summarized in the section
“Technical data”.
The available processors of these systems with their results can be seen in the following table.
Processor Score #Tiles
Inte
l® X
eo
n® P
roce
ss
or
E5
v4
Pro
du
ct
Fam
ily
4 Cores Hyper-Threading, Turbo-Modus
E5-2623 v4 7.28 4
E5-2637 v4 9.03 4
6 Cores E5-2603 v4 6.28 6
6 Cores Hyper-Threading, Turbo-Modus
E5-2643 v3 13.6 6
8 Cores E5-2609 v4 8.39 8
8 Cores Hyper-Threading, Turbo-Modus
E5-2620 v4 12.9 8
E5-2667 v4 17.9 8
10 Cores Hyper-Threading, Turbo-Modus
E5-2630L v4 14.6 10
E5-2630 v4 16.6 10
E5-2640 v4 17.6 10
12 Cores Hyper-Threading, Turbo-Modus
E5-2650 v4 20.1 12
14 Cores Hyper-Threading, Turbo-Modus
E5-2650L v4 19.9 13
E5-2660 v4 22.9 14
E5-2680 v4 25.8 14
E5-2690 v4 27.2 14
16 Cores Hyper-Threading, Turbo-Modus
E5-2683 v4 27.1 16
E5-2697A v4 30.3 16
18 Cores Hyper-Threading, Turbo-Modus
E5-2695 v4 30.3 18
E5-2697 v4 31.8 18
20 Cores Hyper-Threading, Turbo-Modus
E5-2698 v4 34.7 20
22 Cores Hyper-Threading, Turbo-Modus
E5-2699 v4 38.7 22
These PRIMERGY dual-socket rack and tower systems are very suitable for application virtualization thanks to the progress made in processor technology. Compared with a system based on the previous processor generation an approximate 28% higher virtualization performance can be achieved (measured in vServCon score in their maximum configuration).
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 39 (52)
The first diagram compares the virtualization performance values that can be achieved with the processors reviewed here.
The relatively large performance differences between the processors can be explained by their features. The values scale on the basis of the number of cores, the size of the L3 cache and the CPU clock frequency and as a result of the features of Hyper-Threading and turbo mode, which are available in most processor types. Furthermore, the data transfer rate between processors (“QPI Speed”) also determines performance.
A low performance can be seen in the Xeon E5-2603 v4 and E5-2609 v4 processors, as they have to manage without Hyper-Threading (HT) and turbo mode (TM). In principle, these weakest processors are only to a limited extent suitable for the virtualization environment.
Within a group of processors with the same number of cores scaling can be seen via the CPU clock frequency.
As a matter of principle, the memory access speed also influences performance. A guideline in the virtualization environment for selecting main memory is that sufficient quantity is more important than the speed of the memory accesses. The vServCon scaling measurements presented here were all performed with a memory access speed – depending on the processor type – of at most 2400 MHz. More information about the topic “Memory Performance” and QPI architecture can be found in the White Paper Memory performance of Xeon E5-2600 v4 (Broadwell-EP)-based systems.
E5
-26
23
v4
-4
Co
res
E5
-26
37
v4
-4
Co
res
E5
-26
03
v4
-6
Co
res
E5
-26
09
v3
-8
Co
res
E5
-26
09
v4
-8
Co
res
E5
-26
20
v4
-8
Co
res
E5
-26
67
v4
-8
Co
res
E5
-26
30
L v
4 -
10
Co
res
E5
-26
30
v4
-1
0 C
ore
s
E5
-26
40
v4
-1
0 C
ore
s
E5
-26
50
v4
-1
2 C
ore
s
E5
-26
50
L v
4 -
14
Co
res
E5
-26
60
v4
-1
4 C
ore
s
E5
-26
80
v4
-1
4 C
ore
s
E5
-26
90
v4
-1
4 C
ore
s
E5
-26
83
v4
-1
6 C
ore
s
E5
-26
97
A v
4 -
16
Co
res
E5
-26
95
v4
-1
8 C
ore
s
E5
-26
97
v4
-1
8 C
ore
s
E5
-26
98
v4
-2
0 C
ore
s
E5
-26
99
v4
-2
2 C
ore
s
4 4 6 6 8 8 8 10 10 10 12 13 14 14 14 16 16 18 18 20 22
0
5
10
15
20
25
30
35
40
Fin
al vS
erv
Co
n S
co
re
Intel® Xeon
® Processor E5-2600 v4 Product Family #Tiles
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 40 (52) http://ts.fujitsu.com/primergy
Until now we have looked at the virtualization performance of a fully configured system. However, with a server with two sockets the question also arises as to how good performance scaling is from one to two processors. The better the scaling, the lower the overhead usually caused by the shared use of resources within a server. The scaling factor also depends on the application. If the server is used as a virtualization platform for server consolidation, the system scales with a factor of 1.94. When operated with two processors, the system thus achieves a significantly better performance than with one processor, as is illustrated in the diagram opposite using the processor version Xeon E5-2699 v4 as an example.
The next diagram illustrates the virtualization performance for increasing numbers of VMs based on the Xeon E5-2620 v4 (8 core) and E5-2660 v4 (14 core) processors.
In addition to the increased number of physical cores, Hyper-Threading, which is supported by almost all processors of the Intel
® Xeon
®
Processor E5-2600 v3 Product Family, is an additional reason for the high number of VMs that can be operated. As is known, a physical processor core is consequently divided into two logical cores so that the number of cores available for the hypervisor is doubled. This standard feature thus generally increases the virtualization performance of a system.
The previous diagram examined the total performance of all application VMs of a host. However, studying the performance from an individual application VM viewpoint is also interesting. This information is in the previous diagram. For example, the total optimum is reached in the above Xeon E5-2620 v4 situation with 24 application VMs (eight tiles, not including the idle VMs); the low load case is represented by three application VMs (one tile, not including the idle VM). Remember: the vServCon score for one tile is an average value across the three application scenarios in vServCon. This average performance of one tile drops when changing from the low load case to the total optimum of the vServCon score - from 2.57 to 12.9/8=1.61, i.e. to 63%. The individual types of application VMs can react very differently in the high load situation. It is thus clear that in a specific situation the performance requirements of an individual application must be balanced against the overall requirements regarding the numbers of VMs on a virtualization host.
2.5
7
5.2
4
7.7
8
9.4
0
10.6
11.8
12.6
12.9
2.6
9
5.4
0
8.1
3
10.6
12.9
15.0
16.4
18.1
19.1
20.4
21.3
22.2
22.5
22.9
0
5
10
15
20
25
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 9 10 11 12 13 14
vS
erv
Con S
core
E5-2620 v3 E5-2660 v4
#Tiles
19.9
@12 tile
s
38.7
@22 tile
s
0
5
10
15
20
25
30
35
40
1 x E5-2699 v4 2 x E5-2699 v4
× 1.94
Fin
al v
Se
rvC
on
Sc
ore
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 41 (52)
The virtualization-relevant progress in processor technology since 2008 has an effect on the one hand on an individual VM and, on the other hand, on the possible maximum number of VMs up to CPU full utilization. The following comparison shows the proportions for both types of improvements.
Seven systems with similar housing construction are compared: a system from 2008, a system from 2009, a system from 2011, a system from 2012, a system from 2013, a system from 2014 and a current system with the best processors each (see table below) for few VMs and for highest maximum performance.
The clearest performance improvements arose from 2008 to 2009 with the introduction of the Xeon 5500 processor generation (e. g. via the feature “Extended Page Tables” (EPT)
1). One sees an increase of the
vServCon score by a factor of 1.28 with a few VMs (one tile).
1 EPT accelerates memory virtualization via hardware support for the mapping between host and guest memory addresses.
2008 2009 2011 2012 2013 2014/2015 2016
RX200 S4 RX200 S5 RX200 S6 RX200 S7 RX200 S8 RX2530 M1 RX2530 M2
RX300 S4 RX300 S5 RX300 S6 RX300 S7 RX300 S8 RX2540 M1 RX2540 M2
- - TX300 S6 RX350 S7 RX350 S8 RX2560 M1 RX2560 M2
TX300 S4 TX300 S5 TX300 S6 TX300 S7 TX300 S8 TX2560 M1 TX2560 M2
Best
Performance Few VMs
vServCon Score 1 Tile
Best Maximum
Performance
vServCon Score max.
2008 X5460 1.91 X5460 2.94@2 tiles
2009 X5570 2.45 X5570 6.08@ 6 tiles
2011 X5690 2.63 X5690 9.61@ 9 tiles
2012 E5-2643 2.73 E5-2690 13.5@ 8 tiles
2013 E5-2667 v2 2.85 E5-2697 v2 17.1@11 tiles
2014 E5-2643 v3 3.22 E5-2699 v3 30.3@18 tiles
2016 E5-2637 v4 3.29 E5-2699 v4 38.7@22 tiles
1.912.45 2.63 2.73 2.85 3.22 3.29
0
1
2
3
4
5
6
7
8
9
10
2008X5460
3.17 GHz4C
2009X5570
2.93 GHz4C
2011X5690
2.93 GHz6C
2012E5-26433.3 GHz
4C
2013E5-2667 v2
3.3 GHz8C
2014E5-2643 v3
3.4 GHz6C
2016E5-2637 v4
3.5 GHz4C
vS
erv
Co
n S
co
re
YYear CPUFreq.
#Cores
× 1.28 × 1.07
× 1.04× 1.04 × 1.13 × 1.02
Virtualization relevant improvements
Few VMs (1 Tile)
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 42 (52) http://ts.fujitsu.com/primergy
With full utilization of the systems with VMs there was an increase by a factor of 2.07. The one reason was the performance increase that could be achieved for an individual VM (see score for a few VMs). The other reason was that more VMs were possible with total optimum (via Hyper-Threading). However, it can be seen that the optimum was “bought” with a triple number of VMs with a reduced performance of the individual VM.
Where exactly is the technology progress between 2009 and 2016?
The performance for an individual VM in low-load situations has only slightly increased for the processors compared here with the highest clock frequency per core. We must explicitly point out that the increased virtualization performance as seen in the score cannot be completely deemed as an improvement for one individual VM.
The decisive progress is in the higher number of physical cores and – associated with it – in the increased values of maximum performance (factor 1.58, 1.40, 1.27, 1.77 and 1.28 in the diagram).
Up to and including 2011 the best processor type of a processor generation had both the highest clock frequency and the highest number of cores. From 2012 there have been differently optimized processors on offer: Versions with a high clock frequency per core for few cores and versions with a high number of cores, but with a lower clock frequency per core. The features of the processors are summarized in the section “Technical data”.
Performance increases in the virtualization environment since 2009 are mainly achieved by increased VM numbers due to the increased number of available logical or physical cores. However, since 2012 it has been possible - depending on the application scenario in the virtualization environment – to also select a CPU with an optimized clock frequency if a few or individual VMs require maximum computing power.
2.946.08
9.6113.5
17.1
30.3
38.7
0
5
10
15
20
25
30
35
40
2008X5460
3.17 GHz4C
2009X5570
2.93 GHz4C
2011X5690
2.93 GHz6C
2012E5-26902.9 GHz
8C
2013E5-2697 v2
2.7 GHz12C
2014E5-2699 v3
2.3 GHz18C
2016E5-2699 v4
2.2 GHz22C
vS
erv
Co
n S
co
re
YYear CPUFreq.
#Cores
× 2.07 × 1.58
× 1.27
× 1.40
× 1.77
× 1.28
Virtualization relevant improvements
Score at optimum Tile count
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 43 (52)
VMmark V2
Benchmark description
VMmark V2 is a benchmark developed by VMware to compare server configurations with hypervisor solutions from
VMware regarding their suitability for server consolidation. In addition to the software for load generation, the benchmark consists of a defined load profile and binding regulations. The benchmark results can be submitted to VMware and are published on their Internet site after a successful review process. After the discontinuation of the proven benchmark “VMmark V1” in October 2010, it has been succeeded by “VMmark V2”, which requires a cluster of at least two servers and covers data center functions, like Cloning and Deployment of virtual machines (VMs), Load Balancing, as well as the moving of VMs with vMotion and also Storage vMotion.
In addition to the “Performance Only” result, it is also possible from version 2.5 of VMmark to alternatively measure the electrical power consumption and publish it as a “Performance with Server Power” result (power consumption of server systems only) and/or “Performance with Server and Storage Power” result (power consumption of server systems and all storage components).
VMmark V2 is not a new benchmark in the actual sense. It is in fact a framework that consolidates already established benchmarks, as workloads in order to simulate the load of a virtualized consolidated server environment. Three proven benchmarks, which cover the application scenarios mail server, Web 2.0, and e-commerce were integrated in VMmark V2.
Each of the three application scenarios is assigned to a total of seven dedicated virtual machines. Then add to these an eighth VM called the “standby server”. These eight VMs form a “tile”. Because of the performance capability of the underlying server hardware, it is usually necessary to have started several identical tiles in parallel as part of a measurement in order to achieve a maximum overall performance.
A new feature of VMmark V2 is an infrastructure component, which is present once for every two hosts. It measures the efficiency levels of data center consolidation through VM Cloning and Deployment, vMotion and Storage vMotion. The Load Balancing capacity of the data center is also used (DRS, Distributed Resource Scheduler).
The result of VMmark V2 for test type „Performance Only“ is a number, known as a “score”, which provides information about the performance of the measured virtualization solution. The score reflects the maximum total consolidation benefit of all VMs for a server configuration with hypervisor and is used as a comparison criterion of various hardware platforms.
This score is determined from the individual results of the VMs and an infrastructure result. Each of the five VMmark V2 application or front-end VMs provides a specific benchmark result in the form of application-specific transaction rates for each VM. In order to derive a normalized score the individual benchmark results for one tile are put in relation to the respective results of a reference system. The resulting dimensionless performance values are then averaged geometrically and finally added up for all VMs. This value is included in the overall score with a weighting of 80%. The infrastructure workload is only present in the benchmark once for every two hosts; it determines 20% of the result. The number of transactions per hour and the average duration in seconds respectively are determined for the score of the infrastructure workload components.
In addition to the actual score, the number of VMmark V2 tiles is always specified with each VMmark V2 score. The result is thus as follows: “Score@Number of Tiles”, for example “4.20@5 tiles”.
In the case of the two test types “Performance with Server Power” and “Performance with Server and Storage Power” a so-called “Server PPKW Score” and “Server and Storage PPKW Score” is determined, which is the performance score divided by the average power consumption in kilowatts (PPKW = performance per kilowatt (KW)).
The results of the three test types should not be compared with each other.
A detailed description of VMmark V2 is available in the document Benchmark Overview VMmark V2.
Application scenario Load tool # VMs
Mail server LoadGen 1
Web 2.0 Olio client 2
E-commerce DVD Store 2 client 4
Standby server (IdleVMTest) 1
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 44 (52) http://ts.fujitsu.com/primergy
Benchmark environment
The measurement set-up is symbolically illustrated below:
System Under Test (SUT)
Hardware
Number of servers 2
Model PRIMERGY RX2540 M2
Processor 2 × Xeon E5-2699 v4
Memory 512 GB: 16 × 32GB (1x32GB) 2Rx4 DDR4-2400 R ECC
Network interface 1 × Emulex OneConnect OCe14000 Dual Port Adapter with 10Gb SFP+ DynamicLoM interface module
1 × Intel I350-T2 Dual Port 1GbE Adapter
Disk subsystem 1 × Dual port PFC EP LPe16002
2 × PRIMERGY RX300 S8 configured as Fibre Channel target:
9/8 × SAS-SSD (400 GB) 2 × Fusion-io ioDrive
®2 PCIe-SSD (1.2 TB)
RAID 0 with several LUNs Total: 9.31 TB
Software
BIOS Version V5.0.0.11 R1.3.0
BIOS settings See details
Operating system VMware ESXi 6.0.0 U1b Build 3380124
Operating system settings
ESX settings: see details
Details
See disclosure http://www.vmware.com/a/assets/vmmark/pdf/2016-03-31-Fujitsu-RX2540M2.pdf http://www.vmware.com/a/assets/vmmark/pdf/2016-03-31-Fujitsu-RX2540M2-serverPPKW.pdf
Multiple 1Gb or 10Gb
networks
Load Generators incl. Prime Client and
Datacenter Management
Server
Server(s) Storage System
System under Test (SUT)
vMotion
network
Clients & Management
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 45 (52)
Datacenter Management Server (DMS)
Hardware (Shared)
Enclosure PRIMERGY BX600
Network Switch 1 × PRIMERGY BX600 GbE Switch Blade 30/12
Hardware
Model 1 × server blade PRIMERGY BX620 S5
Processor 2 × Xeon X5570
Memory 24 GB
Network interface 6 × 1 Gbit/s LAN
Software
Operating system VMware ESXi 5.1.0 Build 799733
Datacenter Management Server (DMS) VM
Hardware
Processor 4 × logical CPU
Memory 10 GB
Network interface 2 × 1 Gbit/s LAN
Software
Operating system Microsoft Windows Server 2008 R2 Enterprise x64 Edition
Prime Client
Hardware (Shared)
Enclosure PRIMERGY BX600
Network Switch 1 × PRIMERGY BX600 GbE Switch Blade 30/12
Hardware
Model 1 × server blade PRIMERGY BX620 S5
Processor 2 × Xeon X5570
Memory 12 GB
Network interface 6 × 1 Gbit/s LAN
Software
Operating system Microsoft Windows Server 2008 Enterprise x64 Edition SP2
Load generator
Hardware
Model 2 × PRIMERGY RX600 S6
Processor 4 × Xeon E7-4870
Memory 512 GB
Network interface 5 × 1 Gbit/s LAN
Software
Operating system VMware ESX 4.1.0 U2 Build 502767
Load generator VM (per tile 1 load generator VM)
Hardware
Processor 4 × logical CPU
Memory 4 GB
Network interface 1 × 1 Gbit/s LAN
Software
Operating system Microsoft Windows Server 2008 Enterprise x64 Edition SP2
Some components may not be available in all countries or sales regions.
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 46 (52) http://ts.fujitsu.com/primergy
Benchmark results
“Performance Only” measurement result (March 31st
2016)
On March 31, 2016 Fujitsu achieved with a PRIMERGY RX2540 M2 with Xeon E5-2699 v4 processors and VMware ESXi 6.0 U1b a VMmark V2 score of “34.74@28 tiles” in a system configuration with a total of 2 × 44 processor cores and when using two identical servers in the “System under Test” (SUT). With this result the PRIMERGY RX2540 M2 is in the official
VMmark V2 “Performance Only” ranking the most powerful 2-socket server in a “matched pair” configuration consisting of two identical hosts (valid as of benchmark results publication date).
All comparisons for the competitor products reflect the status of 31st
March 2016. The current VMmark V2 “Performance Only” results as well as the detailed results and configuration data are available at http://www.vmware.com/a/vmmark/.
The diagram shows the “Performance Only” result of the PRIMERGY RX2540 M2 in comparison with the best 2-socket systems in a “matched pair” configuration.
The processors used, which with a good hypervisor setting could make optimal use of their processor features, were the essential prerequisites for achieving the PRIMERGY RX2540 M2 result. These features include Hyper-Threading. All this has a particularly positive effect during virtualization.
All VMs, their application data, the host operating system as well as additionally required data were on a powerful Fibre Channel disk subsystem. As far as possible, the configuration of the disk subsystem takes the specific requirements of the benchmark into account. The use of flash technology in the form of SAS SSDs and PCIe-SSDs in the powerful Fibre Channel disk subsystem resulted in further advantages in response times of the storage medium used.
The network connection to the load generators was implemented via 10Gb LAN ports. The infrastructure-workload connection between the hosts was by means of 1Gb LAN ports.
All the components used were optimally attuned to each other.
2-socket systems, “matched pair” VMmark V2 Score Difference
Fujitsu PRIMERGY RX2540 M2 34.74@28 tiles
HPE ProLiant ML350 Gen9 34.09@28 tiles 1.91%
Huawei FusionServer RH2288H V3 32.46@28 tiles 7.02%
Fujitsu PRIMERGY RX2540 M1 27.66@22 tiles 25.60%
34
.74@
28
tile
s
34
.09
@2
8 tile
s
32
.46
@3
1 tile
s
27
.66
@2
2 tile
s
0
5
10
15
20
25
30
35
40
2 × FujitsuPRIMERGY RX2540 M2
2 × 2 × XeonE5-2699 v4
2 × Hewlett PackardEnterprise
ProLiant ML350 Gen92 × 2 × XeonE5-2699 v4
2 × HuaweiFusionServer RH2288H V3
2 × 2 × XeonE5-2699 v4
2 × FujitsuPRIMERGY RX2540 M1
2 × 2 × XeonE5-2699 v3
VM
mark
V2 S
core
Performance Only, 2-socket, “matched pair”
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 47 (52)
“Performance with Server Power” measurement result (March 31st
2016)
On March 31, 2016 Fujitsu achieved with a PRIMERGY RX2540 M2 with Xeon E5-2699 v4 processors and VMware ESXi 6.0 U1b a VMmark V2 “Server PPKW Score” of “38.3065@22 tiles” in a system configuration with a total of 2 × 446 processor cores and when using two identical servers in the “System under Test” (SUT). With this result the PRIMERGY
RX2540 M2 is in the official VMmark V2 “Performance with Server Power” ranking the most energy-efficient virtualization server worldwide (valid as of benchmark results publication date).
All comparisons for the competitor products reflect the status of 31st
March 2016. The current VMmark V2 “Performance with Server Power” results as well as the detailed results and configuration data are available at http://www.vmware.com/a/vmmark/2/.
The diagram shows all VMmark V2 “Performance with Server Power“ results.
38
.30
65
@2
8 tile
s
25
.23
05
@2
2 tile
s
23
.64
93
@2
2 tile
s
22
.89
98
@4
0 tile
s
20
.04
10
@4
0 tile
s
17
.68
99
@2
0 tile
s
0
5
10
15
20
25
30
35
40
45
2 × FujitsuPRIMERGYRX2540 M22 × 2 × XeonE5-2699 v4
2 × FujitsuPRIMERGYRX2540 M12 × 2 × XeonE5-2699 v3
2 × FujitsuPRIMERGYRX2530 M12 × 2 × XeonE5-2699 v3
2 × FujitsuPRIMERGYRX4770 M22 × 4 × XeonE7-8890 v3
2 × FujitsuPRIMEQUEST
2800E22 × 4 × XeonE7-8890 v3
2 × HPProLiant
DL380 Gen92 × 2 × XeonE5-2699 v3
VM
mark
V2 S
erv
er
PP
KW
Score +51.8%
Performance with Server Power
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 48 (52) http://ts.fujitsu.com/primergy
STREAM
Benchmark description
STREAM is a synthetic benchmark that has been used for many years to determine memory throughput and which was developed by John McCalpin during his professorship at the University of Delaware. Today STREAM is supported at the University of Virginia, where the source code can be downloaded in either Fortran or C. STREAM continues to play an important role in the HPC environment in particular. It is for example an integral part of the HPC Challenge benchmark suite.
The benchmark is designed in such a way that it can be used both on PCs and on server systems. The unit of measurement of the benchmark is GB/s, i.e. the number of gigabytes that can be read and written per second.
STREAM measures the memory throughput for sequential accesses. These can generally be performed more efficiently than accesses that are randomly distributed on the memory, because the processor caches are used for sequential access.
Before execution the source code is adapted to the environment to be measured. Therefore, the size of the data area must be at least 12 times larger than the total of all last-level processor caches so that these have as little influence as possible on the result. The OpenMP program library is used to enable selected parts of the program to be executed in parallel during the runtime of the benchmark, consequently achieving optimal load distribution to the available processor cores.
During implementation the defined data area, consisting of 8-byte elements, is successively copied to four types, and arithmetic calculations are also performed to some extent.
Type Execution Bytes per step Floating-point calculation per step
COPY a(i) = b(i) 16 0
SCALE a(i) = q × b(i) 16 1
SUM a(i) = b(i) + c(i) 24 1
TRIAD a(i) = b(i) + q × c(i) 24 2
The throughput is output in GB/s for each type of calculation. The differences between the various values are usually only minor on modern systems. In general, only the determined TRIAD value is used as a comparison.
The measured results primarily depend on the clock frequency of the memory modules; the processors influence the arithmetic calculations.
This chapter specifies throughputs on a basis of 10 (1 GB/s = 109 Byte/s).
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 49 (52)
Benchmark environment
System Under Test (SUT)
Hardware
Model PRIMERGY RX2540 M2
Processor 2 processors of Intel® Xeon
® Processor E5-2600 v4 Product Family
Memory 16 × 16GB (1x16GB) 2Rx4 DDR4-2400 R ECC
Software
BIOS settings Energy Performance = Performance
Utilization Profile = Unbalanced
Package C State limit = C0
COD Enable = Disabled
Early Snoop = Disabled
Home Snoop Dir OSB = Enabled
All processors except Xeon E5-2603 v4 and E5-2609 v4: Hyper-Threading = Disabled
Operating system SUSE Linux Enterprise Server 12 SP1 (x86_64)
Operating system settings
Transparent Huge Pages inactivated
Compiler Intel C++ Composer XE 2016 for Linux
Benchmark STREAM Version 5.10
Some components may not be available in all countries or sales regions.
Benchmark results
Processor Memory Frequency
[MHz]
Max. Memory Bandwidth
[GB/s]
Cores Processor Frequency
[GHz]
Number of Processors
TRIAD
[GB/s]
Xeon E5-2603 v4 1866 59.7 6 1.70 2 89.3
Xeon E5-2609 v4 1866 59.7 8 1.70 2 99.3
Xeon E5-2623 v4 2133 68.3 4 2.60 2 71.2
Xeon E5-2620 v4 2133 68.3 8 2.10 2 108
Xeon E5-2630L v4 2133 68.3 10 1.80 2 109
Xeon E5-2630 v4 2133 68.3 10 2.20 2 109
Xeon E5-2640 v4 2133 68.3 10 2.40 2 110
Xeon E5-2637 v4 2400 76.8 4 3.50 2 101
Xeon E5-2643 v4 2400 76.8 6 3.40 2 118
Xeon E5-2667 v4 2400 76.8 8 3.20 2 120
Xeon E5-2650 v4 2400 76.8 12 2.20 2 130
Xeon E5-2650L v4 2400 76.8 14 1.70 2 133
Xeon E5-2660 v4 2400 76.8 14 2.00 2 133
Xeon E5-2680 v4 2400 76.8 14 2.40 2 133
Xeon E5-2690 v4 2400 76.8 14 2.60 2 132
Xeon E5-2683 v4 2400 76.8 16 2.10 2 133
Xeon E5-2697A v4 2400 76.8 16 2.60 2 133
Xeon E5-2695 v4 2400 76.8 18 2.10 2 132
Xeon E5-2697 v4 2400 76.8 18 2.30 2 132
Xeon E5-2698 v4 2400 76.8 20 2.20 2 133
Xeon E5-2699 v4 2400 76.8 22 2.20 2 132
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 50 (52) http://ts.fujitsu.com/primergy
The following diagram illustrates the throughput of the PRIMERGY RX2540 M2 in comparison to its predecessor, the PRIMERGY RX2540 M1.
Further information about memory performance can be found in the White Paper Memory performance of Xeon E5-2600 v4 (Broadwell-EP) based systems.
0 20 40 60 80 100 120 140
Xeon E5-2603 v3Xeon E5-2609 v3Xeon E5-2623 v3Xeon E5-2620 v3
Xeon E5-2630L v3Xeon E5-2630 v3Xeon E5-2640 v3Xeon E5-2637 v3Xeon E5-2643 v3Xeon E5-2667 v3Xeon E5-2650 v3Xeon E5-2660 v3
Xeon E5-2650L v3Xeon E5-2670 v3Xeon E5-2680 v3Xeon E5-2690 v3Xeon E5-2683 v3Xeon E5-2695 v3Xeon E5-2697 v3Xeon E5-2698 v3Xeon E5-2699 v3
Xeon E5-2603 v4Xeon E5-2609 v4Xeon E5-2623 v4Xeon E5-2620 v4
Xeon E5-2630L v4Xeon E5-2630 v4Xeon E5-2640 v4Xeon E5-2637 v4Xeon E5-2643 v4Xeon E5-2667 v4Xeon E5-2650 v4
Xeon E5-2650L v4Xeon E5-2660 v4Xeon E5-2680 v4Xeon E5-2690 v4Xeon E5-2683 v4
Xeon E5-2697A v4Xeon E5-2695 v4Xeon E5-2697 v4Xeon E5-2698 v4Xeon E5-2699 v4
GB/s
PR
IME
RG
Y R
X2540 M
1P
RIM
ER
GY
RX
2540 M
2
est.
est.est.
est.est.
est.est.
est.est.
est.
est.
STREAM TRIAD: PRIMERGY RX2540 M2 vs. PRIMERGY RX2540 M1
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
http://ts.fujitsu.com/primergy Page 51 (52)
Literature
PRIMERGY Servers
http://primergy.com/
PRIMERGY RX2540 M2
This White Paper: http://docs.ts.fujitsu.com/dl.aspx?id=311bab2f-039e-4ff5-8765-c17f2ec22dbb http://docs.ts.fujitsu.com/dl.aspx?id=58bc47eb-acd6-4666-88f3-26a6e9d3d4ac http://docs.ts.fujitsu.com/dl.aspx?id=1dd5f698-5925-460a-8605-736681c6adfb
Data sheet http://docs.ts.fujitsu.com/dl.aspx?id=99064b9e-c735-414b-87fd-d8c3c72e29ec
PRIMERGY Performance
http://www.fujitsu.com/fts/x86-server-benchmarks
Performance of Server Components
http://www.fujitsu.com/fts/products/computing/servers/mission-critical/benchmarks/x86-components.html
BIOS optimizations for Xeon E5-2600 v4 based systems http://docs.ts.fujitsu.com/dl.aspx?id=eb90c352-8d98-4f5a-9eed-b5aade5ccae1
Memory performance of Xeon E5-2600 v4 (Broadwell-EP) based systems http://docs.ts.fujitsu.com/dl.aspx?id=8f372445-ee63-4369-8683-da9557673357
RAID Controller Performance http://docs.ts.fujitsu.com/dl.aspx?id=9845be50-7d4f-4ef7-ac61-bbde399c1014
Disk I/O: Performance of storage media and RAID controllers
Basics of Disk I/O Performance http://docs.ts.fujitsu.com/dl.aspx?id=65781a00-556f-4a98-90a7-7022feacc602
Information about Iometer http://www.iometer.org
OLTP-2
Benchmark Overview OLTP-2 http://docs.ts.fujitsu.com/dl.aspx?id=e6f7a4c9-aff6-4598-b199-836053214d3f
SAP SD
http://www.sap.com/benchmark
Benchmark overview SAP SD http://docs.ts.fujitsu.com/dl.aspx?id=0a1e69a6-e366-4fd1-a1a6-0dd93148ea10
SPECcpu2006
http://www.spec.org/osg/cpu2006
Benchmark overview SPECcpu2006 http://docs.ts.fujitsu.com/dl.aspx?id=1a427c16-12bf-41b0-9ca3-4cc360ef14ce
SPECpower_ssj2008
http://www.spec.org/power_ssj2008
Benchmark Overview SPECpower_ssj2008 http://docs.ts.fujitsu.com/dl.aspx?id=166f8497-4bf0-4190-91a1-884b90850ee0
STREAM
http://www.cs.virginia.edu/stream/
White Paper Performance Report PRIMERGY RX2540 M2 Version: 1.1 2016-07-21
Page 52 (52) http://ts.fujitsu.com/primergy
TPC-E
http://www.tpc.org/tpce
Benchmark Overview TPC-E http://docs.ts.fujitsu.com/dl.aspx?id=da0ce7b7-3d80-48cd-9b3a-d12e0b40ed6d
VMmark V2
Benchmark Overview VMmark V2 http://docs.ts.fujitsu.com/dl.aspx?id=2b61a08f-52f4-4067-bbbf-dc0b58bee1bd
VMmark V2 http://www.vmmark.com
vServCon
Benchmark Overview vServCon http://docs.ts.fujitsu.com/dl.aspx?id=b953d1f3-6f98-4b93-95f5-8c8ba3db4e59
Contact
FUJITSU
Website: http://www.fujitsu.com/
PRIMERGY Product Marketing
mailto:[email protected]
PRIMERGY Performance and Benchmarks
mailto:[email protected]
© Copyright 2016 Fujitsu Technology Solutions. Fujitsu and the Fujitsu logo are trademarks or registered trademarks of Fujitsu Limited in Japan and other countries. Other company, product and service names may be trademarks or registered trademarks of their respective owners. Technical data subject to modification and delivery subject to availability. Any liability that the data and illustrations are complete, actual or correct is excluded. Designations may be trademarks and/or copyrights of the respective manufacturer, the use of which by third parties for their own purposes may infringe the rights of such owner. For further information see http://www.fujitsu.com/fts/resources/navigation/terms-of-use.html
2016-07-21 WW EN