benchmarking best practices 102 - amazon s3benchmarking best practices 102 maxim kuvyrkov bkk16-300...
TRANSCRIPT
![Page 1: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/1.jpg)
Presented by
Date
Event
Benchmarking Best Practices 102
Maxim Kuvyrkov
BKK16-300 March 9, 2016
Linaro Connect BKK16
![Page 2: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/2.jpg)
Overview
● Revision (Benchmarking Best Practices 101)● Reproducibility● Reporting
![Page 3: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/3.jpg)
Revision
![Page 4: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/4.jpg)
Previously, in Benchmarking-101...
● Approach benchmarking as an experiment. Be scientific.
● Design the experiment in light of your goal.● Repeatability:
○ Understand and control noise.○ Use statistical methods to find truth in noise.
![Page 5: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/5.jpg)
And we briefly mentioned
● Reproducibility● Reporting
So let’s talk some more about those.
![Page 6: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/6.jpg)
Reproducibility
![Page 7: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/7.jpg)
Reproducibility
An experiment is reproducible if external teams can run the same experiment over large periods of time and get commensurate (comparable) results.Achieved if others can repeat what we did and get the same results as us, within the given confidence interval.
![Page 8: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/8.jpg)
From Repeatability to Reproducibility
We must log enough information that anyone else can use that information to repeat our experiments.We have achieved reproducibility if they can get the same results, within the given confidence interval.
![Page 9: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/9.jpg)
Logging: Target
● CPU/SoC/Board○ Revision, patch level, firmware version…
● Instance of the board○ Is board 1 really identical to board 2?
● Kernel version and configuration● Distribution
![Page 10: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/10.jpg)
Example: Target
Board: Juno r0CPU: 2 * Cortex-A57r0p0, 4 * Cortex-A53r0p0Firmware version: 0.11.3Hostname: juno-01Kernel: 3.16.0-4-generic #1 SMPDistribution: Debian Jessie
![Page 11: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/11.jpg)
Logging: Build
● Exact toolchain version● Exact libraries used● Exact benchmark source● Build system (scripts, makefiles etc)● Full build logOthers should be able to acquire and rebuild all of these components.
![Page 12: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/12.jpg)
Example: Build
Toolchain: Linaro GCC 2015.04CLI: -O2 -fno-tree-vectorize -DFOOLibraries: libBar.so.1.3.2, git.linaro.org/foo/bar #8d30a2c508468bb534bb937bd488b18b8636d3b1Benchmark: MyBenchmark, git.linaro.org/foo/mb #d00fb95a1b5dbe3a84fa158df872e1d2c4c49d06Build System: abe, git.linaro.org/toolchain/abe #d758ec431131655032bc7de12c0e6f266d9723c2
![Page 13: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/13.jpg)
Logging: Run-time Environment
● Environment variables● Command-line options passed to benchmark● Mitigation measures taken
![Page 14: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/14.jpg)
Logging: Other
All of the above may need modification depending on what is being measured.● Network-sensitive benchmarks may need
details of network configuration● IO-sensitive benchmarks may need details
of storage devices● And so on...
![Page 15: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/15.jpg)
Long Term Storage
All results should be stored with information required for reproducibilityResults should be kept for the long term● Someone may ask you for some information● You may want to do some new analysis in
the future
![Page 16: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/16.jpg)
Reporting
![Page 17: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/17.jpg)
Reporting
● Clear, concise reporting allows others to utilise benchmark results.
● Does not have to include all data required for reproducibility.
● But that data should be available.● Do not assume too much reader knowledge.
○ Err on the side of over-explanation
![Page 18: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/18.jpg)
Reporting: Goal
Explain the goal of the experiment● What decision will it help you to make?● What improvement will it allow you to
deliver?Explain the question that the experiment asksExplain how the answer to that question helps you to achieve the goal
![Page 19: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/19.jpg)
Reporting
● Method: Sufficient high-level detail○ Target, toolchain, build options, source, mitigation
● Limitations: Acknowledge and justify○ What are the consequences for this experiment?
● Results: Discuss in context of goal○ Co-locate data, graphs, discussion○ Include units - numbers without units are useless○ Include statistical data○ Use the benchmark’s metrics
![Page 20: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/20.jpg)
Presentation of Results
Graphs are always usefulTables of raw data also usefulStatistical context essential:● Number of runs● (Which) mean● Standard deviation
![Page 21: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/21.jpg)
Experimental Conditions
Precisely what to report depends on what is relevant to the resultsThe following are guidelinesOf course, all the environmental data should be logged and therefore available on request
![Page 22: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/22.jpg)
Include
Highlight key information, even if it could be derived. Including:● All toolchain options● Noise mitigation measures● Testing domain● For e.g. memory sensitive benchmark, report
bus speed, cache hierarchy
![Page 23: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/23.jpg)
Leave Out
Everything not essential to the main point● Environment variables● Build logs● Firmware● ...All of this information should be available to be provided on request.
![Page 24: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/24.jpg)
Graphs:Strong Suggestions
![Page 25: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/25.jpg)
Speedup Over Baseline (1/3)
Misleading scale● A is about 3.5%
faster than it was before, not 103.5%
Obfuscated regression● B is a regression
![Page 26: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/26.jpg)
Speedup Over Baseline (2/3)
Baseline becomes 0Title now correctRegression clear
But, no confidence interval.
![Page 27: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/27.jpg)
Speedup Over Baseline (3/3)
Error bars tell us more● Effect on D can be
disregarded● Effect on A is real,
but noisy.
![Page 28: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/28.jpg)
Labelling (1/2)
What is the unit?What are we comparing?
![Page 29: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/29.jpg)
Labelling (2/2)
![Page 30: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/30.jpg)
Graphs:Weak Suggestions
![Page 31: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/31.jpg)
Show the mean
![Page 32: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/32.jpg)
Direction of ‘Good’ (1/2)
“Speedup” changes to “time to execute”Direction of “good” flipsIf possible, maintain a constant direction of good.
![Page 33: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/33.jpg)
Direction of ‘Good’ (2/2)
If you have to change the direction of ‘good’, flag the direction (everywhere)
Can be helpful to flag it anyway
![Page 34: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/34.jpg)
Consistent Order
Presents improvements neatlyBut, hard to compare different graphs in the same report
![Page 35: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/35.jpg)
Scale (1/2)
A few high scores make other results hard to seeA couple of alternatives may be more clear...
![Page 36: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/36.jpg)
Scale (2/2)
![Page 37: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/37.jpg)
Summary
![Page 38: Benchmarking Best Practices 102 - Amazon S3Benchmarking Best Practices 102 Maxim Kuvyrkov BKK16-300 March 9, 2016 Linaro Connect BKK16 Overview Revision (Benchmarking Best Practices](https://reader034.vdocuments.mx/reader034/viewer/2022042806/5f75b2af996b8a2465683704/html5/thumbnails/38.jpg)
Summary
● Log everything, in detail● Be clear about:
○ What the goal of your experiment is○ What your method is, and how it achieves your
purpose● Present results
○ Unambiguously○ With statistical context
● Relate results to your goal