practical sparql benchmarking revisited
DESCRIPTION
A talk given at SemTechBiz 2014 in San Jose that follows up on the tool originally presented at the 2012 conference. Talks about the limitations we've encountered with the original tool and how we've evolved it to address these and build a more robust general purpose and open source SPARQL testing tool. The tool is available on SourceForge in pre-built form at http://sourceforge.net/projects/sparql-query-bm/ or as code on SourceForge or GitHub (https://github.com/rvesse/sparql-query-bm)TRANSCRIPT
![Page 2: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/2.jpg)
2
Overview
1. Rewind to 20122. Limitations3. Evolving the Framework4. Examples5. Future Work
![Page 3: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/3.jpg)
3
Rewind to 2012
![Page 4: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/4.jpg)
4
Practical SPARQL Benchmarking
Presentation I gave at this conference in 2012 Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking
Highlighted some issues with SPARQL Benchmarking: Standard Benchmarks all have know deficiencies Lack of standardized methodology Best benchmark is the one you run with your data and workload
Introduced the 1.x version of our SPARQL Query Benchmarker tool Java tool and API for benchmarking Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white
paper Reports various appropriate statistics Various configuration options to change what exactly is benchmarked e.g. whether results are
fully parsed and counted
![Page 5: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/5.jpg)
5
Obtaining the Tool
The 1.x tool was open sourced shortly after the 2012 conference under a 3 clause BSD License
Available on SourceForge http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/
Also as Maven artifacts (in Maven Central): Group ID: net.sf.sparql-query-bm Artifact IDs:
cmd core
Latest 1.x Version: 1.1.0
![Page 6: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/6.jpg)
6
Limitations
![Page 7: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/7.jpg)
7
SPARQL Queries Only
The 1.x tool can only benchmark SPARQL queries SPARQL 1.1 has been standardized since the 1.x version of
the tool was written and adds various additional SPARQL features that you may want to test: SPARQL Updates SPARQL Graph Store Protocol
Queries are fixed No parameterization support
Can't pass custom endpoint parameters in For example enable/disable reasoning
Also no way to test endpoint specific extensions e.g. transactions
![Page 8: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/8.jpg)
8
HTTP Endpoints Only
Requires using HTTP endpoints to access the SPARQL system to be tested
Adds communication overheads to the results Sometimes this may be desirable
No ability to test SPARQL operations in-memory i.e. can't test lower level APIs
![Page 9: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/9.jpg)
9
Lack of control over test methodology
Only supports a single benchmarking methodology Methodology is hard coded Can't do things like run a subset of the provided operations
on each run Or repeat an operation within a run Or retry an operation under specific failure conditions
Configuration of the methodology is tightly coupled to the methodology Many aspects are actually independent of the methodology
![Page 10: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/10.jpg)
10
Query Mix Definitions
Used a simplistic text based format One query file per line No way to specify additional parameters No way to assign a friendly name to queries
Assigns each query the filename
![Page 11: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/11.jpg)
11
Limited Progress Monitoring
There is a progress monitoring API but it is limited E.g. Gets called after a query completes but not before it
starts Makes it awkward/impossible to implement some kinds of
monitoring e.g. crash detection, memory usage
![Page 12: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/12.jpg)
12
Poor Command Line Interface
In the interests of speed over usability we rolled our own command line arguments parser
Means argument parsing is awkward to extend
![Page 13: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/13.jpg)
13
Evolving the Framework
![Page 14: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/14.jpg)
14
2.x
Earlier this year we found a compelling reason to rewrite the tool and address the various limitations
First 2.x release was made 9th June 2014 Minor bug fix and maintenance releases since Releases available at:
http://sourceforge.net/projects/sparql-query-bm/files/ Code is now using Git
http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git Mirrors available on GitHub for those who think that it is the one true source https://github.com/rvesse/sparql-query-bm
Maven artifacts available through Maven Central as before: Group ID: net.sf.sparql-query-bm Artifact IDs: core, cmd and dist Latest 2.x version: 2.0.1
![Page 15: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/15.jpg)
15
Queries become Operations
Concept of Queries replaced with the general concept of Operations
Also divorces the definition of an operation with how to run said operation Makes it easier to change runtime behaviour of operations
20 built-in operations provided API allows defining and plugging in new operations as
desired http://sparql-query-bm.sourceforge.net/javadoc/latest/core/
![Page 16: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/16.jpg)
16
Query/Update Operations
Several kinds of query/update Fixed Parameterized Dataset Size
Variants for both remote endpoints and in-memory datasets
Remote variants have additional NVP variants Allows adding custom parameters to the remote request
Accounts for 13 of the built in operations
![Page 17: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/17.jpg)
17
Graph Store Protocol Operations
One for each graph store protocol operation: DELETE GET HEAD POST PUT
Accounts for a further 5 of the built-in operations
![Page 18: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/18.jpg)
18
Utility Operations
Sleep Do nothing for some period Useful for simulating quiet periods as part of testing
Mix Allow grouping a set of operations into a single operation Lets you compose mixes from other mixes
![Page 19: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/19.jpg)
19
In-Memory Testing
As already noted in-memory variants of some operations are now available
These run tests against a Dataset implementation Part of Apache Jena ARQ API
Removes SPARQL Protocol and HTTP overhead from testing Of course depending on Dataset implementation may still be some communication overhead But this is likely using lower level back end native communications protocols instead
![Page 20: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/20.jpg)
20
Runners API
Addresses the limitation of hard coded methodology Separates test running into three components:
Overall runner Mix runner Operation runner
Each has own API and can be customized as desired Various useful base/abstract implementations provided
Four different test runners are provided: Benchmark Smoke Soak Stress
![Page 21: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/21.jpg)
21
Runners API - New Runners
Smoke Runs the mix once and indicates whether it passes/fails Pass is defined as all operations pass
Soak Run the mix continuously for some period of time Test how a system reacts under continuous load
Stress Run the mix with increasingly high load Test how a system reacts under increasing load
AbstractRunner provides a basic framework and helper method to make it easy to add custom runners or customize existing runs
![Page 22: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/22.jpg)
22
Runners API - Mix & Operation Runners
Allows customizing how mixes and individual operations are run
Some alternative implementations built in: E.g. SamplingOperationMixRunner
Runs a sample of the operations in the mix May include repeats
E.g. RetryingOperationRunner Retries an operation if it doesn't succeed
Easy to implement your own
![Page 23: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/23.jpg)
23
Options API
Separates test configuration from the test runner Interface with all common configuration defined
Endpoints Timeouts Progress Listeners etc
NB - Runners are typically defined such that they restrict their input options to sub-interfaces that add runner specific configuration e.g. Warm-ups for benchmarks Total runtime for soak testing Ramp up factor for stress testing
![Page 24: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/24.jpg)
24
Operation Mix Definitions
Now using TSV as the file format Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can
configure Each line is a series of parameters separated by a tab
character First parameter is an identifier for the type of the operation
Used to decide how to interpret the remaining parameters Can define your own mix file format and register a loader
for it Possible to override the loader for a specific operation
identifier since this has an API Means you can do neat tricks like use a mix designed for remote endpoints against an in-
memory dataset
![Page 25: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/25.jpg)
25
Example Operation Mixquery 806670-warmup1.rq 806670 Warmup Query 1query 806670-warmup2.rq 806670 Warmup Query 2query 806670-nofilter.rq 806670 Query with No Filterquery 806670-filter3.rq 806670 Query with Filter (Variant 3)param-query 806670-filter3-params.rq instances.tsv Parameterized Query with Filter (Variant 3)query 806670-filter4.rq 806670 Query with Filter (Variant 4)query 806670-filter4a.rq 806670 Query with Filter (Variant 4a - Zero Results)param-query 806670-filter4-params.rq instances.tsv Parameterized Query with Filter (Variant 4)query 806238-warmup1.rq 806238 Warmup Query 1query 806238-warmup2.rq 806238 Warmup Query 2query 806238-comment43.rq 806238 Query (Comment 43)query 806238-comment43a.rq 806238 Query (Comment 43 - SELECT * sub-query)query 806238-comment45.rq 806238 Query (Comment 45 - Multiple sub-queries)query 806238-comment54.rq 806238 Query (Comment 54)param-update load-full1m.ru graph-names.tsv Load 1M Dataset into named graphparam-query count-loaded.rq graph-names.tsv Count named graphparam-update drop-loaded.ru graph-names.tsv Drop named graphquery count.rq Count quadscheckpoint 10 Checkpoint every 10 runssleep 180 3 minute sleep
![Page 26: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/26.jpg)
26
Improved Progress Monitoring
Now provides notifications before and after operation and mix runs
Improvements to how some of the built-in implementations handle multi-threaded output Makes it easier to distinguish where errors occurred when running multi-threaded
benchmarks
![Page 27: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/27.jpg)
27
Improved CLI
Now based upon the powerful open source Airline library https://github.com/airlift/airline
Provides a command line interface to each built-in runner Also provides AbstractCommand with all standard options exposed Standardized exit codes across all commands
Comprehensive built-in help Can help you define operation mixes
./operations ./operation --op param-query
![Page 28: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/28.jpg)
28
Examples
![Page 29: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/29.jpg)
29
Examples
These are things we've done (or are currently doing) with the framework that aren't in the open source releases
However the 2.x framework makes these (hopefully) easy to replicate yourself
![Page 30: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/30.jpg)
30
Custom Operations
Many stores often have rich REST APIs in addition to their SPARQL APIs
Can be useful to include testing of these in your mixes Requires implementing two interfaces:
Operation OperationCallable
Abstract implementations of both available to give you the boiler plate bits
Internally we have 9 different custom operations defined which test a subset of our REST API: Database Management Asynchronous Queries Import Management
![Page 31: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/31.jpg)
31
Custom Progress Monitoring
One thing we're particularly interested in is how operations affect memory usage We added custom progress listeners that track and monitor memory usage Reports on min, max and average memory usage
We also have another progress listener that tracks processes to identify when a test run may have been impacted by other activity on the system
![Page 32: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/32.jpg)
32
Retry on Auth Failure
public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner { public RetryOnAuthFailureOperationRunner() { this(1); }
public RetryOnAuthFailureOperationRunner(int maxRetries) { super(maxRetries); }
@Override protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options, Operation op, OperationRun run) { return run.getErrorCategory() == ErrorCategories.AUTHENTICATION; }}
Extends the built-in RetryingOperationRunner Simply adds a constraint on retries by overriding the
shouldRetry() method
![Page 33: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/33.jpg)
33
Future Work
![Page 34: Practical SPARQL Benchmarking Revisited](https://reader035.vdocuments.mx/reader035/viewer/2022062707/5582b21ad8b42a584c8b507e/html5/thumbnails/34.jpg)
34
Future Work
Embrace Java 7 features fully Use ServiceLoader to automatically discover new operations and mix formats
Make it even easier to customize runners i.e. provide more abstraction of the current implementations