practical sparql benchmarking revisited

1

Practical SPARQL Benchmarking Revisited

Rob [email protected]

@RobVesse

2

Overview

1. Rewind to 20122. Limitations3. Evolving the Framework4. Examples5. Future Work

3

Rewind to 2012

4

Practical SPARQL Benchmarking

Presentation I gave at this conference in 2012 Slides at http://www.slideshare.net/RobVesse/practical-sparql-benchmarking

Highlighted some issues with SPARQL Benchmarking: Standard Benchmarks all have know deficiencies Lack of standardized methodology Best benchmark is the one you run with your data and workload

Introduced the 1.x version of our SPARQL Query Benchmarker tool Java tool and API for benchmarking Used a methodology based upon combination of the BSBM runner and Revelytix SP2B white

paper Reports various appropriate statistics Various configuration options to change what exactly is benchmarked e.g. whether results are

fully parsed and counted

http://www.slideshare.net/RobVesse/practical-sparql-benchmarking








5

Obtaining the Tool

The 1.x tool was open sourced shortly after the 2012 conference under a 3 clause BSD License

Available on SourceForge http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/

Also as Maven artifacts (in Maven Central): Group ID: net.sf.sparql-query-bm Artifact IDs:

cmd core

Latest 1.x Version: 1.1.0

http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/

http://sourceforge.net/projects/sparql-query-bm/files/1.0.0/

6

Limitations

7

SPARQL Queries Only

The 1.x tool can only benchmark SPARQL queries SPARQL 1.1 has been standardized since the 1.x version of

the tool was written and adds various additional SPARQL features that you may want to test: SPARQL Updates SPARQL Graph Store Protocol

Queries are fixed No parameterization support

Can't pass custom endpoint parameters in For example enable/disable reasoning

Also no way to test endpoint specific extensions e.g. transactions

8

HTTP Endpoints Only

Requires using HTTP endpoints to access the SPARQL system to be tested

Adds communication overheads to the results Sometimes this may be desirable

No ability to test SPARQL operations in-memory i.e. can't test lower level APIs

9

Lack of control over test methodology

Only supports a single benchmarking methodology Methodology is hard coded Can't do things like run a subset of the provided operations

on each run Or repeat an operation within a run Or retry an operation under specific failure conditions

Configuration of the methodology is tightly coupled to the methodology Many aspects are actually independent of the methodology

10

Query Mix Definitions

Used a simplistic text based format One query file per line No way to specify additional parameters No way to assign a friendly name to queries

Assigns each query the filename

11

Limited Progress Monitoring

There is a progress monitoring API but it is limited E.g. Gets called after a query completes but not before it

starts Makes it awkward/impossible to implement some kinds of

monitoring e.g. crash detection, memory usage

12

Poor Command Line Interface

In the interests of speed over usability we rolled our own command line arguments parser

Means argument parsing is awkward to extend

13

Evolving the Framework

14

2.x

Earlier this year we found a compelling reason to rewrite the tool and address the various limitations

First 2.x release was made 9th June 2014 Minor bug fix and maintenance releases since Releases available at:

http://sourceforge.net/projects/sparql-query-bm/files/ Code is now using Git

http://git.code.sf.net/p/sparql-query-bm/git sparql-query-bm-git Mirrors available on GitHub for those who think that it is the one true source https://github.com/rvesse/sparql-query-bm

Maven artifacts available through Maven Central as before: Group ID: net.sf.sparql-query-bm Artifact IDs: core, cmd and dist Latest 2.x version: 2.0.1

http://sourceforge.net/projects/sparql-query-bm/files/








http://git.code.sf.net/p/sparql-query-bm/git%20sparql-query-bm-git














https://github.com/rvesse/sparql-query-bm








15

Queries become Operations

Concept of Queries replaced with the general concept of Operations

Also divorces the definition of an operation with how to run said operation Makes it easier to change runtime behaviour of operations

20 built-in operations provided API allows defining and plugging in new operations as

desired http://sparql-query-bm.sourceforge.net/javadoc/latest/core/

http://sparql-query-bm.sourceforge.net/javadoc/latest/core/







16

Query/Update Operations

Several kinds of query/update Fixed Parameterized Dataset Size

Variants for both remote endpoints and in-memory datasets

Remote variants have additional NVP variants Allows adding custom parameters to the remote request

Accounts for 13 of the built in operations

17

Graph Store Protocol Operations

One for each graph store protocol operation: DELETE GET HEAD POST PUT

Accounts for a further 5 of the built-in operations

18

Utility Operations

Sleep Do nothing for some period Useful for simulating quiet periods as part of testing

Mix Allow grouping a set of operations into a single operation Lets you compose mixes from other mixes

19

In-Memory Testing

As already noted in-memory variants of some operations are now available

These run tests against a Dataset implementation Part of Apache Jena ARQ API

Removes SPARQL Protocol and HTTP overhead from testing Of course depending on Dataset implementation may still be some communication overhead But this is likely using lower level back end native communications protocols instead

20

Runners API

Addresses the limitation of hard coded methodology Separates test running into three components:

Overall runner Mix runner Operation runner

Each has own API and can be customized as desired Various useful base/abstract implementations provided

Four different test runners are provided: Benchmark Smoke Soak Stress

21

Runners API - New Runners

Smoke Runs the mix once and indicates whether it passes/fails Pass is defined as all operations pass

Soak Run the mix continuously for some period of time Test how a system reacts under continuous load

Stress Run the mix with increasingly high load Test how a system reacts under increasing load

AbstractRunner provides a basic framework and helper method to make it easy to add custom runners or customize existing runs

22

Runners API - Mix & Operation Runners

Allows customizing how mixes and individual operations are run

Some alternative implementations built in: E.g. SamplingOperationMixRunner

Runs a sample of the operations in the mix May include repeats

E.g. RetryingOperationRunner Retries an operation if it doesn't succeed

Easy to implement your own

23

Options API

Separates test configuration from the test runner Interface with all common configuration defined

Endpoints Timeouts Progress Listeners etc

NB - Runners are typically defined such that they restrict their input options to sub-interfaces that add runner specific configuration e.g. Warm-ups for benchmarks Total runtime for soak testing Ramp up factor for stress testing

24

Operation Mix Definitions

Now using TSV as the file format Still wanted to be simple enough that someone with zero RDF/SPARQL knowledge can

configure Each line is a series of parameters separated by a tab

character First parameter is an identifier for the type of the operation

Used to decide how to interpret the remaining parameters Can define your own mix file format and register a loader

for it Possible to override the loader for a specific operation

identifier since this has an API Means you can do neat tricks like use a mix designed for remote endpoints against an in-

memory dataset

25

Example Operation Mixquery 806670-warmup1.rq 806670 Warmup Query 1query 806670-warmup2.rq 806670 Warmup Query 2query 806670-nofilter.rq 806670 Query with No Filterquery 806670-filter3.rq 806670 Query with Filter (Variant 3)param-query 806670-filter3-params.rq instances.tsv Parameterized Query with Filter (Variant 3)query 806670-filter4.rq 806670 Query with Filter (Variant 4)query 806670-filter4a.rq 806670 Query with Filter (Variant 4a - Zero Results)param-query 806670-filter4-params.rq instances.tsv Parameterized Query with Filter (Variant 4)query 806238-warmup1.rq 806238 Warmup Query 1query 806238-warmup2.rq 806238 Warmup Query 2query 806238-comment43.rq 806238 Query (Comment 43)query 806238-comment43a.rq 806238 Query (Comment 43 - SELECT * sub-query)query 806238-comment45.rq 806238 Query (Comment 45 - Multiple sub-queries)query 806238-comment54.rq 806238 Query (Comment 54)param-update load-full1m.ru graph-names.tsv Load 1M Dataset into named graphparam-query count-loaded.rq graph-names.tsv Count named graphparam-update drop-loaded.ru graph-names.tsv Drop named graphquery count.rq Count quadscheckpoint 10 Checkpoint every 10 runssleep 180 3 minute sleep

26

Improved Progress Monitoring

Now provides notifications before and after operation and mix runs

Improvements to how some of the built-in implementations handle multi-threaded output Makes it easier to distinguish where errors occurred when running multi-threaded

benchmarks

27

Improved CLI

Now based upon the powerful open source Airline library https://github.com/airlift/airline

Provides a command line interface to each built-in runner Also provides AbstractCommand with all standard options exposed Standardized exit codes across all commands

Comprehensive built-in help Can help you define operation mixes

./operations ./operation --op param-query

https://github.com/airlift/airline



28

Examples

29

Examples

These are things we've done (or are currently doing) with the framework that aren't in the open source releases

However the 2.x framework makes these (hopefully) easy to replicate yourself

30

Custom Operations

Many stores often have rich REST APIs in addition to their SPARQL APIs

Can be useful to include testing of these in your mixes Requires implementing two interfaces:

Operation OperationCallable

Abstract implementations of both available to give you the boiler plate bits

Internally we have 9 different custom operations defined which test a subset of our REST API: Database Management Asynchronous Queries Import Management

31

Custom Progress Monitoring

One thing we're particularly interested in is how operations affect memory usage We added custom progress listeners that track and monitor memory usage Reports on min, max and average memory usage

We also have another progress listener that tracks processes to identify when a test run may have been impacted by other activity on the system

32

Retry on Auth Failure

public class RetryOnAuthFailureOperationRunner extends RetryingOperationRunner { public RetryOnAuthFailureOperationRunner() { this(1); }

public RetryOnAuthFailureOperationRunner(int maxRetries) { super(maxRetries); }

@Override protected <T extends Options> boolean shouldRetry(Runner<T> runner, T options, Operation op, OperationRun run) { return run.getErrorCategory() == ErrorCategories.AUTHENTICATION; }}

Extends the built-in RetryingOperationRunner Simply adds a constraint on retries by overriding the

shouldRetry() method

33

Future Work

34

Future Work

Embrace Java 7 features fully Use ServiceLoader to automatically discover new operations and mix formats

Make it even easier to customize runners i.e. provide more abstraction of the current implementations

35

Questions?

[email protected]@RobVesse

practical sparql benchmarking revisited

Technology

sparql operations

bmgit sparqlquery

sparql protocol

sparql benchmarking

x tool

query file

sourceforge http

builtin operations