the query rewrite plugin interface: writing your own plugin

Download The Query Rewrite Plugin Interface: Writing Your Own Plugin

If you can't read please download the document

Upload: martinhanssonoracle

Post on 25-Jan-2017

1.280 views

Category:

Software


2 download

TRANSCRIPT

The Query RewritePlugin Interface:Writing Your Own PluginMartin HanssonOptimizer [email protected] 2015, Oracle and/or its affiliates. All rights reserved.

Survey

Who has used plugins (installing, uninstalling) Written plugins Used QR plugins Written QR plugin

Who saw Sveta's presentation?

Who understood it?

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

(Plugin services)

Bonus: Writing a Post-Parse Plugin

Here is my agenda for today. Short on time. Superficial than I would like. First time.

first we sort out what query rewrites are and what they aren't. Had all sorts of questions

Then we'll go over the QR API. Actually API's

We will take a look at how to declare a QR plugin

We will actually write a little plugin together.

If time allows, I'll give you an introduction to plugin services

Throw in a little bonus for you: a post parse plugin. Much more complex and I don't know of anyone else who has dared to.

Inside the server, query rewrites could potentially get very complex because there are all sorts of rewrites happening at different stages. We like to keep it simple. Like it was intercepted on the way. Once a query is rewritten, it is the query, to the user it should look like a query was rewritten, you get notified, but then the old query is never seen again . Something like this.

Server

Client

Rewriter

What are Query Rewrites?The Philosophy:

Talked about implementing it this way, like some sort of proxy, possibly even on a different machine. But then it'd be more complex, extra component running. Not to mention cruel to animals.

There's also some benefit of parsing the string first. And what do I mean by that?

What are Query Rewrites?They come in two flavorsPre-Parse: string-to-string

Low overhead

No structure

Post-Parse: parse tree

Retains structure

Require re-parse (no destructive editing)

Need to traverse parse tree

Only select statements

What QR plugins do is intercept an incoming sql command before it reaches the query optimizer and changes it into a different command. What they can't do is stop a query from happening I.e. filter out queries. And you can't turn one query into two queries, for instance

A pre-parse QR plugin takes a string and returns a string. Very fast and efficient. Downside no structure. Great for custom commands..

A post-parse QR plugin operates on the parse tree instead. The good thing is that we can pick out the parts of the query that we're interested in. If we want to look for a certain literal - abc, say we won't get a false hit if it's inside a comment. Because comments are removed.

If we rewrite a query this way, we have to build a new string and re-parse it there's no destructive editing. And we have a parse tree to traverse. May be expensive. Remebmer on every query. We can however use query digests as a quick reject test which I'll get to later.

We currently only support select statements.

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

(Plugin services)

Bonus: Writing a Post-Parse Plugin

...

Query Rewrite Plugin API - OverviewNetworkParserOptimizer

ServertextLEX

QR Plugin(s)

Pre-parse QR API

Post-parse QR API

Here's a schematic of the query rewrite API's. The query first comes in from the 'network' and is then normally sent straight to the parser. With the QR API we can intercept the query here. We then call out to the plugins

They can intercept it at one of two different times. Either before or right after parsing is done by the server.

Before the parser text. Between the parser and the optimizer, we are passing a parse tree. This is what is intercepted by the post-parse API. Called lex usually.

mysql> INSTALL PLUGIN rewrite_example SONAME 'rewrite_example.so';Query OK, 0 rows affected (0,01 sec)

mysql> SELECT 'ABC';+-----+| abc |+-----+| abc |+-----+1 row in set, 1 warning (0,00 sec)

mysql> SHOW WARNINGS\G*************************** 1. row *************************** Level: Note Code: 1105Message: Query 'SELECT 'ABC'' rewritten to 'select 'abc'' by a query rewrite plugin*************************** 2. row *************************** Level: Note Code: 1105Message: Query 'SHOW WARNINGS' rewritten to 'show warnings' by a query rewrite plugin*************************** 3. row ***************************Query Rewrite in Practice

Let's take a moment to look at what query rewrites look like in practice. Here I use the example plugin that is included in the distribution. If you build from source you need to start the server with --plugin-dir flag.

First, install the plugin, and it will start rewriting queries right away. This example plugin will simply rewrite all queries to lowercase. As you can see it treats every character the same way. It doesn't matter if it's a keyword or a quoted string.

The QR API raises an SQL Note saying that the query was rewritten and from that point on the old query is gone and the new one is the query.

You use the SHOW WARNINGS command to see the Note. What is a little strange at first is that the SHOW WARNINGS command itself is rewritten, and the warning is queued up before parsing, so by the time the command is executed, there is a warning about the show command itself being rewritten.

Let's look @ apis

API'sAudit APIGeneral Query Rewrite APIServerPre-Parse QR APIPost-Parse QR API

QR Plugin(s)

The API's are tiered with the server at the bottom and the specific pre- and post- parse API's at the top.

The specific API's call the plugins and handle the result.

The second layer from top is the general QR API. This is where the SQL Note 'warning' that a query is rewritten is produced.

The lowest layer above the server is the Audit API. Let's take a moment to look at it.

Database auditing involves observing a database so as to be aware of the actions of database users. Database administrators and consultants often set up auditing for security purposes, for example, to ensure that those without the permission to access information do not access it.

Source: http://en.wikipedia.org/wiki/Database_audit

Audit API

The audit API lets you use pluggable auditing. Here is the definition of database auditing from Wikipedia. Auditing is essentially a type of logging to detect illegal activities in the database, such as security breaches and suspicious activity. There are various laws that have different requirements on what should be logged. Some laws require logging of which queries are performed,

Some laws require logging of who sees certain data.

Some laws require logging of modification of the data

Some require logging of all meta-data changes.

Audit API provides infrastructure for

Locking

Caching

Gathering

Event model

Parameter passing

An audit plugin registers itself to listen to various events :logging in What SELECTs are performed What UPDATE and inserts are performed

Etc All are events The audit api provides infrastructure for locking plugins (so they don't get uninstalled while logging). In order to make this locking scale to thousands of parse events per second we need some clever caching here.

The audit API has this nice feature where it collects the plugins according to event. So by the time an event happens, we already have all the plugins queued up that want to get notified of this event. That does wonders for performance.

So we discovered after some time that query rewrites are just a special case of auditing. We just added a 'pre parse event' and a 'post parse event'. The parse events fit nicely into the event model of the audit api.

We also have a working infrastructure for passing parameters to the plugin. It's a bit crude, based on void* pointers and bitmaps, but this way we know that it works on all platforms.

With that, I'm going to dive into actual code snippets.

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

(Plugin services)

Bonus: Writing a Post-Parse Plugin

static struct st_mysql_audit my_rewrite_descriptor={ MYSQL_AUDIT_INTERFACE_VERSION, /* interface version */ NULL, /* release_thd() */ do_the_rewrite, /* event_notify() */ /* class masks */ { 0, // General class 0, // Connection class (unsigned long) MYSQL_AUDIT_PARSE_ALL, // Parser class }};

MYSQL_AUDIT_PARSE_ALL == MYSQL_AUDIT_PREPARSE | MYSQL_AUDIT_POSTPARSE

https://dev.mysql.com/doc/refman/5.7/en/writing-audit-plugins.html

Specific Plugin Descriptor

These examples are more or less straight out of the manual, the link is below. Give you links a end.

I'm going to start with the type-specific plugin descriptor. This is not really my area of expertise, so I'll give a shallow introduction.

For interface_version, the convention is that type-specific plugin descriptors use the interface version for the given plugin type. The actual versioning is done in the general descriptor on the next slide. For audit plugins, the value of the interface_version member is MYSQL_AUDIT_INTERFACE_VERSION

The plugin may wish to be notified when the server dissociates it from the current session. Typically it happens when the session is closed.

There's the notify function which is called when the event is fired. Which event it is is in the next part.

And the class masks. In this case the class mask says that the plugin is both a pre- and a post-parse QR plugin.

mysql_declare_plugin(audit_log){ MYSQL_AUDIT_PLUGIN, /* type */ &my_rewrite_descriptor, /* descriptor */ "my_plugin", /* name */ "Me", /* author */ "My audit plugin", /* description */ PLUGIN_LICENSE_GPL, my_init, /* init function (when loaded) */ my_deinit, /* deinit function (when unloaded) */ 0x0003, /* version */ my_status_vars, /* status variables */ my_system_variables, /* system variables */ NULL, /* reserved */ 0, /* flags */}mysql_declare_plugin_end;

https://dev.mysql.com/doc/refman/5.7/en/writing-audit-plugins.html

Declaring an Audit Plugin

General plugin desc.

The first line identifies this as an audit plugin. The second line is the descriptor from the last slide.

The name, author, description are what you choose them to be. The name is used in the INSTALL PLUGIN statement(?)

We hope your plugin will be GPL'ed.

Init and deinit are called first and last, respectively. Typically you register pfs instrumentation in the init function. Gets torn down automatically when the plugin is uninstalled.

We have the versioning where the plugin API checks that a plugin is compatible with the API and otherwise doesn't load it. Instead it raises an error.

Any plugin can define status and system variables. I won't go into details on that here because my floor time is limited. But they are visible in information_schema and in show status/show variables while the plugin is installed. Work automatically.

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

(Plugin services)

Bonus: Writing a Post-Parse Plugin

10 min..

#include // Always include this first#include // General plugin header#include // The Query Rewrite API

static int plugin_init(MYSQL_PLUGIN) {} // return 0 for success

static int do_the_rewrite(MYSQL_THD, mysql_event_class_t event_class, const void *event){ // return 0 for success}

https://dev.mysql.com/doc/refman/5.7/en/writing-audit-plugins.html

Skeleton of a Query Rewrite Plugin

Now we're ready to write a QR plugin! Start with #include's. You will want to include my_global first, or I guarantee you will be running into various problems and you will get pissed off and then you'll end up doing it anyway. So save yourself the aggravation and trust me on this.

Here's my init function. This is called when the plugin is installed, and if you don't return 0, you will get an error message that the plugin failed to install.

Here's the rewrite function. It takes an opaque session object. Opaque is a fancy way of saying void*.

Then we have the event class, and the event itself. The actual type of the event depends on the event class.

You can also abort a query, in which case you get a warning with the number that you return.

Let's look at the rewriting function,

static int do_the_rewrite(MYSQL_THD, mysql_event_class_t event_class, const void *event){ const mysql_event_parse *event_parse= static_cast(event);

if (event_parse->event_subclass == MYSQL_AUDIT_PARSE_PREPARSE) { do pre-parse rewrite ... }

or

if (event_parse->event_subclass == MYSQL_AUDIT_PARSE_POSTPARSE) { do post-parse rewrite ... }

return 0; /* success */}Skeleton of a Query Rewrite Plugin (pt. 2)

Like I said, this plugin is both a pre- and a post-parse plugin. So there's two events.

Because we only subscribe to parse events, we can trust that the event is a parse event. If we subscribed to other events, we would've had to look at the event_class before this cast.

Now we need to know Are we before or after parsing? What we do next depends completely on this. Note that this is only an example to show as many details as possible. In real life you probably don't want to rewrite the query both before and after parsing.

When we're done, return 0.

Skeleton of a Query Rewrite Plugin (Pre-Parse)Doing the Pre-Parse Rewritestatic int do_the_rewrite(MYSQL_THD, mysql_event_class_t event_class, const void *event){ const mysql_event_parse *event_parse= static_cast(event);

if (event_parse->event_subclass == MYSQL_AUDIT_PARSE_PREPARSE) { size_t query_length= event_parse->query.length; char *rewritten_query= static_cast(my_malloc(key_memory_rewrite_example, query_length + 1, MYF(0)));

for (size_t i= 0; i < query_length + 1; ++i) rewritten_query[i]= tolower(event_parse->query.str[i]);

event_parse->rewritten_query->str= rewritten_query; event_parse->rewritten_query->length= query_length; *reinterpret_cast(event_parse->flags)|= MYSQL_AUDIT_PARSE_REWRITE_PLUGIN_QUERY_REWRITTEN; }

Now let's take a look at how to rewrite a query string.

We start by using the alloc service to allocate a new query string. Then just copy character by character. Change to lower in this case. Point the event to the new query. Set the flag that it's rewritten or the rewritten query won't be used.

If you look closely at the my_malloc, it's using a pfs memory key to do instrumentation. I'll tell you more about that in a few slides.

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

Plugin services

Bonus: Writing a Post-Parse Plugin

...

What is A Plugin Service?

Server

Audit API

Services

Plugins

Calls

Calls

Let's talk about services. Plugins can't accomplish very much without some help from the server. This is especially true of post-parse plugins, because all they have is an opaque pointer to a parse tree with no primitives to work on it.

In services the flow of calls goes like in this slide. The API calls the plugin, who processes the input. In so doing it may call up functionality in the server. Functionality offered by the server to plugins is called a service. I will mention two services that I deem the most important ones for writing QR plugins: The Parser service and the Alloc service.

The Parser ServiceThis Service Lets a Plugin:Parse a string, get:Normalized query

Query digest

Traverse a parse tree:Find positions of literals

Print literals

First of all the parser service lets you hand a string to the parser. You want to do this after your post-parse rewrite is complete. Of course, if you're writing a post-parse plugin you already have a parse tree set up by the time notify fn. is called.

There is only one post-parse QR plugin that I know of, the one we wrote. Called Rewriter. It works by using normalized queries and digests to pattern-match queries that should be rewritten.

A normalized query has all the literals anonymized. A digest is an md5 of it. We did it this way to get a reasonable performance, because query rewrites are potentially very expensive. With md5's and lookup in a hash table we get a few percent of overhead, which is reasonable.

The way it's used is that we get the positions of literals that should be replaced. Then we build a new string and hand it to the parser service.

Covering the parser service exhaustively would be a full presentation of its own, so I will just have to skip most of the gritty details here.

And covering the Rewrite plugin would also be a full presentation of its own.

The Parser ServiceIn code (include/mysql/service_parser.h):

kukint mysql_parser_parse(MYSQL_THD thd, const MYSQL_LEX_STRING query, unsigned char is_prepared, sql_condition_handler_function handle_cond, void *condition_handler_state)

MYSQL_LEX_STRING mysql_parser_get_normalized_query(MYSQL_THD thd)

int mysql_parser_get_statement_digest(MYSQL_THD thd, uchar *digest)

typedefint (*parse_node_visit_function)(MYSQL_ITEM item, unsigned char* arg);

int mysql_parser_visit_tree(MYSQL_THD thd, parse_node_visit_function processor, unsigned char* arg)

MYSQL_LEX_STRING mysql_parser_item_string(MYSQL_ITEM item)

This is what actual code looks like. Just an excerpt. Literals are visited with a callback function parse_node_visit_function.

The Alloc ServiceLets a Plugin:Allocate

Deallocate

Instrument

Code (include/mysql/service_mysql_alloc.h)my_malloc

my_realloc

my_claim

my_free

my_memdup

my_strdup

my_strndup

The alloc service lets you you guessed it allocate memory. So why would you want to do that instead of rolling your own memory allocation?

The main reason is that you get instrumentation through pfs. The good thing is that this data is together with all other instrumentation data in pfs, together with that of the rest of the server.

Program Agenda

What query rewrites are

The API's

Declaring plugins

Writing plugins

(Plugin services)

Bonus: Writing a Post-Parse Plugin

...

Skeleton of a Post-Parse Query Rewrite PluginDoing the Post-Parse Rewrite... if (event_parse->event_subclass == MYSQL_AUDIT_PARSE_POSTPARSE) { MYSQL_LEX_STRING first_literal= {NULL, 0}; mysql_parser_visit_tree(thd, catch_literal, (unsigned char*)&first_literal); if (first_literal.str != NULL) { size_t query_length= first_literal.length + event_parse->query.length + 25; first_literal.str[first_literal.length]='\0'; char *rewritten_query= static_cast(my_malloc(key_memory_post_parse_example, query_length, MYF(0))); sprintf(rewritten_query, "/* First literal: %s */ %s", first_literal.str, event_parse->query.str); MYSQL_LEX_STRING new_query= {rewritten_query, query_length}; mysql_parser_free_string(first_literal); mysql_parser_parse(thd, new_query, false, handle_parse_error, NULL); *((int *)event_parse->flags)|= (int)MYSQL_AUDIT_PARSE_REWRITE_PLUGIN_QUERY_REWRITTEN;

}

Now let's take a look at a post-parse rewrite. Quick example that I cooked up. The post parse api is kind of limited an heavily geared towards the existing Rewriter plugin. That plugin is way too complex to cover here.

This will simply pick out the first literal in the query and put it in a comment. Not very useful but hopefullly it can illustrate how it works.

We start by using the alloc service to allocate a new query string again.

So we call the parser service with a callback function catch_literal that catches the first literal.

Then we start building a new string.

Then parse it.

Set the flag.

Skeleton of a Post-Parse Query Rewrite PluginCatching a Literal

int catch_literal(MYSQL_ITEM item, unsigned char* arg){ MYSQL_LEX_STRING *result_string_ptr= (MYSQL_LEX_STRING*)arg; if (result_string_ptr->str == NULL) { *result_string_ptr= mysql_parser_item_string(item); return 0; } return 1;}

Here's the callback function that catches the first literal.

Uses the parser service to print the item.

Skeleton of a Post-Parse Query Rewrite PluginResultmysql> INSTALL PLUGIN post_parse_example SONAME 'post_parse_example.so';Query OK, 0 rows affected (0,01 sec)

mysql> SELECT 'abc', 'def';+-----+-----+| abc | def |+-----+-----+| abc | def |+-----+-----+1 row in set, 1 warning (0,00 sec)

mysql> SHOW WARNINGS\G*************************** 1. row *************************** Level: Note Code: 1105Message: Query 'SELECT 'abc', 'def'' rewritten to '/* First literal: 'abc' */ SELECT 'abc', 'def'' by a query rewrite plugin1 row in set (0,00 sec)

Here is the result of running the plugin.

LinksBlog posts:mysqlserverteam.com/write-yourself-a-query-rewrite-plugin-part-1/

mysqlserverteam.com/the-query-rewrite-plugins/

In the Manual:dev.mysql.com/doc/refman/5.7/en/plugin-api.html

dev.mysql.com/doc/refman/5.7/en/plugin-types.html

dev.mysql.com/doc/refman/5.7/en/plugin-services.html

dev.mysql.com/doc/refman/5.7/en/writing-audit-plugins.html

dev.mysql.com/doc/refman/5.7/en/performance-schema-statement-digests.html

In the Code:include/mysql/service_parser.h

include/mysql/services.h

The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions.The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

Copyright 2014 Oracle and/or its affiliates. All rights reserved. |

Click to edit the title text formatClick to edit Master title style

1/30/16

Oracle Confidential Internal

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Click to edit the title text formatClick to edit Master title style

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick to edit Master text styles

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Insert Information Protection Policy Classification from Slide 13

Click to edit the title text formatClick to edit Master title style

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick to edit Master text styles

Second level

Third level

Fourth level

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick to edit Master text styles

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Click to edit the title text formatClick to edit Master title style

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick to edit Master text styles

Click to edit the outline text formatSecond Outline LevelThird Outline LevelFourth Outline LevelFifth Outline LevelSixth Outline Level

Seventh Outline LevelClick icon to add picture