andrew dunstan 9.3 json presentation @ postgres open 2013

40
PostgreSQL 9.3 and JSON Andrew Dunstan [email protected] [email protected]

Upload: postgresopen

Post on 30-May-2015

1.591 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

PostgreSQL 9.3 and JSON

Andrew Dunstan

[email protected]@pgexperts.com

Page 2: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Overview

● What is JSON?● Why use JSON?● Quick review of 9.2 features● 9.3 new features● Building on 9.3 features● Some useful extensions● Future work

Page 3: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

What is JSON?

● Data serialization format● Lightweight● Human readable● Becoming ubiquitous● Simpler and more compact than XML

Page 4: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

What it looks like

{ "books" : [ { "title": "Catch 22”, "author": "Joseph Heller"}, { "title": "Catcher in the Rye", "author": "J. D. Salinger"} ], "publishers": [ { "name": "Random House" }, { "name": "Penguin" } ], "active": true, "version": 35, "date": "2003-09-13", “reference”: null}

Scalars:● quoted strings● numbers● true, false, null

No extensions

No date/time types

Page 5: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Why use it?

● Everyone is moving that way● Understood everywhere there is a JavaScript

interpreter– Especially browsers

● ... and in a large number of other languages– e.g. Perl, Python

● node.js is becoming very widely used● More compact than XML● Most applications don't need the richer

structure of XML

Page 6: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Why not use it?

● Overly verbose● Field names are repeated

● Arguably less readable than, say, YAML● Not suitable for huge objects

Page 7: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Review – pre 9.2 facilities

Nothing – store JSON as text● No validation● No JSON production● No JSON extraction

Page 8: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Review – 9.2 data type

New JSON type● Stored as text● Reasonably performant state-based validating

parser● Kudos: Robert Haas

Page 9: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Review – 9.2 production functions

● Turn non-JSON data into JSON● row_to_json(anyrecord)● array_to_json(anyarray)● Optional second param for pretty printing

● My humble contribution ☺

Page 10: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

What's missing?

● JSON production features are incomplete● JSON processing is totally absent

● Have to use PLV8, PLPerl or some such

Page 11: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 Features – JSON production

● to_json(any)● Can be used on any datum, not just

arrays and records

● json_agg(record)● Much faster than

array_to_json(array_agg(record))

Page 12: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 and casts to JSON

● Production functions honor casts to JSON for non-builtin types

● Not needed for builtins, as we know how to convert them

● Saves syscache lookups where we know it's not necessary

Page 13: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 hstore and JSON

● hstore_to_json(hstore)● Also used as a cast function

● hstore_to_json_loose(hstore)● Uses heuristics about whether or not

certain possibly numeric and boolean values need to be quoted.

Page 14: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 JSON parser rewrite

● New parser uses recursive descent pattern

● Caller can supply event handlers for certain events

● c.f. XML SAX parsers● Validator uses NULL handlers for all

events

● Tokenizing routines of previous parser largely kept

Page 15: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 JSON processsing functions

● All leverage new parser API● Operators give a more natural style to

extraction operations● Many have two forms, producing either

JSON output, which can be further processed, or text output, which cannot.

● Text output is de-escaped and dequoted

Page 16: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 extraction operators (1)

● -> fetch an array element or object member as json

● '[4,5,6]'::json->2 ⟹ 6– json arrays are 0 based, unlike SQL arrays

● '{"a":1,"b":2}'::json->'b' ⟹ 2

Page 17: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 extraction operators (2)

● ->> fetch an array element orobject member as text

● '["a","b","c"]'::json->2 ⟹ c– Instead of "c"

Page 18: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 extraction operators (3)

● #> and #>> fetch data pointed at by a path

● Path is an array of text elements● Treats arrays correctly by some trying to

treat path element as an integer of necessary

● '{"a":[6,7,8]}'::json#>'{a,1}' ⟹ 7●

Page 19: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 extraction functions

● json_extract_path(json, VARIADIC path_elems text[]);

● json_extract_path_text(json, VARIADIC path_elems text[]);

● Same as #> and #>> operators, but you can pass the path as a variadic array

● json_extract_path('{"a":[6,7,8]}','a','1') ⟹ 7

Page 20: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 turn JSON into records

● CREATE TYPE x AS (a int, b int);● SELECT * FROM

json_populate_record(null::x, '{"a":1,"b":2}', false);

● SELECT * FROM json_populate_recordset(null::x,'[{"a":1,"b":2},{"a":3,"b":4}]', false);

Page 21: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 turn JSON into key/value pairs

● SELECT * FROM json_each('{"a":1,"b":"foo"}')

● SELECT * FROM json_each_text('{"a":1,"b":"foo"}')

● Deliver columns named “key” and “value”

Page 22: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 get keys from JSON object

● SELECT * FROM json_object_keys('{"a":1,"b":"foo"}')

Page 23: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 JSON array processing

● SELECT json_array_length('[1,2,3,4]');● SELECT * FROM

json_array_elements('[1,2,3,4]')

Page 24: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 the JSON C API

● Need● A state object● Some handlers● Stash these in a JsonSemAction object● call pg_json_parse()

Page 25: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 API example

● Code can be cloned from https://bitbucket.org/adunstan/json_typeof

● CREATE FUNCTION json_typeof(json) RETURNS text AS 'MODULE_PATHNAME' LANGUAGE C STRICT IMMUTABLE;

Page 26: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 API example state object

● #include <utils/jsonapi.h>● typedef struct typeof_state {

JsonLexContext *lex;JsonTokenType result;

} TypeofState;

Page 27: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 API example event handlers

● Need one for each of:● Array start● Object start● Scalar

Page 28: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 API example array/object start handlers

● static void json_typeof_astart (void *state){ TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0)

_state->result = JSON_TOKEN_ARRAY_START;}static void json_typeof_ostart (void *state){ TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0)

_state->result = JSON_TOKEN_OBJECT_START;}

Page 29: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 API example scalar handler

● static void json_typeof_scalar( void *state, char *token, JsonTokenType tokentype){ TypeofState *_state = (TypeofState *) state; if (_state->lex->lex_level == 0)

_state->result = tokentype;}

Page 30: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 example function codeDatum json_typeof(PG_FUNCTION_ARGS){ text *json = PG_GETARG_TEXT_P(0); TypeofState *state; JsonLexContext *lex = makeJsonLexContext(json, false); JsonSemAction *sem;

state = palloc0(sizeof(TypeofState)); sem = palloc0(sizeof(JsonSemAction)); state->lex = lex; sem->semstate = (void *) state; sem->object_start = json_typeof_ostart; sem->array_start = json_typeof_astart; sem->scalar = json_typeof_scalar; pg_parse_json(lex, sem); PG_RETURN_TEXT_P(token_to_type(state->result));}

Page 31: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

9.3 CAPI – what else is it good for?

● Transformations of all kinds● XML● YAML● ....

● An XML transformation was the original working model for creating the API

Page 32: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Other useful modules

● Developed for use by my clients● json_build lets you compose json of arbitrary

complexity● json_object creates a json object from a 1- or

2-dimensional array of text, similar to hstore facility.

Page 33: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

json_build

● Two functions:● build_json_object(VARIADIC “any”)● build_json_array(VARIADIC “any”)● Together these let you compose deeply nested

tree structures safely.

Page 34: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

json_build example

SELECT build_json_object( 'a', build_json_object('b',false,'c',99), 'd', build_json_object('e',array[9,8,7]::int[], 'f', (select row_to_json(r) from ( select relkind, oid::regclass as name from pg_class where relname = 'pg_class') r)));

{"a" : {"b" : false, "c" : 99}, "d" : {"e" : [9,8,7], "f" : {"relkind":"r","name":"pg_class"}}}

Page 35: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

json_object examples

SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}');

json_object

-------------------------------------------------------

{"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}

-- same but with two dimensions

SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}');

json_object

-------------------------------------------------------

{"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}

Page 36: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

json_object examples

SELECT json_object('{a,1,b,2,3,NULL,"d e f","a b c"}');

json_object

-------------------------------------------------------

{"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}

-- same but with two dimensions

SELECT json_object('{{a,1},{b,2},{3,NULL},{"d e f","a b c"}}');

json_object

-------------------------------------------------------

{"a" : "1", "b" : "2", "3" : null, "d e f" : "a b c"}

Page 37: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Extension repos

● https://github.com/pgexperts/json_build● https://bitbucket.org/qooleot/json_object● Also on PGXN

Page 38: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Future of JSON in PostgreSQL

● Possible binary representation● Avoid reparsing● Could be vastly faster for some operations● Should we use a dictionary and store structure

separately?

● General document store● Need to get around the “rewrite a whole

datum” issue

Page 39: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Credit

● 9.3 core development work sponsored by Heroku

● Other work sponsored by PGExperts clients and IVC

Page 40: Andrew Dunstan 9.3 JSON Presentation @ Postgres Open 2013

Questions?