gbroccolo - use of indexes on geospatial databases with postgresql - foss4g.eu 2015
TRANSCRIPT
![Page 1: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/1.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Use of indexes on geospatial database with the PostgreSQL
DBMS
Giuseppe Broccolo
www.2ndquadrant.it
![Page 2: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/2.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
$~# whoami
• PostgreSQL and PostGIS consultant– Development, Replication, Disaster Recovery, pre-production Benchmark,
Remote DBA, 24/7 Support, Training
![Page 3: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/3.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Outline
• Indexes on geospatial DBs
• What does PostgreSQL offer?
• Examples of usage:– Points in PostgreSQL– Points in PostGIS extension– (LiDAR) points in PointCloud extension
![Page 4: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/4.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Indexes on geospatial databases
• Binary structure used to speed up accesses to data:–
– In case of trees: balanced/unbalanced structure of nodes
– Theoretical performances:• R/W: ~O(log N) Size: ~O(N)
– Algorithms are not defined by ordering/comparison but placement operators
– Index nodes are defined starting from the MBR containing the whole dataset
![Page 5: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/5.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
MBR
![Page 6: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/6.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
MBR
![Page 7: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/7.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
MBR
![Page 8: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/8.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
MBR
Balanced:● R-tree, etc.
![Page 9: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/9.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
MBR
Unbalanced:● Kd-tree, Quad-tree, etc.
![Page 10: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/10.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
What PostgreSQL offers
• “in core” 2D geometric (not geografic) datatype– Fixed resolution: double precision– point, circle, box– @-@, @@, <->, &&, <<, >>, <<|, |>>, ...
![Page 11: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/11.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
What PostgreSQL offers
• PostGIS extension:– geometry, geography
– <@, @>, &&, <<, >>, <<|, |>>, ...– ST_Lenght(), ST_Distance(), ...
![Page 12: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/12.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Tree indexes in PostgreSQL• Balanced indexes
– B-Tree– GIN (Generalized Inverted Index) – fast accesses to data – GiST (Generalized Search Tree) – good concurrency, “lossy”
• kNN searches
• Unbalanced index– SP-GiST (Space Partitioned GiST) – low I/O
• Introduced in PostgreSQL 9.2• Usable in PostGIS >2.1
![Page 13: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/13.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Tree indexes in PostgreSQL• Balanced indexes
– B-Tree– GIN (Generalized Inverted Index) – fast accesses to data – GiST (Generalized Search Tree) – good concurrency, “lossy”
• kNN searches
• Unbalanced index– SP-GiST (Space Partitioned GiST) – low I/O
• Introduced in PostgreSQL 9.2• Usable in PostGIS >2.1
![Page 14: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/14.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Work with 2D points sets
• The test environment: Vagrant VM (Ubuntu 14.04)– Single virtual core 2.26GHz, RAM 512MB, Disco 7.2k
• PostgreSQL 9.4 + PostGIS 2.1– postgresql.conf: default
• ~10M of points– Nearest Neighbours search – Bounding Box inclusion
![Page 15: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/15.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Work with 2D points sets
• The test environment: Vagrant VM (Ubuntu 14.04)– Single virtual core 2.26GHz, RAM 512MB, Disco 7.2k
• PostgreSQL 9.4 + PostGIS 2.1– postgresql.conf: default
• ~10M of points– Nearest Neighbours search – Bounding Box inclusion
![Page 16: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/16.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Work with 2D points sets
• The test environment: Vagrant VM (Ubuntu 14.04)– Single virtual core 2.26GHz, RAM 512MB, Disco 7.2k
• PostgreSQL 9.4 + PostGIS 2.1– postgresql.conf: default
• ~10M of points– Nearest Neighbours search – Bounding Box inclusion
![Page 17: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/17.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Indexes creation on the 2D sample– point datatype supports both GiST and SPGiST indexing
=# CREATE INDEX idx_gist_point ON many_point USING gist(point);
=# CREATE INDEX idx_spgist_point ON many_point USING spgist(point);
– geometry(point,0) datatype supports only GiST indexing
=# CREATE INDEX idx_gist_geom ON many_geom USING gist(point);
=# CREATE INDEX idx_spgist ON many_geom USING spgist(point);
ERROR: data type geometry has no default operator class for access method "spgist"
HINT: You must specify an operator class for the index or define a default operator class for the data type.
![Page 18: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/18.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Indexes creation on the 2D sample
index size table size time
idx_gist_point 715MB 653MB 214s
idx_spgist_point 437MB 653MB 137s
idx_gist_geom 523MB 501MB 290s
![Page 19: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/19.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Nearest Neighbours search (2D)
– point
SELECT *
FROM many_geom
ORDER BY ST_MakePoint(0.5, 0.5) <-> geom LIMIT 10;
– geometry(point,0)
SELECT *
FROM many_point
ORDER BY point(0.5, 0.5) <-> point LIMIT 10;
![Page 20: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/20.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Nearest Neighbours search (2D)• Query timing (without & with indexes):
– point
– geometry(point,0)
planner strategy exec. time
Seq. Scan + Sort 7.3s
planner strategy exec. time
Seq. Scan + Sort 17.2s
planner strategy exec. time
Index Scan (idx_gist_point)
52ms
planner strategy exec. time
Index Scan (idx_gist_geom)
18ms
![Page 21: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/21.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Bounding Box inclusion (2D)
– point
SELECT *
FROM many_geom
WHERE point && ST_MakeBox2D(ST_MakePoint(0.4, 0.4), ST_MakePoint(0.6, 0.6));
– geometry(point,0)
SELECT *
FROM many_point
WHERE point <@ box(point(0.4, 0.4), point(0.6, 0.6));
![Page 22: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/22.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Bounding Box inclusion (2D)• Query timing (without & with indexes):
– point
– geometry(point,0)
planner strategy exec. time
Seq. Scan + <@ 5.7s
planner strategy exec. time
Seq. Scan + && 2.0s
planner strategy exec. time
Index Scan (idx_spgist_point)
0.4s
planner strategy exec. time
Index Scan (idx_gist_geom)
0.7s
![Page 23: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/23.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Bounding Box inclusion (2D)• Query timing (without & with indexes):
– point
– geometry(point,0)
planner strategy exec. time
Seq. Scan + <@ 5.7s
planner strategy exec. time
Seq. Scan + && 2.0s
planner strategy exec. time
Index Scan (idx_spgist_point)
0.4s
planner strategy exec. time
Index Scan (idx_gist_geom)
0.7s
Unbalanced indexes intrinsecally provide boxed sample in their nodes
Used in BB inclusion!!
![Page 24: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/24.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Work with (many) 3D points in PostgreSQL
• The OpenGeo suite (Boundless – P. Ramsey)– Include postgis and pointcloud extensions
• Casting between the two points datatype is allowed• pointcloud allows to use the patches to reduce the
whole data size
– No packages available to work with PostgreSQL 9.4– Can import LiDAR data from .LAS files
http://suite.opengeo.org/4.1/whatsnew.html
http://suite.opengeo.org/opengeo-docs/dataadmin/pointcloud/loadingdata.html#loading-with-pdal
![Page 25: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/25.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
An example of usage: 1G points cloud
• The test environment:– 16GB RAM, 1TB RAID1 storage, 8 CPU @3.3GHz, PostgreSQL 9.3
• Use the pointcloud extension– one point → one record
• Search points inside a BB and NN
4B 4B 4B 2B
http://suite.opengeo.org/opengeo-docs/dataadmin/pointcloud/schemas.html
![Page 26: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/26.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Build the index
table size GiST index size building time
56GB 59GB 6h
CREATE INDEX pc_gist_idx ON pcpoints USING gist(Geometry(pt));
You have to cast to PostGIS point datatype to use GiST index
![Page 27: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/27.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
BB inclusion with 1G points cloud
included points execution time(no index)
execution time(with index)
1M 798s 208ms
10M - 9.27s
100M - 99.7s
300M - 682s
SELECT * FROM pcpoint
WHERE Geometry(pt) &&
ST_SetSRID(ST_3DMakeBox(ST_MakePoint(0, 0, 100),
ST_MakePoint(100, 100, 500)), 4326);
Index is always used!
![Page 28: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/28.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
BB inclusion with 1G points cloud using patches
WITH sel AS (
SELECT PC_Explode(pa) AS pc FROM pcpatch
WHERE ST_SetSRID(ST_GeomFromEWKB(PC_Envelope(pa)), 4326) &&
ST_SetSRID(ST_3DMakeBox(ST_MakePoint(0, 0, 100),
ST_MakePoint(100, 100, 500)), 4326)
)
SELECT pc FROM sel
WHERE ST_Within(Geometry(pc),
ST_SetSRID(ST_3DMakeBox(ST_MakePoint(0, 0, 100),
ST_MakePoint(100, 100, 500)), 4326));
100k patches 10k points/patch (2h, 9.4GB)
http://suite.opengeo.org/4.1/dataadmin/pointcloud/objects.html#pcpatch
![Page 29: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/29.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
BB inclusion with 1G points cloud using patches
included points execution time(search of patches)
execution time(patch explosion)
1M 520ms 3s
10M 3.8s 16.5s
100M 33.8s 150s
So...indexed searchesare faster!
![Page 30: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/30.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Nearest neighbours search with 1G points cloud
searched points execution time(no index)
execution time(with index)
1M 2000s 1.41s
10M - 13.8s
SELECT *
FROM pcpoints
ORDER BY ST_SetSRID(ST_MakePoint(0, 0, 0), 4326) ↔ Geometry(pt)LIMIT <searched points>;
![Page 31: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/31.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Nearest neighbours search with 1G points cloud
searched points execution time(no index)
execution time(with index)
1M 2000s 1.41s
10M - 13.8s
SELECT *
FROM pcpoints
ORDER BY ST_SetSRID(ST_MakePoint(0, 0, 0), 4326) ↔ Geometry(pt)LIMIT <searched points>;
Index blocks in memory are used,
then SeqScanssearched points execution time
100M 2100s
![Page 32: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/32.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Conclusions
• PostgreSQL includes many features to work with geospatial entities– 2D in core geometries, PostGIS, PointCloud (, ...)
• Indexes can be successfully used– Improved performances for geospatial entities introduced with PostGIS
• Waiting for SP-GiST indexes (PostGIS >2.1)
• Performances achievable for higher number of entries show that geospatial features in the PostgreSQL DBMS can be suitable for the range 100M-1G
![Page 33: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/33.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Questions?
• @giubro
• gemini__81
• gbroccolo7
![Page 34: gbroccolo - Use of Indexes on geospatial databases with PostgreSQL - FOSS4G.EU 2015](https://reader031.vdocuments.mx/reader031/viewer/2022020106/55cef40ebb61ebca3d8b479b/html5/thumbnails/34.jpg)
2ndQuadrant Italia Giuseppe Broccolo – [email protected]
FOSS4G.EU 2015Como, Politecnico di Milano
July 14th-17th 2015
Creative Commons License
Copyright 2012-2015,
2ndQuadrant Italia - http://www.2ndquadrant.it
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License