doing more with sql

35
DOING MORE WITH SQL John Reiser MAC URISA 2014

Upload: john-reiser

Post on 20-May-2015

1.486 views

Category:

Education


0 download

DESCRIPTION

A talk I gave at the 2014 MAC URISA Conference in Atlantic City. Often, GIS users have little exposure to SQL. This talk gives a brief overview to SQL from a GIS users' perspective, and provides some examples of how it can be used in place of common ArcGIS/desktop GIS tasks to improve efficiency.

TRANSCRIPT

Page 1: Doing more with SQL

DOING MORE WITH SQL

John Reiser MAC URISA 2014

Page 2: Doing more with SQL

#MACURISA2014

DBMS Systems

!  Many of the modern DBMSs support spatial data. !  Oracle, MS SQL, PostgreSQL are most often used. !  PostgreSQL is

! open source/free to use and modify !  incredibly reliable, extensible, powerful ! provides spatial capabilities through PostGIS

!  DBMSs allow for “enterprise” functionality, like multiple users/concurrency, high output, etc.

Page 3: Doing more with SQL

#MACURISA2014

Structured Query Language

!  SQL is the standardized method of interacting with a database

!  Even Access allows you to use SQL !  Common, hopefully familiar, statements:

! Select (read from database) !  Insert (new records into a DBMS) ! Update (existing records in DBMS) ! Delete (remove records from DBMS) ! Where (limits your results)

Page 4: Doing more with SQL

#MACURISA2014

Select Statements

!  Most common SQL query you will encounter

!  “Select By Attributes” has this as the foundation

!  Nothing more than “SELECT * FROM gis_layer WHERE…”

Page 5: Doing more with SQL
Page 6: Doing more with SQL

#MACURISA2014

Joins

!  In ArcGIS or Access, you join two (or more) tables together using a primary key.

!  If the keys match, the secondary tables are tacked on to the first

!  Again, geospatial is special, so GIS has another type of join

Page 7: Doing more with SQL

#MACURISA2014

Combining Tables

!  The simplest combination of two tables would be to combine each record from table A with each record from table B.

!  The Cartesian Product. !  Example: A has 2 records, B has 3. !  A ✕ B: {(A1, B1), (A1, B2), (A1, B3),

(A2, B1), (A2, B2), (A2, B3)} !  Let’s take a deck of cards as an example.

Page 8: Doing more with SQL
Page 9: Doing more with SQL
Page 10: Doing more with SQL

#MACURISA2014

Joins

!  Think of a Join as limiting the Cartesian Product of two tables down to just the specific records desired.

!  The manner in which you form your SELECT … JOIN will be important: ! Ensure the desired records and columns are returned. ! Speed of the JOIN performed.

Page 11: Doing more with SQL

#MACURISA2014

Spatial Joins

!  Relationship not determined by key, but by proximity or connectivity

!  Contains/Within/Overlaps ! One feature falls entirely within another

!  Touches/Intersects/Crosses ! One feature touches another

!  Equals or Disjoint

Page 12: Doing more with SQL

#MACURISA2014

Set Theory

!  General terms first, because these concepts are used across GIS and not just in SQL. ! Union !  Intersection ! Relative Complement ! Symmetric Difference

!  Terms should be somewhat familiar…

Page 13: Doing more with SQL

#MACURISA2014

Union

!  ArcToolbox: returns a set where all features are returned, however new features created where they intersect.

!  SQL: Set of all values from both tables. !  Join: An FULL JOIN – all values from two tables,

with NULL values where there are not shared values. !  Venn:

Page 14: Doing more with SQL

#MACURISA2014

Unions are not Cartesian

!  Union / FULL JOIN will leave NULLs where there are not matches across tables. All records will be returned, however the records will not be “shuffled” together like the cards example.

!  FULL JOINs still require a WHERE or ON predicate to create the join.

Page 15: Doing more with SQL

#MACURISA2014

Example of Cartesian vs FULL

!  From Wikipedia:

Page 16: Doing more with SQL

#MACURISA2014

Cartesian Product

Page 17: Doing more with SQL

#MACURISA2014

FULL JOIN

Page 18: Doing more with SQL

#MACURISA2014

Intersection

!  ArcToolbox: returns a set where the geometries of two different feature classes overlap.

!  SQL: Only where the two tables share values. !  Join: An INNER JOIN – intersection of two tables. !  Venn:

Page 19: Doing more with SQL

#MACURISA2014

LEFT & RIGHT JOINs

!  ArcToolbox: called Update. !  SQL: All records in Table A, along with some columns/

records from Table B. !  Join: A LEFT JOIN – columns from B will contain NULL if

there is no match. All records from A returned. (A RIGHT JOIN is just an easy way of writing the reverse.)

!  Venn:

!  Examples?

Page 20: Doing more with SQL

#MACURISA2014

Symmetric Difference

!  ArcToolbox: returns a set where the geometries feature class A do not overlap feature class B.

!  SQL: Only where the two tables do not share values.

!  Join: An FULL JOIN, WHERE a.value <> b.value !  Venn:

!  Examples?

Page 21: Doing more with SQL

#MACURISA2014

Many types of Joins

!  INNER and OUTER (LEFT, RIGHT, FULL) !  Different from Cartesian Product because some

comparison value needs to be tested for truth. !  Truth testing can be =, <>, <, > can also be the

result of a function. !  Spatial Joins in SQL:

! ST_Intersects(a.shape, b.shape) ! ST_Contains(a.shape, b.shape), ST_Within() ! ST_Overlaps(a.shape, b.shape) ! ST_Touches(a.shape, b.shape)

Page 22: Doing more with SQL

Let’s look at some spatial joins.

Page 23: Doing more with SQL

#MACURISA2014

Fire Stations in Town

!  How can we calculate the number of fire stations within a municipality?

!  Can we find the most? !  Can we find the least? !  How about those towns

with no fire stations? !  How about those with a

specific number of fire stations?

Page 24: Doing more with SQL

#MACURISA2014

Fire Stations

Page 25: Doing more with SQL

#MACURISA2014

Watch your groups!

Page 26: Doing more with SQL

#MACURISA2014

Lowest Count… right?

Page 27: Doing more with SQL

#MACURISA2014

Left Join on ST_Contains()

Page 28: Doing more with SQL

#MACURISA2014

Aggregates in WHERE

Page 29: Doing more with SQL

#MACURISA2014

Quick Subquery

Page 30: Doing more with SQL

#MACURISA2014

Bus Routes

!  How can we find the towns that are along a given bus route?

!  How do we find the routes that cross through a town?

!  How do we find the towns without service?

!  bus.line = 553 AND ST_Intersects( bus.shape, mun.shape)

Page 31: Doing more with SQL

#MACURISA2014

Self-Joins

!  A table can be referenced twice in the same query.

!  How could we use this to generate a “neighbor” list?

!  How would we generate that list of towns?

!  FROM nj_munis m, njmunis x WHERE m.mun <> x.mun AND ST_Touches(m.shape, x.shape)

Page 32: Doing more with SQL

#MACURISA2014

Denny's & La Quinta

!  Using SQL to remove the humor from jokes…

SELECT d.city, d.state, ST_Transform(d.shape,2163) <-> ST_Transform(l.shape,2163) as distance FROM dennys d, laquinta l WHERE (ST_Transform(d.shape,2163) <-> ST_Transform(l.shape,2163)) < 150 ORDER BY 3;

Page 33: Doing more with SQL

A Denny's between two La Quintas. Huntsville, Alabama

Page 34: Doing more with SQL

#MACURISA2014

Power of SQL

!  Speed. !  Flexibility. !  Data integrity and control. !  Automated reports as data changes. !  Views and functions can help automate and

streamline your GIS workflow. !  A bit of a learning curve, but SQL is a standard

and is supported and understood by a wide variety of applications and data stores.

Page 35: Doing more with SQL

#MACURISA2014

More Info and Thanks!

!  John Reiser email: [email protected] twitter: @johnjreiser code: github.com/johnjreiser

!  Articles on New Jersey Geographer: http://njgeo.org/

!  "Mitch Hedberg & GIS" (using PostgreSQL): http://njgeo.org/2014/01/30/mitch-hedberg-and-gis/