a.k.a ‘how to preprocess you data in a meaningful way’

58
Queries, overlays and lookup tables a.k.a ‘How to preprocess you data in a meaningful way’ Devis Tuia, v0.1, March 14 2018

Upload: others

Post on 27-May-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: a.k.a ‘How to preprocess you data in a meaningful way’

Queries, overlays and lookup tables

a.k.a ‘How to preprocess you data in a meaningful way’

Devis Tuia, v0.1, March 14 2018

Page 2: a.k.a ‘How to preprocess you data in a meaningful way’

Why are we talking about it

In a GIS project, we create a LOT of datasets

● We like to avoid unnecessary information (it takes place)

● We like to avoid redundancy

(it makes a mess)

● We like to avoid erroneous calculations

(sometimes the tool used has an impact on the final result)

2

Page 3: a.k.a ‘How to preprocess you data in a meaningful way’

Example: a “Merge” duplicates!

3

merge

ID length !!!!!!!!!!

1 100 !same

2 100 !same

3 10

So if you calculate the total length,

it will be 210, instead of 110.

Page 4: a.k.a ‘How to preprocess you data in a meaningful way’

Menu of the day

Different families of data handling types

(can be queries, transformations or alterations)

Reviewing useful tools from each family

Introduction of Lookup tables (called “reclassification tables in ArcGIS”)

4

Page 5: a.k.a ‘How to preprocess you data in a meaningful way’

Families of handling types

There are two main families

● Query-like: they work on the features, but do not modify the attributes

This means: you select, cut and paste, but the attributes stay the

same.

● Overlays: they modify both geometry and attributes.

This means: the resulting features have a different set of

attributes than the originals.

5

Page 6: a.k.a ‘How to preprocess you data in a meaningful way’

QUERIES, part 1

selecting

6

Page 7: a.k.a ‘How to preprocess you data in a meaningful way’

Queries, group 1: the select

Select only highlights features according to a query

It can be

● Spatial selection: ‘select buildings WITHIN Wageningen’

● Attribute selection: ‘select buildings where GM_NAAM = ‘Wageningen’ ’

In the first case you will need a second (polygon) feature class with the municipality boundaries, in the second, you need the municipality of every building in the attribute table, under ‘GM_NAAM’(see “overlays” tools later).

7

Page 8: a.k.a ‘How to preprocess you data in a meaningful way’

Examples of select by location

8

Page 9: a.k.a ‘How to preprocess you data in a meaningful way’

Examples of select by location

9

To select within a buffer, you don’t need to create and save a buffer

Page 10: a.k.a ‘How to preprocess you data in a meaningful way’

HINT: about the tool

if your selecting feature class has a feature selected, it will be the only one the tool will select in

10

INPU

T

INPU

T

RESU

LT

RESU

LT

Page 11: a.k.a ‘How to preprocess you data in a meaningful way’

But remember: it is just a selection!

You haven’t saved the resulting features (your selection is only in memory

If you want you can

● Save them manually (right click on the feature class being selected)

● Use another tool with direct saving options: this is necessary to guarantee reproducibility

Selections are taken into account when performing operations (see previous slide for buildings in Wageningen)

11

Page 12: a.k.a ‘How to preprocess you data in a meaningful way’

QUERIES, part 2

cutting out

12

Page 13: a.k.a ‘How to preprocess you data in a meaningful way’

Clip

A way to select by location AND saving.

Does not create any new attribute, it just cuts and copies in a new feature class (entity stays the same)

But remember: IT CUTS features

With the previous, you select them and then save them, so you wont alter their geometry

13

Page 14: a.k.a ‘How to preprocess you data in a meaningful way’

Comparing them:

The green is the feature class being clipped on the area of the blue square

Clip modifies the geometry, selecting does not.

Neither add attributes.

14

Page 15: a.k.a ‘How to preprocess you data in a meaningful way’

Erase

Does the same job as clip, but removing the intersection with the second feature class

So remember: it modifies the geometry.

15

Page 16: a.k.a ‘How to preprocess you data in a meaningful way’

QUERIES, part 3

adding / removing features

16

Page 17: a.k.a ‘How to preprocess you data in a meaningful way’

The mergers

Sometimes we want to make a single feature class from two

= we concatenate the features

This time, they MUST have the same attributes!

(entities must be the same)

17

ID Height Width Region

8 55 3 12

9 54 4 16

10 12 6 17

ID Height Width Region

5 17 6 12

6 19 5 17

7 44 3 15

ID Height Width Latitude

8 55 3 12N

9 54 4 16N

10 12 6 17N

ID Height Width Region Color

5 17 6 12 Red

6 19 5 17 Green

7 44 3 15 Red

Can merge Cannot merge!

Page 18: a.k.a ‘How to preprocess you data in a meaningful way’

Append VS merge

They merge datasets into a single one

Differences:

● APPEND writes into an existing dataset, MERGE creates a new one

● In APPEND, the output dataset does not have the same attribute structure (only the inputs), but will overwrite it with the inputs’ one (for this, use the NO_TEST option)

BOTH GENERATE DUPLICATE FEATURES

18

Page 19: a.k.a ‘How to preprocess you data in a meaningful way’

Example

We want to merge these two overlapping buildings feature classes

APPEND and MERGE will create an extra building feature!

19

Page 20: a.k.a ‘How to preprocess you data in a meaningful way’

OVERLAYS

20

Page 21: a.k.a ‘How to preprocess you data in a meaningful way’

Overlays in general

They are based on the concept of spatial joins.

They combine two feature classes

● can be of different feature type (point, line, polyg.)

● Can be with different attributes

1. first apply a spatial query based on location

2. Apply a specific geometric processing (based on set theory)

3. Concatenate (join) the attributes: the output will have as much attributes as the sum of the composing feature classes.

21

Page 22: a.k.a ‘How to preprocess you data in a meaningful way’

Geometric operations

Union

Intersect

Identity

Update

Symmetric difference

22

Page 23: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: union

Obtain all features primitives in each feature class, after intersecting them (and removing duplicates)

The primitives of Union are used in all the other tools.

23

Page 24: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: union (attribute table)

24

Feature Attributes (green feature class)

Attributes (blue feature class)

Values Values

<null> Values

Values Values

Values <null>

Values Values

Values <null>

Page 25: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: union

Examples of use

Merge all built structure using a set of feature classes from different offices (some buildings will be duplicated, for example)

Merging a multi-year dataset of deforestation areas (non forest areas remain from year to year)

25

Page 26: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: intersection

keeps only the primitives that belong to both inputs

= removes all primitives that belong to either one or another

26

Page 27: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: intersection (attribute table)

27

Feature Attributes (green feature class)

Attributes (blue feature class)

Values Values

Values Values

Values Values

Page 28: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: intersection

Examples of use

Extract buildings within Wageningen

(ok, can also be done with a clip if joining the attributes is not desired)

Extract points of interest on the highway between Ede and Utrecht (and you need to be able to differentiate those in Utrecht and Gelderland districts... Otherwise it’s a clip)

28

Page 29: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: identity

keeps only the primitives spatially located on the original input feature class

(in our case the blue square in the left image)

29

Page 30: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: identity (attribute table)

30

Feature Attributes (green feature class)

Attributes (blue feature class)

Values Values

<null> Values

Values Values

Values Values

Page 31: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: identity

Examples of use

Extract extract within a protected zone which areas are forest (forest areas are generally bigger polygons, so the features will need to be cut) and which are not

31

Page 32: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: update

Merges all primitives belonging to the update feature class.

32

Page 33: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: update (attribute table)

33

Feature Attributes (green feature class)

Attributes (blue feature class)

<null> Values

Values <null>

Values <null>

Page 34: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: update if same attributes

34

Feature Attributes

Values

Values

Values

Page 35: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: update

Examples of use

Extract forest areas EXCLUSIVE OR protected zones

Merge two versions of the same map. The most recent remains identical and differences found in the old one are added

35

Page 36: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: symmetric difference

Removes primitives spatially common to both feature classes

= Union - Intersection

36

Page 37: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: symmetric difference

(attribute table)

37

Feature Attributes (green feature class)

Attributes (blue feature class)

<null> Values

Values <null>

Values <null>

Page 38: a.k.a ‘How to preprocess you data in a meaningful way’

Set overlays: symmetric difference

Examples of use

Extract areas that are EITHER forests OR protected (but not both)

38

Page 39: a.k.a ‘How to preprocess you data in a meaningful way’

Summing up

There are many tools at your disposal

First think of what you need for further calculations (carry attributes or not? Need to save it or just selecting is enough?)

Then think of the most efficient way, most solutions can be attained with a combination of tools, but also with a single one!

Remember: if features are selected in the processed feature class, only those will be processed.

Be careful of features duplicates (overlaps) when merging datasets!

39

Page 40: a.k.a ‘How to preprocess you data in a meaningful way’

LOOKUP TABLES

40

Page 41: a.k.a ‘How to preprocess you data in a meaningful way’

What is a lookup table?

Also called Reclassification tables

It is basically an attribute that

● Has several repeated entries in a single feature class

● Appears with same name and meaning across feature classes

A classical example is the TDN code

Another is neighborhood codes

41

Page 42: a.k.a ‘How to preprocess you data in a meaningful way’

TDN codes

They are a description of landuse centrally defined

They fit our description

● Has several repeated entries in a single feature class

TDN codes are repeated, many polygons are of the same land

use type

● Appears with same name and meaning across feature classes

Land use types can be found in attribute tables concerning

agricultural fields, buildings, ...

42

Page 43: a.k.a ‘How to preprocess you data in a meaningful way’

Region codes

A single numerical code to define a region (can also be the name of the region, of course, but then careful with typos )

● Has several repeated entries in a single feature class

region codes are repeated, many fields are located in the same

region

● Appears with same name and meaning across feature classes

fields and protection areas can have a “region” attribute, to

select them more easily.

43

Page 44: a.k.a ‘How to preprocess you data in a meaningful way’

Lookup table are good for

Keep in mind the important grouping variables

Ensure that you do not have duplicates

Structure your data: if your grouping variables are in tables, you will re use them!

44

Page 45: a.k.a ‘How to preprocess you data in a meaningful way’

Lookup tables (by building type)

45

Page 46: a.k.a ‘How to preprocess you data in a meaningful way’

Lookup tables (by TDN code)

46

Page 47: a.k.a ‘How to preprocess you data in a meaningful way’

Careful!

The extra attributes you add are summed (so not super helpful to sum TDN codes for instance

Use it for distances, counts, ...

To obtain a lookup table we use the frequency tool.

This tools is not to be confused with the Reclassify tool, where we recode an attribute in a raster (e.g.: all values for deciduous forests (value 10) and pine forest (value 20) are reclassified into generic forest (value 1))

47

Page 48: a.k.a ‘How to preprocess you data in a meaningful way’

JOINS

48

Page 49: a.k.a ‘How to preprocess you data in a meaningful way’

What is a join?

It is a logical extension of the concept of lookup tables

In a nutshell:

● You have two datasets

● They share a lookup attribute or spatial locations

● You want to join them (i.e. that the features of one get enriched with the attributes of the other)

They can be based on attributes or on spatial queries.

49

Page 50: a.k.a ‘How to preprocess you data in a meaningful way’

Spatial join

Performs a selection based on relative locations of the features belonging to two feature classes

Merges attributes only on features selected according to location

The others get <null> fields values.

It allows flexibility of spatial selection criteria (matching operations in ArcGIS): e.g. join when

● Features intersect

● Features intersect the boundary of the other

● Features are contained completely in the other

● ...

50

Page 51: a.k.a ‘How to preprocess you data in a meaningful way’

Spatial join (2)

You can see it as creating a lookup table of locations and selecting only those with the same ID.

E.g. (let-s say we want to add flood risk values into a buildings feature class by intersection with flood risk areas):

● First it performs geometrical matching operation (intersect, ...)

● Then selects all the buildings according to it

● For the selected ones, it joins attributes from the

corresponding flood risk feature class.

51

Page 52: a.k.a ‘How to preprocess you data in a meaningful way’

Attribute join (1)

It is the same concept, but based on one (or many) common attribute(s) (yes, a lookup table ).

● Example: different suitabilities depending on roads types AND municipality

You will the common attributes in the two datasets being joined.

The second dataset can be a lookup table itself!

52

Page 53: a.k.a ‘How to preprocess you data in a meaningful way’

Ex: Road buffer size

For instance: in the lightrail project, you need to attribute different buffer values wrt road types:

● First you create a LUT for road types

● You add a new attribute to the LUT, the buffer size

● You enter the values (e.g. fietsers get a buffer of 100m)

53

Page 54: a.k.a ‘How to preprocess you data in a meaningful way’

Ex: road buffer size

54

● Now you join the roads feature class with the LUT

● Your buffer widths are now replicated correctly for each road segment

Page 55: a.k.a ‘How to preprocess you data in a meaningful way’

Final point: do you need to save it?

If yes, use the Join and Spatial Join tools

● They save a new feature class with the joint attributes

It is only temporary? Keep it in memory (right click on the feature class and use the option ‘joins and relates’

55

BUT NOT WITHIN

THE LIGHTRAIL PROJECT!

Page 56: a.k.a ‘How to preprocess you data in a meaningful way’

Today

We discussed a number of operations (tools) to preprocess and organize your data

We saw queries and overlay operation and highlighted specificities

We introduced the concept of lookup tables and joins.

56

Page 57: a.k.a ‘How to preprocess you data in a meaningful way’

Putting in context of the light rail project...

You will encounter these concepts (or already have)

● Clipping to reduce a feature class extent (and the number of features)

● Overlays to exclude features or join datasets

● Lookup tables to assign suitability values to features (via a join)

57

Page 58: a.k.a ‘How to preprocess you data in a meaningful way’

THANK YOU!

58