l 9 - spatial data analysis

9
L 9. Spatial Data Analysis 1 Introduction to Geoinformatics L-9. Spatial Data Analysis Dr. György SZABÓ associate professor Budapest University of Technology and Economy Department of Photogrammetry and Geoinformatics [email protected] Contents OVERVIEW This chapter is describing the geographic analysis and modeling methods. Examines methods constructed around the concepts of location, distance, and area. LEARNING OBJECTIVES Definitions of spatial data analysis . Methods to examine distance effects, in the creation of clusters, hotspots, and anomalies. Methods for measuring properties of areas. Measures that can be used to capture the centrality of geographic phenomena. Techniques for analyzing surfaces and for determining their morphologic properties. Techniques for the support of spatial decisions and the design of landscapes according to specific objectives. Longley, Goodchild, Maguire, Rhind (2011): Geographical Information Systems and Science CH – 14. pp. 351-359. What is spatial analysis? • Spatial analysis the engine of GIS because it includes all of the transformations, manipulations, and methods that can be applied to geographic data to add value to them, to support decisions, and to reveal patterns and anomalies that are not immediately obvious • Spatial analysis is the process by which we turn raw data into useful information A redrafting of the map made by Dr. John Snow in 1854, showing the deaths that occurred in an outbreak of cholera in the Soho district of London The first application of GIS in epidemiology And the spatial analysis: Hypotesis: Source of the infection is the water > Broad street pump Action: remove the Broad street pump head The map made by Openshaw and colleagues by applying their Geographical Analysis Machine to the incidence of childhood leukemia in northern England. A very large number of circles of random sizes is randomly placed on the map, and a circle is drawn if the number of cases it encloses substantially exceeds the number expected in that area given the size of its population at risk Data sets: - location of the disease - Number of people at risk (Source: International Journal of Geographical Information Systems)

Upload: priscila-quintela

Post on 07-Dec-2015

10 views

Category:

Documents


1 download

DESCRIPTION

geoinformatic

TRANSCRIPT

Page 1: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 1

Introduction to Geoinformatics

L-9. Spatial Data Analysis

Dr. György SZABÓ associate professor

Budapest University of Technology and EconomyDepartment of Photogrammetry and Geoinformatics

[email protected]

ContentsOVERVIEW

This chapter is describing the geographic analysis and modeling methods. Examinesmethods constructed around the concepts of location, distance, and area.

LEARNING OBJECTIVES

• Definitions of spatial data analysis .

• Methods to examine distance effects, in the creation of clusters, hotspots, andanomalies.

• Methods for measuring properties of areas.

• Measures that can be used to capture the centrality of geographic phenomena.

• Techniques for analyzing surfaces and for determining their morphologic properties.

• Techniques for the support of spatial decisions and the design of landscapesaccording to specific objectives.

Longley, Goodchild, Maguire, Rhind (2011): Geographical Information Systems and ScienceCH – 14. pp. 351-359.

What is spatial analysis?

• Spatial analysis the engine of GIS because itincludes all of the transformations, manipulations,and methods that can be applied to geographicdata to add value to them, to support decisions,and to reveal patterns and anomalies that are notimmediately obvious

• Spatial analysis is the process by which we turnraw data into useful information

A redrafting of the map made by Dr. John Snow in 1854, showing the deaths that occurred in an outbreak of cholera in the Soho district of London

The first application of GIS in epidemiology

And the spatial analysis: Hypotesis: Source of the infection is the water > Broad street pump

Action: remove the Broad street pump head

The map made by Openshaw and colleagues by applying their Geographical Analysis Machine to the incidence of childhood leukemia in northern England.

A very large number of circles of random sizes is randomly placed on the map, and a circle is drawn if the number of cases it encloses substantially exceeds the number expected in that area given the size of its population at risk

Data sets:- location of the disease- Number of people at risk

(Source: International Journal of Geographical Information Systems)

Page 2: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 2

Type of spatial analysis

• Inductive: to examine empirical evidence in the search for patterns that mightsupport new theoriesor general principles, in this case with regard to disease causation.

• Deductive: focusing on the testing of known theoriesor principles against data

• Normative: using spatial analysis to develop or prescribe new or better designs

Analysis of attributes• Selection of objects based on attribute values.

• One way to examine this suspicion is to plot one variable against the other as a scatterplot.

• Regression analysis focuses on finding the simplest relationship indicated by the data.

• Relationships between variables can vary across space, which is an issue termed spatial heterogeneity (add thematic in Geomedia)

Geomedia-> Analysis->Attribute QueryCheap Pub (Price of beer =< 300 Ft

SQL script:

SELECT <fields>

FROM <tables>

[WHERE<logical expressions>]

[<grouping>]

[<order by>];

Scatterplots of median house value (y axis) versus percent black (x axis) for U.S. counties in 1990, with linear regressions:

(A) California

(B) Mississippi

(Source: U.S. Bureau of the Census)

A

B

Age-adjusted rates of mortality due to cancers of the trachea, bronchus, and lung, among white males between 1950 and 1969, by county

Geomedia: Legend ->Add Thematic Legend EntryPrice level of pubs (cheap, medium, expensive)

Page 3: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 3

Analysis of spatial properties• Topological relations:

– Point in Polygon

– Polygon Overlay

– Spatial Joins

• Analysis based on distance:– Measuring distance, length, area, peremiter

– Buffering – point, line, area zone creation

The point in polygon problem, shown in the continuous-field case (the point must by definition lie in exactly one polygon, or outside the project area).

In only one instance (the orange polygon) is there an odd number of intersections between the polygon boundary and a line drawn vertically upward from the point.

Polygon overlay, in the discrete object case

Here the overlay of two polygons produces nine distinct polygons. One has the properties of both polygons, four have the properties of the yellow shaded polygon but not the blue (bounded) polygon, and four are outside the yellow polygon but inside the blue polygon.

Here a dataset representing two types of land cover (A on the left, B on the right) is overlaid on a dataset representing three types of ownership (the two datasets have been offset for visual clarity).

Polygon overlay in the continuous-field case

Geomedia: Analysis-> Spatial QueryTopological relations of two object sets

The spatial relations:• TOUCH

• COUNTAIN

• ARE COUNTAINED BY

• ENTIRELY CONTAIN

• ARE ENTIRELY CONTAINED BY

• OVERLAP

• MEET

• ARE SPATIALLY EQUAL

• ARE WITHIN DISTANCE OF

Touch returns features that touch the defined features in any way-meeting, overlapping, containing, or being contained by.

Touch with the Not qualifier

Contain returns features that surround defined features.Contained features can touch but not overlap the borders of the surrounding features. Points cannot contain other features.

with the Not qualifierContain

Page 4: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 4

Are contained by returns features that fall completely within the defined features. Contained features can touch but not overlap the borders of the surrounding features.

Are contained by with the Not qualifier

Entirely contain returns features that surround defined features. Contained features cannot touch or overlap the borders of the surrounding features. Points cannot entirely contain other features..

with the Not qualifierEntirely contain

Are entirely contained by returns features that fall completely within the defined features. Contained features cannot touch or overlap the borders of the surrounding features.

Are entirely contained by with the Not qualifier

Overlap returns features that overlap the defined features.

with the Not qualifierOverlap

Meet returns features that fall next to the defined features, touching without overlapping.

Meet with the Not qualifier

Are spatially equal returns features that occupy the same space and location. Features must be of the same type to be spatially equal.

with the Not qualifierAre spatially equal

Are within distance of returns features having any part located within the specified distance of the defined features. If either the starting or ending point of a linear feature, for example, falls within the specified distance, it is returned.

Are within distance of with the Not qualifier

Spatial Intersection allows you to perform a spatial overlay on two feature classes or queries to find the intersecting areas, or areas of coincidence.

Original After Spatial Intersection

The spatial operators available for this command are touch, contain, are contained by, entirely contain, are entirely contained by, overlap, meet, and are spatially equal. After you choose the two sets of input features to intersect and the type of spatial operation to perform, this command outputs the results as a new query.

Spatial Difference allows you to perform spatial masking, that is, to perform a difference operation on two sets of areas to produce resultant geometries.

This command takes as input two area feature classes or queries, the features to be masked or cropped (the from-feature), and the features to be used as a mask (the subtract-feature).

Page 5: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 5

Geomedia -> Tools -> Validate Geometry,->Validate Connectivity

• Validate geometry: checking the point, line, area feature geometry consistency in a feature class

• Validate connectivity: checking the topological relations of one feature class or between two feature groups

Geometry Validation Error Conditions

Validating ConnectivityOvershootThis condition occurs when the end of a linear geometry extends beyond the point at which it should intersect with, and stop at, another geometry.

UndershootThis condition occurs when the end of linear geometry or a point geometry falls short of intersecting another geometry.

Node MismatchThis condition occurs when the end of a linear or point geometry falls short of intersecting with the end of another linear or point geometry.

Examples of Connectivity Conditions by Feature Class

Overshoot

UndershootNode Mismach

Intersection Not Broken

Nearly Coincident

Node Mismach

Intersections NotCoincident

Pythagoras’s Theorem and the straight-line distance between two points on a plane

The effects of the Earth’s curvature on the measurement of distance, and the choice of shortest paths

Page 6: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 6

The length of a path as traveled on the Earth’s surface (red line) may be substantially longer than the length of its horizontal projection as evaluated in a two-dimensional GIS

(A) shows three paths across part of Dorset in the UK. The green path is the straight route, the red path is the modern road system, and the gray path represents the route followed by the road in 1886

(Courtesy Michael De Smith)

(B) Shows the vertical profiles of all three routes, with elevation plotted against the distance traveled horizontally in each case. 1 ft 5 0.3048 m, 1 yd 5 0.9144 m.

(Courtesy Michael De Smith)

Geomedia: Edit-> Attribute ->Update AttributeStatic measurement of feature geometry: Line - LENGTH(Input.Geometry)Area - AREA(Input.Geometry), PERIMETER(Input.Geometry)

Buffers (dilations) of constant width drawn around a point, a polyline, and a polygon

Geomedia: Analysis-> Buffer Zone100 m Buffer Zone of Pubs

Terrain surface representation and analysis

– Digital Contourline Model (DCM),

– Digital Surface Model (DSM), – Digital Elevation Model (DEM),

– Digital Terrain Model (DTM)

Page 7: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 7

Surface representation: 2, 2.5, 3D

DCM DEM

DTMDSM

(A)Landslide risk map for Pisa, Italy (Courtesy of Earth Science Department, University of Siena, Italy)

(B) Yangtse River, China (Courtesy of Human Settlements ResearchCenter, Tsinghua University, China)

Examples of applications that use the TIN data model

Sampling of Terrain

Random: stochastic elements

Systematic:

• Homogenious (regular grid)

• Inhomogenious (charasteristic points, break lines, form lines, extremal points)

Grid model

Systematic grid, or raster of spot height with constant density– Creation: direct measurement, derivation

– Type: Grid, Cell, Point cloud

– Advantage: easy representation and storage in matrix form

– Disadvantage: heterogeneous terrain - > week representation or too much points

Grid interpolation

Approximation:

•Linear

•Bilinear

•2nd order

•3rd order

•Spline

TIN model

Triangulated Irregular Network– Formation: Delaunay triangulation (dual

definition - Voronoi Diagram): nearly equal side triangle structure

– Advantage: more characteristics, represented the morphology of the real terrain forms

– Disadvantage: complex structure, large storage and computational requirements

Page 8: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 8

TIN TopologyTIN surface of Death Valley, California

(A)“wireframe” showing all triangles

(B) shaded by elevation

(C) draped with satellite image

Tivadar – 2D contour line map Tivadar – Digital Surface Model

Tivadar – Digital Terrain ModelComplex Model: Simulation of

Bereg dam damage - 107m water level

Page 9: l 9 - Spatial Data Analysis

L 9. Spatial Data Analysis 9

Complex networks

• Connections

• Rules

• Attributes

Model for movement over surfaces

• Representation of route location on raster data

• Complex surface representation of phenomena: water flow, drainage

(A)the problem is solved in continuous space, with straight-line travel, for a warehouse to serve the 12 largest U.S. cities. In continuous space there is an infinite number of possible locations for the site.

(B) a similar problem is solved at the scale of a city neighborhood on a network, where Hakimi’s theorem states that only junctions (nodes) in the network and places where there is weight need to be considered, making the problem much simpler, but where travel must follow the street network.

Search for the best locations for a central facility to serve dispersed customers GIS can be used to find locations

for fire stations that result in better response times to emergencies

(PhotoDisc/Getty Images)

Screenshot of the system used by drivers for Sears to schedule and navigate a day’s workload

Thank You

MerciGrazie

Gracias

Obrigado Danke

Japanese

English

French

Russian

German

Italian

Spanish

Brazilian PortugueseArabic

Traditional Chinese

Simplified Chinese

Hindi

Tamil

Thai

KoreanKöszönömHungarian