data standardization and classification
DESCRIPTION
DATA STANDARDIZATION and CLASSIFICATION. Cartographic Design for GIS (Geog. 340) Prof. Hugh Howard American River College. STANDARDIZATION. STANDARDIZATION. Normalization Transformation of raw data values to different, more meaningful values To map densities instead of “raw” values - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/1.jpg)
DATASTANDARDIZATION
and
CLASSIFICATION
Cartographic Design for GIS (Geog. 340)Prof. Hugh HowardAmerican River College
![Page 2: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/2.jpg)
STANDARDIZATION
![Page 3: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/3.jpg)
STANDARDIZATION• Normalization• Transformation of raw data values to
different, more meaningful values– To map densities instead of “raw” values
– To map proportions between variables
– To map other relationships between variables
– To map statistical summaries
![Page 4: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/4.jpg)
MAPPING DENSITY• How much of a particular thing exists
within a given area• Larger enumeration units often have
"more" of a particular thing– Mapping density is not necessary if all
you want to do is show where “more” is– Accounting for the varying sizes of
enumeration units can be more revealing
![Page 5: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/5.jpg)
MAPPING DENSITY
Population/Area
“persons per square mile”
![Page 6: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/6.jpg)
MAPPING DENSITY
Bushels/Area
“bushels per acre”
![Page 7: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/7.jpg)
MAPPING PROPORTIONS• Proportions represent the relationship
of a part to a whole• Several ways to express proportions
– Quotient: 0.0-1.0 – Percentage: 0-100%– Rate: 7 per 1,000
![Page 8: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/8.jpg)
MAPPING PROPORTIONS
Persons 60 and Over/Total Persons*100
“percentage of seniors”
Persons 60 and Over
![Page 9: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/9.jpg)
MAPPING PROPORTIONS
Non Grads/Total Population*100
“percentage of non grads”
![Page 10: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/10.jpg)
MAPPING RELATIONSHIPS• It is often revealing to show how two
variables are related (in a manner that is not strictly proportional)
• Several ways to express relationships– Quotient: 0.0-infinity – Percentage: 0-infinity%– Rate: 1,500 per 100
![Page 11: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/11.jpg)
MAPPING RELATIONSHIPS
Females/Males
“ratio of females to males”
![Page 12: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/12.jpg)
MAPPING RELATIONSHIPS• It is often revealing to show how two
variables are related (in a manner that is not strictly proportional)
• Several ways to express relationships– Quotient: 0.0-infinity – Percentage: 0-infinity%– Rate: 1,500 per 100
![Page 13: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/13.jpg)
MAPPING RELATIONSHIPS
Acres of Cropland/Population
“acres per 1,000 people”
![Page 14: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/14.jpg)
MAPPING STAT. SUMMARIES• Enumeration units can be represented
according to calculated statistics– Median– Mean (average)– Standard Deviation, etc.
![Page 15: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/15.jpg)
MAPPING STAT. SUMMARIES
![Page 16: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/16.jpg)
Animation showing raw and standardized values
(slow version)
![Page 17: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/17.jpg)
Animation showing raw and standardized values
(fast version)
![Page 18: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/18.jpg)
STANDARDIZATION• Transformation of raw data values to
different, more meaningful values– Densities, Proportions, Relationships,
and Statistical Summaries
• In conjunction with data classification, normalization allows us to craft our message…
![Page 19: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/19.jpg)
DATACLASSIFICATION
![Page 20: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/20.jpg)
DATA CLASSIFICATION• The act of organizing attribute values
into categories, or groups• Can be qualitative or quantitative, and
based on any of the four measurement scales– Nominal– Ordinal – Interval– Ratio
![Page 21: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/21.jpg)
0 - 500
501 - 1 ,000
1 ,001 - 1,500
RATI O(Popu lat i on )
2 .4 - 4 .7
4.8 - 6.3
6.4 - 8 .6
I NTE RVAL(Qu al i t y of L i fe)
Poor
Fai r
Good
ORD I NAL(Vi si b i l i t y)
Com m er ci al
Residen t i al
In du st r i al
NOM I NAL(Z on i n g)
DATA CLASSIFICATION
![Page 22: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/22.jpg)
DATA CLASSIFICATION• One of the most interesting aspects of
thematic mapping– One set of attribute values can yield
many different maps, depending on the classification scheme
– The scheme you choose can strongly influence how your map is perceived
![Page 23: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/23.jpg)
DATA CLASSIFICATION
![Page 24: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/24.jpg)
DATA CLASSIFICATION• Animation showing population using
equal interval, quantile, and natural breaks classification methods
![Page 25: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/25.jpg)
DATA CLASSIFICATION
There is no “best” method
Certain methods are not well suited to particular situations
![Page 26: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/26.jpg)
DATA CLASSIFICATION• How many classes should you use?
– Anywhere from 3 to 7 – 5 is probably optimal– An odd # has a “middle” class
Difficult to differentiate large numbers of tints
![Page 27: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/27.jpg)
DATA CLASSIFICATION• Animation showing agricultural sales
using 2, 4, and 6 classes
![Page 28: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/28.jpg)
DATA CLASSIFICATION
![Page 29: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/29.jpg)
DATA CLASSIFICATION• Equal Interval
– Each class occupies an equal interval along the number line, or histogram
TOWN POPULATION
No gaps between classes
![Page 30: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/30.jpg)
DATA CLASSIFICATION• Advantages of Equal Interval
– Can be easy to understand and interpret– Good for attributes that are normally
represented using uniform classes: elevation, precipitation, temperature
0 – 2021 – 4041 – 6061 – 8081 – 100
![Page 31: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/31.jpg)
DATA CLASSIFICATION• Disadvantage of Equal Interval
*
![Page 32: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/32.jpg)
DATA CLASSIFICATION• *Considers distribution of data along a
number line (poor)– Doesn't work well with skewed
distributions (can result in empty classes)
![Page 33: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/33.jpg)
DATA CLASSIFICATION• Quantile
– Each class contains the same (or similar) number of attribute values
4 classes: quartiles5 classes: quintiles6 classes: sextilesTOWN
Gaps between classes
POPULATION
![Page 34: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/34.jpg)
DATA CLASSIFICATION• Advantage of Quantile
– Ensures that a choropleth map will have the same number of darkest polygons as lightest, etc.
≈13 Counties per Class
67 Counties5 Classes
![Page 35: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/35.jpg)
DATA CLASSIFICATION• Disadvantage of Quantile
*
![Page 36: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/36.jpg)
DATA CLASSIFICATION• *Considers distribution of data along a
number line (poor)– Doesn’t work well with skewed
distributions (one or two classes can occupy the majority of the range)
![Page 37: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/37.jpg)
DATA CLASSIFICATION• Natural Breaks
– Each class contains clusters of attribute values, and “natural” breaks between
More subjectiveTOWN
Gaps between classes
POPULATION
![Page 38: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/38.jpg)
DATA CLASSIFICATION• Advantage of Natural Breaks
*
![Page 39: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/39.jpg)
DATA CLASSIFICATION• *Considers distribution of data along a
number line (very good)– Considers how the data are distributed
along the number line; each classification is “custom tailored”
– Works well with skewed data distributions
![Page 40: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/40.jpg)
DATA CLASSIFICATION• Disadvantages of Natural Breaks
– Subjective, and results will differ– More difficult to compare with other maps– One or two classes can end up occupying
the majority of the data's range
![Page 41: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/41.jpg)
DATA CLASSIFICATION• Classification for map comparison
– Use the same method for all maps (if possible)
– Equal interval with identical break values often works best (shown here)
– Quantile can also work well– By definition, natural breaks will result in
different classifications on different maps, making comparison difficult
![Page 42: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/42.jpg)
![Page 43: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/43.jpg)
![Page 44: DATA STANDARDIZATION and CLASSIFICATION](https://reader033.vdocuments.mx/reader033/viewer/2022061607/568135c8550346895d9d2927/html5/thumbnails/44.jpg)
DATASTANDARDIZATION
and
CLASSIFICATION
Cartographic Design for GIS (Geog. 340)Prof. Hugh HowardAmerican River College