sub-county statistical analysis and visualization using ...using arcgis pro and python robert...
TRANSCRIPT
![Page 1: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/1.jpg)
Sub-County Statistical
Analysis and Visualization
using ArcGIS Pro and Python
Robert Gottlieb, GIS Data AnalystEpidemiology Resource CenterJenny Durica, Director of MCH EpidemiologyDivision of Maternal and Child Health
Indiana GIS DaySeptember 17, 2019
![Page 2: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/2.jpg)
Infant Mortality RatesU.S. & Selected Countries, 2010
![Page 3: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/3.jpg)
Infant Mortality RatesU.S., 2017
![Page 4: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/4.jpg)
Infant Mortality RatesIndiana, U.S. & Healthy People 2020
Goal, 2007 – 2017
![Page 5: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/5.jpg)
![Page 6: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/6.jpg)
Infant Mortality Data
Birth record
• Pre-term, low birthweight
• Prenatal care
• Smoking during pregnancy
• Insurance
Death record
• Cause of death
• Age at death
Countylevel
analysis
![Page 7: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/7.jpg)
Infant Mortality and Birth Risk Factors
Infant Mortality Rates, 2013-17Mothers Smoking During
Pregnancy, 2013-17
![Page 8: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/8.jpg)
Infant Mortality and Birth Risk Factors
Infant Mortality Rates, 2013-17 Preterm Infants, 2013-17
Need the ability to look beyond county analysis,
and comprehensively assess multiple risk factors
![Page 9: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/9.jpg)
GIS in Health Begins with Geographic Coding Records collected by ISDH Records are a point on a map
![Page 10: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/10.jpg)
• What we are repeatedly and increasingly asked for?– Sub-county statistics
• Can we share and distribute these stats?– No, because of suppression rules (identifiability) and accuracy
(noise) of stats due to less data available at such scales
• How do we overcome these limitations?– Provide statistics on Observed cases and events at increased
geographic detail to focus health resources Within the county using a Multi-scale Binning and Smoothing Methodology that can be shared and distributed in the public domain to promote and support targeted health interventions
![Page 11: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/11.jpg)
We are Here
![Page 12: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/12.jpg)
Describing Health By County as Whole(share data with everyone)
Targeting Health of Neighborhoods(no sharing – actionable info not leveraged)
![Page 13: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/13.jpg)
Not enough informationfor public/partners
Too much informationfor public/partners
(identifiable)
Finding the balance betweencoarse data and detailed datawhile maintaining datastability and confidentiality.
What is this solution?
![Page 14: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/14.jpg)
![Page 15: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/15.jpg)
Zip Code Census Tract
- Wide range of population (100 – 100,000)- Point-based zips aren’t often cross-walked to areas- Small zips aren’t often cross-walked to large zips
- Zip Codes do not exactly equal Census ZCTA- Zip boundaries change
- Data collection doesn’t check zip for accuracy
- Some tract populations are < 1,000- Tracts can be very small areas
- Tracts can be oddly shaped- Tracts boundaries change
- Tract geography is considered ‘too identifiable’
![Page 16: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/16.jpg)
Used extensively by ISDHGIS in the past (Rushton)
Susceptible tofalse positives
Currently a Popularoption (“Hex-Binning”)
Introducesdirectional bias
Straight-forward,Out-of-Box
Arbitrary
Ensures datastability (Rushton)
Large bins might notdescribe data at
source point
Varying sizes of binsmight be confusing
![Page 17: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/17.jpg)
Diamond Binning – Based on Reasoning
The road grid system covers nearly all of Indiana. One can drive further when travelling north, south, east or westfrom a point than travelling NW, NE, SW or SE. The distance travelled for a given amount of time creates an extentboundary in a general shape of a diamond. Since neighborhoods and communities are closely tied to streets andpeople with tend to live near people of similar demographic characteristics, we reason that a diamond better captures a ‘neighborhood’ of people.
Drive-Time Service Area
15 minutes
![Page 18: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/18.jpg)
Concurrent Binning (Record Aggregation) for Urban and Rural Population
Accounting for a lot of data points AND too few data points
Multi-Scale:
Small bins for urban – more data points available in small area
Large bins for rural – more area needed to capture enough data points
![Page 19: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/19.jpg)
Multi-Scale Offset Approach:
5, 20, and 80 square miles and N, S, E, and W of seed
4.5
4.1
6.2
![Page 20: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/20.jpg)
Python Inputs and Outputs
Inputs
• Point layer with risk factors
• Bin template
• Need to create ArcProproject ahead of time
Outputs
• 20 mile bins with mean weighted composite rate
• Point layer with significance for each diamond
• Intermediate bins and smoothed layers
![Page 21: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/21.jpg)
Point DataBin Template
20 mile2 Bin with Mean Weighted Composite Rate
![Page 22: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/22.jpg)
![Page 23: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/23.jpg)
5 mi2 Weighted Composite Rate* = ((3 * 5mi2
Rate) + (2 * 20mi2 Rate) + (2 * 20mi2_N Rate) + (2 * 20mi2_E Rate) + (2 * 20mi2_W Rate) + (80mi2 Rate) + (80mi2_N Rate) + (80mi2_E
Rate) + (80mi2_W Rate)) / 15
*Only bins with variable counts greater than user defined threshold (typically 20) are included in calculation
5mi2 Rate = 0.386364
20mi2 Rate = 0.541295
20mi2_N Rate = 0.538126
20mi2_E Rate = 0.495726
20mi2_W Rate = 0.534035
80mi2 Rate = 0.537832
80mi2_N Rate = 0.536505
80mi2_E Rate = 0.536672
80mi2_W Rate = 0.536396
0.501657 = ((3 * 0.386364) + (2 * 0.541295) + (2 * 0.538126) + (2 * 0.495726) + (2 * 0.534035) +
(0.537832) + (0.536505) + (0.536672) + (0.536396)) / 15
Weighting ensures that local data is more important in calculation of composite rate
![Page 24: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/24.jpg)
*** Histogram scale may not match.
![Page 25: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/25.jpg)
*** Histogram scale may not match.
![Page 26: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/26.jpg)
*** Histogram scale may not match.
![Page 27: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/27.jpg)
*** Histogram scale may not match.
![Page 28: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/28.jpg)
*** Histogram scale may not match.
![Page 29: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/29.jpg)
*** Histogram scale may not match.
![Page 30: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/30.jpg)
*** Histogram scale may not match.
![Page 31: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/31.jpg)
*** Histogram scale may not match.
![Page 32: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/32.jpg)
*** Histogram scale may not match.
![Page 33: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/33.jpg)
*** Histogram scale may not match.
Initial 5 Composite Result
Composite Rate
County Rate
![Page 34: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/34.jpg)
*** Histogram scale may not match.
Composite Rate Composite Rate IDW
![Page 35: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/35.jpg)
Raster Surface is Smoothed Using a Low Pass Filter Method to Remove Noise
![Page 36: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/36.jpg)
Surface statistics are calculated within each
20mi bin
The mean of each pixel value represents the smoothed weighted
composite rate
![Page 37: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/37.jpg)
Composite Rate Composite Rate
![Page 38: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/38.jpg)
Link to web app containing GUMSS maps
![Page 39: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/39.jpg)
4 Things to Remember When Interpreting the Maps
1The value of a diamond bin is based on data within and around that diamond bin
(bins are smoothed)
2Values of the diamond are based on patient records that could be geocoded
(typical geocode percentages are about 90% statewide)
3Values in diamonds with zero or a small number of geocoded points
rely more on data further away(interpolation or inference to fill data gaps)
4The process is built on the assumption of spatial autocorrelation
(like values tend to be nearer to one another in space)
![Page 40: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/40.jpg)
![Page 41: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/41.jpg)
![Page 42: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/42.jpg)
![Page 43: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/43.jpg)
Map Utilization
• Maternal & Child Health Needs Assessment research and outreach to high risk areas
• Grant proposals and program funding
• Maternal and Child Health strategic planning
• Education and data dissemination
![Page 44: Sub-County Statistical Analysis and Visualization using ...using ArcGIS Pro and Python Robert Gottlieb, GIS Data Analyst Epidemiology Resource Center Jenny Durica, Director of MCH](https://reader036.vdocuments.mx/reader036/viewer/2022062415/609936f1c9a1126ed06834a0/html5/thumbnails/44.jpg)
Questions?
Robert Gottlieb
Jenny Durica