madness in the dust: is having dementia linked to where you live?

37
By Frank Kelly, Ondrej Urban and Esther Remmelink Madness in the dust Is having dementia linked to where you live?

Upload: frank-kelly

Post on 28-Jan-2018

130 views

Category:

Health & Medicine


1 download

TRANSCRIPT

By Frank Kelly, Ondrej Urban

and Esther Remmelink

Madness in the dustIs having dementia linked to

where you live?

Top 5 causes of death in London in 2012

1. top five female death categories in London:

1. Ischaemic heart diseases;

2. Dementia and Alzheimer’s disease;

3. Cerebrovascular diseases;

4. Malignant neoplasm of trachea, bronchus

and lung;

5. Chronic lower respiratory diseases;

2. top five male death categories in London:

1. Ischaemic heart diseases;

2. Malignant neoplasm of trachea,

bronchus and lung;

3. Chronic lower respiratory diseases;

4. Cerebrovascular diseases;

5. Dementia and Alzheimer’s disease;

http://cleanair.london/hot-topics/first-ever-rankings-of-top-10-death-rates-for-every-london-borough/

http://www.sciencemag.org/news/2017/01/brain-pollution-evidence-builds-dirty-air-causes-alzheimer-s-dementia

Alzheimer's Disease (AD)A progressive neurodegenerative disorder:

● 15-20 years’ accumulation of “plaques”

● short-term memory problems → death (traumatic)

Most common form of dementia (60%+ of cases):

Cause poorly understood: genetic + environmental factors

Harmful factors: old age, female gender, poor cardiovascular function, diabetes

Protective factors: intellectual activities, physical activity, social interaction

Changes in demographics means global burden of AD will increase http://www.alz.org/

http://www.un.org/esa/population/publications/

Background

WHO are we?

Frank, Ondrej and Esther are data

scientists at @HAL24K

(smart cities data science consultancy)

HOW are we affected?

● Family members with Alzheimer's Disease

● Have studied neuroscience

Also:

● Have lived in polluted cities

How pollutants can affect the central nervous

systemNot all reach the brain:

Fine particles, e.g.

Particulate Matter (PM)

2.5 = particles <2.5μm)

Heusinkveld, NeuroToxicology, 2016

PM2.5 particles

Two main sources:

● Anthropogenic = human-made, e.g.

fuel combustion, industry, vehicles,

agriculture

● Non-anthropogenic = non-human

made, e.g. forest fires

Measured in µg/m3

https://uk-air.defra.gov.uk/assets/documents/reports/cat09/1310021025_AQD_DD4_2011mapsrepv0.pdf

https://uk-air.defra.gov.uk/assets/documents/reports/cat11/1212141150_AQEG_Fine_Particulate_Matter_in_the_UK.pdf

Image credit: http://www.solarcrest.co.uk/images/PM2-point-5.jpg

Urban PM2.5

Black carbon = a major part

of PM2.5 due to road traffic

https://uk-

air.defra.gov.uk/assets/documents/reports/cat11/1212141150_

AQEG_Fine_Particulate_Matter_in_the_UK.pdf

Project & Presentation Goals

- Use open data to verify for ourselves the link between Alzheimer’s prevalence

and air pollution exposure

- Demonstrate nice map visualisation methods in Python

- Discuss challenges and interesting findings

Our Study: Air Quality (AQ) data

USA (California) UK (England) Netherlands

(Netherlands)

Time span 1990 - 2016 (AQ) 2010 - 2015 2011 - 2016

Measurement

of air

pollution

PM2.5 continuously

measured from state

environmental agency

operated air quality

monitoring stations

Population-weighted

anthropogenic and non-

anthropogenic

PM 2.5 estimates per local

authority (started 2009)

PM 2.5 measured from

50 air quality

monitoring stations

(part of national

network)

Website https://www.epa.gov/outdoor-

air-quality-data

https://uk-air.defra.gov.uk/ http://www.rivm.nl/

PM2.5 monitoring

http://aqicn.org/map/netherland/

The Netherlands

is in there,

somewhere

California

PM2.5 monitoring in 3 cities

Sparse

in SF

http://aqicn.org/map/netherland/

Our Study: Alzheimer’s (AD) prevalence data

USA (California) UK (England) Netherlands

(Netherlands)

Time span 2010 - 2014 2013 2010 - 2014

Disease

prevalence

sample

Registered Medicare

patient Alzheimer’s &

other dementias

prevalence, split over

and under 65

Counts of diagnosed

dementia cases in 12 age

bins; ranging from 30-34

through to 90+ years of age

Counts of diagnosed

dementia cases

Region division County Parliamentary constituency ‘Gemeente’ = city

council

Geospatial Visualisation in Python: options

• Geopandashttp://geopandas.org/

● Bokeh

http://bokeh.pydata.org/en/lates

t/

See our demo in github: https://github.com/ondrejiayc/PyDataLondon2017

Netherlands: PM 2.5

<-

cumulative

PM2.5 count

over 4 years

->

Long-term

average

wind speed

Wind + PM2.5 Animation: http://aqicn.org/faq/2015-11-

05/a-visual-study-of-wind-impact-on-pm25-concentration/

Netherlands: Dementia prevalence

->

Fraction of

population

over 65

<- dementia

prevalence

in 2015

How? ● Multiplying rates by a standard population distribution

(e.g. the W.H.O. standard).

Adjusting for age distributionAdjust for age in the population studied to be able to compare different

populations,

e.g. between counties.

W.H.O. pop.

statistics for

developed

nations, 2010

<-

cumulative

PM2.5 count

over 4 years

Netherlands: PM 2.5 vs. (adjusted) dementia

prevalence

->

Crudely age

adjusted

dementia

prevalence

California: AQ vs. AD prevalence with Bokeh

Limitations of Californian & Dutch data

● Dementia prevalence data lacked age

breakdown in Netherlands and USA

● Sample bias in USA Dementia data:

(Medicare scheme only)

● PM 2.5 monitors not evenly spread out

across California

UK geospatial data: boundaries

● PM 2.5 data by local authority

● Dementia count for each age group recorded

per parliamentary constituency

● Shape files available from Ordnance Survey

https://www.ordnancesurvey.co.uk/business-and-government/products/boundary-line.html

Geopandas capabilities - spatial joins for London

Cumulative 5 year human-made PM 2.5 by Local

Authority

Dementia prevalence by Westminster constituency

England’s age distribution

● Urban / rural age distribution differs

immensely

UK County = mostly countryside

UK Borough = mostly urban areas

● Detailed prevalence per age group

available for UK data

UK: correction for age

Directly age

adjusted

version ->

Median 30-95+

dementia count

per region

<-

UK County = mostly countryside

UK Borough = mostly urban areas

Geopandas: dementia and age correction for

London

^ Mean age by

London ward (2013)

^ Dementia prevalence by Westminster

constituency, 2013 (not age corrected)AD age

corrected ->

PM 2.5

in the

UK

(2011 -

2015)

Cumulative non

human-made PM2.5

Cumulative human-

made PM2.5

UK PM2.5 & age adjusted dementia

Cumulative non

human-made

PM2.5

Cumulative

human-made

PM2.5

UK mean

wind profile

1981- 2010

Age adjusted

dementia

prevalence

(England only)

Age adjusted English

dementia vs. PM 2.5

● Geospatially joined

○ (Local Authority ⇔ Parliamentary

Constituency)

● No clear correlation between

total age-adjusted dementia

prevalence (total for all ages)

and cumulative 5 year PM 2.5

Measuring lifetime exposure to PM2.5 ?

https://www.instituteforgovernment.org.uk/blog/dealing-diesel

● Ownership %

of diesel cars

increased

drastically

since 1990

Lifetime

exposure

https://www.instituteforgovernment.org.uk/blog/dealing-diesel

● Diesel cars only

part of the

picture

● Which age

group was most

affected by

traffic related

PM 2.5 levels?

Age-adjusted dementia prevalence in 30-39 year

olds

UK County = mostly countryside UK Borough = mostly urban areas

Age-adjusted

dementia prevalence

in 30-39 year olds

● Can we see evidence of

movement of young people out

from London to its surrounding

commuter belt?

Confounders and covariatesSome other possible variables that could explain

the relationship between poor AQ and early AD?

<- % not

born in the

UK (2011)

Median

household

income

(2012/13)

Confounders and covariatesSome other possible variables that could explain

the relationship between poor AQ and early AD?

Feature importance: Random Forest Regressor

model applied to London areas’ 30-39yrs AD

prevalence:

Education, income

levels and life

expectancy were

significant →

Our conclusions

● Some indication that there is a link between:

recent exposure to cumulative PM2.5 level per region

and

early Alzheimer's prevalence in that region

● This correlation may be confounded by other (unknown) factors

⇒ Open Data and Data Science ≠ Epidemiology

● Challenge to compare open data from different domains and geographies

⇒ Recommend Geopandas

Where should Data Scientists live ideally?

● A windy place… (or rainy)

● Not too much traffic…

● Wherever you live:

○ Head for greener areas

○ Get decent sleep, eat

healthily, get regular

aerobic exercise … and

intellectual stimulation.

THE

ENDThank you!

Frank: @norhustla, Esther: @esrem5, Ondrej: @ondrejiayc

HAL24K: @hal24k

We would welcome your involvement!

- Contribute AD and / or AQ data

- Examine the data and draw your own conclusions

https://github.com/ondrejiayc/PyDataLondon2017