ngdm 2009 panel on climate change mining climate and ecosystem data : challenges and opportunities...

11
NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Upload: angelica-leona-cross

Post on 13-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

NGDM 2009 panel on Climate Change

Mining Climate and Ecosystem Data : Challenges and Opportunities

Vipin KumarUniversity of Minnesota

Page 2: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Climate Change : The defining issue of our era

Greenhouse gas emissions are the cause of global warming

Human induced ecosystem changes (e.g. deforestation)

Increased use of fossil fuels

Consequences of Global Warming include :

Increased occurrence of extreme events

Melting ice caps/rising sea levels

Heat waves/Droughts/Floods Shocks in supplies of water

and food

Page 3: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Need of the day

Ability to answer questions such as: What is the impact of climate change on intensity, duration

and frequency of extreme events? E.g. Droughts, Floods, Hurricanes. Heat Waves

What is the impact of deforestation on global carbon cycle?

What is the relationship of crop yield and prices to deforestation dynamics and greenhouse gas emissions?

Page 4: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

A Golden Opportunity for the KDD community

Data sets need to answer the questions above are becoming available

Remote Sensing data from satellites and weather radars

Data from in-situ sensors and sensor networks

Output from climate and earth system models

Data guided processes can complement hypothesis guided data analysis to develop predictive insights for use by climate scientists, policy makers and community at large.

Page 5: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Challenges in Mining Earth Science Data

Analysis and Discovery approaches need to be cognizant of climate and ecosystem data characteristics such as:

Spatio-temporal autocorrelation

Low-frequency variability

Long-range spatial dependence

Long memory temporal processes (teleconnections)

Nonlinear processes

Multi-scale nature

Non-Stationarity

Page 6: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Illustrative Application: Forest Cover ChangeChanges in forests account for over 20% of

the greenhouse gas emissions

2nd only to fossil fuel emissions

Terrestrial carbon can provide up to 25% of the climate change solution

Ability to monitor changes in global forest cover over space and time is critical for enabling inclusion of forests in carbon trading

⇒ The need for a scalable technological solution to assess the state of forest ecosystems and how they are changing has become increasingly urgent.

Deforestation moves large amounts of carbon into the atmosphere in the form of CO2.Deforestation moves large amounts of carbon into the atmosphere in the form of CO2.

Good to Go Green: SFO Unveils Carbon Offset Kiosks

'Carbon Offset' Business Takes Rootby Martin Kaste

Page 7: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

longitude

latit

ude

Correlation Between ANOM 1+2 and Land Temp (>0.2)

-180 -150 -120 -90 -60 -30 0 30 60 90 120 150 180

90

60

30

0

-30

-60

-90

Illustrative Application: Finding Climate Indices

El Nino Events

Nino 1+2 Index

A climate index is a time series of sea surface temperature or sea level pressure

Climate indices capture teleconnections The simultaneous variation in

climate and related processes over widely separated points on the Earth

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

longitude

latit

ude

-180 -150 -120 -90 -60 -30 0 30 60 90 120 150 180

90

60

30

0

-30

-60

-90

Sea surface temperature anomalies in the region bounded by 80 W-90 W and 0 -10 S

Page 8: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Discovery of Climate Indices Using Clustering

longitude

latit

ud

e

SST Clusters With Relatively High Correlation to Land Temperature

-180 -150 -120 -90 -60 -30 0 30 60 90 120 150 180

90

60

30

0

-30

-60

-90

29

75 78 67 94

An alternative approach for finding candidate indices.

– Clusters represent ocean regions with relatively homogeneous behavior.

– The centroids of these clusters are time series that summarize the behavior of these ocean areas, and thus, represent potential climate indices.

– Clusters are found using the Shared Nearest Neighbor (SNN) method that eliminates “noise” points and tends to find regions of “uniform density”.

– Clusters are filtered to eliminate those with low impact on land points

Many SST clusters and SLP cluster pairs reproduce well-known climate indices

Provides a better physical interpretation than those based on the SVD/EOF paradigm, and provide candidate indices with better predictive power than known indices for some land areas.

DMI

SOI

SOI

NAO AO

Steinbach, M., Tan, P., Kumar, V., Klooster, S., and Potter, C. 2003. Discovery of climate indices using clustering. In Proceedings of the Ninth ACM SIGKDD international Conference on Knowledge Discovery and Data Mining (Washington, D.C., August 24 - 27, 2003). KDD '03. ACM, New York, NY, 446-455.

Page 9: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Finding New Patterns: Indian Monsoon Dipole Mode Index

• Recently discovered Indian Ocean Dipole Mode index (DMI)*

• DMI is defined as the difference in SST anomaly between the region 5S-5N, 55E-75E and the region 0-10S, 85E-95E.

• DMI and is an indicator of a weak monsoon over the Indian subcontinent and heavy rainfall over East Africa.

• The difference of SLP clusters 16 and 22 is a surrogate for the DMI index that is defined using SST.

* N. H. Saji, B. N. Goswami, P. N. Vinayachandran and T.

Yamagata, “A dipole mode in the tropical Indian Ocean,”

Nature 401, 360-363 (23 September 1999).

DMI

Plot of cluster 16 – cluster 22 versus the Indian Ocean Dipole Mode index. (Indices smoothed using 12 month moving average.)

Page 10: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Dynamic Climate Indices

• Most well-known indices based on data collected at fixed land stations.

• NAO computed as the normalized difference between SLP at a pair of land stations in the Arctic and the subtropical Atlantic regions of the North Atlantic Ocean

• However, underlying phenomenon may not occur at exact location of the land station. e.g. NAO

• Challenge: Given sensor readings for SLP at different points in the ocean, how to identify clusters of low/high pressure points that may move with space and time.

Source: Portis et al, Seasonality of the NAO, AGU Chapman Conference, 2000.

Page 11: NGDM 2009 panel on Climate Change Mining Climate and Ecosystem Data : Challenges and Opportunities Vipin Kumar University of Minnesota

Illustrative Application: Relationship Mining

Example of a non-random association pattern between FPAR-Hi and NPP-Hi events and the land locations where such pattern is observed frequently. Left: Locations that support the association pattern {abnormally high FPAR => abnormally high NPP}. Right: Land locations that correspond to grassland and shrubland regions. The remarkable similarity between the two figures suggest that grasslands are vegetation that is able to more quickly take advantage of periodically high precipitation (and possibly solar radiation) than forests.