rami qahwaji [email protected] & tufancolak [email protected]

42
http://spaceweather.inf.brad.ac.uk Rami Qahwaji [email protected] & TufanColak [email protected] EIMC, University of Bradford BD71DP, U.K. utomated Short-term Prediction of olar Flares using Machine Learnin

Upload: yehudi

Post on 11-Jan-2016

51 views

Category:

Documents


1 download

DESCRIPTION

Automated Short-term Prediction of Solar Flares using Machine Learning. Rami Qahwaji [email protected] & TufanColak [email protected] EIMC, University of Bradford BD71DP, U.K. Organisation of this talk. Objectives & related work Solar data (features and activities) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

Rami Qahwaji [email protected]

& TufanColak

[email protected]

EIMC, University of Bradford

BD71DP, U.K.

Automated Short-term Prediction of Solar Flares using Machine Learning

Page 2: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work

Page 3: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Objective:

We aim to design an automated system that could provide short-term prediction of solar flares by establishing a correlation between sunspots and solar flares using machine learning.

Page 4: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Related Work

Despite the recent advances in solar imaging, machine learning has not been widely applied to solar data, except for verification purposes.

Solar activity (i.e., Wolf Number) was predicted first by (Calvo et al. 1995).

(Borda et al. 2002) described a method for the automatic detection of solar flares using BP MLP.

MLP, SVM and RBF were used for flares detection in (Qu et al. 2003).

Page 5: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work

Page 6: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Data?

Data from the publicly available National Geophysical Data Centre (NGDC) sunspot groups and flares catalogues are used in our study.

NGDC keeps record of data from several observatories around the world and holds one of the most comprehensive publicly available databases for solar features and activities.

Page 7: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

The NGDC sunspots catalogue

The NGDC sunspot catalogue holds records of sunspot groups supplying their date, time, location, physical properties, sunspot area and classification data.

Two classification systems exist for sunspots: McIntosh, which depends on the size, shape and spot density of sunspots, and Mt. Wilson., which is based on the distribution of magnetic polarities within spot groups.

Page 8: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

The NGDC Flares catalogue

This catalogue provides information about dates, starting and ending times for flare eruptions, location, NOAA number of the corresponding active region and x-ray classification for the detected flares.

Not all the flares have associated NOAA numbers. Flares without NOAA numbers are not included in our study.

Page 9: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Data

Page 10: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association and prediction model Machine learning algorithms Practical results Conclusions and future work

Page 11: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

We’ve investigated all the sunspot groups that were associated with flares from 01 Jan1992 till 31 Dec 2005.

The degree of association was determined based on the NOAA region number and the timing information.

A C++ platform that extracts online flares and sunspots info from NGDC catalogues was created.

Our software has analysed the data related to 29343 flares and 110241 sunspots and has managed to associate 1425 M and X flares with their corresponding sunspot groups.

Associating Flares and Sunspots

Page 12: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Associating Flares and Sunspots

Page 13: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Flare Prediction

Page 14: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

The Theoretical Model

Page 15: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work

Page 16: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Various neural network topologies, support vector machines (SVM) and Radial Basis Function Networks (RBFN) are optimized and compared.

In our previous work (Qahwaji & Colak, CITSA 2006 and Colak & Qahwaji, WSC11) the performance of several NN topologies (i.e., Elman BP, FFBP, cascade FFBP, etc.) was compared and it was concluded that CCNN provides better association between solar flares and sunspot classes.

CCNN and RBFN are used because of their efficient performance in classification and time-series prediction (Frank et al. 1997).

Page 17: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Thank You for Listening

SVM vs NN?

Page 18: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

It is one of the recent trends in machine learning to compare the performance of SVMs and NNs.

The work reported in (Acir & Guzelis 2004), (Pal & Mather 2004), (Huang et al. 2004), and (Distante et al. 2003) supports this.

Similar performance for SVMs was reported for flares detection in (Qu et al. 2003),

Page 19: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Cascade FFBP

In cascade FFBP, the first layer has connecting weights with the input layer. Each subsequent layer has weights connecting it to the input layer and all previous layers. .

Page 20: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

SVM (Support Vector Machines)

maximises the distance between the closest vectors in both classes to the hyperplane

Page 21: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Radial Basis Function Networks (RBFN)

Page 22: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Optimising the Learning Algorithms

A learning algorithm provides best generalisation if it is optimised.

A NN is optimised if the optimum topology, learning algorithm and learning times are found.

After finding that CCNN provides best performance, we compared 100 different CCNN topologies.

We found that a CCNN with 6 hidden nodes in the first layer and 4 hidden nodes in the second layer gives the best results for CFP and CFTP.

Similar approaches were followed for SVM and RBNN.

Page 23: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Page 24: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work

Page 25: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Both NGDC catalogues were used and our software has analysed the data related to 29343 flares and 110241 sunspots and has managed to associate 1425 M and X flares with their corresponding sunspot groups.

The total number of samples used for our training set is 2882, where 1425 samples represent sunspots that produced flares.

The remaining samples represent sunspots that existed in non-flaring days and are not related to any sunspot groups within the previous flaring sunspot samples.

Page 26: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

The Training and Testing Sets

The NN training and testing was carried out based on the statistical Jack-knife technique (Fukunaga 1990).

For all the experiments, 80% of the samples are randomly selected and used for training while the remaining 20% are used for testing. These experiments are repeated for number of times and the average is taken.

Page 27: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Initial Experiments

For each sample, the training vector consists of 5 elements ( 3 for inputs; 2 for outputs).

Page 28: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Initial Experiments

Several experiments based on the Jack-knife technique were carried out and we found that the prediction rate for flares in the best case scenario was 72.9%.

This indicated that a correlation existed between the input and output sets. But this value is not high enough to provide reliable prediction of solar activities.

To improve the learning performance we tried to associate the classified sunspots with the sunspot cycle.

Page 29: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

This seemed logical because the rise and fall of solar activity coincides with the sunspot cycle (Pap et al. 1990).

When the solar cycle is at a maximum, plenty of large active regions exist and many solar flares are detected. These decreases in number as the Sun approaches the minimum part of its cycle (Pap et al. 1990).

Page 30: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Solar Cycle and Flares

Science @ NASA,"Solar Minimum Explodes", 9.15.2005

Page 31: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Solar Cycle Modelling-Hathaway’s Model

a represents the amplitude and is related to the rise of the cycle minimum, b is related to the time in months from minimum to maximum; c gives the asymmetry of the cycle; and to denotes the starting time

Page 32: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

For each sample, the training vector consists of 6 elements ( 4 for inputs; 2 for outputs).

Page 33: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Hence, for Fkc sunspot at solar maximum that produced an M flare, the training vector looks like this:

Page 34: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Page 35: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Page 36: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Organisation of this talk

Objectives & related work Solar data (features and activities) Data Association Machine learning algorithms Practical results Conclusions and future work

Page 37: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Conclusions

A fully automated computer platform that could verify this correlation between sunspot classes and solar flares relation using machine learning, is designed.

The association and learning softwares will become public shortly at

Our findings show that there is a direct relation between the eruptions of flares and certain McIntosh classes of sunspots such as Ekc, Fki and Fkc. Our findings are in accordance with (McIntosh 1990), (Warwick 1966), and (Sakurai 1970).

http://spaceweather.inf.bradford.ac.uk/

Page 38: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

X-ray Flares versus McIntosh Classification

0.0 5.0 10.0 15.0 20.0 25.0

CAICAOCHOCKOCROCSICSODACDAIDAODHCDHIDHODKCDKIDKODSIDSOEACEAIEAOEHCEHOEKCEKIEKOESIESOFACFAIFAOFHCFHIFHOFKCFKIFKOFSO

McI

nto

sh C

lass

Percentage

X Class Flare

M Class Flare

X-ray Flares versus Mt. Wilson Classification

0

5

10

15

20

25

30

35

40

45

50

B BD BG BGD G GD

Mt. Wilson Class

Perc

entag

e

M Class Flare

X Class Flare

Page 39: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

A hybrid system, which combines both SVM and CCNN, will give better results for flare prediction.

Page 40: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Future Work

Apply image segmentation and classification algorithms to detect sunspots and classify them automatically, so that the platform is completed.

To track the individual sunspot groups over their lifetime. The development of the sunspot group can contribute to the knowledge of the machine learning systems.

Will better prediction be achieved if the magnetic configuration of sunspots (Mt. Wilson classification) is combined with the sunspot area to replace the McIntosh classification (Sammis, Tang & Zirin, 2000, ApJ)?

Page 41: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

To compare our findings with other authors who tested the correlations of the various McIntosh classes on flare rates and the applications to solar flare prediction (e.g. McIntosh 1990; Bornmann & Shaw 1994, Sol. Phys. 150, p. 127; Gallagher et al. 2002, Sol. Phys. 209, p. 171; Wheatland 2004, ApJ 609, p. 1134). 

Page 42: Rami Qahwaji  r.s.r.qahwaji@bradford.ac.uk &  TufanColak t.colak@bradford.ac.uk

http://spaceweather.inf.brad.ac.uk/

SIPWORKIII 08/09/06

Acknowledgment. This work is supported by an EPSRC Grant (GR/T17588/01), which is entitled “Image Processing and Machine Learning Techniques for Short-Term Prediction of Solar Activity”.