estimation of pareto distribution functions from samples contaminated by measurement errors

20
Presenter: Lwando Kondlo Supervisor: Prof. C. Koen SKA Postgrad Bursary Conference December 5, 2009

Upload: truly

Post on 18-Jan-2016

15 views

Category:

Documents


0 download

DESCRIPTION

Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors. Presenter: Lwando Kondlo Supervisor: Prof. C. Koen SKA Postgrad Bursary Conference December 5, 2009. Background. The model for variable X measured with error is - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Presenter: Lwando KondloSupervisor: Prof. C. Koen

SKA Postgrad Bursary ConferenceDecember 5, 2009

Page 2: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

The model for variable X measured with error is

Estimation of the density/distribution function of X is often important.

This is a classical deconvolution problem. The specific case where X has a Pareto form

is discussed.

Page 3: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Pareto distribution – model for positive data. Example includes the

Distribution of income and wealth among individuals

Masses of molecular clouds, etc.

Page 4: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

The Finite-Support Pareto distribution (FSPD) is

Page 5: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Distributional parameters are estimated by fitting the FSPD to a set of data.

This is not appropriate if the data are contaminated by errors

Page 6: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

To develop methodology for deconvolution when X is known to be of Pareto form.

Apply the methodology to the real (radio astronomical) data.

Page 7: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

If X has the PDF g(.) and has the PDF h(.). Then Y has the PDF

Then the convolved PDF (CPDF)

Page 8: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

The CPDF could differ substantially from FSPD.

Probability-Probability plots (compares observed and theoretical distribution functions) can be used.

Page 9: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Simulated data with

are used.

L U a σ

3 6 1.5 0.4

Page 10: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors
Page 11: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

1. The contaminated data extend beyond the interval [L,U] over which the error-free data occur

2. The shape of the distribution is changed

◦ This will lead to biased estimates of L, U and power-law exponent a.

Page 12: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Based on maximising the likelihood (or log-likelihood) of the observed data given the model.

Log-likelihood of CPDF

Page 13: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Application to the data in the histogram leads

N.B: CPDF fitted to the data with errors gives favourable MLEs with true parameter values 3; 6 and 1.5.

L U a

FSDP 2.267 7.124 1.186

CPDF 3.047 6.028 1.445

Page 14: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

The methodology is illustrated by fitting CPDF to a sample of giant molecular clouds masses in the galaxy M33 (Engargiola et. al., 2003).

Page 15: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors
Page 16: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

L U a σ

MLE 6.9 77.7 1.33 3.47

s.e 0.65 4.97 0.26 0.55

The unit mass is solar masses.

Good agreement with the Engargiola et al (2003) estimates. More especially a = 1.6 +/- 0.3.

Page 17: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

The linear form of the P-P plot indicates that the estimated distribution fits the sample of giant molecular clouds very well.

Page 18: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Deconvolution is a useful statistical method for recovering an unknown distribution of X in the presence of errors.

The methodology for deconvolution when X is known to be of Pareto form is developed

Satisfactory results were found by MLE method.

The price paid is that the analysis is more complicated

Page 19: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors

Everyone contributed to the work presented.1. Prof. C. Koen (Supervisor)2.Funding: SKA SA (Kim, Anna and Daphne) 3. University of the Western Cape (Leslie and

Rennet)

Page 20: Estimation of Pareto  Distribution Functions from Samples Contaminated by Measurement Errors