Download - Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors

Presenter: Lwando KondloSupervisor: Prof. C. Koen

SKA Postgrad Bursary ConferenceDecember 5, 2009

The model for variable X measured with error is

Estimation of the density/distribution function of X is often important.

This is a classical deconvolution problem. The specific case where X has a Pareto form

is discussed.

Pareto distribution – model for positive data. Example includes the

Distribution of income and wealth among individuals

Masses of molecular clouds, etc.

The Finite-Support Pareto distribution (FSPD) is

Distributional parameters are estimated by fitting the FSPD to a set of data.

This is not appropriate if the data are contaminated by errors

To develop methodology for deconvolution when X is known to be of Pareto form.

Apply the methodology to the real (radio astronomical) data.

If X has the PDF g(.) and has the PDF h(.). Then Y has the PDF

Then the convolved PDF (CPDF)

The CPDF could differ substantially from FSPD.

Probability-Probability plots (compares observed and theoretical distribution functions) can be used.

Simulated data with

are used.

L U a σ

3 6 1.5 0.4

1. The contaminated data extend beyond the interval [L,U] over which the error-free data occur

2. The shape of the distribution is changed

◦ This will lead to biased estimates of L, U and power-law exponent a.

Based on maximising the likelihood (or log-likelihood) of the observed data given the model.

Log-likelihood of CPDF

Application to the data in the histogram leads

N.B: CPDF fitted to the data with errors gives favourable MLEs with true parameter values 3; 6 and 1.5.

L U a

FSDP 2.267 7.124 1.186

CPDF 3.047 6.028 1.445

The methodology is illustrated by fitting CPDF to a sample of giant molecular clouds masses in the galaxy M33 (Engargiola et. al., 2003).

L U a σ

MLE 6.9 77.7 1.33 3.47

s.e 0.65 4.97 0.26 0.55

The unit mass is solar masses.

Good agreement with the Engargiola et al (2003) estimates. More especially a = 1.6 +/- 0.3.

The linear form of the P-P plot indicates that the estimated distribution fits the sample of giant molecular clouds very well.

Deconvolution is a useful statistical method for recovering an unknown distribution of X in the presence of errors.

The methodology for deconvolution when X is known to be of Pareto form is developed

Satisfactory results were found by MLE method.

The price paid is that the analysis is more complicated

Everyone contributed to the work presented.1. Prof. C. Koen (Supervisor)2.Funding: SKA SA (Kim, Anna and Daphne) 3. University of the Western Cape (Leslie and

Rennet)

Download - Estimation of Pareto Distribution Functions from Samples Contaminated by Measurement Errors

Top Related