process monitoring of bivariate poisson data sotiris bersimis, department of statistics and...

22
Monitoring of Monitoring of Bivariate Bivariate Poisson Data Poisson Data Sotiris Bersimis, Department of Statistics and Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Insurance Science, University of Piraeus, Piraeus, Greece, and Piraeus, Greece, and Petros E. Maravelakis, D Petros E. Maravelakis, D epartment of epartment of Statistics and Actuarial-Financial Mathematics, Statistics and Actuarial-Financial Mathematics, University of the Aegean, Samos, Greece. University of the Aegean, Samos, Greece. A Problem Oriented Solution A Problem Oriented Solution

Upload: jewel-marsh

Post on 04-Jan-2016

224 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

Process Process Monitoring of Monitoring of Bivariate Bivariate Poisson DataPoisson Data

Sotiris Bersimis, Department of Statistics and Insurance Science, Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece, and University of Piraeus, Piraeus, Greece, and

Petros E. Maravelakis, DPetros E. Maravelakis, Department of Statistics and Actuarial-epartment of Statistics and Actuarial-Financial Mathematics, University of the Aegean, Samos, Greece.Financial Mathematics, University of the Aegean, Samos, Greece.

A Problem Oriented SolutionA Problem Oriented Solution

Page 2: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Problem is related to Food Industry

• Greek industry is composed of 23 sectors and the most important of them is the food and drink sector.

• This sector represents about 21% of Greek manufacturing industry, includes more than 1,300 enterprises and creates 70,000 jobs.

• In 2002, Greece’s food sector was second in the European Union (out of 15 countries), in terms of growth, reaching a growth rate of 3.3% (in that period Spain hold the first place).

• The first place in the food sector is taken by dairy productsdairy products, which hold 24%.

• The 5 main sectors of the Greek dairy industry are: milkmilk, yogurt, cheese, ice cream, cream and butter.

Page 3: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

Contribution of Each Food Sector in the Greek Food Industry

Page 4: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

In the dairy industry and Especially in Milk Production

• The dairy industry showed great signs of improvement, in the last 10 years, mainly because of the high nutritional value of dairy products and their close relationship with the Greek diet (now is also trapped in the economic crisis).

• From the preceding discussion it is clear that the dairy industry is of great importance for Greek Economy while milk is of great importance for Greeks’ diet.

• Among the different categories, Greeks prefer fresh milkfresh milk, which holds 47.4% of total share.

• At the same time, all companies invest considerable money in terms of research and development, and installation of units to gather and process fresh milk of high qualityquality and safetysafety.

Page 5: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

A Closer Look to the Problem

• In fresh milk as well as in many food processing operations, product safetysafety is controlled, by checking only the final product by microbiologicalmicrobiological and chemical methods (Tokatli et al, 2005).

• A major drawback associated with this approach is time delaytime delay. Collecting and examining the samples to determine the safety of the product takes too much timetoo much time (the results of the the results of the microbiological analysis are completed only after the product is microbiological analysis are completed only after the product is released to the marketreleased to the market).

• Another drawback is that it can be a high-costhigh-cost solution if any contamination is reported after the production is completed. Furthermore, the recall of the defective product and the collection from retail outlets add extra significant costextra significant cost.

Page 6: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

A Closer Look to the Problem

• Thus, it is clear that new process monitoring techniques new process monitoring techniques are needed aiming at this type of problemsare needed aiming at this type of problems.

• The significance of new process monitoring techniquesnew process monitoring techniques to deal with this type of problems arises from the fact that these cases are related to public health, since there are many diseases associated with low quality milk (or similar food products):• Leptospirosis

• Cowpox.

• Tuberculosis

• Brucellosis

• Listeria

• Johne's Disease

Page 7: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Exact Problem

• A milk pasteurization plant. A continuous pasteurization line.

• Our focus was concentered in the time interval after the pasteurization is completed and before the product is released to the consumer, since any pathogenic biological factor since any pathogenic biological factor contained in the raw material is removed with the contained in the raw material is removed with the pasteurization processpasteurization process.

• What now if a pathogenic biological factor appears to the part of production after the pasteurization of the product?

• As we already said microbiologicalmicrobiological methods are applied to the final product to ensure that the milk is safe for consumption.

• But also we said that the there is a time delay and that usually the exact results of the microbiological analysis are taken the exact results of the microbiological analysis are taken after the product is released to the marketafter the product is released to the market..

Page 8: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Exact Problem

• Thus we need a monitoring procedure !

• But what are we going to monitor ?

• Usually in that case quality control departments monitor the percentage of the non-conforming products.

• Also there are few cases that the quality control departments monitor the number of microorganisms of a specific type found in a sample (microorganisms per milliliter / in a suspension created by a sample from the production line)…

• In that case we monitor with an appropriate control chart a Poisson distributed test statistic…

• Note here that the milk (and almost all the food) contains microorganisms that if they do not exceed a threshold can not affect human health (in some cases are also useful).

Page 9: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Exact Problem

• But this needs time…

• The Poisson based control chart is fed with measurements only after the product is at the hands of the consumer…

• The quality managers are instructed that they have to wait a certain amount of time in order to proceed to the counting of microorganisms in the plate.

• In that case, we are assuming that if a contamination factor exists, affects the new products in an increasing way (the effects are a function of time).

• In that case, a better solution is to use CUSUM type or an EWMA control chart.

Page 10: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Exact Problem

• But the use of these types of charts do not solve the problem, because there is the time delay, and in that case an extreme event will identified only when is too late…

• A better solution is to measure the number of microorganisms (of a specific type) that are developed in a test plate (created by a sample from the production line) in many time points ….

• from zero point to the final time point (and not only at the end of the time period given in the microbiological guidelines).

• In that way we may be capable to observe how fast are the number of microorganisms is growing.

• The idea is that if a contaminating factor exists in the production line after the pasteurization process is completed then the number of microorganisms will be growing faster.

Page 11: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Exact Problem

• Also, if a contaminating factor make its appearance in the production line then it itself evolves (since it is a biological factor) causing continuously more and more contamination.

• Thus, the proposed sampling procedure is the following:

• Take one sample from the production line every k time units (say for example every 8 hours)

• Define a value l for the measurements on microbiological system (say for example 6) – usually usually by Optical Densityby Optical Density.

• If the guidelines instruct that the number of microorganisms of a specific type must measure in r hours (say 48 hours), then perform the 1st test at the r/l (8th) hour, the 2nd test at the 2r/l (16th ) hour, …, and finally the lth test at the r (48th ) hour.

Page 12: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

Sampling Scheme

Sampling Point   1 2 3 4 5 6 7 8 9 10 11 12

Plate / Interval

1 (0,8] X(1,1) X(1,2) X(1,3) X(1,4) X(1,5) X(1,6) X(1,7) X(1,8) X(1,9) …  ….  …. 

2 (8,16]   X(2,1) X(2,2) X(2,3) X(2,4) X(2,5) X(2,6) X(2,7) X(2,8) X(2,9)  ….  ….

3 (16,24]     X(3,1) X(3,2) X(3,3) X(3,4) X(3,5) X(3,6) X(3,7) X(3,8) X(3,9) …. 

4 (24,32]       X(4,1) X(4,2) X(4,3) X(4,4) X(4,5) X(4,6) X(4,7) X(4,8) X(4,9)

5 (32,40]         X(5,1) X(5,2) X(5,3) X(5,4) X(5,5) X(5,6) X(5,7) X(5,8)

6 (40,48]           X(6,1) X(6,2) X(6,3) X(6,4) X(6,5) X(6,6) X(6,7)

• The null hypothesis is that the process is in control, that there is no time dependence, and that each of the components x(i,j), i=1,2,…,l=6 and j=1,2,…,+∞ follows a Poisson distribution with the parameter λl.

• Thus, each time point u the sums of the form y(u)= x(1,u)+x(2,u-1)+…+x(l,u-l+1) are also Poisson random variables with parameter l∙λl.

Page 13: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

Univariate Control Chart

• Thus, in case we are interested in only one type of bacterium, we may apply a univariate Shewhart type control chart on the statistic

y(u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞.

Page 14: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• But what happens in the case that we have more than one types of bacterium ? Say for example 2.

• In that case, we may apply the same technique in both the types of bacterium.

• Thus, we conclude with two sums of the form• y(1,u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞• y(2,u)=x(1,u)+x(2,u-1)+…+x(l,u-l+1) for u=1,2,…,+∞

• The two variables in most of the cases will be dependent, since the presence of a contaminating factor will trigger a chain reaction in the evolution of these types of bacterium.

• In that case, we define the two dimensional random variable y=(y1,y2) which follows a two dimensional Poisson distribution with parameters λ1, λ2, and λ.

Page 15: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• The two dimensional random variable y=(y1,y2) has the following probability function

• This bivariate setting is actually based on the joint distribution of the variables Y1, Y2 where in general

Y1=Z1+Z3 and Y2=Z2+Z3

and Z1, Z2, Z3 are mutually independent Poisson random variables with means λ1, λ2 and λ3, respectively.

1 2 1 2kz z min( y ,y )

1 21 1 11 1 2 2 1 2 3

k 01 2 1 2

y yλ λ λPr(Y y ,Y y ) exp{ ( λ λ λ )} k !

k ky ! y ! λ λ

Page 16: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• The next step in our methodology is to identify the variable that will be used for the monitoring the bivariate process.

• A fact that will be used to motivate the selection of this variable is that the number of the bacteria can only increase. Therefore, we are interested in a variable that will be able to detect fast this possible increase.

• A straightforward selection is the sum of the two random variables Y1 and Y2 which is the sum of two dependent Poisson variables, say Y.

• This random variable identifies an increase in the mean of either Y1 and Y2.

• The random variable Y follows a Hermite distribution (see Jonshon, Kotz and Kemp (1992) pages 357-364) with probability function

z 2 j j[ z / 2 ]1 2

s 1 2j 0

α αPr(Y y ) exp( α α ) ,

( x 2 j )! j !

32211 , aa

Page 17: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• Consequently, for the identification of an out of control situation we may construct a Shewhart type control chart with limits calculated using the Hermite distribution (see Montgomery (2008)).

• This chart detects a possible increase in the mean of any of the two variables.

Variable Shift ATS Difference1 25% 82,41 75% 67,11 125% 22,42 25% 88,22 75% 65,12 125% 24,1

1,2 25% 52,21,2 75% 35,21,2 125% 20,1

Based on 1000 repetitions.

Page 18: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• The next step required by the nature of the problem is to see what happens after an out-of-control signal is given.

• A method to identify the responsible variable is needed.

• In order to identify the responsible variable after a signal we have to properly select a random variable that will help us in this direction.

• Such a random variable is the difference of the two random variables Y1 and Y2, say Y’.

• From the definition of the bivariate Poisson distribution we deduce that Y’=Y1-Y2=Z1-Z2, is the difference of two independent Poisson r.v.

Page 19: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• Since we use Y’ after a signal is issued, we expect to see one of the following results

• a positive value of Y’ meaning that we have an increase in Z1.

• a negative value of Y’ meaning that we have an increase in Z2.

• a value of Y’ close to zero meaning that both Z1 and Z2 have shifted.

• Therefore, the use of Y’ assures us that we will be able to identify the responsible variable in most of the cases. The probability distribution of Y’ is known and is given in Jonshon, Kotz and Kemp (1992) pages 190-192 and it is of the form

y / 2

11 2 z 1 2

2

λPr(Y ' y ) exp( λ λ ) I 2 λ λ ,

λ

.)1(!

)4/(

2)(

0

2

k

kr

z krk

xxxI

Page 20: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

The Bivariate Case

• Thus, we may use the distribution of Y’ in order to define a formal procedure for identifying the out-of-control variable.

• Specifically, if the value of Y’ is above the 95% percentage point of its theoretical distribution, then responsible variable is Y1 and if the value of Y’ is below the 5% percentage point of its theoretical distribution then Y2 is the responsible variable and if the value of is between the 5% and 95% percentage point of its theoretical distribution then both variables have shifted.

Page 21: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

Correct Identification Rates

  Shift  Shift Variable Size (%) Correct Identification

  1 25% 35,2%  1 50% 76,4%  1 75% 93,2%  1 100% 95,2%  1 150% 99,8%  2 25% 34,2%  2 50% 77,2%  2 75% 93,2%  2 100% 94,6%  2 150% 99,2%

Based on 1000 repetitions.

Page 22: Process Monitoring of Bivariate Poisson Data Sotiris Bersimis, Department of Statistics and Insurance Science, University of Piraeus, Piraeus, Greece,

References

• Figen (Kosebalaban) Tokatli, Ali Cinar, Joseph E. Schlesser (2005). HACCP with multivariate process monitoring and fault diagnosis techniques: application to a food pasteurization process, Food Control, 16, 411–422.

• Jonshon, N.L., Kotz, S. and Kemp, A.W. (1992). Univariate Discrete Distributions, Wiley, New York.

• Montgomery, D.C. (2008). Introduction to Statistical Quality Control, Wiley, New York.