structured replacement policies for a markov-modulated shock model

5
Operations Research Letters 37 (2009) 280–284 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl Structured replacement policies for a Markov-modulated shock model Murat Kurt * , Lisa M. Maillart Department of Industrial Engineering, University of Pittsburgh, United States article info Article history: Received 2 September 2008 Accepted 18 March 2009 Available online 31 March 2009 Keywords: Shock model Markov-modulated Poisson process Markov decision process Control-limit policy abstract We establish the optimality of structured replacement policies for a periodically inspected system that fails silently whenever the cumulative number of shocks, or the magnitude of a single shock it has received, exceeds a corresponding threshold. Shocks arrive according to a Markov-modulated Poisson process which represents the (controllable or uncontrollable) environment. © 2009 Elsevier B.V. All rights reserved. 1. Introduction Shock-damage models have been commonly applied in earth- quake prediction, traffic and insurance modeling, and reliability modeling of computers, batteries and database systems [1]. One common assumption in the existing shock models is that the shock interarrival times are identically distributed. From a prac- tical point of view, this assumption may not be valid due to exter- nal and/or internal environmental factors. Several researchers have studied shock models that relax this assumption [2–7]. However, these studies analyze the properties of the stochastic models they present, and do not consider optimal replacement policies for the system. More specifically, Sumita and Shanthikumar [5,6] analyze a cumulative damage shock model with correlated shocks (in both size and arrival rate) by using bivariate Markov renewal sequences. Igaki et al. [2] extend this work to incorporate the effects of exoge- nous factors and model the dependencies between the shock mag- nitudes and interarrival times as a trivariate stochastic process. Li and Luo [4] derive the lifetime distribution of a system under a cu- mulative damage shock model in which both the interarrival times and the magnitudes of the shocks are correlated through a Markov process. Lastly, Kharoufeh and Mixon [3] analyze the transient and asymptotic reliability of a system which fails by receiving shocks and continuous wear, the rates of which are modulated by a Marko- vian environment. In this paper, we consider a system which fails silently when the cumulative number of shocks it has received exceeds a certain threshold or it receives a single shock of a magnitude above some threshold. Shocks arrive according to a Poisson process, the rate of * Corresponding author. Tel.: +1 412 624 9807; fax: +1 412 624 9831. E-mail address: [email protected] (M. Kurt). which is modulated by a randomly varying environment modeled as a discrete-time Markov chain. In other words, the system receives shocks according to a Markov-modulated Poisson process (MMPP), which has been widely used in reliability modeling of manufacturing systems [3] and software that is subject to virus attacks or errors with varying interarrival times [8,9]. MMPPs can also extend the current modeling applications for the optimal control of single stage production lines where random process shifts may represent the shocks, and the shift probabilities from an in-control to an out-control state can be modulated by a randomly varying environment [10]. Although the term ‘‘MMPP’’ is usually reserved for a Poisson process whose rate is modulated by a continuous-time Markov chain, we use it to refer to a discrete- time process. We consider two types of environments: controllable (internal), and uncontrollable (external). For a multi-component system, a controllable environment may refer to the status of the parts that are not affected by the shocks, but have an effect on the arrival rate of the shocks through some wear processes. On the other hand, an uncontrollable environment may refer to the weather conditions under which the system is operating, or the status of a computer network through which random virus attacks are sent to the system of interest. We address the question of when to replace the system to op- timally balance the trade-off among the costs of inspection, failure and replacement. To minimize total expected discounted cost, we formulate the problem as a discrete-time infinite-horizon Markov decision process (MDP). Waldmann [7] considers optimal replace- ment decisions for a repairable system using a continuous-time shock model under varying effects of uncontrollable environmen- tal factors, and provides sufficient conditions for the optimality of control-limit replacement policies with respect to cumulative damage. Unlike Waldmann, we consider both controllable and uncontrollable environments, and are able to establish structural 0167-6377/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.orl.2009.03.008

Upload: murat-kurt

Post on 21-Jun-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structured replacement policies for a Markov-modulated shock model

Operations Research Letters 37 (2009) 280–284

Contents lists available at ScienceDirect

Operations Research Letters

journal homepage: www.elsevier.com/locate/orl

Structured replacement policies for a Markov-modulated shock modelMurat Kurt ∗, Lisa M. MaillartDepartment of Industrial Engineering, University of Pittsburgh, United States

a r t i c l e i n f o

Article history:Received 2 September 2008Accepted 18 March 2009Available online 31 March 2009

Keywords:Shock modelMarkov-modulated Poisson processMarkov decision processControl-limit policy

a b s t r a c t

We establish the optimality of structured replacement policies for a periodically inspected system thatfails silently whenever the cumulative number of shocks, or the magnitude of a single shock it hasreceived, exceeds a corresponding threshold. Shocks arrive according to a Markov-modulated Poissonprocess which represents the (controllable or uncontrollable) environment.

© 2009 Elsevier B.V. All rights reserved.

1. Introduction

Shock-damage models have been commonly applied in earth-quake prediction, traffic and insurance modeling, and reliabilitymodeling of computers, batteries and database systems [1]. Onecommon assumption in the existing shock models is that theshock interarrival times are identically distributed. From a prac-tical point of view, this assumption may not be valid due to exter-nal and/or internal environmental factors. Several researchers havestudied shock models that relax this assumption [2–7]. However,these studies analyze the properties of the stochastic models theypresent, and do not consider optimal replacement policies for thesystem. More specifically, Sumita and Shanthikumar [5,6] analyzea cumulative damage shock model with correlated shocks (in bothsize and arrival rate) by using bivariateMarkov renewal sequences.Igaki et al. [2] extend this work to incorporate the effects of exoge-nous factors andmodel the dependencies between the shockmag-nitudes and interarrival times as a trivariate stochastic process. Liand Luo [4] derive the lifetime distribution of a system under a cu-mulative damage shockmodel in which both the interarrival timesand the magnitudes of the shocks are correlated through aMarkovprocess. Lastly, Kharoufeh andMixon [3] analyze the transient andasymptotic reliability of a system which fails by receiving shocksand continuouswear, the rates ofwhich aremodulated by aMarko-vian environment.In this paper, we consider a system which fails silently when

the cumulative number of shocks it has received exceeds a certainthreshold or it receives a single shock of a magnitude above somethreshold. Shocks arrive according to a Poisson process, the rate of

∗ Corresponding author. Tel.: +1 412 624 9807; fax: +1 412 624 9831.E-mail address:[email protected] (M. Kurt).

0167-6377/$ – see front matter© 2009 Elsevier B.V. All rights reserved.doi:10.1016/j.orl.2009.03.008

which is modulated by a randomly varying environment modeledas a discrete-time Markov chain. In other words, the systemreceives shocks according to a Markov-modulated Poisson process(MMPP), which has been widely used in reliability modeling ofmanufacturing systems [3] and software that is subject to virusattacks or errors with varying interarrival times [8,9]. MMPPs canalso extend the current modeling applications for the optimalcontrol of single stage production lines where random processshifts may represent the shocks, and the shift probabilities from anin-control to an out-control state can be modulated by a randomlyvarying environment [10]. Although the term ‘‘MMPP’’ is usuallyreserved for a Poisson process whose rate is modulated by acontinuous-time Markov chain, we use it to refer to a discrete-time process.We consider two types of environments: controllable(internal), and uncontrollable (external). For a multi-componentsystem, a controllable environment may refer to the status of theparts that are not affected by the shocks, but have an effect onthe arrival rate of the shocks through some wear processes. Onthe other hand, an uncontrollable environment may refer to theweather conditions under which the system is operating, or thestatus of a computer network through which random virus attacksare sent to the system of interest.We address the question of when to replace the system to op-

timally balance the trade-off among the costs of inspection, failureand replacement. To minimize total expected discounted cost, weformulate the problem as a discrete-time infinite-horizon Markovdecision process (MDP). Waldmann [7] considers optimal replace-ment decisions for a repairable system using a continuous-timeshock model under varying effects of uncontrollable environmen-tal factors, and provides sufficient conditions for the optimalityof control-limit replacement policies with respect to cumulativedamage. Unlike Waldmann, we consider both controllable anduncontrollable environments, and are able to establish structural

Page 2: Structured replacement policies for a Markov-modulated shock model

M. Kurt, L.M. Maillart / Operations Research Letters 37 (2009) 280–284 281

properties of the optimal cost function and optimal replacementpolicy with respect to the environment state for the controllablecase. We also offer insights to the structure of the optimal policywith respect to the environment state. We discuss the relationshipbetween Waldmann’s model and our uncontrollable environmentmodel further in Section 2.The remainder of the article is organized as follows. In Section 2

we provide a detailed problem description andmodel formulation.In Section 3, we present structural properties of the resultingoptimal policies, which we illustrate by numerical examples inSection 4.

2. Problem description and model formulation

In this section, we provide a detailed problem descriptionfollowed byMDPmodel formulations for both the controllable andthe uncontrollable environment situations.

2.1. Problem description

A system is inspected at a cost of I every τ units of timeand receives shocks according to a Poisson process. The rate ofthe process follows a discrete-time homogeneous Markov chain(DTMC) with a state space Λ = λ0, λ1, . . . , λR, with λi < λjfor i < j. That is, the arrival rate of the shocks between thenth and (n + 1)st inspection depends only on the arrival ratebetween the (n − 1)st and nth inspection. Each shock has anindependent and identically distributed randommagnitude with acumulative distribution function F . The system fails between twosuccessive inspections if the cumulative number of shocks it hasreceived exceeds a certain threshold S or it receives a shock witha magnitude that exceeds a threshold κ . System failure is detectedonly through inspections and each failure has a fixed penalty cost,η. If the system is in theworking condition, then inspection revealsthe cumulative number of shocks and the current arrival rate.Inspection occurs at the end of each period and is assumed to takea negligible amount of time. After each inspection the system iseither used for one more period or replaced with a new one at afixed cost, C . If the system is found to be failed, it must be replaced.All costs are assumed to be incurred at the end of each period anddiscounted with a rate β ∈ (0, 1). The objective is to minimize thetotal expected discounted inspection, failure and replacement cost.

2.2. Model formulation

We consider two separate cases for the environment thatmodulates the arrival rate of the shocks and provide a discrete-time, infinite-horizon, MDP model for each of these cases. Werefer to the models that correspond to the controllable and theuncontrollable environment cases as the controllable environmentmodel (CEM) and the uncontrollable environment model (UEM),respectively. In the CEM, replacement in effect renews the arrivalrate of the shocks; that is, a new system starts to receive shockswith rate λ0. On the other hand, in the UEM, replacement onlyresets the cumulative number of shocks to zerowithout having anyeffect on the arrival rate. In our models, each decision epoch refersto the point in time immediately following an inspection, and wecall each time interval between any two successive decision epochsa period. Below, we describe the additional notation common toboth models:

• (r, s): The state of the process just after an inspection, wherer ∈ Ψ = 0, . . . , R indicates that the system is receivingshocks at rate λr , and s ∈ Ω = 0, 1, . . . , S denotes thecumulative number of of shocks, where s = S corresponds to afailure.We define the state space of the process byΥ = Ψ ×Ω .We also letΘ =

⋃r∈Ψ (r, S) denote the set of failure states.

• ϕr(i): Given that the system is receiving shocks at rateλr , r ∈ Ψ ,the probability of receiving i shocks of magnitude less than κprior to the next inspection, i.e., ϕr(i) = e−λr τ (λr τ)i

i! [F(κ)]i.• P(s′|s, r): The probability that the system will be in state s′ atepoch t + 1 given it is in state s and receiving shocks at rate λr ,r ∈ Ψ , at epoch t , or

P(s′|s, r) =

ϕr(s′ − s) if s ≤ s′ < S,

1−S−s−1∑i=0

ϕr(i) if s′ = S,

0 otherwise.

(1)

• P(r): The one-step transition probability matrix for thecumulative number of shocks when the system is receivingshocks at rate λr , r ∈ Ψ , i.e., P(r) = [P(s′|s, r)], s, s′ ∈ Ω .• Q (r ′|r): The probability that the system will start to receiveshocks at rateλr ′ at epoch t+1 given it receives shockswith rateλr in period t . We denote the one-step transition probabilitymatrix for the shock arrival rate by Q = [Q (r ′|r)], r, r ′ ∈ Ψ .• a(r, s): The optimal action taken in state (r, s), to be chosenfrom 0, 1, where ‘‘0’’ stands for waiting one more period and‘‘1’’ stands for the immediate replacement of the system. Notethat ‘‘1’’ is the only feasible action in state (r, S) for any r ∈ Ψ ,since the system is down.• w(r, s): The total expected discounted cost incurred starting instate (r, s) ∈ Υ \Θ if action ‘‘0’’ is taken.• ϑ(r, s): The minimum total expected discounted cost startingin state (r, s) ∈ Υ .

In the CEM (UEM), when the system is detected to be in state(r, s) ∈ Υ after an inspection and the decision-maker chooses toreplace the system, then the process moves to state (0, 0)

((r, 0)

).

Otherwise, while operating prior to the next inspection, theprocess transitions into state (r ′, s′) ∈ Υ with probabilityQ (r ′|r)P(s′|s, r), i.e.,

w(r, s) = β

(∑r ′∈Ψ

∑s′∈Ω\S

P(s′|s, r)Q (r ′|r)[ϑ(r ′, s′)+ I

]+ P(S|s, r)

∑r ′∈Ψ

Q (r ′|r)[ϑ(r ′, S)+ I + η

]),

for (r, s) ∈ Υ \Θ . (2)

It is convenient to define `(r, s) as the expected inspection andfailure penalty cost starting in state (r, s) ∈ Υ \ Θ , i.e., `(r, s) =β [ηP(S|s, r)+ I], and restate (2) as follows:

w(r, s) = `(r, s)+ β∑r ′∈Ψ

∑s′∈Ω

P(s′|s, r)Q (r ′|r)ϑ(r ′, s′),

for (r, s) ∈ Υ \Θ .

For both models, we determine the optimal actions thatminimize the total expected discounted cost in each state (r, s) ∈Υ by solving a set of recursive equations as follows.For the CEM,

ϑ(r, s) =min w(r, s), ϑ(0, 0)+ C for (r, s) ∈ Υ \Θ ,ϑ(0, 0)+ C if s = S.

Clearly, ϑ(0, 0) = w(0, 0) in the CEM. In other words, it is notoptimal to replace a new system unless the arrival rate of theshocks increases from λ0 or the system receives at least one shock.For the UEM,

ϑ(r, s) =min w(r, s), ϑ(r, 0)+ C for (r, s) ∈ Υ \Θ ,ϑ(r, 0)+ C if s = S.

Similar to the CEM, it is obvious thatϑ(r, 0) = w(r, 0) for all r ∈ Ψin the UEM. In otherwords, because replacement only removes the

Page 3: Structured replacement policies for a Markov-modulated shock model

282 M. Kurt, L.M. Maillart / Operations Research Letters 37 (2009) 280–284

effects of shocks, it is not optimal to replace the system unless itreceives at least one shock after replacement.Our UEM can be viewed as a special case of Waldmann’s [7]

model. Waldmann [7] considers the optimal replacement of aperiodically inspected system that is subject to failure as a functionof cumulative damage where the magnitude of a single shockreceived is a Markov-modulated process, and the operating andreplacement costs are state dependent. In our model, the amountof damage (i.e., the number of shocks) accumulated in eachperiod follows an MMPP. Furthermore, our failure probabilitiesare realistically defined by two thresholds whereas Waldmannassumes no functional form.

3. Structural properties

In this section we derive structural properties of the MDPmodels formulated in Section 2.2. We prove the existence ofoptimal control-limit replacement policies, which are easy toimplement because of their intuitive structure and may facilitatemore efficient solution techniques [11]. We begin our structuralanalysis by defining the increasing failure rate concept for astochastic matrix.

Definition 1 (Barlow and Proschan [12]). A stochastic matrix H =[h(j|i)], i, j = 1, . . . , n is said to have the increasing failure rate(IFR) property if

∑nj=k h(j|i) is nondecreasing in i for all k =

1, . . . , n.

The IFR property is equivalent to first order stochastic dominancerelationship among the rows of a given stochasticmatrix. It is oftenused to derive sufficient conditions that ensure the optimalityof control-limit policies in maintenance and reliability theory[13–15]. In our problem, the IFR property of P(r) can be interpretedas follows: The larger the number of shocks the system currentlyhas, the more likely it is to have a larger number of shocks at thetime of the next inspection. The results in Lemma 1 are statedwithout proof. They directly follow from the fact that P(r) is uppertriangular and P(s′|s, r) = P(s′+1|s+1, r) for all s ∈ Ω\S, S−1,s′ ≥ s and r ∈ Ψ .

Lemma 1. P(r) is IFR and `(r, s) is strictly increasing in s ∈ Ω \ Sfor all r ∈ Ψ .

In Proposition 1, for both the CEM and the UEM, we showthat the optimal total expected discounted cost is monotonicallynondecreasing in the number of shocks received under any givenarrival rate.

Proposition 1. ϑ(r, s) is nondecreasing in s ∈ Ω for all r ∈ Ψ inboth the CEM and the UEM.

Unless otherwise stated, we construct the proofs of our structuralresults by induction on the iterates of the following value iterationalgorithm:

wn+1(r, s) = `(r, s)+ β∑r ′∈Ψ

∑s′∈Ω

P(s′|s, r)Q (r ′|r)ϑn(r ′, s′),

for (r, s) ∈ Υ \Θ , and (3)

for the CEM,

ϑn+1(r, s)

=

min

wn+1(r, s), ϑn(0, 0)+ C

for (r, s) ∈ Υ \Θ ,

ϑn(0, 0)+ C if s = S; (4)

for the UEM,

ϑn+1(r, s)

=

min

wn+1(r, s), ϑn(r, 0)+ C

for (r, s) ∈ Υ \Θ ,

ϑn(r, 0)+ C if s = S. (5)

In (3)–(5), ϑn(r, s) andwn(r, s) are the respective values of ϑ(r, s)and w(r, s) at the nth (n ≥ 0) iteration of the algorithm, andϑ0(r, s) = 0 for all (r, s) ∈ Υ .

Proof. We prove the result for the CEM only, since the proof forthe UEM is similar. By construction, the result holds at iteration0. Suppose ϑn(r, s) is nondecreasing in s ∈ Ω for all r ∈ Ψ andn = 1, . . . ,m. For an arbitrary r ∈ Ψ and s ∈ Ω \ S − 1, S,

wm+1(r, s+ 1)− wm+1(r, s) > β∑r ′∈Ψ

Q (r ′|r)

×

∑s′∈Ω

[P(s′|s+ 1, r)− P(s′|s, r)

]ϑm(r ′, s′) ≥ 0. (6)

In (6), the first inequality follows from the fact that `(r, s+1) >`(r, s). Since P(r) is IFR, by Lemma 4.7.2 of Puterman [11], thesecond inequality holds by the induction hypothesis thatϑm(r ′, s′)is nondecreasing in s′ ∈ Ω for all r ′ ∈ Ψ . Thus wm+1(r, s) isstrictly increasing in s ∈ Ω \ S for all r ∈ Ψ . By (4), thisimplies ϑm+1(r, s) is nondecreasing in s ∈ Ω for all r ∈ Ψ , and byinduction the result is established by the convergence of the valueiteration algorithm.

As an immediate result of Proposition 1, Corollary 1 establishesthe optimality of a shock number-based control-limit policy, whichwe explain as follows: Given the system is receiving shocks withrate λr , r ∈ Ψ , it is optimal to replace the system if and only ifthe cumulative number of shocks is greater than or equal to somethreshold shock number s∗r .

Corollary 1. There exists an optimal shock-number-based control-limit policy for both the CEM and the UEM; that is, for each r ∈ Ψthere exists a threshold s∗r ∈ Ω such that a(r, s) = 1 if and only ifs ≥ s∗r .

Next, we provide a definition to compare two stochastic matricesbased on the first order stochastic dominance relationship.

Definition 2. Let H1 = [h1(j|i)] and H2 = [h2(j|i)], i, j = 1, . . . , nbe two stochastic matrices. If

∑nj=k h1(j|i) ≥

∑nj=k h2(j|i) for all

i, k = 1, . . . , n, then we say that H1 stochastically dominates H2,and denote it by H1 H2.

Lemma 2 states that as the shocks arrive more frequently, thesystem is more likely to receive a larger number of shocks or fail.

Lemma 2. P(r+1) P(r), and `(r, s) is strictly increasing in r ∈ Ψfor all s ∈ Ω \ S for all r ∈ Ψ \ R.

Proof. Let φn(λ, x) =∑ni=0

e−λτ (λτx)i

i! for n ∈ Z+, λ ∈ (0,∞) andx ∈ (0, 1]. By standard calculus, ∂φn(λ,x)

∂λ< 0 for all n ∈ Z+ and

x ∈ (0, 1], establishing the results by Definition 2.

Conditional onQ being IFR, Proposition 2 states that the optimaltotal expected discounted cost does not decrease as the expectedshock interarrival time decreases.

Proposition 2. If Q is IFR, then ϑ(r, s) is nondecreasing (strictlyincreasing) in r ∈ Ψ for all s ∈ Ω in the CEM (UEM).

Proof. By construction, the result holds at iteration 0. Supposeϑn(r, s) is nondecreasing in r ∈ Ψ for all s ∈ Ω and n =1, . . . ,m. Note that ϑm(r, s) is nondecreasing in s ∈ Ω for allr ∈ Ψ (the proof of this auxiliary result is provided in the proofof Proposition 1). For an arbitrary r ∈ Ψ \ R and s ∈ Ω \ Sconsiderwm+1(r + 1, s)− wm+1(r, s).

Page 4: Structured replacement policies for a Markov-modulated shock model

M. Kurt, L.M. Maillart / Operations Research Letters 37 (2009) 280–284 283

wm+1(r + 1, s)− wm+1(r, s)

> β∑r ′∈Ψ

Q (r ′|r + 1)∑s′∈Ω

P(s′|s, r + 1)ϑm(r ′, s′)

−β∑r ′∈Ψ

Q (r ′|r)∑s′∈Ω

P(s′|s, r)ϑm(r ′, s′) (7)

= β∑r ′∈Ψ

Q (r ′|r + 1)∑s′∈Ω

[P(s′|s, r + 1)− P(s′|s, r)

]ϑm(r ′, s′)

+β∑r ′∈Ψ

Q (r ′|r + 1)∑s′∈Ω

P(s′|s, r)ϑm(r ′, s′)

−β∑r ′∈Ψ

Q (r ′|r)∑s′∈Ω

P(s′|s, r)ϑm(r ′, s′)

≥ β∑r ′∈Ψ

Q (r ′|r + 1)∑s′∈Ω

P(s′|s, r)ϑm(r ′, s′)

−β∑r ′∈Ψ

Q (r ′|r)∑s′∈Ω

P(s′|s, r)ϑm(r ′, s′) (8)

= β∑s′∈Ω

P(s′|s, r)∑r ′∈Ψ

[Q (r ′|r + 1)− Q (r ′|r)

]ϑm(r ′, s′)

≥ 0, (9)

where (7) is implied by the fact that `(r + 1, s) > `(r, s) byLemma 2; (8) is implied by the fact that

∑s′∈Ω [P(s

′|s, r + 1) −

P(s′|s, r)]ϑm(r ′, s′) ≥ 0 for all r ′ ∈ Ψ , which follows from Lemma4.7.2 of Puterman [11] since ϑm(r ′, s′) is nondecreasing in s′ ∈ Ωfor all r ′ ∈ Ψ and P(r + 1) P(r) by Lemma 2; and lastly,the inequality in (9) is implied by the fact that

∑r ′∈Ψ [Q (r

′|r +

1) − Q (r ′|r)]ϑm(r ′, s′) ≥ 0 for all s′ ∈ Ω , which follows fromLemma 4.7.2 of Puterman [11] since Q is IFR and ϑm(r ′, s′) isnondecreasing in r ′ ∈ Ψ for all s′ ∈ Ω by the induction hypothesis.Thus,wm+1(r, s) is strictly increasing in r ∈ Ψ for all s ∈ Ω\S.

By (4), this implies that ϑm+1(r, s) is nondecreasing in r ∈ Ψ

for all s ∈ Ω , and by induction the result is established by theconvergence of the value iteration algorithm.

In our problem context, Q being IFR implies that the shorterthe expected shock interarrival time is in a given period, the morelikely it is to be shorter in the following period. Analogous to anoptimal shock-number-based control-limit policy, Proposition 2implies an optimal shock arrival rate-based control-limit policy forthe CEM, which we describe as follows: Given the system hasreceived s ∈ Ω \S shocks, replacing the systemwith a new one isoptimal if and only if r is above some certain threshold r∗s ∈ Ψ . Weformally state this result in Corollary 2 without proof. Note that Qbeing IFR does not guarantee a control-limit structure with respectto the shock arrival rate for the optimal policy in the UEM.

Corollary 2. If Q is IFR, then there exists a shock arrival rate-basedoptimal control-limit policy for the CEM, that is, for each s ∈ Ω thereexists a threshold r∗s ∈ Ψ such that a(r, s) = 1 if and only if r ≥ r

∗s .

Next, we analyze the effect of the environment type on the op-timal cost function. Let ϑ1(r, s) and ϑ2(r, s) denote the optimal to-tal expected discounted cost in state (r, s) ∈ Υ for the CEM andthe UEM, respectively. Proposition 3 can be interpreted as follows:Conditional on Q being IFR, the optimal expected discounted costdoes not decrease in any state if the decision-maker loses the abil-ity to control the environment. This result is quite intuitive, sincein the CEM new systems start to receive shocks at the minimumrate, whereas in the UEM, for the same price the arrival rate of theshocks is not affected by replacement.

Proposition 3. If Q is IFR, thenϑ2(r, s) ≥ ϑ1(r, s) for all (r, s) ∈ Υ .

10

9

8

7

6

5

4

3

2

1

00 1 2 3 4 5 6 7 8 9

r

s

REPLACE

REPLACEREPLACE

REPLACE

WAIT

WAIT

WAIT

WAIT

Fig. 1. Optimal policy in the CEM.

Proof. For (r, s) ∈ Υ , we let w1(r, s) and w2(r, s) specifythe value of w(r, s) in the CEM and the UEM, respectively. Wesimultaneously apply the algorithms (3) and (4) (for the CEM) and(3) and (5) (for the UEM), and let ϑni (r, s) and w

ni (r, s) be the

respective values ofϑi(r, s) andwi(r, s) at the nth (n ≥ 0) iterationof the corresponding algorithm for (r, s) ∈ Υ and i = 1, 2. Byinduction on the iterates of these algorithms, we will show thatϑn2 (r, s) ≥ ϑn1 (r, s) for all (r, s) ∈ Υ and n ≥ 0. By definition,ϑ02 (r, s) = ϑ01 (r, s) = 0 for all (r, s) ∈ Υ . Suppose ϑn2 (r, s) ≥ϑn1 (r, s) for all (r, s) ∈ Υ and n = 1, . . . ,m. For an arbitrary(r, s) ∈ Υ \Θ considerwm+12 (r, s)−wm+11 (r, s). Note that since themodels are identical except for the type of environment, they havethe same immediate expected failure penalty costs in each state.This implies:

wm+12 (r, s)− wm+11 (r, s) = β∑r ′∈Ψ

∑s′∈Ω

P(s′|s, r)Q (r ′|r)

×[ϑm2 (r

′, s′)− ϑm1 (r′, s′)

]≥ 0, (10)

where the inequality in (10) follows from the induction hypothesis.Thus, wm+12 (r, s) ≥ wm+11 (r, s) for all (r, s) ∈ Υ \ Θ . By (4)and (5), and using the facts that wm+12 (r, s) ≥ wm+11 (r, s) for all(r, s) ∈ Υ \ Θ and ϑm1 (r, 0) is nondecreasing in r ∈ Ψ , (theproof of this auxiliary result is given in the proof of Proposition 2),ϑm+12 (r, s) ≥ ϑm+11 (r, s) for all (r, s) ∈ Υ , and the result followsby the convergence of the value iteration algorithms.

4. Numerical examples

In this section, we present two numerical examples to illustratethe structure of the optimal replacement policy in the CEM andthe UEM. We use the same parameter values for both models.We assume a threshold shock-number S = 10 and 10 differentarrival rates. In other words, we let Ω = 0, 1, . . . , 10, Ψ =0, 1, 2, 3, . . . , 9, and set λr = r + 1 for r ∈ Ψ . We also assumethe following IFR Q matrixQ (r|r) = 0.4,Q (r + 1|r) = 0.3,Q (r + 2|r) = 0.2,Q (r + 3|r) = 0.1, for r = 0, 1, . . . , 6;Q (7|7) = 0.4,Q (8|7) = 0.3,Q (9|7) = 0.3,Q (8|8) = 0.65,Q (9|8) = 0.35, andQ (8|9) = 0.2, Q (9|9) = 0.8.We let the time between two consecutive epochs be τ = 1 unit,and without assuming an explicit probability distribution for therandom magnitudes of shocks, we set F(κ) = 0.8. Regarding the

Page 5: Structured replacement policies for a Markov-modulated shock model

284 M. Kurt, L.M. Maillart / Operations Research Letters 37 (2009) 280–284

10

9

8

7

6

5

4

3

2

1

00 1 2 3 4 5 6 7 8 9

r

sREPLACE

REPLACE

REPLACE

WAIT

WAIT

WAIT

WAIT

WAIT

Fig. 2. Optimal policy in the UEM.

cost parameters of the problem, we assume I = 0, β = 0.97, andset C = η = 5 for the CEM and C = 5, η = 15 for the UEM.Fig. 1 demonstrates the fact that the optimal policy is of control-

limit type in both components of the state of the process forthe CEM. For instance, given the environment is in state r =4, it is optimal to replace the system only if it has received4 or more shocks. From Fig. 1, it is also apparent that as theshocks arrive more frequently (i.e., the environment worsens)the decision-maker becomes more willing to replace the system(Corollary 2). For instance, given the environment occupies stater = 0, replacement is not optimal at any level, whereas it becomesoptimal regardless of the system’s cumulative number of shockswhen the environment is in state r = 5 or higher.In Fig. 2, note that the optimal policy for the UEM exhibits

control-limit structure onlywith respect to the cumulative numberof shocks, which verifies the fact that Q being IFR is not sufficientto have a control-limit structure with respect to the arrival rateof the shocks in the UEM. Also, from Figs. 1 and 2, it is clear thatthe structure of the optimal policy in the UEM differs from that ofthe CEM, andwe intuitively explain the single convex replacementregion in the UEM policy as follows: Faster shock arrival rates

induce earlier replacement, but only up to a point after which thedecision-maker becomes less and less willing to pay to replace thesystem.

Acknowledgments

The authors would like to thank Jeffrey P. Kharoufeh and hisinsightful comments which helped us to improve the manuscript.Murat Kurt is also specially thankful to Andrew J. Schaefer forhis support during the study. The authors also appreciate theconstructive comments of an anonymous reviewer.

References

[1] T. Nakagawa, Shock and Damage Models in Reliability Theory, in: SpringerSeries in Reliability Engineering, 2007.

[2] I. Igaki, U. Sumita, M. Kowada, Analysis of Markov renewal shock models,Journal of Applied Probability 32 (3) (1995) 821–831.

[3] J.P. Kharoufeh, D. Mixon, On a Markov-modulated shock and wear process,Technical Report, University of Pittsburgh, 2008.

[4] G. Li, J. Luo, Shock model in Markovian environment, Naval Research Logistics52 (3) (2005) 253–260.

[5] J.G. Shanthikumar, U. Sumita, General shockmodels associatedwith correlatedrenewal sequences, Journal of Applied Probability 20 (3) (1983) 600–614.

[6] U. Sumita, J.G. Shanthikumar, A class of correlated cumulative shock models,Advances in Applied Probability 17 (2) (1985) 347–366.

[7] K.H. Waldmann, Optimal replacement under additive damage in randomlyvarying environments, Naval Research Logistics 30 (3) (1983) 377–386.

[8] S. Chatterjee, R.B. Misra, S.S. Alama, A generalised shock model for softwarereliability, Computers and Electrical Engineering 24 (5) (1998) 363–368.

[9] H. Pham, M. Pham, Software reliability models for critical applications,Technical Report, Idaho National Engineering Laboratory, 1991.

[10] R.K. Nurani, J.S. Seshadri, G. Shanthikumar, Optimal control of a single stageproduction system subject to random process shifts, Operations Research 45(5) (1997) 713–724.

[11] M.L. Puterman, Markov Decision Processes, John Wiley and Sons, New York,1994.

[12] R.E. Barlow, F. Proschan,Mathematical Theory of Reliability, JohnWiley& Sons,New York, 1965.

[13] M. Chen, R.M. Feldman, Optimal replacement policieswithminimal repair andage-dependent costs, European Journal of Operational Research 98 (1) (1997)75–84.

[14] W.P. Pierskalla, J.A. Voelker, A survey of maintenance models: The control andsurveillance of deteriorating systems, Naval Research Logistics Quarterly 23(3) (1976) 353–388.

[15] C. Valdez-Flores, R.M. Feldman, A survey of preventive maintenance modelsfor stochastically deteriorating single-unit systems, Naval Research Logistics36 (4) (1989) 419–446.