ECE 645 – Hypothesis Testing (cont’d.)
J. V. Krogmeier
February 21, 2014
Contents
1 Composite Hypothesis Testing 3
2 Bayesian Formulation 4
2.1 Specialization of Bayesian for uniform costs . . . . . . . . . . . . . . . . . 5
2.2 Example: Testing on the radius of a point in the plane . . . . . . . . . . . 6
3 Uniformly Most Powerful Tests 10
3.1 Example: UMP Tests Don’t Always Exist . . . . . . . . . . . . . . . . . 11
1
3.2 Example: UMP Testing of Location . . . . . . . . . . . . . . . . . . . . . 12
3.3 Unbiasedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4 Locally Most Powerful Tests 16
4.1 The general structure of LMP . . . . . . . . . . . . . . . . . . . . . . . . 17
5 Generalized Likelihood Ratio Tests 19
Hypothesis Testing 2
1 Composite Hypothesis Testing
• Previously had the case where under each hypothesis there was only one possible
distribution for the observation:
H0 : Y ∼ P0
vs.
H1 : Y ∼ P1
.
This is said to be a simple hypothesis test.
• Now consider the case where under each hypothesis there are many possible distribu-
tions for the observation Y . Hypotheses of this type are known as composite hypotheses.
• A family {Pθ : θ ∈ Λ} of probability distributions on the observation space Γ. Here
parameter set is a disjoint union Λ = Λ0 ∪ Λ1 and the two hypotheses are
H0 : Y ∼ Pθ, θ ∈ Λ0
vs.
H1 : Y ∼ Pθ, θ ∈ Λ1
.
• Consider two approaches: Bayesian and non-Bayesian.
Hypothesis Testing 3
2 Bayesian Formulation
• Assume the parameter is a rv Θ taking values in Λ and that Pθ is the conditional
distribution of Y given Θ = θ.
• Wish to make a binary decision based on the observation Y = y about which of the
two sets Λ0 and Λ1 contains Θ = θ. Consider only non-randomized decision rules.
• Cost function. C(i, θ), for i = 0, 1 and θ ∈ Λ, is the cost of choosing decision i when
Y ∼ Pθ. For simplicity, assume that C is non-negative and bounded.
• Conditional risks (for a decision rule δ)1
Rθ(δ) = Eθ{C(δ(Y ), θ)} for θ ∈ Λ.
• Average or Bayes risk
r(δ) = E{RΘ(δ)}and a Bayes rule is one that minimizes the Bayes risk.
• Using iterated expectations2 and the definition Eθ{C(δ(Y ), θ)} = E{C(δ(Y ),Θ)|Θ =
θ}:r(δ) = E{E{C(δ(Y ),Θ)|Θ}}
1Eθ{·} denotes expectation assuming Y ∼ Pθ.2EX = E{E{X|Y }}.
Hypothesis Testing 4
= E{C(δ(Y ),Θ)}= E{E{C(δ(Y ),Θ)|Y }}.
• The last relation implies that r(δ) is minimized over δ if for each y ∈ Γ, δ(y) is chosen
to be the decision that minimizes the posterior cost
E{C(δ(Y ),Θ)|Y = y} = E{C(δ(y),Θ)|Y = y}.
• Since δ(y) can be only 0 or 1, a Bayes rule for this problem is given by
δB(y) =
1 <
0 or 1 if E{C(1,Θ)|Y = y} = E{C(0,Θ)|Y = y}0 >
,
i.e., δB chooses the hypothesis that is least costly, on the average, given the obser-
vation. For Λ = {0, 1} this reduces to the Bayes rule for simple hypothesis testing,
which also had the interpretation of minimizing the posterior cost.
2.1 Specialization of Bayesian for uniform costs
• Suppose that the cost function is uniform over the two sets Λ0 and Λ1, i.e., say
C(i, θ) = ci,j, θ ∈ Λj.
Hypothesis Testing 5
• Then, under the resonable assumption c11 < c01
δB(y) =
1 >
0 or 1 if P (Θ∈Λ1|Y=y)P (Θ∈Λ0|Y=y) = c10−c00
c01−c11
0 <
.
• In addition, if we assume that Y has conditional densities fY (y|Θ ∈ Λ0), fY (y|Θ ∈Λ1), then the test can be rewritten (using Bayes’ formula3)
δB(y) =
1 >
0 or 1 if L(y) = π0(c10−c00)π1(c01−c11)
0 <
where L(y) = fY (y|Θ ∈ Λ1)/fY (y|Θ ∈ Λ0) and π0 = P (Θ ∈ Λ0), π1 = P (Θ ∈ Λ1).
• This reduces the problem back to that of simple Bayesian hypothesis testing.
2.2 Example: Testing on the radius of a point in the plane
3Bayes formula says:
P (Θ ∈ Λj |Y = y) =fY (y|Θ ∈ Λj)P (Θ ∈ Λj)
fY (y|Θ ∈ Λ0)P (Θ ∈ Λ0) + fY (y|Θ ∈ Λ1)P (Θ ∈ Λ1)
Hypothesis Testing 6
3 Uniformly Most Powerful Tests
The case considered here is that θ is modeled as a deterministic but unknown constant
(or as random, but with unknown statistics). Then a Bayes test in not meaningful so we
look at Neyman-Pearson methods.
• Say parameter set is given as a disjoint union: Λ = Λ0 ∪ Λ1. Hypothesis H0 corre-
sponds to a state of nature Pθ where θ ∈ Λ0. Similarly, for H1.
• Let δ(y) be a randomized decision rule for H0 vs. H1. Define
– False-alarm probabilities.
PF (δ; θ) = Eθ{δ(Y )}, for θ ∈ Λ0.
– Detection probabilities.
PD(δ; θ) = Eθ{δ(Y )}, for θ ∈ Λ1.
• A Uniformly Most Powerful (UMP) test of level α is one that maximizes
PD(δ; θ)
for every θ ∈ Λ1 subject to
PF (δ; θ) ≤ α
for all θ ∈ Λ0.
Hypothesis Testing 10
• UMP tests do not always exist.
3.1 Example: UMP Tests Don’t Always Exist
• Λ = Λ0 ∪ Λ1
• Suppose H0 is simple, i.e., Λ0 = {θ0}
• Suppose that Pθ has a density fθ(·) for each θ ∈ Λ and consider the Neyman-Pearson
problem for testingH0 : Y ∼ Pθ0
vs.
Hθ : Y ∼ Pθfor some fixed θ ∈ Λ1. To be clear, at the moment we are considering a simple
hypothesis test.
• We know from the NPL that there exists a most powerful α-level test for this problem
with a critical region of the form
Γθ = {y ∈ Γ : fθ(y) > τfθ0(y)}where τ and a possible randomization are chosen to give a size α test. Also from the
NPL we know that the test is essentially unique and that any other α-level test will
have smaller power.
Hypothesis Testing 11
• So for two distinct parameter values θ′, θ′′ ∈ Λ1 the test with critical region Γθ′ will
have a smaller power for testing
H0 vs. Hθ′′
than the test with Γθ′′ (and vice versa) unless Γθ′ and Γθ′′ are essentially identical.
A UMP test for
H0 vs. Hθ : Y ∼ Pθ, θ ∈ Λ1
exists if and only if the critical region Γθ is (essentially) the same for all
θ ∈ Λ1.
Synonymously, we can say a UMP test exists if and only if the LRT for every θ ∈ Λ1
can be completely defined (including threshold) without knowledge of θ.
3.2 Example: UMP Testing of Location
• Consider the family of distributions {Pθ : θ ∈ Λ} where Λ is a subset of R and Pθ is
N (θ, σ2).
• Consider the hypothesis pairH0 : θ = µ0
vs.
H1 : θ > µ0
Hypothesis Testing 12
where µ0 is a fixed real number. Therefore, we have a simple null hypothesis Λ0 =
{µ0} and a composite alternative Λ1 = (µ0, ∞).
• From the previous example (i.e., location testing with Gaussian error) we know that
for each fixed θ ∈ Λ1 the most powerful α-level test for H0 versus Y ∼ N (θ, σ2) has
a critical region
Γθ = {y ∈ Γ : y > σΦ−1(1− α) + µ0}.This region does not depend upon θ (note that θ is restricted to be > µ0 ) and thus
it gives a UMP test for H0 : θ = µ0 vs. H1 : θ > µ0.
• Let δ1 denote the decision rule for the critical region Γθ (as seen previously, random-
ization is not required). The detection probabilities are:
PD(δ1; θ) = 1− Φ
Φ−1(1− α)− θ − µ0
σ
for θ > µ0.
• Now for the same family of distributions consider the hypothesis testing problem
H0 : θ = µ0
vs.
H1 : θ 6= µ0
Hypothesis Testing 13
where µ0 is a fixed real number. Therefore, we have the same simple null hypothesis
Λ0 = {µ0} and a new composite alternative Λ1 = (−∞, µ0) ∪ (µ0, ∞).
• For θ > µ0 the critical region of the most powerful α-level test is as before, but for
θ < µ0 it is different. With the two cases included in the same formula:
Γθ =
{y ∈ Γ : y > σΦ−1(1− α) + µ0} for θ > µ0
{y ∈ Γ : y < σΦ−1(α) + µ0} for θ < µ0.
In the sense that the critical region “switches” as a function of θ between these two
cases, it depends upon θ and
No UMP test exists for this two-sided hypothesis testing problem.
• Let δ2 be the test corresponding to critical region Γθ with θ < µ0. Then we can show
PD(δ2; θ) = Φ
Φ−1(α)− θ − µ0
σ
for θ < µ0.
• We can certainly extend the definitions of PD(δ1; θ) and PD(δ2; θ) to θ ∈ R and then
plot the two power functions together on the same axis. Note that PD(δ1; θ) increases
as θ increases while PD(δ2; θ) decreases as θ increases. The curves cross when θ is
Hypothesis Testing 14
such that
Φ
Φ−1(α)− θ − µ0
σ
= 1− Φ
Φ−1(1− α)− θ − µ0
σ
which happens for θ = µ0 where both sides equal α.
• This shows that neither test performs will outside of its region of optimality. A more
reasonable test than either δ1 or δ2 would compare |y − µ0| to a threshold, but this
cannot be UMP for H0 : θ = µ0 vs. H1 : θ 6= µ0 because such does not exist.
3.3 Unbiasedness
• Previous example illustrates how the UMP criterion is too strong for some problems
since it is not useful to aim for a criterion for which a test does not exist.
Hypothesis Testing 15
• Sometimes can overcome the difficulty by applying more constraints to eliminate
unreasonable tests.
• One such condition is to require unbiasedness meaning we require4
PD(δ; θ) ≥ α
for all θ ∈ Λ1 in addition to the constraint PF (δ; θ) ≤ α for all θ ∈ Λ0. This would
have eliminated both δ1 and δ2 from consideration in the previous example.
4 Locally Most Powerful Tests
• Consider the case where Λ is of the form [θ0,∞) with Λ0 = {θ0} and Λ1 = (θ0,∞).
• Such comes up in many signal detection problems in which θ0 = 0 and θ is a signal
amplitude parameter.
• Often we are primarily interested in the case where, under H1, θ is close to θ0. When
θ is a signal amplitude parameter this would correspond to small signal strength (i.e.,4Recalling the actual definition of detection and false alarm probabilties we can give a more symmetric definition for unbiasedness. Namely
δ is unbiased in this composite binary hypothesis testing problem if
Eθ{δ(Y )} is
{≥ α for all θ ∈ Λ1
≤ α for all θ ∈ Λ0.
Hypothesis Testing 16
the low signal-to-noise ratio regime).
• Consider a decision rule δ. Then subject to regularity conditions we may expand
PD(δ; θ) in a Taylor series about θ0:
PD(δ; θ) = PD(δ; θ0) + (θ − θ0)P ′D(δ; θ0) + O((θ − θ0)2)
where P ′D(δ; θ0) = ∂∂θPD(δ; θ).
• Note that PD(δ; θ0) = PF (δ) so for all α sized tests we see that for θ near θ0
PD(δ; θ) ≈ α + (θ − θ0)P ′D(δ; θ0).
Conclude that for θ near θ0 we can achieve an approximate maximum power with size
α by choosing δ to maximize P ′D(δ; θ0).
• A test which maximizes P ′D(δ; θ0) subject to a false alarm constraint PF (δ) ≤ α is
called an α-level locally most powerful (LMP) test or simply a locally optimum test.
4.1 The general structure of LMP
• Assume that Pθ has density fθ for each θ ∈ Λ1. Then we can write
PD(δ; θ) = Eθ{δ(Y )} =∫Γδ(y)fθ(y)dy.
Hypothesis Testing 17
• If the family {fθ(y) : θ ∈ Λ1} is sufficiently regular that differentiation wrt θ and
integration wrt y may be interchanged then
P ′D(δ; θ) =∫Γδ(y)
∂
∂θfθ(y)
∣∣∣∣∣∣θ=θ0
dy.
• Comparison of this expression with our previous work on NP testing for simple hy-
potheses shows that the α-level LMP problem is the same as the α-level NP design
problem where we replace f1(y) with
∂
∂θfθ(y)
∣∣∣∣∣∣θ=θ0
.
• From this analogy (within regularity) an α-level LMP test for H0 : θ = θ0 vs.
H1 : θ > θ0 is given by
δlo(y) =
1 >
γ if ∂∂θfθ(y)
∣∣∣∣θ=θ0
= ηfθ0(y)
0 >
where η and γ are chosen st PF (δlo) = α.
Hypothesis Testing 18
5 Generalized Likelihood Ratio Tests
• In the absense of the applicability of any of the above a test for composite hypothesis
which is often used is to compare
maxθ∈Λ1 fθ(y)
maxθ∈Λ0 fθ(y)
to a threshold.
• Called a generalized likelihood ratio test (GLRT) or maximum likelihood test.
Hypothesis Testing 19