analysis of tests of local realism
TRANSCRIPT
Analysis of tests of local realism
by
Yanbao Zhang
B.S., University of Science and Technology of China, 2006
A thesis submitted to the
Faculty of the Graduate School of the
University of Colorado in partial fulfillment
of the requirements for the degree of
Doctor of Philosophy
Department of Physics
2013
This thesis entitled:Analysis of tests of local realism
written by Yanbao Zhanghas been approved for the Department of Physics
Emanuel Knill
Sae Woo Nam
Date
The final copy of this thesis has been examined by the signatories, and we find that both thecontent and the form meet acceptable presentation standards of scholarly work in the above
mentioned discipline.
iii
Zhang, Yanbao (Ph.D., Physics)
Analysis of tests of local realism
Thesis directed by Dr. Emanuel Knill
Reliable and loophole-free demonstrations of the violation of local realism (LR) are highly
desirable not only for understanding the foundation of quantum mechanics but also for facilitating
quantum information processing, such as quantum key distribution and randomness expansion.
To date, LR has been experimentally violated, but with loopholes, by testing predetermined Bell
inequalities. This thesis presents a framework for verifying and quantifying the violation of LR
without relying on a particular Bell inequality.
First, the experimental resources, such as the quantum state, measurement settings, detection
efficiency, and visibility required for a violation of LR, are studied via a measure called the statistical
strength. The higher the statistical strength, the more confidence in a violation of LR one has after
a sufficiently large number of experimental data. Particularly, we study the minimum detection
efficiency required to achieve any given statistical strength level in tests of LR with entangled states
created from two independent polarized photons passing through a polarizing beam splitter. It is
shown that, compared with photon detectors, photon counters make violations of LR easier to
detect for any nonzero probability of multiple photons in an output beam of the polarizing beam
splitter.
Second, to quantify the statistical evidence against LR obtained from a finite number of
experimental data, one can choose a test statistic, such as a Bell-inequality violation, to measure
the amount of violation of LR. It is desirable to bound the probability, according to LR, of obtaining
a test statistic at least as extreme as that observed. This probability is known as a p-value for
the hypothesis test of LR. We propose a protocol to bound such a p-value. The bound provided is
asymptotically tight, if the prepared quantum state and measurement settings are stable during an
experiment. Therefore, the proposed protocol is asymptotically optimal, and the bound provided
iv
is a standardized measure of success for experimental tests of LR. One can quantitatively compare
different experimental tests based on this bound. Moreover, the bound provided is valid even if the
quantum state varies arbitrarily and local realistic models depend on previous measurement settings
and outcomes. Hence, this bound facilitates device-independent and nonlocality-based quantum
information processing. For comparison, bounds of p-values derived from Bell-inequality violations
using the number of standard deviations of violation of a Bell inequality or using martingale theory
are studied. It is found that putative bounds derived from the number of standard deviations of
violation are not valid and bounds from martingale theory are not tight.
Finally, a simplified and efficient data analysis protocol using a set of Bell inequalities is
proposed and compared with the above optimal and martingale-based protocols. The simplified
protocol provides as good as and typically tighter p-value bounds than the martingale-based proto-
col, and the bounds provided can even be asymptotically tight. Moreover, the simplified protocol
can be applied to any test with linear witnesses, such as tests for verifying entanglement, system
dimensionality, or steering.
Dedication
To my parents.
vi
Acknowledgements
I would like to thank my advisor, Manny Knill. Manny has tremendous amount of ideas
and amazing intuitions on how to solve problems. No matter what kind of problems I met during
research, he always gave me pieces of helpful advice. During these years, I have learned from him
not only a lot on physics and mathematics but also critical thinking. I cannot express all my thanks
to him.
I have also benefited a lot from working with Scott Glancy. Scott is specially good at
simplifying problems and making research and writing easier to understand. He helped me a
lot at writing and presenting my work. I also enjoyed the time spent together with Adam Meier,
Bryan Eastin, and Mike Mullan. They, specially Adam, helped me a lot at presenting my work
and improving my English.
I also would like to thank many other people at NIST and JILA, including, but not limited
to, Lorna Buhse, Kevin Coakley, Sae Woo Nam, Alan Migdall, Thomas Gerrits, Dominic Meiser,
and Murray Holland. Lorna helped me with every conference travel. Kevin helped me a lot
on mathematical statistics. I learned a lot and became interested in quantum optics through
discussions with Sae Woo, Alan, Thomas, Dominic, and Murray.
vii
Contents
Chapter
1 Introduction 1
1.1 Why test local realism? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Bell’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Overview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Contents of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 Bell inequalities 8
2.1 Locality, realism, and Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2.2 Geometric interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.1 Special case: The CHSH inequality . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Various Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3.1 Bell inequalities with many settings . . . . . . . . . . . . . . . . . . . . . . . 16
2.3.2 Bell inequalities with many outcomes . . . . . . . . . . . . . . . . . . . . . . 18
2.3.3 Bell inequalities with many parties . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Derivation of Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Bell inequality and entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Bell inequality, steering, and contextuality . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Bell inequality and private information . . . . . . . . . . . . . . . . . . . . . . . . . . 31
viii
3 Challenges of testing local realism 32
3.1 Experimental configuration for testing local realism . . . . . . . . . . . . . . . . . . . 32
3.2 The locality loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 The detection loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.4 The memory loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Possibilities of loophole-free violations of LR . . . . . . . . . . . . . . . . . . . . . . . 36
4 Statistical strength of experiments for rejecting local realism 39
4.1 Experimental configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.2 Data analysis method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5 Asymptotically optimal data analysis for rejecting local realism 56
5.1 Statistical concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.2.1 Bell functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2.2 SD-based protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.2.3 Martingale-based protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
5.2.4 PBR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.3 Technical details for applying the PBR protocol . . . . . . . . . . . . . . . . . . . . . 70
5.3.1 Estimating the experimental probability distribution . . . . . . . . . . . . . . 70
5.3.2 Effects of bad estimates of true distributions and optimal LR models . . . . . 73
5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.1 Confidence-gain rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.4.2 Application to experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
ix
6 Efficient quantification of experimental violation of local realism 81
6.1 Simplified PBR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
6.2 Protocol comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.1 Computational resource comparison . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.2 Comparison of confidence-gain rates . . . . . . . . . . . . . . . . . . . . . . . 88
6.2.3 Comparison of protocols’ behavior for finite data . . . . . . . . . . . . . . . . 91
6.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7 Conclusions and future directions 98
7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
Bibliography 100
Appendix
A User guide of the local realism analysis engine 111
A.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
A.2 LRE state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
A.2.1 Experimental configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
A.2.2 Analysis and display variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
A.2.3 Data dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.3 LRE interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
A.4 LRE support functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
A.5 LRE usage examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
A.5.1 Analyzing an existing data set . . . . . . . . . . . . . . . . . . . . . . . . . . 129
A.5.2 Monitoring an experiment in progress . . . . . . . . . . . . . . . . . . . . . . 133
A.6 Technical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
x
A.6.1 Data half-lives and data weights . . . . . . . . . . . . . . . . . . . . . . . . . 137
B Optimization results for Chapter 4 141
B.1 Results for unbalanced Bell states using photon counters or detectors . . . . . . . . . 142
B.2 Tradeoff between visibility and efficiency using unbalanced Bell states where S = 10−6145
B.3 Results for pseudo-Bell states using photon counters . . . . . . . . . . . . . . . . . . 146
B.4 Results for pseudo-Bell states using photon detectors . . . . . . . . . . . . . . . . . . 148
xi
Tables
Table
4.1 Extreme conditions for tests of LR free of the detection loophole for photon counters
or photon detectors using the unbalanced Bell states |ψuB〉 defined in Eq. (4.11).
The asymptotic behavior when θ → 0 is consistent with results in Ref. [1], which are
shown in the last row. The angle parameters are explained in the text. . . . . . . . . 48
4.2 Extreme conditions for tests of LR free of the detection loophole for photon counters
and photon detectors using the pseudo-Bell states of Eq. (4.13). The angle parame-
ters are explained in the text. The minimum detection efficiencies for counters and
detectors when γ = 45◦ are the same as those found in Ref. [2]. . . . . . . . . . . . . 53
xii
Figures
Figure
2.1 The regions achievable by LR, quantum mechanics, and all physical theories satis-
fying no signaling. Any correlation vector inside black squares is achievable under
the no-signaling conditions as in Eq. (2.8). The quantum convex set Q and the
LR polytope L are bounded by red curves and blue lines (with black lines in (b)),
respectively. (a) is the situation in the subspace E(A1B1) = 1 and E(A1B2) = 0,
while (b) is the situation in the subspace E(A1B1) = E(A1B2) = 1/2. . . . . . . . . 13
3.1 The experimental procedure for testing a bipartite Bell inequality. The inset reflects
the locality condition in the space-time diagram. . . . . . . . . . . . . . . . . . . . . 33
4.1 Schematic of a test of LR with the independent photons source. Two spatially and
temporally matched polarized photons are inserted at 1 and 2. The polarization
rotators PR1 and PR2 are set so that photons 1 and 2 are linearly polarized at the
same direction when they reach the polarizing beam splitter PBS1. After PBS1, the
photons are in a nonmaximally entangled state (see Eq. (4.3)) and are sent to Alice’s
and Bob’s measurement setups. Each measurement setup uses a PR, a PBS and
two detectors. The PR is used to select measurement bases by rotating the photon’s
polarization state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
xiii
4.2 Detection efficiency of photon counters or photon detectors required for different
statistical strength levels S vs the parameter θ [Eq. (4.11)]. The empty squares
show our calculated points, and the dotted lines are linear interpolations to guide
the eyes. In curve a, the linear extrapolation toward θ = 0 is shown. . . . . . . . . . 50
4.3 Tradeoff between the overall minimum detection efficiency minθ ηc(θ) and the visi-
bility V of unbalanced Bell states. Here, we fix the optimal statistical strength to
10−6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Detection efficiencies of photon counters and photon detectors required for different
statistical strength levels S vs the parameter γ of the pseudo-Bell state of Eq. (4.13):
(a) S = 0, (b) S = 5E-5, (c) S = 5E-4, and (d) S = 1.5E-3. The calculated points
are labeled by squares for photon counters and by diamonds for photon detectors,
and the dotted lines are linear interpolations to guide the eyes. . . . . . . . . . . . . 54
5.1 Confidence-gain rates G achieved by the SD-based, martingale-based, and PBR pro-
tocols. The gain rate G is shown for a CHSH test of LR with an unbalanced Bell state
with no loss and perfect detectors. It depends on the parameter θ in the unbalanced
Bell state |ψuB〉. Given the state parameter θ, the measurement settings are chosen
to maximize the violation of the CHSH inequality (1.2). The line corresponding to
the gain rates achieved by the SD-based protocol crosses the line corresponding to
the optimal gain rates at θ = 33.41◦. . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
xiv
5.2 The confidence-gain rate G of a CHSH test of LR with a Bell state and varying detec-
tion efficiency η and visibility V. The measurement settings are chosen to maximize
the violation of the CHSH inequality (1.2). Measurement outcomes where no particle
is detected are assigned the value −1. (a) pA1 = pB1 = 0.5, (b) pA1 = pB1 = 0.51, (c)
pA1 = pB1 = 0.52, and (d) pA1 = pB1 = 0.53, where pA1 and pB1 are the probabilities
that at each trial Alice and Bob independently choose the settings A1 and B1, respec-
tively. Note that, in the subplot (a) the optimal gain rates are not shown, since the
optimal gain rate can be at most 6 % larger than the corresponding martingale-based
gain rate so that the difference between them is not visible. . . . . . . . . . . . . . . 77
5.3 Running log-p-values as functions of the number of trials n in a CHSH test of LR
with an unbalanced Bell state cos(θ)|00〉 + sin(θ)|11〉 where θ = 22.5◦. We assume
that there is no noise or detection inefficiency and the setting distribution is uniform.
The log-p-values are computed according to the three protocols discussed. The slopes
of the straight lines are the confidence-gain rate achieved by each protocol. (a) is
for one simulation of 5000 successive trials. (b) is an average of 30 independent
simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.4 Running log-p-values as functions of the number of trials n in the experiment of
Ref. [3]. In this experiment, different measurement settings are chosen uniformly
randomly. The dotted lines are provided only to guide the eye. . . . . . . . . . . . . 80
6.1 Confidence-gain rates in the test of the CGLMP inequality 〈Id(X)〉 ≤ 2. Here,
we use the quantum state and measurement settings of Ref. [4], Eqs. (15) and (9),
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
xv
6.2 Confidence-gain rates in the test of LR with an unbalanced Bell state |ψ(θ)〉. The
measurement settings are chosen to maximize the violation of the CHSH inequal-
ity (1.2) given the state |ψ(θ)〉. The gain rates achieved by the simplified PBR
protocol using the CHSH inequality are shown as circles (◦), while the gain rates by
the same protocol using the CHSH inequality together with no-signaling conditions
are shown as crosses (+). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3 An example of running log-p-values as functions of the number of trials n in a test
of the CGLMP inequality. The dashed and solid lines are the asymptotic lines for
log-p-values based on gain rates achieved by the (full or simplified) PBR protocol
and the martingale-based protocol, respectively. Repetitions of this Monte Carlo
simulation show similar behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Chapter 1
Introduction
1.1 Why test local realism?
Theories designed according to “local realism” (LR) include a set of hidden variables, which
if known would deterministically predict all measurement outcomes. Moreover, the values of the
hidden variables cannot be influenced by spacelike-separated events, hence these hidden variables
are called local hidden variables. In 1964, Bell first showed that quantum mechanics violates LR [5].
This profound result is known as Bell’s theorem. To prove this theorem, Bell and his followers
constructed a class of inequalities, called Bell inequalities. These inequalities are satisfied by all the
predictions according to LR, but can be violated by the predictions of quantum mechanics. A test
of LR showing violation was first realized by Freedman and Clauser in 1972 [6]. Since then, many
such tests have been performed. For reviews of this field, see Refs. [7, 8, 9, 10]. Naturally, one will
ask, “Why does anyone still perform tests of LR given the many claimed experimental violations?
Why are they important?”
Most importantly and fundamentally, the violation of LR implies that the physical description
of the world contradicts at least one of the principles—locality or realism. Locality states that two
spacelike-separated events cannot affect each other, while realism is the ability to meaningfully
speak of the definiteness of the outcomes of measurements that have not been performed. Each
principle sounds natural in practice, however, their combination does not predict measurement
results correctly. To date, no test of LR performed has satisfied both principles without introducing
additional assumptions, i.e., loopholes. Accordingly, no experimental result so far conclusively rules
2
out LR. This is the central motivation underlying the competition for performing a loophole-free
test of LR.
Secondly, quantum physicists are trying to build up quantum computers or networks. Entan-
glement and violation of LR are very important resources for these tasks. (See Sec. 2.4 of Chapter 2
for the definition of entanglement and relevant discussions.) To verify and quantify these quantum
resources, Bell inequalities or generalized Bell inequalities are useful and even indispensable tools.
For example, if experimental results violate LR, a family of quantum communication protocols are
secure even for causal adversaries not limited by the laws of quantum mechanics [11, 12, 13].
Last but not the least, quantum physicists are interested in understanding and quantifying
the violation of LR achievable in quantum mechanics. As is well known, the quantum violation of
LR is less than the maximal violation of LR possible according to a theory satisfying relativistic
causality (also called no signaling) [14]. Physicists would like to verify that the violations of LR in
experiments are consistent with the predictions of quantum mechanics.
1.2 Bell’s theorem
As mentioned above, Bell’s theorem states that no local realistic (LR) theory (i.e., a theory
designed according to LR) can reproduce all the predictions of quantum mechanics. In the work [5],
Bell derived the following inequality
1 + E(A1B1) ≥ |E(A2B2)− E(A2B1)|, (1.1)
where E(AiBj) with i, j ∈ {1, 2} is the correlation between Alice’s and Bob’s measurements Ai and
Bj with outcomes ±1 on two separated particles. This inequality (1.1) is satisfied by a restricted set
of LR theories for which the outcomes from the two separated particles are exactly anticorrelated
when Alice’s and Bob’s measurements are A1 and B2, respectively. This restriction is reasonable
if the two particles are spin-1/2 particles and they are in the singlet state 1√2(| ↑↓〉 − | ↓↑〉), a Bell
state. For this case, Bell gave a set of measurements for which the right-hand side of Eq. (1.1) is
larger than the left-hand side, thus Eq. (1.1) is violated. However, the ideal singlet state required
3
in Bell’s original proof cannot be prepared in practice.
Practical experimental tests of LR were made possible by the proposal by Clauser, Horne,
Shimony and Holt in 1969 [15]. They constructed Bell inequalities satisfied by a general LR theory.
One is the Clauser-Horne-Shimony-Holt (CHSH) inequality [15]
ICHSH ≡ E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 2, (1.2)
where the terms E(AiBj) are the same as those in Eq. (1.1). Another Bell inequality, equivalent
to the CHSH inequality but easier to test, is the Clauser-Horne (CH) inequality [16]
ICH ≡ P (A1B1) + P (A1B2) + P (A2B1)− P (A2B2)− P (A1)− P (B1) ≤ 0, (1.3)
where P (AiBj) is the probability that both measurements Ai and Bj have outcome +1 and P (Ai)
or P (Bj) is the probability that measurement Ai or Bj has outcome +1. By choosing appropriate
measurements on a Bell state, the CHSH and CH expressions ICHSH and ICH take their maxi-
mum values 2√
2 and 1/√
2 − 1/2, respectively. To test inequalities (1.2) or (1.3), each of two
parties—Alice and Bob—receives one particle from a common source. Each of them randomly and
independently chooses a local measurement from a set consisting of two measurements, performs
the chosen measurement on their own particle, and records the outcome. This procedure is called
a trial. After a large number of trials, Alice and Bob collect enough data and can estimate ICHSH
or ICH from these data.
1.3 Overview of the thesis
This thesis tells the story of our quest to quantify the evidence against LR obtained from
experimental data. Since the first test of LR in 1972 [6], it has been a convention to test a Bell
inequality and present the result in terms of the number of experimental standard deviations (SDs)
of violation of this Bell inequality. This is a way of claiming successful violation of LR with small
measurement uncertainties. However, a large number of SDs of violation does not necessarily imply
a small p-value, where a p-value is the probability, if LR holds, of a violation at least as high as that
4
observed. A small p-value means that the data is significant for rejecting LR. Hence, a reliable test
of LR requires a small p-value. The main work in this thesis is about how to upper bound a p-value
for the hypothesis test of LR. Specifically, we propose a method to compute an asymptotically tight
upper bound of a p-value. The proposed method can be simplified and adapted to quantify the
experimental evidence for rejecting an arbitrary set of hypothetical probability distributions from
which the experimental probability distribution is separated by hyperplanes in the probability space.
For example, the simplified method can be applied to verify entanglement or system dimensionality
with linear witnesses.
If one pretends that the distribution of the violation of a Bell inequality, if LR holds, is
Gaussian with mean less than or equal to 0 and SD equal to the observed one, one can estimate
a p-value from the number of SDs of violation. However, as our work [17] showed, this p-value
estimate is not an upper bound of the corresponding exact p-value, and so it is not valid. A valid
p-value bound was suggested by Gill [18, 19]. But, it is based on a conservative estimate of a tail
probability, and so this bound is not tight.
We propose a method to compute a tighter p-value bound without relying on a particular Bell
inequality [17]. Specially, our bound is asymptotically tight with respect to the number of trials, if
the prepared quantum state and measurement settings are stable over time. Hence, the proposed
bound is a standardized measure of success for experimental tests of LR, and one can compare the
strengths of rejecting LR in different experiments based on this measure. The proposed method
works even if the prepared quantum state and measurement settings vary arbitrarily and relevant
LR models depend on previous measurement settings and outcomes. That is, our method works in a
device-independent way and is robust against the memory loophole [20, 21, 22, 23]. Computing the
proposed p-value bound requires only the sequence of measurement settings and outcomes without
using a predetermined Bell inequality. Because the proposed method is not restricted to a single
Bell inequality, it enables wider searches for strong violations of LR. Also, this method adapts to
changes in experimental configuration over the time period for acquiring experimental data. We
implement this method in Matlab and Octave both for monitoring experiments in progress and
5
analyzing existing data sets.
Geometrically, the probability distributions accessible by LR form a convex polytope L, while
the distributions achievable by quantum mechanics form a bigger convex set Q such that L ⊂ Q.
Each face of the convex polytope L corresponds to a Bell inequality. If the joint probability distribu-
tion q of measurement settings and outcomes given by a quantum state and a set of measurements
violates a Bell inequality, then the distribution q is not in the polytope L, i.e., q /∈ L. However,
determining the violation of a predetermined Bell inequality by the distribution q may not be the
most effective way for showing that q /∈ L. The best Bell inequality that separates the distribu-
tion q as far from the polytope L as possible depends on the relative position of q to L. Our
proposed method [17] first estimates the position of the distribution q relative to L before a trial
using previous measurement results, and then it estimates the best Bell inequality with respect to
q. If the distribution q is stable over time, these estimates are asymptotically optimal and so is
the computed p-value bound. For typical experimental configurations, such as the configuration
for testing the CHSH inequality (1.2), our proposed method [17] works efficiently. However, it is
difficult to implement this method as the configuration parameters, that is, the numbers of parties,
measurement settings, and measurement outcomes, increase.
The proposed method in [17] can be simplified if we consider the information about the
polytope L available before the test. For example, if we know a set of relevant faces of the polytope
L, i.e., a set of Bell inequalities, we can simplify the proposed method and at the same time make
p-value bounds sufficiently tight. The motivation is as follows: Given a set of Bell inequalities, we
can assume that the best Bell inequality with respect to the experimental probability distribution
can be expressed as a convex combination of the Bell inequalities in the set considered. Whether
or not this assumption actually holds affects only the tightness of a p-value bound computed, but
not its validity. It is easier to estimate the best Bell inequality in this case, and so the complexity
of computing a p-value bound is reduced. The efficiency of the simplified method depends on
the number of Bell inequalities considered, but not the configuration parameters. The bound
depends on the choice and number of Bell inequalities, and generally, more inequalities make the
6
bound tighter. We find that even trivial Bell inequalities such as those derived from no-signaling
conditions can improve the tightness of the bound. In general, we cannot guarantee that the p-value
bound provided by the simplified method is asymptotically tight.
We also study the quantification of an experimental violation of LR in the asymptotic limit.
The difference between the experimental probability distribution q and all the distributions ac-
cessible by LR can be characterized by a measure called the statistical strength, defined as the
minimum Kullback-Leibler divergence from q to all LR distributions [24]. The reason for using this
measure is that, the p-value for rejecting LR decays exponentially with the number of data points
in the asymptotic limit, and the decay rate cannot be larger than the statistical strength as defined
above [25]. This measure helps to quantify the experimental resources required for a loophole-free
test of LR. Particularly, we study the minimum detection efficiency required for achieving any
given statistical strength level S. Our results [26] show that, for the tests with unbalanced Bell
states of the form cos(θ)|00〉+ sin(θ)|11〉, the minimum detection efficiency required for closing the
detection loophole [27] (corresponding to S = 0) is 2/3, consistent with Eberhard’s result [28]. For
the tests with entangled states created from two independent polarized photons passing through
a polarizing beam splitter, the minimum detection efficiencies for closing the detection loophole
are 89.71 % and 91.11 %, using photon counters and photon detectors respectively. These results
are a little better than the minimum efficiencies 90.62 % for counters and 92.23 % for detectors,
as presented in Ref. [2]. The results show that, compared with photon detectors, photon counters
make violations of LR easier to detect.
1.4 Contents of the thesis
Chapter 2 reviews theoretical works on the tests of LR and related subjects. In addition,
we discuss a systematic and efficient method for deriving various Bell inequalities. Chapter 3
reviews experimental challenges of performing a loophole-free test of LR and recent experimental
progress. Chapter 4 explains our contribution for closing the detection loophole. Chapter 5 studies
the quantification of the statistical evidence against LR through bounding p-values. As mentioned
7
in Sec. 1.3, our proposed data analysis is asymptotically optimal. In Chapter 6, we simply the
proposed data analysis and discuss how to extend it to other tests that benefit quantum information
processing. Finally we conclude the thesis in Chapter 7. There are also two appendices, Appendix A
and Appendix B. In Appendix A, we provide the user guide and code information for implementing
our data analysis. The code can be used to both monitor experiments in progress and analyze
existing data sets. In Appendix B, we provide the details of the optimization results in Chapter 4.
Note that, Chapters 4–6 are based on our papers [17, 26, 29]. Most contents of these chapters are
the same as in the published papers, but more results and discussions are added.
Chapter 2
Bell inequalities
2.1 Locality, realism, and Bell inequalities
Starting from this chapter, the upper-case letters A,B, . . . characterizing the measurements
are also used as the random variables from which the measurement outcomes a, b, . . . are sampled.
As is conventional, the upper-case letter X denotes a random variable and the lower-case letter x
denotes the sampled value of this random variable. We apologize that the readers have to figure out
from context whether A,B, . . . mean the settings or the random variables from which the outcomes
a, b, . . . are sampled.
The two assumptions behind a Bell inequality are locality and realism, as mentioned in
Chapter 1. It helps to understand what are locality and realism by considering the framework for
deriving a Bell inequality.
Physical theories, whether classical, quantum-mechanical, or more general, describe a physical
system by a state. Suppose that there is a source of two particles, described by a joint state. The
two particles are going to Alice and Bob, respectively. Alice and Bob perform measurements A
and B chosen randomly on their own particle and get outcomes a and b, respectively. The above
procedure is called a trial. In a theory designed according to local realism (LR), the joint state of
two particles is λ. Given the state λ and measurement settings A and B at a trial, the outcomes a
and b are completely determined, i.e., a = a(λ,A,B) and b = b(λ,A,B). This determinism is the
meaning of realism. Suppose that the two measurement processes are space-like separated in the
space-time diagram. Then, there is no causal effect on the event of observing the outcome a from
9
the event of choosing the setting B, or on the event of observing the outcome b from the event
of choosing the setting A. Hence, the outcomes are expressible as a = a(λ,A) and b = b(λ,B),
conveying the locality assumption. The combination of locality and realism is called LR, and a
theory designed according to LR is called a local realistic (LR) theory. Note that, the concept of
realism is different from the element of reality in the Einstein-Podolsky-Rosen paradox described in
Ref. [30]. In this paradox, the element of reality is assigned to a physical quantity only if, without
any disturbance, an experimenter can predict with certainty the value of this physical quantity.
As is well known, given a quantum state, quantum mechanics can predict only the probability
P (A = a,B = b) of observing the outcomes a and b at a trial after the measurements A and B,
respectively. However, from the above paragraph, we can see that an LR theory can predetermine
the outcome of each measurement at each trial given the associated state λ. Since the LR state λ is
not accessible in an experiment and can be different at different trials, a general LR theory assumes
that there is a probability distribution ρ(λ) over different states λ. The measurement-outcome
probability at a trial according to this general LR theory is
P (A = a′, B = b′) =∫ρ(λ)δa′,a(λ,A)δb′,b(λ,B)dλ, (2.1)
where the indicator function δx,y = 1 if x = y, otherwise δx,y = 0. As described so far, it is
conceivable that the LR state λ and measurement settings A,B at a trial statistically depend on
each other. Assuming that Alice and Bob have “free will”, as discussed by Bell [31], one can
assure the statistical independence between λ and the setting choices A and B. With the free will
assumption, from Eq. (2.1) one can derive various Bell inequalities satisfied by any LR theory, for
example, the Clauser-Horne-Shimony-Holt (CHSH) inequality (1.2) introduced in Chapter 1.
Suppose that each of Alice and Bob have two different measurement settings A1 and A2 or
B1 and B2. Each measurement has two different outcomes ±1. At each trial, Alice and Bob choose
their own measurement setting randomly and independently. After many trials, they estimate the
correlations E(AiBj) between measurements Ai and Bj with i, j ∈ {1, 2}. For any physical theory,
10
these correlations satisfy
−1 ≤ E(AiBj) ≤ 1. (2.2)
According to an LR theory ρ(λ),
E(AiBj) =∫ρ(λ)ai(λ,Ai)bj(λ,Bj)dλ, (2.3)
where ai(λ,Ai), bj(λ,Bj) = ±1 for any λ. Hence, the CHSH expression
ICHSH = E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2)
=∫ρ(λ) [a1(λ,A1)b1(λ,B1) + a1(λ,A1)b2(λ,B2) + a2(λ,A2)b1(λ,B1)− a2(λ,A2)b2(λ,B2)] dλ
=∫ρ(λ) {a1(λ,A1) [b1(λ,B1) + b2(λ,B2)] + a2(λ,A2) [b1(λ,B1)− b2(λ,B2)]} dλ (2.4)
Using the fact that the expression a1(λ,A1)[b1(λ,B1)+b2(λ,B2)]+a2(λ,A2)[b1(λ,B1)−b2(λ,B2)] =
±2, from Eq. (2.4) we get the CHSH inequality −2 ≤ ICHSH ≤ 2.
2.2 Geometric interpretation
2.2.1 Special case: The CHSH inequality
Considering the permutation symmetry over the four correlations in the CHSH expres-
sion (2.4), we can get the following inequalities,
−2 ≤E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 2,
−2 ≤E(A1B1) + E(A1B2)− E(A2B1) + E(A2B2) ≤ 2,
−2 ≤E(A1B1)− E(A1B2) + E(A2B1) + E(A2B2) ≤ 2, and
−2 ≤− E(A1B1) + E(A1B2) + E(A2B1) + E(A2B2) ≤ 2. (2.5)
As shown by Fine [32], the vector ~E = (E(A1B1), E(A1B2), E(A2B1), E(A2B2)) in the correlation
space where each dimension denotes the correlation between Ai and Bj can be explained by LR
if and only if ~E satisfies the above four CHSH inequalities (2.5) and the trivial inequalities (2.2).
We denote the set of correlation vectors in the correlation space satisfying the inequalities (2.5)
11
and (2.2) by L. Because the region L is defined by linear inequalities, it is a convex polytope,
namely a four-dimensional octahedron [33, 34]. Note that, a set of points S is convex if and only if
for all points ~x, ~y ∈ S the set also contains the straight-line segments {ω~x+ (1− ω)~y : 0 ≤ ω ≤ 1}
between these points.
Quantum mechanics provides correlation vectors ~E outside of L. As first shown by Cirel’son [35],
the CHSH expression ICHSH in Eq. (2.4) can be as high as 2√
2 according to quantum mechanics.
The maximum value can be achieved by the Bell state |ψBell〉 = (|00〉+|11〉)/√
2 with measurements
A1 = σz, A2 = σx, B1 = (σz + σx)/√
2 and B2 = (σz − σx)/√
2. Here, σz =(
1 00 −1
)and σx =
(0 11 0
)are two Pauli matrices. The region accessible by quantum mechanics in the correlation space was
first fully characterized by Masanes [36]. Masanes showed that a correlation vector ~E is obtainable
within quantum mechanics if and only if it satisfies the nonlinear inequalities
−π ≤ sin−1(E(A1B1)) + sin−1(E(A1B2)) + sin−1(E(A2B1))− sin−1(E(A2B2)) ≤ π
−π ≤ sin−1(E(A1B1)) + sin−1(E(A1B2))− sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π
−π ≤ sin−1(E(A1B1))− sin−1(E(A1B2)) + sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π
−π ≤ − sin−1(E(A1B1)) + sin−1(E(A1B2)) + sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π (2.6)
and the trivial linear inequalities (2.2). The set of correlation vectors fulfilling the inequalities (2.6)
and (2.2) is denoted by Q, which is a convex set, but not a convex polytope, such that L ⊂ Q.
The violation of LR by quantum mechanics does not achieve the maximal violation according
to a theory satisfying relativistic causality. Relativistic causality forbids sending messages faster
than light, hence it is also called no signaling. In 1994, Popescu and Rohrlich formulated a specific
theory [14], later referred to as the Popescu-Rohrlich (PR) box, which achieves ICHSH = 4, the
algebraic maximum of the CHSH expression. According to the PR box, the joint probability is
given as
P (Ai = ai, Bj = bj) =
1/2 if (ai + 1)/2⊕ (bj + 1)/2 = (i− 1)(j − 1),
0 otherwise,(2.7)
12
where ⊕ denotes addition modulo 2, i and j are 1 or 2, and ai and bj are −1 or 1. So, E(AiBj) =
(−1)(i−1)(j−1) and ICHSH = 4. By relabeling the outcomes of Bj , i.e., bj ↔ −bj , the PR box can
also achieve the algebraic minimum ICHSH = −4. Probabilities such as those in Eq. (2.7) satisfy
the no-signaling conditions [14]
∑bj
P (Ai = ai, Bj = bj) =∑bj′
P (Ai = ai, Bj′ = bj′) ≡ P (Ai = ai) ∀ai, Ai, Bj , Bj′ ,
∑ai
P (Ai = ai, Bj = bj) =∑ai′
P (Ai′ = ai′ , Bj = bj) ≡ P (Bj = bj) ∀bj , Ai, Ai′ , Bj . (2.8)
Under the no-signaling conditions (2.8), the CHSH expression ICHSH can take any value between
−4 and 4. Considering the permutation symmetry, the set of correlation vectors ~E satisfying the
inequalities
−4 ≤ E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 4
−4 ≤ E(A1B1) + E(A1B2)− E(A2B1) + E(A2B2) ≤ 4
−4 ≤ E(A1B1)− E(A1B2) + E(A2B1) + E(A2B2) ≤ 4
−4 ≤ −E(A1B1) + E(A1B2) + E(A2B1) + E(A2B2) ≤ 4 (2.9)
and the inequalities (2.2) is achievable according to a general theory constrained by only the no-
signaling conditions (2.8). This set of correlation vectors is a four-dimensional cube P containing
the LR polytope L and the quantum convex set Q. The relative relationships between L, Q, and
P are shown as in Fig. 2.1. The left and right plots show the situations in two different subspaces.
2.2.2 The general case
So far we have discussed only a particular case where each of Alice and Bob has two measure-
ments with outcome ±1. For a general scenario, Alice and Bob can perform mA and mB measure-
ments, respectively. Each measurement Ai or Bj has a certain number, dA or dB, of possible out-
comes. After many trials in an experiment, we can estimate the probabilities P (Ai = ai, Bj = bj)
given the chosen measurement settings Ai and Bj . These probabilities satisfy the no-signaling
13
−1 1−1
1
E(A
2B
2)
E (A2B1)−1 1
−1
1
E(A
2B
2)
E (A2B1)
(a) (b)
Figure 2.1: The regions achievable by LR, quantum mechanics, and all physical theories satisfyingno signaling. Any correlation vector inside black squares is achievable under the no-signalingconditions as in Eq. (2.8). The quantum convex set Q and the LR polytope L are bounded by redcurves and blue lines (with black lines in (b)), respectively. (a) is the situation in the subspaceE(A1B1) = 1 and E(A1B2) = 0, while (b) is the situation in the subspace E(A1B1) = E(A1B2) =1/2.
conditions as in Eq. (2.8) and the following trivial constraints, i.e., the positivity conditions
P (Ai = ai, Bj = bj) ≥ 0 ∀ai, bj , Ai, Bj , (2.10)
and the normalization conditions
∑ai,bj
P (Ai = ai, Bj = bj) = 1 ∀Ai, Bj . (2.11)
The d = mAmBdAdB probabilities can be considered as a point in a d-dimensional space. Since the
sum of these probabilities is equal to mAmB, the total number of joint measurement settings, this
space is not a conventional “probability space”. However, when it is not necessary to differentiate
it from the conventional probability space, we call this space also the probability space. Note that,
14
the number of independent probabilities is less than d, the dimension of the probability space, since
the probabilities P (Ai = ai, Bj = bj) satisfy the no-signaling conditions as in Eq. (2.8) and the
normalization conditions as in Eq (2.11).
An LR state λ specifies a specific outcome for each measurement, so there are a total of
Nλ = dmAA dmB
B different LR states. A general LR theory corresponds to a convex combination,
i.e., a probability distribution ρ(λ), over these LR states. Since there are only a finite number of
LR states, the set of LR theories constitutes a convex polytope L in the probability space [7, 8].
Note that, a convex polytope can be defined either as the set of all convex combinations of a finite
set of points or as a bounded intersection of halfspaces. (For an introduction of basic properties
of a convex polytope, see the first three lectures in the book [37].) A convex polytope has many
faces of different dimensions, and a face is the intersection of the polytope with a hyperplane whose
corresponding linear inequality is satisfied by all points inside of the polytope. Hence, the empty
set is a face for every convex polytope. The dimension of a face is the minimum of the dimensions
of linear vector spaces containing the face. A 0-dimensional face is a vertex (or an extreme point) of
a convex polytope, and a face with the maximal dimension is called a facet. For the LR polytope,
each vertex corresponds to an LR state, and any face corresponds to a Bell inequality. Given a
Bell inequality, one can also construct a face of the LR polytope such that this Bell inequality is
an equality on the face. If a Bell inequality is not satisfied by all no-signaling theories and the
inequality’s corresponding face of the LR polytope has a dimension greater than or equal to 1, the
Bell inequality is tight. Otherwise, the Bell inequality is not tight.
In contrast, all no-signaling theories are bounded by only the trivial inequalities as in Eqs. (2.8),
(2.10), and (2.11), which are linear inequalities. Hence, the set of no-signaling theories is a convex
polytope containing L, called the no-signaling polytope P [38]. Finally, the quantum set Q has
an infinite number of extreme points, obtainable by projective measurements on pure quantum
states. Hence, the quantum probabilities form a convex set Q, but not a convex polytope. Since
the probabilities achievable within quantum mechanics can violate Bell inequalities but they satisfy
all inequalities defining the no-signaling polytope P, it follows that the convex set Q is sandwiched
15
between L and P, as illustrated in Fig. 2.1.
The above discussion is for a bipartite case. But, the same argument can be applied to
multipartite cases, and in general we have the relationship that L ⊂ Q ⊂ P.
To help demonstrate the violation of LR in an experimental test, we need to characterize the
LR polytope L in terms of Bell inequalities. Given the number of parties per test, the number of
measurement settings per party, and the number of outcomes per measurement, one can list all LR
states λ, i.e., the vertices of the LR polytope L. Hence, in principle, all the faces of the polytope
L including all tight Bell inequalities can be constructed. However, this task is computationally
hard. Specifically, Pitowsky showed that determining whether or not a set of probabilities is inside
of the LR polytope L is an NP-complete problem [39]. So, it is not surprising that the complete
characterization of the LR polytope exists only either in a configuration where the numbers of
parties, settings, and outcomes are small [32, 40, 41, 42, 43] or where additional symmetries can
be exploited [33, 44, 45]. In the following section, we present some important and well-studied Bell
inequalities.
2.3 Various Bell inequalities
The simplest and most tested Bell inequality is the CHSH inequality (1.2) of Chapter 1.
It requires only two local parties with two dichotomic measurements (i.e., measurements with two
outcomes) at each party. It is violated by all the pure entangled states of d-level systems (d ≥ 2) [46,
47, 48] (see Sec. 2.4 below for the definition of entanglement). It is the best Bell inequality robust
against the special depolarizing noise (i.e., unpolarized photons) in an experimentally prepared
Bell state of two polarized photons, except the slightly better Bell inequality with at least 465
dichotomic measurements at each party [49]. In addition, the violation of the CHSH inequality is
robust against detection inefficiencies (i.e., particle losses) in experiments [28, 50, 51, 52, 53]. In
the following, we mostly discuss different generalizations of the CHSH inequality, including but not
restricted to Bell inequalities with more measurement settings [27, 41, 52, 54, 55, 56, 57], with more
measurement outcomes [41, 58, 59], or with more parties [33, 44, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69].
16
2.3.1 Bell inequalities with many settings
The CHSH inequality (1.2) can be expressed in the following form
P (A2 6= B1) + P (B1 6= A1) + P (A1 6= B2) ≥ P (A2 6= B2), (2.12)
where P (Ai 6= Bj) is the probability that measurements Ai and Bj have different outcomes. The
inequality (2.12) can be derived as follows: We define the function f(X,Y ) = |x− y| on the set of
measurement outcomes {X = x, Y = y} where X,Y ∈ {A1, A2, B1, B2} and x, y = ±1. Then, it is
easy to see that this function satisfies the triangle inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z). So, we
have f(A2, B1) + f(B1, A1) ≥ f(A2, A1) and f(A2, A1) + f(A1, B2) ≥ f(A2, B2). Combing these
two inequalities, we get the inequality
f(A2, B1) + f(B1, A1) + f(A1, B2) ≥ f(A2, B2), (2.13)
which is satisfied by the measurement outcomes assigned by any LR state λ. Since a general LR
theory corresponds to a probability distribution ρ(λ) over all LR states and∫ρ(λ)f(X,Y )dλ =
2P (X 6= Y ), we get the inequality (2.12) from Eq. (2.13). By the same argument and using the
above triangle inequality 2(m − 1) times, we can extend the inequality (2.12) to the case where
each of Alice and Bob has m measurement settings:
P (Am 6= B1)+P (B1 6= A1)+P (A1 6= B2)+...+P (Bm−1 6= Am−1)+P (Am−1 6= Bm) ≥ P (Am 6= Bm),
(2.14)
which is the same as
Ichained ≡ E(AmB1) + E(B1A1) + ...+ E(Am−1Bm)− E(AmBm) ≤ (2m− 2). (2.15)
The inequality (2.15) is the chained CHSH inequality as presented in Refs. [27, 54]. The chained
CHSH inequality can be violated by quantum mechanics with a value for Ichained as high as
2m cos(π/(2m)) by choosing appropriate measurement settings on a Bell state [70].
The chained CHSH inequality has interesting applications in situations where the CHSH in-
equality is inadequate. For example, the use of the chained CHSH inequality reduces the number of
17
trials required to reject LR at a specified confidence level in experiments without noise or detection
inefficiency [71]. Moreover, the use of the chained CHSH inequality with a large m improves the
security of quantum key distribution [11, 72], and shows that, if quantum correlations are expressed
as mixtures of local correlations and general (not necessarily quantum) correlations, the coefficients
of local correlations approach to zero [72, 73]. In practice, however, the chained CHSH inequal-
ity (2.15) with m > 2 requires higher detection efficiency than the CHSH inequality (1.2) for closing
the detection loophole [74] (see Sec. 3.3 of Chapter 3 for an introduction of the detection loophole).
There are also other generalizations of the CHSH inequality for many settings [55, 56]. How-
ever, these Bell inequalities [27, 54, 55, 56] are generally not tight. In 2001, Pitowsky and Svozil
suggested a general method for obtaining all tight Bell inequalities for a given experimental con-
figuration [40]. By this method, many tight Bell inequalities are found [40, 41, 52, 57]. In the
following, I present some known tight Bell inequalities.
As discussed in Sec. 2.2, when each of Alice and Bob has two dichotomic measurements,
there is only one type of tight Bell inequality, i.e., the CHSH inequality (1.2) (or the Clauser-Horne
(CH) inequality (1.3)). When each party has three dichotomic measurements, besides the CHSH
inequality there is one more type of tight Bell inequality [40, 41]
I32 ≡P (A1B1) + P (A1B2) + P (A1B3) + P (A2B1) + P (A2B2)− P (A2B3)
+ P (A3B1)− P (A3B2)− P (A1)− 2P (B1)− P (B2) ≤ 0, (2.16)
where P (AiBj) is the probability that both measurements Ai and Bj have outcome +1 and P (Ai)
or P (Bj) is the probability that measurement Ai or Bj has outcome +1. Note that, here and below
two Bell inequalities are called of the same type if and only if they can be transformed to each
other by relabling the parties, the settings, or the outcomes and by considering the no-signaling
conditions as in Eq. (2.8) and the normalization conditions as in Eq. (2.11). Also, the subscripts
md in the notation Imd means that each of Alice and Bob has m different measurement settings
and each measurement has d possible outcomes.
There are two-qubit states that violate the inequality (2.16) but not a CHSH inequality [41].
18
(A qubit is a two-level quantum-mechanical system.) This new inequality also illustrates a sharing
of bipartite violation of LR between three qubits [41]. That is, given a state ρ123 of three qubits 1, 2
and 3, the corresponding states of the first two qubits and the last two qubits, ρ12 and ρ23, can show
violations of the inequality (2.16) at the same time by choosing three appropriate measurements on
each qubit 1, 2 or 3. However, such a phenomenon cannot be observed using a CHSH inequality.
In practice, the Bell inequality I32 ≤ 0 can tolerate more detection inefficiency than a CHSH
inequality. Specifically, Ref. [50] considered a situation where one party’s detector is perfect and
showed that the inequality I32 ≤ 0 can be violated by measurements on a two-qubit state when
the other party’s detector has an efficiency higher than 43%, which is lower than the minimum
detection efficiency 50% required for violating a CHSH inequality in the same situation.
For the case where both Alice and Bob have four dichotomic measurements, there are many
new types of tight Bell inequalities. Only a partial list of 155 inequivalent tight Bell inequalities
was found [52, 53]. Some of these inequalities tolerate more detection inefficiency than a CHSH
inequality [1, 52, 53]. Specifically, using one of these inequalities, the minimum detection efficiency
required at each party for closing the detection loophole is as low as 61.8% (using an entangled
state of two four-level quantum systems) [1], which is lower than the minimum efficiency 66.7%
when using a CHSH inequality and a two-qubit state [28].
For the case where there are m (m > 4) dichotomic measurements at each of Alice and Bob,
not many Bell inequalities are constructed. Ref. [41] constructed one type Bell inequality Im2 ≤ 0,
but this Bell inequality may not be tight. Using the Bell inequality Im2 ≤ 0, it is shown that, if
one party’s detector is perfect and the other party’s detector has an efficiency higher than 1/m,
the detection loophole can be closed using an entangled state of two m-level quantum systems [1].
2.3.2 Bell inequalities with many outcomes
For two d-outcome measurements A1 and A2 (or B1 and B2) at Alice (or Bob), we label
the measurement outcomes by 0, 1, ..., or d − 1. Let the notation x mod d denote the positive
remainder of the division of x by d, and define the function f(X,Y ) = (x− y) mod d on the set of
19
measurement outcomes {X = x, Y = y} where X,Y ∈ {A1, A2, B1, B2} and x, y ∈ {0, 1, ..., d− 1}.
It is easy to show that this function satisfies the triangle inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z).
Using this triangle inequality twice, we can see that the measurement outcomes according to any
LR state satisfy the inequality f(A2, B1) + f(B1, A1) + f(A1, B2) ≥ f(A2, B2). Hence, we get a
Bell inequality
〈f(A2, B1)〉+ 〈f(B1, A1)〉+ 〈f(A1, B2)〉 ≥ 〈f(A2, B2)〉 . (2.17)
Eq. (2.17) is a simplified expression of the original Collins-Gisin-Linden-Massar-Popescu (CGLMP)
inequality [58], equivalent to the expression in Ref. [75]. By the same argument and using the above
triangle inequality 2(m− 1) times, we get a chained form of the CGLMP inequality [72]
〈f(Am, B1)〉+〈f(B1, A1)〉+〈f(A1, B2)〉+〈f(B2, A2)〉+...+〈f(Am−1, Bm)〉 ≥ 〈f(Am, Bm)〉 . (2.18)
When d = 2, Eqs. (2.17) and (2.18) reduce to the CHSH inequality (2.12) and the chained CHSH
inequality (2.14), respectively.
One of the interests of the CGLMP inequality (2.17) is that it is a tight Bell inequality for
any number d of outcomes [43]. Particularly, when d ≤ 3 all tight Bell inequalities are the CGLMP
inequalities with the parameter d = 2 or 3 in the definition of the function f [43]. Also, when
d > 2 a CGLMP inequality is more robust against noise and detection inefficiency in an experiment
(due to the higher dimension of the tested quantum system) than a CHSH inequality [58, 76]. In
addition, unlike a CHSH inequality, when d > 2 the maximal violation of a CGLMP inequality
cannot be achieved with the maximally entangled state [4, 77, 78].
2.3.3 Bell inequalities with many parties
For two measurements with outcomes ±1 at each of n parties, the correlation functions
E(O1,k1O2,k2 ...On,kn) for measurements Oi,kiat the ith party where ki = 1, 2 can be explained by
LR theories if and only if the following set of Bell inequalities [33, 44]∣∣∣∣∣∣∑
s1,...,sn=±1
S(s1, ..., sn)∑
k1,...,kn=1,2
sk1−11 ...skn−1
n E(O1,k1O2,k2 ...On,kn)
∣∣∣∣∣∣ ≤ 2n (2.19)
20
is satisfied, where S(s1, ..., sn) stands for an arbitrary function of the indices s1, ..., sn ∈ {−1, 1}
such that its range is the set {−1, 1}. The Bell inequalities in Eq. (2.19) follow from the algebraic
identity ∑s1,...,sn=±1
S(s1, ..., sn)n∏i=1
[oi,1(λ,Oi,1) + sioi,2(λ,Oi,2)] = ±2n,
where oi,ki(λ,Oi,ki
) is the predetermined outcome for measurement Oi,kigiven an LR state λ.
Since there are 22ndifferent functions S(s1, ..., sn), Eq. (2.19) represents a set of 22n
Bell
inequalities, which includes all tight Bell inequalities in the correlation space [33]. Many of these
inequalities are trivial. For example, when the choice for the function is S(s1, ..., sn) = 1 for all
arguments, one gets the condition E(O1,1O2,1...On,1) ≤ 1. Specific other choices give nontrivial
inequalities. For example, when S(s1, ..., sn) =√
2 cos[(s1 + ... + sn − n − 1)π/4], which is always
±1 no matter what are the values of s1, ..., sn, one recovers the Mermin-Ardehali-Belinskii-Klyshko
(MABK) inequalities [60, 61, 62], in the form derived by Belinskii and Klyshko [62]. Specially, for
n = 2, the CHSH inequality (1.2) follows.
The maximal violations of Bell inequalities in Eq. (2.19) are attained by the generalized
Greenberger-Horne-Zeilinger (GHZ) state |ψGHZ〉 = 1√2(|0〉1...|0〉n + |1〉1...|1〉n) with a choice of
measurements depending on the inequality under consideration [33]. Among these Bell inequalities,
the MABK inequality can be violated by the largest amount, and this maximal violation is 2n√
2n−1.
Considering the special depolarizing noise in an experiment, the experimental state has the form
ρ = V |ψGHZ〉〈ψGHZ|+ (1− V )ρnoise where ρnoise is the completely mixed state. Then, the MABK
inequality is violated if and only if V > 1/√
2n−1 [44]. In addition, unlike the case where n is
even, there are pure and fully entangled states of an n-partite system that do not violate any Bell
inequality in Eq. (2.19) when n is odd [79].
The above discussion is for the case where each of n parties has two dichotomic measurements.
For a general case where each party has more than two measurements or two measurements with
more than two outcomes, a few Bell inequalities are constructed [64, 65, 66, 67, 68, 69].
21
2.3.4 Derivation of Bell inequalities
There are many different types of tight Bell inequalities. These inequalities corresponds to
the faces of the LR polytopes associated with different numbers of parties, settings, and outcomes.
Since it is a hard problem to find all faces of an LR polytope, deriving Bell inequalities from the
characterization of an LR polytope is not practical. Are there other guiding principles as to how
Bell inequalities are derived? In the following, we discuss two such principles.
First, an LR model corresponds to a probability distribution over all LR states λ, where given
the state λ the outcomes of all possible measurements are known. That is, according to an LR
model, there is a probability distribution over the outcomes of all measurements of all parties [32].
For example, in the configuration where there are two dichotomic measurements at each of two
parties, A1, A2 at Alice and B1, B2 at Bob, the existence of an LR model is equivalent to the
existence of the joint probability distribution P (A1 = a1, A2 = a2, B1 = b1, B2 = b2) where ai and
bj are the outcomes of the corresponding measurements Ai and Bj , i, j = 1, 2. However, since the
measurements A1 and A2 at Alice (or B1 and B2 at Bob) are not compatible with each other, this
joint probability distribution is not accessible. In an experiment, we can observe only marginal
distributions P (Ai = ai, Bj = bj), i, j = 1, 2. If there is a joint distribution consistent with the
experimental marginals, these marginals satisfy linear inequalities, i.e., Bell inequalities. Hence, we
can think a Bell inequality as a consistency constraint on marginal distributions.
In general, it is difficult to test whether or not a set of marginal distributions is consistent.
However, some necessary conditions for consistency may be easy to characterize. Particularly, if we
associate each measurement context corresponding to a marginal distribution with a logical formula,
then it is possible to derive a Bell inequality from a logical consistency constraint on these formulas.
For example, for the above configuration we associate each measurement context (Ai, Bj) with the
logical formula Ai 6= Bj , where Ai 6= Bj means that the outcomes of measurements Ai and Bj are
different. Classical logic shows that, if A2 6= B2 is true then (A1 6= B1)∨ (A1 6= B2)∨ (A2 6= B1) is
22
also true, where the notation ∨ is the logical or operator. As a result,
P (A2 6= B2) ≤ P ((A1 6= B1) ∨ (A1 6= B2) ∨ (A2 6= B1))
≤ P (A1 6= B1) + P (A1 6= B2) + P (A2 6= B1),
which is the CHSH inequality (2.12). To find a violation of such a logical Bell inequality, we only
need to find a situation where the logical formulas are jointly contradictory. Recently, Abramsky
and Hardy [80] showed that, for the configuration where all measurements of each party have 2p
outcomes, any Bell inequality can be derived from a logical consistency constraint (although such
a constraint may be hard to find in practice). Their proof works for any finite number of parties
and any finite number of measurement settings per party.
The second principle is that, from Secs. 2.3.1 and 2.3.2 we can see that both the CHSH
inequality (2.12) and the CGLMP inequality (2.17) can be derived from a triangle inequality
f(X,Y ) + f(Y,Z) ≥ f(X,Z), satisfied by the function f(X,Y ) = (x − y) mod d on the set of
measurement outcomes {X = x, Y = y} where x, y ∈ {0, 1, ..., d− 1}. To derive the CHSH inequal-
ity or the CGLMP inequality, we use the above triangle inequality twice. For a general configuration
where each party has more than two measurement settings, we can repeat the use of the triangle
inequality several times in order to derive new types of Bell inequalities. For example, when Alice
and Bob have m d-outcome measurements A1, A2, ..., Am and B1, B2, ..., Bm, respectively, using the
triangle inequality 2(m− 1) times we get the chained CGLMP inequality (2.18).
Also, we can consider other functions satisfying the triangle inequality. For example, the
function f(X,Y ) = max{0, x − y}, defined on the set of measurement outcomes {X = x, Y = y}
where X,Y ∈ {A1, A2, B1, B2} and x, y ∈ {0, 1, ..., d−1}, satisfies the triangle inequality f(X,Y )+
f(Y, Z) ≥ f(X,Z). Using this triangle inequality twice, we get that, according to any LR model,
〈f(A2, B1)〉+ 〈f(B1, A1)〉+ 〈f(A1, B2)〉 ≥ 〈f(A2, B2)〉. (2.20)
When d = 2, the above inequality (2.20) reduces to
P (A2 = 1, B1 = 0) + P (B1 = 1, A1 = 0) + P (A1 = 1, B2 = 0) ≥ P (A2 = 1, B2 = 0). (2.21)
23
Using the no-signaling conditions in Eq. (2.8), we get
P (B1 = 1, A1 = 0) = P (A1 = 0)− P (A1 = 0, B1 = 0),
P (A2 = 1, B1 = 0) = P (B1 = 0)− P (A2 = 0, B1 = 0), and
P (A1 = 1, B2 = 0)− P (A2 = 1, B2 = 0) = P (A2 = 0, B2 = 0)− P (A1 = 0, B2 = 0). (2.22)
One can see that the terms at the left-hand side of Eq. (2.22) are in Eq. (2.21). Replacing these
terms in Eq. (2.21) by the terms at the right-hand side of Eq. (2.22), the CH inequality (1.3) follows.
In this sense, the inequality (2.20) is the generalized CH inequality for a high-dimensional bipartite
system. However, it is an open problem whether or not the generalized CH inequality (2.20)
is tight for any d, i.e., whether or not the Bell inequality (2.20) corresponds to a face of the
associated LR polytope. Also, the relationship between the generalized CH inequality and the
CGLMP inequality (2.17) deserves further investigation.
Furthermore, using the triangle inequality satisfied by f(X,Y ) = max{0, x−y}, we can show
that, no matter what are the predetermined outcomes assigned by an LR state to the measurements
A1, A2, A3 at Alice and B1, B2, B3 at Bob, the inequality
f(A1, B3) + f(B2, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2) ≥ f(A2, B3) + f(B2, A3),
(2.23)
is satisfied. To show the above inequality, we need to consider three different cases: (i) when
f(A2, B3) = 0, we can get f(B2, A1) + f(A1, B1) + f(B1, A3) ≥ f(B2, A3) using the triangle
inequality twice. Hence, the inequality (2.23) follows. (ii) when f(B2, A3) = 0, as in case (i), we
can show the inequality (2.23) using the triangle inequality twice. (iii) when f(A2, B3) = k > 0 and
f(B2, A3) = l > 0, that is, A2 = B3 + k and B2 = A3 + l, the left-hand side of the inequality (2.23)
24
becomes
f(A1, B3) + f(B2, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2)
=f(A1, A2 − k) + f(A3 + l, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2)
≥f(A3 + l, A2 − k) + f(A2, A3)
≥(k + l).
Hence, the inequality (2.23) is always satisfied by an LR state. Therefore, we get the following Bell
inequality
〈f(A1, B3)〉+ 〈f(B2, A1)〉+ 〈f(B1, A3)〉+ 〈f(A2, B1)〉+ 〈f(A1, B1)〉+ 〈f(A2, B2)〉
≥ 〈f(A2, B3)〉+ 〈f(B2, A3)〉. (2.24)
When d = 2, the above inequality (2.24) reduces to
P (A1 = 1, B3 = 0) + P (A1 = 0, B2 = 1) + P (A3 = 0, B1 = 1) + P (A2 = 1, B1 = 0)
+ P (A1 = 1, B1 = 0) + P (A2 = 1, B2 = 0) ≥ P (A2 = 1, B3 = 0) + P (A3 = 0, B2 = 1). (2.25)
Using the no-signaling conditions in Eq. (2.8), we get
P (A1 = 1, B3 = 0)− P (A2 = 1, B3 = 0) = P (A2 = 0, B3 = 0)− P (A1 = 0, B3 = 0),
P (A3 = 0, B1 = 1)− P (A3 = 0, B2 = 1) = P (A3 = 0, B2 = 0)− P (A3 = 0, B1 = 0),
P (A1 = 1, B1 = 0) = P (B1 = 0)− P (A1 = 0, B1 = 0),
P (A1 = 0, B2 = 1) = P (A1 = 0)− P (A1 = 0, B2 = 0),
P (A2 = 1, B1 = 0) = P (B1 = 0)− P (A2 = 0, B1 = 0), and
P (A2 = 1, B2 = 0) = P (B2 = 0)− P (A2 = 0, B2 = 0). (2.26)
One can see that the terms at the left-hand side of Eq. (2.26) are in Eq. (2.25). Replacing these
terms in Eq. (2.25) by the terms at the right-hand side of Eq. (2.26), the I32 inequality (2.16) follows.
In this sense, the inequality (2.24) is a Bell inequality generalizing the I32 inequality (2.16) to a
25
high-dimensional bipartite system. However, whether or not the generalized Bell inequality (2.24)
is tight for any d > 2 is unknown.
So far, only bipartite Bell inequalities have been derived from triangle inequalities. For
multipartite cases, from triangle inequalities we can also derive constraints on the predictions
according to LR. Suppose that there are n(n ≥ 3) parties where each party i has two measurements
Oi and O′i with outcomes 0 or 1. To derive a constraint on all LR predictions, we use the triangle
inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z), satisfied by the function f(X,Y ) = |x− y| defined on the
set of measurement outcomes {X = x, Y = y} where X and Y are measurements with outcomes 0
or 1. For the bipartite case, one can see that the expressions
S2 ≡12
[f(O′1, O2) + f(O2, O1) + f(O1, O′2)− f(O′1, O
′2)]
and
S′2 ≡12
[f(O1, O′2) + f(O′2, O
′1) + f(O′1, O2)− f(O1, O2)]
can be only 0 or 1 according to an LR state. We can think S2 and S′2 as measurements on the first
two subsystems with outcomes 0 or 1. Then, we can use the triangle inequality f(X,Y )+f(Y,Z) ≥
f(X,Z) twice on the measurements S2, S′2 of the first two subsystems and the measurements O3
and O′3 of the third subsystem, in order to get a constraint on the LR predictions of the three
subsystems
〈f(S′2, O3)〉+ 〈f(O3, S2)〉+ 〈f(S2, O′3)〉 ≥ 〈f(S′2, O
′3)〉. (2.27)
Note that, this constraint is not expressed in terms of directly measurable quantities in an exper-
iment. We conjecture that some Bell inequalities can be derived from Eq. (2.27). In general, for
the n-partite case, let
Sn ≡12
[f(S′n−1, On) + f(On, Sn−1) + f(Sn−1, O′n)− f(S′n−1, O
′n)], (2.28)
where S′n−1 can be got from Sn−1 by exchanging Oi and O′i for all i < n. By induction we can
show that the expressions S2, ..., Sn−1 and S′2, ..., S′n−1 can be only 0 or 1 according to an LR state.
Hence, the expectation of Sn according to LR cannot be less than 0.
26
From the above, we can see that the triangle inequality is a powerful tool for deriving various
Bell inequalities. Whether or not all Bell inequalities can be derived from triangle inequalities is
an interesting open problem and deserves further investigation.
2.4 Bell inequality and entanglement
Let us first introduce several concepts. We call a state of a quantum system pure, if this
state corresponds to a vector |ψ〉 in the Hilbert space, that is, the state space of the system. In
a general situation one does not know which pure state a quantum system is in. It is only known
that the system is, with some probability pi, in one of some pure states |ψi〉. For this situation, the
state of the system is described by a density matrix
ρ =∑i
pi|ψi〉〈ψi|, with∑i
pi = 1 and pi ≥ 0. (2.29)
For a bipartite system, if the two subsystems 1 and 2 are in the pure states |ψ〉1 and |φ〉2,
respectively, the state of the composite system is |ϕ〉12 = |ψ〉1⊗ |φ〉2, the tensor product of the two
subsystems’ states. This state |ϕ〉12 is called a product state. In a general situation, the state of a
bipartite system is described by a density matrix ρ12. If the state ρ12 can be written as a convex
combination of product states
ρ12 =∑ij
pij |ψi〉1〈ψi| ⊗ |φj〉2〈φj |, with∑ij
pij = 1 and pij ≥ 0, (2.30)
the state ρ12 is called separable. Otherwise, the state is called entangled. So, for a bipartite system
its state is either separable or entangled. However, for a multipartite system, its state space has
a richer structure. We call a state ρ12...n of an n-partite system fully separable, if ρ12...n can be
written as a convex combination of product states
ρ12...n =∑i1...in
pi1...in |ψi1〉1〈ψi1 |⊗|φi2〉2〈φi2 |⊗ . . .⊗|ϕin〉n〈ϕin |, with∑i1...in
pi1...in = 1 and pi1...in ≥ 0.
(2.31)
Otherwise, the state ρ12...n is called entangled; however, it can be either fully entangled or partially
entangled and partially separable. A state ρ12...n of an n-partite system is called fully entangled
27
if it is not biseparable, that is, if it cannot be prepared by mixing states that are separable with
respect to some bipartitions of the n subsystems. For example, a tripartite state ρ123 is biseparable
if it can be written as a convex combination
ρ123 =∑ij
p(1)i |ψi〉1〈ψi| ⊗ |φj〉23〈φj |+
∑ij
p(2)i |ψ
′i〉2〈ψ′i| ⊗ |φ′j〉13〈φ′j |+
∑ij
p(3)i |ψ
′′i 〉3〈ψ
′′i | ⊗ |φ
′′j 〉12〈φ
′′j |, with
∑ijl
p(l)ij = 1 and p
(l)ij ≥ 0. (2.32)
Here, |φ′′j 〉12, |φj〉23, or |φ′j〉13 is a pure state of the subsystem 12, 23, or 13, respectively. Note
that, these pure states can be entangled states of their corresponding subsystems. If a state of
a composite system is both pure and entangled, then it is called a pure entangled state. If a
state is entangled but not pure, it is called a mixed entangled state. For more discussions about
entanglement including its classification and quantification, see the review papers [10, 81].
Since a separable state can be decomposed into a convex combination of product states, it can
be shown that all measurements on the subsystems over which the separable state of a composite
system is decomposed admit LR descriptions. Hence, the violation of a Bell inequality signifies that
the state is entangled, as first pointed out by Terhal [82]. For example, the violation of the CHSH
inequality (1.2) can detect all pure entangled two-qubit states [46] and some mixed entangled
two-qubit states [83]. Moreover, entanglement detection based on a Bell-inequality violation is
device-independent. That is, one can infer the presence of entanglement even when the tested
quantum state and the measurement settings chosen in an experiment are unknown.
In general, one can detect entanglement based on an entanglement witness. Such a witness
is an observable W with the properties Tr(Wρent) < 0 for at least one entangled state ρent and
Tr(Wρsep) ≥ 0 for all separable states ρsep [82, 84]. Note that, an entanglement witness corresponds
to a linear inequality constraining the probability distribution of experimental results. So, by
subtracting some nonlinear terms the modified expression will detect more entangled states than
the original witness [85].
For each entangled state ρent, there exists an entanglement witness detecting it [84]. For a
bipartite 2×2 or 2×3 dimensional quantum system, if a state ρ12 is entangled, the partial transpose
28
of ρ12 on the subsystem 1 or 2 has a negative eigenvalue. Suppose that the state of a bipartite
system is given as
ρ12 =∑i,j,l,k
Mil,jk|ψi〉1〈ψj | ⊗ |φl〉2〈φk|, (2.33)
where Mil,jk is a complex number, and |ψi〉1 and |φl〉2 are state-space bases of subsystems 1 and 2,
respectively. The partial transpose of ρ12 on a subsystem, for example on subsystem 2, is defined
as
Γ(2)(ρ12) =∑i,j,l,k
Mil,jk|ψi〉1〈ψj | ⊗ (|φl〉2〈φk|)T
=∑i,j,l,k
Mil,jk|ψi〉1〈ψj | ⊗ |φk〉2〈φl|. (2.34)
From this definition, it is easy to see that the partial transpose on subsystem 1, Γ(1)(ρ12), is given as
the matrix transpose of Γ(2)(ρ12). Hence, the eigenvalues and corresponding eigenstates of Γ(1)(ρ12)
and Γ(2)(ρ12) are the same. The entanglement witness detecting ρ12 can be constructed from the
eigenstate of Γ(2)(ρ12) corresponding to a negative eigenvalue. The motivation for this construction
is that, the partial transpose of a separable state (of any dimension) has no negative eigenvalue [86].
However, this construction does not work for all entangled states, since there is an entangled state
whose partial transpose has no negative eigenvalue [87]. In general, constructing an entanglement
witness to detect an arbitrary entangled state is a hard problem.
The detection of entanglement based on a witness, however, requires a detailed characteri-
zation of the system observed (such as its dimension) and of the measurements performed in an
experiment. Otherwise, the measured witness operator W ′ is different from the ideal witness W ,
so that even if Tr(Wρsep) ≥ 0 for all separable states ρsep, it is possible Tr(W ′ρsep) < 0 for some
separable states. To overcome this problem, researchers derived several generalized Bell-type in-
equalities [63, 88, 89, 90, 91, 92, 93, 94, 95] satisfied by all fully or partially separable states, so
that the violation of such a generalized inequality certifies that the quantum state is entangled or
fully multipartite entangled without assuming the dimension of the system observed or knowing the
measurements performed. Note that, in general these generalized Bell-type inequalities are different
29
from Bell inequalities, since the former inequalities are derived without assuming LR descriptions
of quantum systems.
The exact relationship between the violation of LR and entanglement is still poorly under-
stood. Quantitatively, the violation of LR and entanglement are different. The maximal violation
of LR according to various measures (e.g., the violation of a Bell inequality or the Kullback-Leibler
divergence [24]) is generally not given by maximally entangled states [96]. For example, as pointed
out in Sec. 2.3.1, the maximal violation of the CGLMP inequality is not given by the maximally
entangled state of two d-level systems.
Qualitatively, all bipartite or multipartite pure entangled states, where each subsystem may
have a different dimension, violate a Bell inequality [97]. However, for mixed states, entanglement
does not promise a violation of LR. Specifically, some mixed states have bound entanglement,
that is, from these states a singlet state cannot be distilled. For example, if the partial transpose
of an entangled state has no negative eigenvalue, this state has bound entanglement [87]. Peres
conjectured that such states always admit LR descriptions [7]. Peres’s conjecture is an interesting
open problem, and there are many recent works trying to prove or disprove it. Recently, Peres’s
conjecture was disproved in the multipartite case [98]. Specifically, Ref. [98] exhibits a three-qubit
entangled state that is biseparable (see Eq. (2.32)) and so bound entangled, but the measurement
results according to this state violate a tripartite Bell inequality. But Peres’s conjecture remains
open in the bipartite case. (The numerical optimization results in the recent work [99] suggest that
Peres’s conjecture is correct in the bipartite case.) The above shows that the violation of LR and
entanglement are different concepts.
2.5 Bell inequality, steering, and contextuality
The violation of a Bell inequality certifies not only entanglement but also two other properties
of a quantum system, steering [100, 101, 102] and contextuality [103, 104].
Steering describes the ability of Alice, by performing different measurements on her own
system, to remotely prepare Bob’s system into different ensembles of pure states [100]. For example,
30
suppose that Alice and Bob share the Bell state |ψBell〉 = 1√2(|0〉A|0〉B + |1〉A|1〉B), where |0〉 and
|1〉 are the two eigenstates of the measurement operator σz =(
1 00 1
). Then, if Alice performs the
measurement σz on her qubit, Bob’s qubit will become in a state in the ensemble {|0〉B, |1〉B};
however, if Alice performs the measurement σx =(
0 11 0
)on her qubit, Bob’s qubit will become in a
state in a different ensemble {|+〉B, |−〉B} where |+〉 and |−〉 are the two eigenstates of σx. Note
that, in each case, after Alice’s measurement Bob’s qubit will be prepared into one pure state in
the corresponding ensemble and the prepared state depends on Alice’s measurement outcome. So,
a Bell state exhibits steering. On the other hand, for a separable state as in Eq. (2.30) shared by
Alice and Bob, after any measurement performed by Alice on her system, the ensemble of possible
pure states for Bob’s system is always the same. Hence, a separable state does not exhibit steering.
Recently, Wiseman et al. showed that the above operational definition of steering is equivalent
to the violation of the local hidden state model for Bob [101]. In the local hidden state model,
the measurement outcome of Alice is determined by a hidden variable while the distribution of the
measurement outcomes of Bob is explained by a set of quantum states that are correlated with the
values of the hidden variable. According to this mathematical formulation, a local hidden state
model is a special LR model. Specifically, the local hidden state models form a strict and convex
subset of LR models [102]. Hence, the violation of a Bell inequality demonstrates the ability of
Alice to steer the state of Bob’s system.
Contextuality means that, if there is a hidden variable explaining the outcome of a mea-
surement performed on a system, then the outcome assigned to this measurement by the hidden
variable depends on the experimental context, i.e., which other compatible measurements are per-
formed simultaneously on the system. Hence, to show the contextuality of a quantum system, we
consider the constraints satisfied by all non-contextual hidden variable models. It turns out that
non-contextual hidden variable models are special LR models. Hence, a constraint on LR models,
such as a Bell inequality, is also satisfied by all non-contextual hidden variable models. Therefore,
a Bell-inequality violation demonstrates the contextuality of a quantum system.
Like the case for local hidden variables, given a set of experimental contexts the proba-
31
bility distributions described by non-contextual hidden variables form a convex polytope, the
non-contextual polytope. One difference between these two kinds of hidden variables is that, a
non-contextual hidden variable is associated with a set of compatible measurements, while a local
hidden variable is associated with a set of spacelike-separated measurements. Another difference is
that, unlike a Bell inequality, there exists an non-contextuality inequality, corresponding to a facet
of the non-contextual polytope, such that it can be maximally violated by all quantum states of a
system with the same set of measurements [105].
2.6 Bell inequality and private information
Suppose that two parties, Alice and Bob, share a pair of spin-1/2 particles which is in the
singlet state 1√2(| ↑↓〉 − | ↓↑〉). If they measure their particles’ spins along the same direction,
they will observe exactly anticorrelated outcomes that are unknown to a third party. Hence, if
Alice and Bob can verify that they share the singlet state, for example through testing the CHSH
inequality (1.2) as suggested by Ekert in 1991 [106], they can build a secure quantum channel for
sharing private information. However, Ekert’s scheme [106] relies on two assumptions that cannot be
verified in practice: (i) the system and any eavesdropper must obey the law of quantum mechanics,
and (ii) Alice and Bob have perfect control of the state preparation and of the measurement devices,
i.e., they know how their devices work.
Gradually researchers realized that, provided the no-signaling principle can be trusted and
without assuming quantum mechanics, in the device-independent scenario one can still extract se-
cure private information from measurement outcomes that violate LR [3, 11, 13, 107, 108, 109, 110,
111, 112]. For example, in quantum key distribution, if Alice’s and Bob’s measurement outcomes
violate LR, whatever is the underlying physical theory that produces these outcomes, the eaves-
dropper cannot have full information about them. Otherwise, the eavesdropper’s information could
be treated as an LR description of these outcomes. Hence, the violation of LR can be thought of
as a privacy witness.
Chapter 3
Challenges of testing local realism
3.1 Experimental configuration for testing local realism
The experimental procedure for testing a bipartite Bell inequality is shown in Fig. 3.1. We call
such a procedure a trial. At each trial the locality assumption must be satisfied, as shown in the inset
space-time diagram of Fig. 3.1. After many trials, one can estimate the correlations or probabilities
appeared in a Bell inequality and determine whether the Bell inequality is violated based on these
estimates. If the Bell inequality is violated, in consideration of the statistical fluctuations in finite
trials, experimenters conventionally present the violation in terms of the number of experimental
standard deviations (SDs) of violation of a Bell inequality. For example, Weihs et al. [113] reported
an experimental estimate ICHSH = 2.73±0.02 and claimed a violation of the CHSH inequality (1.2)
by 30 SDs. However, there are several loopholes that are never closed simultaneously in a test of
local realism (LR). In the following, we discuss them individually.
3.2 The locality loophole
As discussed in Sec. 2.1 of Chapter 2, a Bell inequality is derived based on two conditions—
locality and realism. Hence, to show a violation of LR, an experiment where the data violate a Bell
inequality should satisfy these two conditions.
Realism assumes that the outcome of an arbitrary measurement on a quantum system is in
principle predetermined, independent of the interaction between the system and the measurement
apparatus. This assumption is the belief of local realistic (LR) theorists and so one can pretend
33
𝐸
Particle
emission 𝐸
𝐴
Measurement
Alice Bob
𝐵
Measurement
Outcome 𝑏
Source
Outcome 𝑎
𝑖 Random choice
𝑖
Tim
e
𝑎 𝑏
Space
𝐴
𝐵
Random choice
Figure 3.1: The experimental procedure for testing a bipartite Bell inequality. The inset reflectsthe locality condition in the space-time diagram.
that it is true in an experiment. But, the locality condition requires experimenters’ efforts to
ensure that it is satisfied. Specifically, the experiment should satisfy the following two conditions:
first, the distance between different parties of the experiment should be large enough to prevent
light-speed or slower communication between one observer’s measurement choice and the result
of another observer’s measurement; second, local measurement choices should be made randomly,
and one should make sure that these choices are independent of each other and also that they are
independent of the LR state of the particle pair emitted from a source. These requirements are
illustrated in Fig. 3.1. Unfortunately, the locality condition is not satisfied in most of experiments
34
performed so far. The failure of this condition is called the locality loophole [31].
Before the epoch-making experiment reported in Ref. [114], all experiments for testing LR
were performed with static setups, in which measurement settings are held fixed for many successive
trials. It allows the possibility that the LR state of the particle pair emitted at a trial depends on the
setting choices made at the same trial. In 1982, Aspect et al. [114] performed the first experiment
using time-varying measurement settings. However, the settings are periodically switched during
the experiment. So, these settings are actually predictable, and communication even at a speed less
than the speed of light could explain the observed results. In 1998, Weihs et al. [113] performed
the first experiment that closed the locality loophole. In this experiment, each of Alice and Bob
chooses a local measurement setting randomly and independently after the entangled photon pair
left the source, and the measurement choice at one observer and the measurement outcome at
the other observer are spacelike separated. In 2010, another experiment [115] was performed that
improved on the result of Ref. [113] by having spacelike separation between the entangled photon
pair emission and random local setting choices.
3.3 The detection loophole
To test LR, each observer needs to perform a local measurement on his or her own particle
emitting from a common entanglement source. Over this process, the particles are subject to trans-
mission loss or detection loss. Furthermore, most experimental tests of LR utilize entangled photon
pairs, which are generated probabilistically by spontaneous parametric down-conversion [116, 117].
Due to particle loss or no particle generation, only at a small fraction of experimental trials there
are particle detections, and at most of the trials the detectors have no response signal. It is possible
that the detected outcomes violate LR while the whole pattern of measurement outcomes admits
an LR model. This is related with the fact that, even if the experimental probability distribution
is inside of the LR polytope in the probability space associated with the performed measurements
and all possible outcomes, the distribution conditioning on detections is still possible to be outside
of the LR polytope in the corresponding subspace. This kind of problem is generally called the
35
detection loophole [27].
In photonic experiments such as in Refs. [113, 114, 115, 116, 117], the violations of LR are
inferred using only the outcomes from detected photons. Hence, these results are subject to the
detection loophole. To justify these results, one needs to employ the fair-sampling assumption, i.e.,
the subensemble of detected photons is assumed to be representative of the entire ensemble. Other-
wise, one needs to improve the overall detection efficiency, including decreasing the transmission loss
and increasing the entangled photon pair generation probability. To show a loophole-free violation
of the CHSH inequality (1.2) or the CH inequality (1.3), using a Bell state the minimum detection
efficiency is 82.85 % [118], while using an unbalanced Bell state of the form cos(θ)|00〉+ sin(θ)|11〉
the minimum detection efficiency approaches 2/3 as θ goes to 0 [28]. If one particle in the entan-
gled pair is always detected, the minimum efficiency for detecting another particle can be as low
as 1/2 [50, 51]. Using other Bell inequalities or other entangled states, the minimum detection
efficiency can be further decreased [1, 50, 76, 119, 120, 121].
The only experiments that have closed the detection loophole so far involve entangled ions
or atoms [3, 122, 123, 124]. In such an experiment, the outcome from every trial is used to test the
CHSH inequality. The overall efficiency for detecting an ion or atom is about 98 %, high enough for
closing the detection loophole. However, in the experiment reported in Ref. [122], entangled ions
were only a few micrometers apart and the detection time was a few milliseconds, so this experiment
did not close the locality loophole. Later, the separation between two entangled ions was extended
to about one meter [3, 123]. Most recently, heralded entanglement between two neutral atoms
trapped independently 20 meters apart was created, and the violation of the CHSH inequality was
observed [124]. But, to close the locality loophole, it is necessary to increase the separation.
3.4 The memory loophole
Besides the locality and detection loopholes, there is another loophole, introduced by con-
sidering time-varying experimental configurations or LR models with memory. In general, to test
LR, a sequence of measurements is performed successively on a state prepared repeatedly. In the
36
analysis of experimental data, it is usually assumed that the LR model at a trial is independent of
previous trial results, i.e., previous measurement-setting choices and outcomes. Also, in the con-
ventional way of presenting a Bell-inequality violation in terms of the number of SDs, it is implicitly
assumed that the prepared quantum state and measurement settings do not vary during an experi-
ment. However, LR models can take advantage of previous trial results in order to explain the next
trial result better; and the experimental configuration may be unstable so that the quantification
of the evidence against LR in terms of the observed number of SDs of violation is not justified. The
violation of LR due to these possibilities is called the memory loophole [20, 21, 22, 23].
As shown by Barrett et al. [23] and Gill et al. [125], a Bell inequality is satisfied by all
probability distributions of trial results predicted by LR models, no matter whether these models
have memory or not. Hence, if an experimental probability distribution does not satisfy a Bell
inequality, the experiment reliably demonstrate the violation of LR. However, since the experimental
probability distribution is unknown after finite trials, the arguments in Ref. [23, 125] cannot be
used to justify the violation of LR witnessed by a finite number of data. The memory effect can
significantly influence the statistical fluctuations in LR predictions or the uncertainty of an observed
violation of a Bell inequality after finite trials. Hence, it affects the probability, according to LR,
of predicting a violation as high as the observed after finite trials. A rigorous bound on such a
probability was proposed by Gill [18, 19], which is satisfied by all LR models with memory or not.
However, this bound is not tight. The main part of this thesis is how to achieve a tight bound
(Chapter 5) and how to efficiently compute high-quality bounds (Chapter 6) while considering
memory effects.
3.5 Possibilities of loophole-free violations of LR
To date, no experiment has demonstrated a loophole-free violation of LR. It is still an open
problem to determine which systems are the best candidates for closing both the locality and
detection loopholes simultaneously in the near future. In the following, we discuss the challenges
needed to overcome in different systems, in order to perform loophole-free tests of LR.
37
In the test of LR using photons, such as in the first experiment [113] that closed the locality
loophole, the photon detection efficiency is about 5 %, not high enough for closing the detection
loophole. With the rapid development of photon sources and detectors, it is likely that the detec-
tion loophole in photonic experiments will be closed soon. Recently, entangled photon pairs with
high generation probability [126, 127] and photon-number-resolving detectors with high detection
efficiency (≥ 95 %) [128, 129] and with short timing jitter (≤ 4 ns) [130, 131] were developed. The
problem left for closing the detection loophole is how to integrate state-of-art photon sources and
detectors together and at the same time minimize the photon loss in transmission and measurement
apparatuses.
Actually, only a few months ago Giustina et al. demonstrated the violation of the CH
inequality using entangled photons without the fair-sampling assumption [132]. However, their
data analysis has problems so that the claimed high number of SDs of violation is not justified.
Also, since the photon-pair source is a continuous-wave rather than pulsed source, there is no well
defined “trial” in this experiment so that LR theorists can take advantage of this drawback to
explain the observed data.
In atomic experiments, the main problem is how to entangle two ions or atoms far away from
each other. Considering the sofar fastest atom (or ion) detection scheme [133], the detection time
is about 1 µs and so the separation between two ions or atoms should be at least 300 m in order to
close the locality loophole. It is very difficult to entangle two ions or atoms at such a long distance.
Usually, the entanglement between one ion or atom and one photon is first established, which is
relatively easily realized [134, 135]. After that, by an appropriate joint detection of the two photons
where each photon is entangled with one ion or atom, two distant ions or atoms are entangled [136].
In principle, entangling two distant ions or atoms is feasible. However, in experiments [3, 123], due
to the low photon collection and detection efficiency, the probability of heralding an entangled ion
pair at the distance of one meter is very low (about 2× 10−8), and only one entangled ion pair is
generated every 8 minutes. Also, in the most recent experiment [124], one pair of entangled atoms
20 m apart was generated every 2 minutes. So far no entanglement generation between two ions or
38
atoms separated by a longer distance has been reported.
Other proposals for loophole-free violations of LR include using entangled atom/ion-photon
systems [136, 137], entangled systems of dimension larger than two [1], continuous-variable measure-
ments on squeezed light [138, 139], or wave-particle correlations of entangled photons [140, 141, 142].
Compared with separating two entangled atoms or ions, it is easier to separate a photon
from an atom or ion. Due to the high efficiency of detecting an atom or ion, the minimum photon
detection efficiency required to close the detection loophole is decreased [50, 51]. However, the
overall efficiency of detecting a photon emitting from an atom or ion is still lower than the min-
imum efficiency required. The more serious problem is that, the entanglement creation between
an atom/ion and a photon is probabilistic and cannot be heralded, so that the test is hard to be
loophole-free. There is one ion-photon experiment [143] demonstrating a violation of the CHSH
inequality, but this experiment does not close either the detection loophole or the locality loophole.
Using a larger dimensional system can lower the minimum detection efficiency, but the state
required is hard to prepare. Continuous variables can be measured with efficiency close to one;
however, the state required is unfeasible to prepare or the violation displayed is very small and
sensitive to noise in an experiment. Using wave-particle correlations can partially reduce the
difficulty of the state preparation, but the photon detection efficiency required is still hard to
achieve. In summary, all these proposals have their advantages but also their disadvantages. There
are few experimental demonstrations of these proposals.
Chapter 4
Statistical strength of experiments for rejecting local realism
From Chapter 3, we can see that all experimental tests of local realism (LR) performed so far
are subject to at least one of the several loopholes discussed. Experiments carried out on trapped
atoms or ions closed the detection loophole [3, 122, 123, 124], but these particles were close to
each other (at most 20 meters apart) and the detection time was long (at least 1 µs), so these
experiments did not close the locality loophole. There have been photonic experiments addressing
the locality loophole [113, 114, 115, 144]. Yet due to low photon detection efficiency, photonic
experiments have not closed the detection loophole (at least at a significant level).
Previous results show that closure of the detection loophole requires a minimum detection
efficiency of 82.85 % when a Bell state is used [118]. With unbalanced Bell states cos(θ)|00〉 +
sin(θ)|11〉, the minimum detection efficiency approaches 2/3 as θ goes to 0 [28]. These results are
obtained via the Clauser-Horne-Shimony-Holt (CHSH) inequality (1.2). To test this inequality, each
local measurement should have two possible outcomes, such as whether a photon is horizontally
or vertically polarized. However, in a photonic experiment, if the measurement apparatus consists
of one polarization rotator, one polarizing beam splitter, and two photon detectors (see Fig. 4.1),
the measurement outcome can also be no detection at both detectors. In this case, to test the
CHSH inequality, we need to combine the no-detection outcome with one polarization. After this
combination, the evidence against LR generally becomes weaker. To study the violation of LR
and the minimum detection efficiency required without choosing a particular Bell inequality, we
quantify the experimental evidence against LR by a measure called the statistical strength (see
40
Sec. 4.2).
In this chapter, we study the possibility of rejecting LR with a source of entangled states
created from two independent polarized photons passing through a polarizing beam splitter. Simi-
lar sources are used in Refs. [116, 117, 145]. We call this source the “independent inputs” source.
Although this source does not produce balanced or unbalanced Bell pairs, it does create some
entanglement. An advantage of this source is that the input photons do not need to be entan-
gled. The two independent polarized photons can be generated by spontaneous parametric down-
conversion (SPDC) in nonlinear crystals [116, 117, 145], or by other single-photon sources being
developed such as atoms, ions, molecules, solid-state quantum dots, or nitrogen-vacancy centers in
diamond [146, 147]. The states of the two photons can be detected by photon counters or pho-
ton detectors. (We use the term “photon detector” to refer to detectors that determine only the
presence or absence of photons, not their number.) Since experimenters can gain more information
with photon counters than with simple photon detectors, we expect that photon counters make
violation of LR more detectable. We also expect that photon counters can mitigate the influence
of the effectively unentangled part of the state.
Our results show that it is possible to perform a test of LR free of the detection loophole
using the independent inputs source, assuming that the detection efficiency of photon counters
(photon detectors) is at least 89.71 % (at least 91.11 %, respectively), showing a small advantage
for photon counters. Furthermore, we numerically quantify the statistical strength of such a test
of LR as a function of the counter or detector efficiency and state parameters. For comparison, we
obtain the same information for an ideal source of unbalanced Bell states. This makes it possible
to estimate the minimum number of trials required to gain reasonable confidence in rejecting LR,
as this number is inversely related to statistical strength.
In Sec. 4.1, we briefly describe the experimental scheme that we analyze. In Sec. 4.2, we
point out the deficiencies of the most commonly used method for quantifying the violation of LR
and summarize the method based on the Kullback-Leibler (KL) divergence proposed in Ref. [24].
We present our results in Sec. 4.3, and we make concluding remarks in Sec. 4.4. This chapter is
41
based on our previous work [26].
4.1 Experimental configuration
Here we consider a test of LR using pairs of polarized photons which are in the same spatial-
temporal mode. The two photons can be generated by an SPDC process [116, 117, 145] in the
weak-pumping regime, although single-photon sources could be used [146, 147]. Given such photon
pairs, they can be processed as shown in Fig. 4.1 to produce a state that can violate LR.
2
4
3PR1 PR4
PR2
PR3
PBS1
PBS2
PBS3
D1
D2
D3
D4
Alice
Bob
PBS – Polarizing Beam
Splitter
PR – Polarization Rotator
D – Photon Detector (or
Photon Counter)
1
Figure 4.1: Schematic of a test of LR with the independent photons source. Two spatially andtemporally matched polarized photons are inserted at 1 and 2. The polarization rotators PR1 andPR2 are set so that photons 1 and 2 are linearly polarized at the same direction when they reachthe polarizing beam splitter PBS1. After PBS1, the photons are in a nonmaximally entangled state(see Eq. (4.3)) and are sent to Alice’s and Bob’s measurement setups. Each measurement setupuses a PR, a PBS and two detectors. The PR is used to select measurement bases by rotating thephoton’s polarization state.
Consider a pair of photons arriving in modes 1 and 2 of Fig. 4.1 in the state
|ψ〉12 = |H〉1|H〉2, (4.1)
42
where H (V ) denotes horizontal (vertical) polarization. We set the polarization rotators PR1 and
PR2 to the same angle to produce the state
|ψ′〉12 = (α|H〉1 + β|V 〉1)(α|H〉2 + β|V 〉2), (4.2)
where |α|2 + |β|2 = 1. After the polarizing beam splitter PBS1, we get the “pseudo-Bell” state
|ψpB〉 = α2|H〉3|H〉4 + β2|V 〉3|V 〉4 + αβ|H〉3|V 〉3 + αβ|H〉4|V 〉4. (4.3)
Using these states, we can perform a test of LR. Motivated by the result of Eberhard [28],
we investigate the possibility of reducing the minimum detection efficiency required to close the
detection loophole in a test of LR by changing the values of α and β in Eq. (4.3).
When we set |α| = |β| = 1/√
2 in Eq. (4.3) and condition on coincidence postselection, we
may treat the pseudo-Bell state as a maximally entangled state, as in the experiments reported in
Refs. [116, 117, 145]. This postselection process discards events where both photons leave PBS1
in the same direction, effectively projecting onto a Bell state. However, the discarded events may
create another loophole similar to the detection loophole for tests of LR [148, 149]. To close this
loophole, the entire pattern of experimental data must be included when evaluating a violation of
a Bell inequality [150]. Here, we also use all data without postselection, but instead of obtaining a
violation of a Bell inequality, we quantify the experimental evidence against all local realistic (LR)
models by means of measures derived from the KL divergence.
4.2 Data analysis method
Contradictions between experimental results and LR are often shown by the violation of a Bell
inequality, such as the CHSH inequality ICHSH ≤ 2 as in Eq. (1.2). To test the CHSH inequality in
an experiment, one needs to estimate the probabilities of various outcomes from a finite number of
trials. Due to uncertainties in the estimated probabilities, it is conventional to present the violation
of LR in terms of the number of experimental standard deviations (SDs) separating the estimate of
ICHSH from its LR upper bound, i.e., 2. For example, Weihs et al. [113] reported an experimental
estimate ICHSH = 2.73± 0.02 and claimed a violation of the CHSH inequality by 30 SDs.
43
While the experimental SD provides the precision with which a Bell-inequality violation is
measured, there are several problems with the number of SDs of violation. First, although the
SD partially quantifies the measurement uncertainty due to a finite number of trials, it does not
characterize the probability that an LR system could also violate a Bell inequality after a finite
number of trials. Because such a system’s (non-)violation can have a larger SD, the experimental
SD may suggest more confidence in rejecting LR than justified (see Chapter 5 for examples).
Second, one would expect that the probability distribution of the estimate of ICHSH under LR is
Gaussian, since this appears to be justified by the central limit theorem [151] as the number of
trials approaches infinity. It therefore seems reasonable to statistically quantify the violation by the
probability that a Gaussian random variable can exceed the mean by the number of SDs of violation
experimentally observed. However, for a finite number of trials and high violation, the Gaussianity
assumption fails. Third, the computation of SDs assumes that the trials are independent and
identically distributed; that is, it does not consider the memory effect [20, 21, 22, 23]. We cannot
expect the prepared states and experimental settings to be stable over the course of a long sequence
of trials. In addition, We cannot exclude the possibility that the LR model for the experiment at a
trial depends on the previous trial results, i.e., previous measurement-setting choices and outcomes.
Fourth, it is desirable to compare experimental results from different tests of LR, but the effects of
the problems with experimental SDs depend on the Bell inequality, the quantum state, measurement
settings, detection efficiency, and other experimental parameters. Consequently, the number of SDs
of violation cannot be used to directly compare the amount of evidence for rejecting LR obtained
from different experimental tests.
To avoid these problems, in this chapter we quantify the violation of LR by the statistical
strength of a test of LR as proposed by van Dam et al. [24]. The statistical strength is charac-
terized by the KL divergence from the experimental probability distribution to the best prediction
according to LR. This measure is justified by the observation that the confidence at which the
experimental data violate LR is asymptotically related to the statistical strength [25]. In the next
two chapters we propose methods to rigorously quantify the confidence in rejecting LR obtained
44
from a finite set of data.
To better understand the approach based on the KL divergence, it is helpful to analyze tests
of LR in terms of a two-player game. The two players are the quantum experimenter QM and
the LR theorist LRT who wants LR to prevail. During the test of LR, given a source of quantum
states, experimenter QM can randomly change the measurement settings. After a large number N
of trials, QM estimates the probability distribution q of measurement settings and outcomes from
the experimental data, which, hopefully, is consistent with the quantum-mechanical prediction and
violate LR. At the same time, knowing the state preparation procedure and the distribution of
measurement settings but not the actual settings or outcomes at a trial, LRT can design all kinds
of different LR models, predicting different probability distributions p for the settings and outcomes.
(We are assuming that state preparation protocols and measurement-setting distributions are not
changed during the experiment.) The goal is to make p as consistent as possible with the eventually
obtained estimated distribution q. This requires minimizing a distance between the QM’s estimate
q and LRT’s prediction p. Following the argument in Ref. [24], this distance can be measured by
the KL divergence from q to p, as defined by
DKL(q ‖ p) =K∑k=1
L∑l=1
q(k, l) log2
(q(k, l)p(k, l)
), (4.4)
where k is the measurement-setting index, K is the number of different measurement settings, l is
the measurement outcome index, and L is the number of different measurement outcomes under
each measurement setting. For example, in the test of the CHSH inequality using photon pairs
entangled in polarization, k denotes one of the measurement settings (A1, B1), (A1, B2), (A2, B1),
or (A2, B2), and so K = 2× 2 = 4; l denotes one of the outcomes (H, H ), (H, V ), (V, H ), or (V,
V ) (assuming perfect detection) and so L = 2× 2 = 4.
The KL divergence has the property that DKL(q ‖ p) ≥ 0, with equality if and only if p = q.
Since there are many different LR models, LRT has the freedom to choose the best one pLR, namely,
the one that minimizes the KL divergence from q. We then define the statistical strength Sq of the
45
distribution q for rejecting LR according to
Sq ≡ DKL(q ‖ pLR) = minp∈L
DKL(q ‖ p), (4.5)
where L is the set of LR models. Likewise, QM also has the freedom to choose different measurement
settings and setting distributions so that the best LR model explains the experimental data poorly.
Hence, the general problem is to determine the optimal statistical strength S of tests of LR subject
to experimental constraints, which is defined to be
S ≡ DKL (qs ‖ ps,LR) = maxq∈Q
Sq = maxq∈Q
minp∈L
DKL(q ‖ p), (4.6)
where qs is an optimal quantum strategy maximizing Eq. (4.5), ps,LR is the best LR model with
respect to qs, and Q is the set of accessible quantum strategies. The statistical strength is asymp-
totically related to the p-value, which is the probability according to LR of obtaining a violation
as high as that observed after finite trials. There is a statistical test such that if S > 0, then for
almost all infinite sequences of independent measurement-setting choices and outcomes, the p-value
after N trials is
pN = 2−NS+o(N), (4.7)
where o(N) is a data-dependent term that goes to 0 as N →∞ [25]. No statistical test can have a
better asymptotic p-value. Because 1− pN can be thought of as a confidence in rejecting LR, the
statistical strength S quantifies the asymptotic rate at which confidence is gained. In particular,
the number of trials required to have reasonable confidence in rejecting LR is necessarily greater
than 1/S.
LRT’s effort to minimize the KL divergence as in Eq. (4.5) is a maximum likelihood estimation
problem. Here, we use the expectation-maximization algorithm in Ref. [152]. The general problem
of computing the optimal statistical strength S is nontrivial. Given the prepared state and the
setting distribution in an experiment, to calculate S, we maximize Eq. (4.5) over measurement
settings with standard nonlinear optimization techniques.
To calculate the statistical strength of a test of LR, we need to learn how LRT predicts
the measurement results given the state preparation procedure and possible measurement settings.
46
Suppose that for a bipartite system with nA × nB measurement settings there are dA outcomes
for each of nA measurement settings at Alice’s side, and there are dB outcomes for each of nB
measurement settings at Bob’s side. Then the LR description implies the existence of a single joint
probability distribution over a dnAA × d
nBB -element event space, which we write as
ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) , (4.8)
where a1, . . . , anA ∈ {1, 2, . . . , dA}, and b1, . . . , bnB ∈ {1, 2, . . . , dB}, with normalization
dA∑a1,...,anA
=1
dB∑b1,...,bnB
=1
ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) = 1. (4.9)
Hence, the marginal probability for the measurement outcome (ai; bj) when settings Ai and Bj are
chosen is given by
ProbLR(Ai = ai;Bj = bj) =dA∑
a1,...,ai−1,ai+1,...,anA=1
dB∑b1,...,bj−1,bj+1,...,bnB
=1
ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) .
(4.10)
Since the probabilities ProbLR(Ai = ai;Bj = bj) are constrained to be marginal distributions,
they satisfy nontrivial relationships. The goal of a test of LR is to choose states and settings
that result in quantum predictions that cannot be obtained as the marginals of a single LR model
for all i and j. The quantum-mechanical prediction of the probability is given by ProbQM(Ai =
ai;Bj = bj) = Tr(ρO(Ai = ai;Bj = bj)), where ρ is the density matrix of the quantum state,
and O(Ai = ai;Bj = bj) is the positive-operator valued measure element corresponding to the
measurement outcome (ai; bj) when Alice and Bob use settings Ai and Bj , respectively. Given the
distributions of measurement settings chosen by Alice and Bob, the KL divergence measures the
statistical distance of the optimal LR model from the quantum predictions as in Eq. (4.5).
4.3 Results and discussion
We consider tests of LR using the independent inputs source for pseudo-Bell pairs and tests
using unbalanced Bell pairs. In both cases, Alice and Bob use measurement devices like those shown
47
in Fig. 4.1. They use either counters or detectors for photon detection, and they independently
and uniformly randomly choose one of two measurement settings each, where the settings are
determined by the polarization rotators. We use Bloch-sphere Euler angles as explained below to
define measurement settings. We label the measurement settings A1 and A2 for Alice or B1 and
B2 for Bob and write the two-photon state coming out of modes 3 and 4 in Fig. 4.1 as |ψ〉AB. We
calculate the optimal statistical strength S according to Eq. (4.6) by maximizing over the Euler
angles of the measurement settings {A1, A2, B1, B2} and minimizing over the set of LR models L,
where we fix the two-photon state |ψ〉AB shared by Alice and Bob. The inner minimization as
implemented guarantees convergence to the optimal LR model pLR, whereas the outer one obtains
only a local optimum. Confidence in global optimality can be obtained by repetition from many
different starting points (which we have done) or by more sophisticated search strategies. A local
optimum satisfying S > 0 is sufficient for having found a detection-loophole-free test. On the other
hand, finding no solution with S > 0 is heuristic evidence that such a test does not exist subject
to the constraints of the experiment. Thus, with this optimization strategy, we can trace the
boundary of the region for which S > 0 (by searching for where S decreases to 0) to heuristically
determine the minimum detection efficiency ηmin and the associated optimal measurement settings
{A1min, A2min, B1min, B2min} needed to perform a test of LR free of the detection loophole with a
given state.
Note that as S → 0, the number of trials required to gain confidence close to unity diverges.
For a constant rate of gaining confidence (see the explanation below Eq. (4.7)), we set the desired
optimal statistical strength S = X > 0 and determine the minimum detection efficiency ηc and the
associated optimal measurement settings {A1c, A2c, B1c, B2c} that achieve this statistical strength.
The strategy for finding such a set of solutions {ηc, A1c, A2c, B1c, B2c} is as follows: First we start
with a set of solutions {ηold, A1old, A2old, B1old, B2old} achieving a statistical strength Xold ≥ X.
Second we optimize Eq. (4.5) over the measurement settings {A1, A2, B1, B2} with the fixed de-
tection efficiency ηold, which yields new settings {A1new, A2new, B1new, B2new} achieving S = Y
(Y ≥ Xold) under the efficiency ηold. Third, we decrease the detection efficiency from ηold to ηnew
48
as much as we can without reducing the statistical strength to below X, so that this new set of
solutions {ηnew, A1new, A2new, B1new, B2new} achieves S = Xnew with Xnew close to X (within nu-
merical error). We then repeat the above procedure several times replacing the old with the new
solutions, until we are unable to reduce the efficiency parameter. We thus heuristically find the set
of optimal solutions {ηc, A1c, A2c, B1c, B2c} to achieve a given statistical strength level S = X.
Table 4.1: Extreme conditions for tests of LR free of the detection loophole for photon countersor photon detectors using the unbalanced Bell states |ψuB〉 defined in Eq. (4.11). The asymptoticbehavior when θ → 0 is consistent with results in Ref. [1], which are shown in the last row. Theangle parameters are explained in the text.
θ α1min α2min β1min β2min ηmin
45◦ 22.50◦ −67.50◦ −22.50◦ 67.50◦ 82.85 %40◦ 21.28◦ −66.89◦ −21.28◦ 66.89◦ 80.61 %35◦ 19.40◦ −65.60◦ −19.40◦ 65.60◦ 78.50 %30◦ 17.00◦ −63.58◦ −17.00◦ 63.58◦ 76.50 %25◦ 14.21◦ −60.72◦ −14.21◦ 60.72◦ 74.60 %20◦ 11.14◦ −56.79◦ −11.14◦ 56.79◦ 72.81 %15◦ 7.92◦ −51.42◦ −7.92◦ 51.42◦ 71.12 %10◦ 4.70◦ −43.88◦ −4.70◦ 43.88◦ 69.53 %5◦ 1.81◦ −32.41◦ −1.81◦ 32.41◦ 68.06 %4◦ 1.32◦ −29.25◦ −1.32◦ 29.25◦ 67.78 %3◦ 0.87◦ −25.55◦ −0.87◦ 25.55◦ 67.52 %2◦ 0.48◦ −21.04◦ −0.48◦ 21.04◦ 67.27 %1◦ 0.17◦ −15.01◦ −0.17◦ 15.01◦ 67.06 %→ 0 0 → −2θ1/2 0 → 2θ1/2 → 2/3
First, we analyze unbalanced Bell states of the form
|ψuB〉 = cos(θ)|H〉A|H〉B + sin(θ)|V 〉A|V 〉B, (4.11)
where θ ∈ (0, π/4]. Note that whether there is a relative phase ei∆φ between the second and first
terms of Eq. (4.11) is not important, since Alice can always adjust her polarization basis, i.e.,
|H〉A → |H〉A, and |V 〉A → e−i∆φ|V 〉A, to put the state in the above form. In principle, the state
|ψuB〉 can be simulated by postselection on the state |ψpB〉 in Eq. (4.3), although this introduces a
loophole as mentioned earlier. Experimental techniques to prepare |ψuB〉 without postselection have
been demonstrated and applied to tests of LR [153, 154]. Here we calculate the statistical strength
for photon detectors. Photon counters have no advantage over photon detectors here, because no
more than one photon simultaneously arrives at Alice’s or Bob’s detectors. Hence, counters and
49
detectors have the same ability of detecting different outcomes with the same probabilities. Our
optimization results are summarized in Table 4.1 and Fig. 4.2. The measurement angle αi,min (or
βj,min) shown in Table 4.1 is the angle from the z axis of the polarization state of an incoming photon
that gets reflected at PBS2 (or PBS3) in Fig. 4.1, where we use the Bloch sphere representation for
this state. By convention, |H〉 and 1√2(|H〉+|V 〉) are polarization states associated with the z and x
axes, respectively. The measurement operators are related with the measurement angles αic and βjc
by Aic = cos(αic)σz + sin(αic)[cos(φic)σx + sin(φic)σy] and Bjc = cos(βjc)σz + sin(βjc)[cos(φ′jc)σx +
sin(φ′jc)σy], i, j = 1, 2. The optimizations show heuristically that we can take φic = φ′jc = 0
everywhere; i.e., all the optimal measurement settings lie in the (x, z) plane of the Bloch sphere,
an observation which has been proven for several special cases [46, 48, 155].
From Table 4.1, we can see that when the optimal statistical strength S approaches 0, αi,min =
−βi,min for i = 1, 2. The minimum detection efficiency ηmin decreases monotonically with the
parameter θ in |ψuB〉 and is 82.85 % when θ = π/4, where the state is a Bell state. It approaches
2/3 when θ approaches 0, where the state is very close to a product state. These results are
consistent with previous results [28, 118]. From Fig. 4.2, we can see how the optimal statistical
strength increases for η > ηmin and how the input state must change to achieve this statistical
strength. Note that not all unbalanced Bell states can achieve a given statistical strength level
S > 0, even if η = 1. For example, for S ≥ 10−4, the parameter θ must be greater than 0.98◦.
Associated measurement settings can be found in the tables in Appendix B.
We also study the effect of the depolarizing noise in the unbalanced Bell state on the minimum
detection efficiency. We model the effective state shared by Alice and Bob as
ρAB = V |ψuB〉〈ψuB|+ (1− V )I/4, (4.12)
where I is the identity matrix of size 4 × 4, and the visibility V characterizes the depolarizing
noise in an experiment. Given the visibility V , the minimum detection efficiency ηc required to
achieve a specific statistical strength level X is a function of the state parameter θ. We study
the relationship between the visibility V and the overall minimum detection efficiency minθ ηc(θ)
50
required to achieve the statistical strength level S = 10−6, as plotted in Fig. 4.3. From Fig. 4.3,
we can see how the overall minimum detection efficiency minθ ηc(θ) changes with the visibility V .
Generally, the higher the visibility, the lower the overall minimum detection efficiency. Associated
states and measurement settings can be found in the tables in Appendix B.
0 5 10 15 20 25 30 35 40 450.65
0.70
0.75
0.80
0.85
0.90
0.95
1.00
θ
η c
e
2/3
c
b
d
(deg)
a
a: 0 b: 1E-6 c: 1E-5 d: 1E-4 e: 1E-3
Figure 4.2: Detection efficiency of photon counters or photon detectors required for different sta-tistical strength levels S vs the parameter θ [Eq. (4.11)]. The empty squares show our calculatedpoints, and the dotted lines are linear interpolations to guide the eyes. In curve a, the linearextrapolation toward θ = 0 is shown.
We now consider the pseudo-Bell states of Eq. (4.3). Let α = cos(γ) and β = sin(γ)eiφ, then
Eq. (4.3) can be rewritten as
|ψpB〉 = cos2(γ)|H〉3|H〉4 + sin2(γ)ei2φ|V 〉3|V 〉4
+ cos(γ) sin(γ)eiφ(|H〉3|V 〉3 + |H〉4|V 〉4), (4.13)
51
0.7 0.75 0.8 0.85 0.9 0.95 10.65
0.7
0.75
0.8
0.85
0.9
0.95
1
V
min
θη c
Figure 4.3: Tradeoff between the overall minimum detection efficiency minθ ηc(θ) and the visibilityV of unbalanced Bell states. Here, we fix the optimal statistical strength to 10−6.
where γ ∈ (0, π/4], and φ ∈ [0, 2π). We can prepare different pseudo-Bell states by changing the
values of both γ and φ. However, for a given γ, as the following discussion shows, the optimal
statistical strength S is the same regardless of the value of φ. In the test of LR as shown in
Fig. 4.1, Alice’s and Bob’s measurements are restricted to polarization rotation followed by photon
counting. They cannot detect coherences between any two of the first two, the third, and the
last terms in the state |ψpB〉 as written in Eq. (4.13), because these terms correspond to different
photon-number-distribution subspaces. Hence, the measurement outcomes determined by |ψpB〉
52
are equivalent to the outcomes given by a mixture of the following two states:
|ψ1〉〈ψ1|, with |ψ1〉 ∝ cos2(γ)|H〉3|H〉4 + sin2(γ)ei2φ|V 〉3|V 〉4, (4.14)
and
ρ2 ∝ |H〉3|V 〉3 3〈H|3〈V |+ |H〉4|V 〉4 4〈H|4〈V |. (4.15)
Since the state |ψ1〉 can be written in the form |ψuB〉 as in Eq. (4.11) by changing the mode
labels and the state bases, the measurement outcomes attributable to |ψ1〉 can reveal a violation
of LR when γ ∈ (0, π/4], as our earlier results show. But ρ2 is a separable state and so the
outcomes attributable to ρ2 can be explained by LR no matter what the measurement settings
{A1, A2, B1, B2} are. Hence, in a test of LR, the information about whether or not LR is violated
is conveyed only by the outcomes from |ψ1〉, while the state ρ2 acts as noise. Based on these
considerations and the earlier argument about being able to eliminate a potential phase in |ψuB〉,
we do not need to consider different phases φ in the pseudo-Bell state |ψpB〉 when calculating the
optimal statistical strength S. Hence, we can choose a fixed value, such as φ = 0. Moreover, we
determined heuristically by extended optimizations in selected cases that the optimal measurement
settings {A1c, A2c, B1c, B2c} can be chosen to lie in the (x, z) plane of the Bloch sphere, just like for
|ψuB〉. Taking these observations into account reduces the number of free parameters and speeds
up the general calculations.
The optimization results for pseudo-Bell states are summarized in Table 4.2 and Fig. 4.4.
Similar to unbalanced Bell states, Table 4.2 shows that when the optimal statistical strength S
approaches 0, αi,min = −βi,min for i = 1, 2. Figure 4.4 shows that there is a lower bound on the
state parameter γ to achieve a nonzero statistical strength level S. Measurement settings for the
results shown in Fig. 4.4 are given in Appendix B.
Table 4.2 and Fig. 4.4 (a) show that the minimum detection efficiency ηmin required to close
the detection loophole achieves its minimum in the interior of the domain, in contrast to what was
found for the case of unbalanced Bell states. We might have expected this behavior based on the
following two observations: First, with respect to the measurement setups used (see Fig. 4.1), the
53
Table 4.2: Extreme conditions for tests of LR free of the detection loophole for photon counters andphoton detectors using the pseudo-Bell states of Eq. (4.13). The angle parameters are explainedin the text. The minimum detection efficiencies for counters and detectors when γ = 45◦ are thesame as those found in Ref. [2].
Photon counter Photon detectorγ α1min α2min β1min β2min ηmin α1min α2min β1min β2min ηmin
45◦ 22.50◦ −67.50◦ −22.50◦ 67.50◦ 90.62 % 11.64◦ −63.88◦ −11.64◦ 63.88◦ 92.23 %40◦ 20.49◦ −66.01◦ −20.49◦ 66.01◦ 89.71 % 11.08◦ −62.79◦ −11.08◦ 62.79◦ 91.31 %35◦ 16.76◦ −62.14◦ −16.76◦ 62.14◦ 89.78 % 9.79◦ −59.60◦ −9.79◦ 59.60◦ 91.11 %30◦ 12.32◦ −56.16◦ −12.32◦ 56.16◦ 90.80 % 7.93◦ −54.42◦ −7.93◦ 54.42◦ 91.71 %25◦ 8.00◦ −48.43◦ −8.00◦ 48.43◦ 92.57 % 5.73◦ −47.46◦ −5.73◦ 47.46◦ 93.05 %20◦ 4.43◦ −39.49◦ −4.43◦ 39.49◦ 94.71 % 3.53◦ −39.09◦ −3.53◦ 39.09◦ 94.89 %15◦ 1.96◦ −29.88◦ −1.96◦ 29.88◦ 96.81 % 1.68◦ −29.76◦ −1.68◦ 29.76◦ 96.85 %10◦ 0.59◦ −19.98◦ −0.59◦ 19.98◦ 98.52 % 0.54◦ −19.96◦ −0.54◦ 19.96◦ 98.53 %5◦ 0.07◦ −10.00◦ −0.07◦ 10.00◦ 99.63 % 0.07◦ −10.00◦ −0.07◦ 10.00◦ 99.63 %
state |ψpB〉 can be thought of as the state |ψuB〉 with noise, as pointed out above, and second, the
violation of LR given by |ψuB〉 is very sensitive to noise, particularly when θ in |ψuB〉 of Eq. (4.11)
is small (see the results under Fig. 4.3 and discussions in Ref. [50]). Table 4.2 and Fig. 4.4 (a) also
suggest that any pseudo-Bell state |ψpB〉 can violate LR using counters or detectors with sufficient
efficiency.
When we look at the minimum detection efficiency required to achieve a given statistical
strength level S, the efficiencies of photon counters and photon detectors are notably different,
showing the utility of the additional information available with photon counters. The advantage
of photon counters is most notable for γ between approximately 35◦ and 45◦. In particular, the
minimum detection efficiency ηmin is 89.71 % for photon counters and 91.11 % for photon detectors,
and ηmin is achieved for γ in this range. Loosely speaking, this advantage is because photon counters
are better at differentiating between measurement outcomes contributed by the entangled (|ψ1〉 in
Eq. (4.14)) and unentangled (ρ2 in Eq. (4.15)) parts of the state |ψpB〉.
A comparison between Fig. 4.2 and Fig. 4.4 suggests that higher efficiencies are required to
achieve given statistical strengths with pseudo-Bell states |ψpB〉 than with unbalanced Bell states
|ψuB〉. This again can be attributed to the noise added by ρ2 to measurement outcomes, which
reduces the statistical strength considerably. As an explicit example, consider the optimal statistical
54
0 10 20 30 40 450.880.900.920.940.960.981.00
γ
η min
CounterDetector
0 10 20 30 40 450.880.900.920.940.960.981.00
γ
η c
CounterDetector
0 10 20 30 40 450.880.900.920.940.960.981.00
γ
η c
CounterDetector
0 10 20 30 40 450.880.900.920.940.960.981.00
γ
η c
CounterDetector
=0(b)=5E-5
(d)
(deg)
(a)
(c)
S S
=1.5E-3
(deg)
(deg)
=5E-4 SS
(deg)
Figure 4.4: Detection efficiencies of photon counters and photon detectors required for differentstatistical strength levels S vs the parameter γ of the pseudo-Bell state of Eq. (4.13): (a) S = 0,(b) S = 5E-5, (c) S = 5E-4, and (d) S = 1.5E-3. The calculated points are labeled by squares forphoton counters and by diamonds for photon detectors, and the dotted lines are linear interpolationsto guide the eyes.
strengths SuB or SpB achievable with
|ψuB(θ = π/4)〉 =1√2
(|H〉A|H〉B + |V 〉A|V 〉B), (4.16)
or with
|ψpB(γ = π/4, φ = 0)〉 =12
(|H〉3|H〉4 + |V 〉3|V 〉4 + |H〉3|V 〉3 + |H〉4|V 〉4). (4.17)
We find that SuB = 2SpB ≈ 0.04627 for perfect photon counters. The ratio can be explained
by observing that one half of the measurement outcomes of |ψpB(γ = π/4, φ = 0)〉 are from the
separable state ρ2 in Eq. (4.15).
55
4.4 Concluding remarks
We have demonstrated a method to measure the statistical strength of tests of LR that is
based on the KL divergence from the predicted experimental probability distribution to the best
prediction given by LR. This method helps to design a loophole-free test of LR and quantifies the
confidence in violation of LR for sufficiently large experimental data sets. We used the method
to determine optimal statistical strengths of tests of LR using a typical measurement setup for
polarized photon pairs with inefficient detectors or counters. We considered both ideal unbalanced
Bell states and pseudo-Bell states obtained by combining independent polarized photons on a
polarizing beam splitter. Creating the latter can be easier [116, 117, 145], but observing a violation
of LR requires higher detection efficiencies. Our calculations show that with pseudo-Bell states,
we can close the detection loophole with a minimum detection efficiency of 89.71 % using photon
counters, or 91.11 % using photon detectors. For unbalanced Bell states, we confirmed previous
calculations [28] showing that violations of LR are possible at detection efficiencies above 2/3.
Furthermore, we numerically exhibited the relationships between state parameters (or visibilities)
and minimum detection efficiencies needed to achieve given levels of statistical strength. Given that
the current roadblock for performing loophole-free tests of LR with photons is detection inefficiency
rather than the difficulty of obtaining an entangled source, we cannot recommend using the pseudo-
Bell state for such an experiment.
In current experiments based on SPDC to produce entangled photon pairs, we must consider
other sources of potentially unwanted measurement outcomes. Such sources include dark counts
and the generation of more than one photon pair [156, 157]. The latter effect can be quite noticeable,
particularly for the brighter, more strongly pumped sources. Further work is required to analyze
the consequences of these effects for the statistical strength. It is also desirable to obtain rigorous
confidence levels for rejecting LR with moderately sized data sets, which we discuss in detail in the
following two chapters. Such confidence levels will improve on measures derived from experimental
SDs of Bell-inequality violation.
Chapter 5
Asymptotically optimal data analysis for rejecting local realism
From the discussion in Sec. 4.2 of the previous chapter, we know that there are several
problems with the conventional measure, the number of experimental standard deviations (SDs)
of violation of a Bell inequality. To avoid these problems, in this chapter we show how to analyze
data from experimental tests of LR to compute a measure of the strength of the evidence against
local realism (LR). By computing this measure, violations of LR by different experiments can
be rigorously assessed and compared. Specifically, the proposed analysis protocol quantifies the
violation of LR in terms of p-values, where small p-values imply strong violation. We call this
the prediction-based-ratio (PBR) protocol. Protocols such as this compute a p-value from a “test
statistic” (see Sec. 5.1 for details). A test statistic is a function of the sequence of trial results,
i.e., measurement-setting choices and outcomes of trials. There are many such statistics to choose
from; an example is the Bell-inequality violation estimated from a finite number of trials and used
by the SD-based protocol.
We prove that the PBR protocol is valid; see Sec. 5.1 for the definition of validity. We compare
the PBR protocol to SD-based and martingale-based [18, 19] protocols. For N independent and
identically distributed trials, these protocols have the property that the p-values computed decrease
to 0 exponentially as N → ∞. We can therefore compare different protocols’ performances in a
test of LR according to the (asymptotic) confidence-gain rate defined by
G = − limN→∞
log2 p(prot)N
N, (5.1)
where p(prot)N is the p-value computed by a protocol. It is desirable to have a high confidence-gain
57
rate as this implies that fewer trials are needed to achieve the same strength of violation of LR.
Given the experimental probability distribution q, the optimal confidence-gain rate that can be
achieved by any protocol is given by the statistical strength Sq as defined in Eq. (4.5) of Chapter 4.
We prove that the PBR protocol is asymptotically optimal. That is, its p-values always achieve the
optimal confidence-gain rate. The confidence-gain rates achieved by different protocols are shown
in Figs. 5.1 and 5.2 for a number of experimental configurations that are explained in Sec. 5.4. The
figures show that SD-based p-values are not valid in some regions. Because the relationship of the
SD-based confidence-gain rates compared to the asymptotically optimal ones varies substantially,
results of experiments with different configurations cannot be directly compared by the common
“number of SDs of violation” measure. The martingale-based protocol is valid and computationally
simple but achieves suboptimal confidence-gain rates.
The PBR protocol remains valid even if the prepared quantum state, measurement settings,
and relevant local realistic (LR) models vary arbitrarily during an experiment, that is, in the
presence of the memory effect [20, 21, 22, 23]. This is desirable not only for tests of LR but
also for practical applications of quantum information, such as device-independent quantum key
distribution [11, 13, 107, 108, 109], randomness expansion [3, 110, 111, 112], state estimation [158],
and certification of entangled measurements [159].
Compared with the other two protocols, an advantage of the PBR protocol is that it can
be applied to a wide variety of configurations (the combinations of quantum state, measurement
settings and other relevant parameters) without having to specify a Bell inequality. Since such
Bell inequalities characterize the family of probability distributions achievable by LR models, they
provide a useful guide to designing an experiment and determining good goal configurations to
be achieved. But since Bell-inequality violation is not directly related to statistical strength, it
is not obvious how to choose the best inequality with respect to an experiment. Moreover, the
predetermined Bell inequality restricts a successful experiment to configurations close to the goal,
closer than may be achievable in a given experiment. The PBR protocol automatically adapts to
deviations from the goal, achieving optimal confidence-gain rates for actual configurations. One
58
can exploit this adaptability by applying the PBR protocol to experiments in progress. This makes
it possible to monitor the current (non-)violation of LR for the purpose of optimizing configuration
parameters. Appendix A contains the code information and documentation for an implementation
of the PBR protocol (the local realism analysis engine) that can be used for monitoring experi-
ments in progress and for analyzing existing data sets. Our results show that the PBR protocol is
sufficiently efficient for practical use with typical experimental configurations.
The chapter is structured as follows: In Sec. 5.1, we provide the relevant statistical back-
ground and justify the use of p-values. In Sec. 5.2, we explain how to compute p-values using
the three protocols mentioned above. We then discuss the technical details for applying the PBR
protocol in Sec. 5.3. Finally in Sec. 5.4, we show how confidence-gain rates achieved by different
protocols compare for various tests of LR. The protocols are also applied to and compared on
simulated and actual experiments. This chapter is based on our previous work [17].
5.1 Statistical concepts
To quantify the strength of the experimental evidence against LR, one needs to take into
account the possibility that a finite set of data generated according to LR can violate a Bell
inequality due to statistical fluctuations in finite samples. This possibility can be formalized in
statistics via a p-value for the hypothesis test of LR. A p-value is associated with a test statistic T
that is a function of the sequence of trial results. If N is the total number of trials, the corresponding
sequence of results is denoted by x = (x1, . . . , xN ). As is conventional, we distinguish between
the sequence of results and the sequence of random variables X = (X1, . . . , XN ) giving rise to
these results. The exact p-value pN is defined as the maximum of the probabilities of the events
T (XLR) ≥ T (x) over all random-variable sequences XLR distributed according to LR models. That
is,
pN = maxLR
ProbLR(T (XLR) ≥ T (x)). (5.2)
59
Due to the difficulty of determining worst-case tail probabilities of typical test statistics, we can
usually determine only upper bounds of exact p-values. Moreover, to close the memory loophole in
a test of LR, the computation of exact p-values is further complicated by the fact that the set of
null hypotheses includes all possible sequences of LR models depending on previous trial results.
Thus, for the remainder of this chapter, the term “p-value” refers to any putative upper-bound
b(T (x)), computed according to a protocol, on the exact p-value Eq. (5.2). That is, the p-value of
a protocol given the observed data x is defined by p(prot)N = b(T (x)).
In order to be able to interpret a protocol’s p-value as a measure of the violation of LR,
it must satisfy statistical validity: A protocol and its p-values are valid if and only if the bound
b(t) ≥ ProbLR(T (XLR) ≥ t) is true whenever XLR is distributed according to LR.
A main purpose of the PBR and related protocols is to evaluate the strength of the evidence
against LR by computing valid p-values given the data. Some care must be taken in interpreting
such p-values in terms of probabilities. For example, a p-value cannot be interpreted as a probability
that LR is true. Although p-values are computed for the data, their validity is defined in terms of
what is known before an experiment, not after. Strictly speaking, we can only state for sure that
before performing the trials, the following holds: For any fixed 0 ≤ α ≤ 1, if LR holds, then the
probability that the returned valid p-value satisfies p(prot)N ≤ α is at most α. Although we have no
intention of making an actual decision on the failure of LR, this statement can be viewed in terms
of traditional hypothesis testing: A protocol tests LR simultaneously at all significance levels α,
and “rejects” LR at a given α if p(prot)N ≤ α. The validity property is equivalent to the statement
that, if LR holds, the maximum probability of (falsely) rejecting at level α is bounded above by
α. This justifies the use of p-values to quantify the violation of LR. The definitions of significance
levels and p-values are based on Ref. [151], 2nd edition, pages 126 and 127.
We use the the term “protocol” rather than “test” for two reasons. The first is that the
term “test” in “test of LR” typically refers to the experimental setup and subsequent analysis,
not a conventional hypothesis test. The second is that hypothesis tests, as the term is used in
mathematical statistics, are valid by definition. Thus, although we do not encourage it, one can
60
think of a valid analysis protocol as a family of hypothesis tests. For such a family to be useful,
the tests should also have high power. For our situation, one can express the power in terms
of the probabilities of rejection at given significance levels, supposing a set of data are sampled
from non-LR models. Alternatively, one can consider the expected p-values, and look for tests for
which the expected p-values are as small as possible. We do not expect that the PBR protocol has
particularly low p-values for a given finite number of trials. In fact, because of the conservative
nature of Markov’s inequality used in the PBR protocol (see Sec. 5.2.4), better protocols exist.
However, asymptotic optimality of the PBR protocol assures us that it performs well when the
evidence for rejection is very strong.
It is also worth noting that many issues that arise in applications of hypothesis testing, such
as selection biases, are less of a concern when one is considering the extremely low p-values that
are desirable when falsifying a physical theory. Corrections for such effects improve p-values by
relatively small terms in our setting. Also, one application of the PBR protocol is to quantify
the success of an experiment independent of the details of the configuration, so that different
experiments can be compared. For this application, the statistical interpretation of the p-value
serves only as a motivation.
5.2 Theory
In this section, we consider three protocols that determine p-values for rejecting LR from ex-
perimental data: SD-based, martingale-based, and PBR protocols. The first two protocols depend
on a Bell inequality, whereas the PBR protocol requires only a sequence of trial results. While all
these three protocols applies to tests of LR with multiple parties, we discuss it explicitly for the
bipartite case to simplify the formulas. (Our implementation of the local realism analysis engine
is presently restricted to tests of LR with two parties.) The result of the n’th trial is denoted by
xn = (in, jn, an, bn), where in, jn are the n’th chosen settings and an, bn are the n’th observed
outcomes of Alice and Bob, respectively. Let i(X) and j(X) be Alice’s and Bob’s settings, respec-
tively, given the potential result X. The joint-setting distribution is fixed, and the probability of
61
choosing the settings i and j by Alice and Bob is given by pi,j .
Before explaining the details of the protocols, let us discuss how to use Bell inequalities in
the SD-based and martingale-based protocols.
5.2.1 Bell functions
To apply the SD-based or martingale-based protocol, we need to write a Bell inequality in
the following form
〈I(X)〉 ≤ B, (5.3)
where X is the random variable from which a trial result x is sampled, I is a real-valued function,
called a Bell function, and I = 〈I(X)〉 is its expectation. Here, the expectation is respect to
the joint distribution of measurement settings and outcomes. An example is the Clauser-Horne-
Shimony-Holt (CHSH) inequality in Eq. (1.2). In this case, if the trial result x consists of setting
choices i, j and outcomes a, b, then
ICHSH(x) = (1− 2δi,2δj,2)ab/pi,j , and B = 2. (5.4)
The functional form ICHSH in Eq. (5.4) ensures that its expectation is equal to the left-hand side
of the CHSH inequality (1.2). In particular, this requires dividing by the known probabilities of
choosing different measurement settings. There is no loss of generality by fixing the setting distri-
bution in advance. Violation of LR requires that measurement settings be chosen independently of
local hidden variables. In particular, the locality and memory loopholes cannot be closed unless at
each trial, measurement settings are chosen randomly and independently by each party according
to a known probability distribution so that there is no possibility of a causal connection between
any two events of Alice’s setting choice, Bob’s setting choice, and the emission of the entangled
particle pair.
Given an experimentally obtained sequence of results x1, . . . , xN from N trials, the obvious
method for estimating I is to compute the average of the sequential values I(xn) given by
I =1N
N∑n=1
I(xn). (5.5)
62
However, this is not the minimum-variance estimate of I, since the setting distribution is fixed and
known. In fact, the conventional way of writing a Bell inequality is as a sum of expectations as in
Eq. (1.2), which makes it independent of the setting distribution. The correspondence between the
two ways of writing a Bell inequality is given by
〈I(X)〉 =∑i,j
pi,j〈I(X)|i(X) = i, j(X) = j〉, (5.6)
where the expectation in the sum is conditioned on the settings of Alice and Bob, as indicated. If
we assume that the state at each trial is identical and do not worry about the memory and locality
loopholes, we can estimate each expectation 〈I(X)|i(X) = i, j(X) = j〉 separately, experimentally
fixing the settings for each estimate if desired. The right-hand side of Eq. (5.6) can then be
computed formally. If we define c(i, j, a, b) to be the number of trials with setting choices i, j and
outcomes a, b, the estimate for I thus computed is
I =∑i,j
pi,j
∑a,b c(i, j, a, b)I(i, j, a, b)∑
a,b c(i, j, a, b), (5.7)
a nonlinear function of c(i, j, a, b). Its SD can be approximated by linear propagation of errors from
SDs for the counts c(i, j, a, b), assuming that these counts are independent and each count follows
a Poisson distribution as are commonly done in experiments. The SD thus obtained is generally
smaller than that of I in Eq. (5.5).
5.2.2 SD-based protocol
The results from N trials are used to obtain I and estimate the SD σ of I as discussed in
Sec. 5.2.1. Given that I > B, it is conventional to give (I −B)/σ, the number of SDs of violation,
as a measure of the amount of violation. To convert the number of SDs to a p-value, we make the
unjustified assumption that, for any LR model the distribution of the random variable ILR, from
which I is sampled, is sufficiently close to Gaussian with the SD σ as estimated from N trial results
but with a mean bounded by B. With this assumption, according to any LR model, the probability
63
of the event ILR ≥ I is then bounded above by
ProbLR(ILR ≥ I) ≤ Q
(I −Bσ
), (5.8)
where Q(z) is the Q-function, which is the probability that a standard normal random variable Z
satisfies Z ≥ z. This allows us to assign the p-value for the observed statistic I as
p(SD)N = Q
(I −Bσ
), (5.9)
with the caveat that our assumption is not justified. As a function of the number of trials N ,
σ√N approaches σ1, where σ1 is an effective one-trial SD. For large N , the quantity Q((I −B)/σ)
approaches e−N(I−B)2/(2σ21). Thus, according to Eq. (5.1), the confidence-gain rate achieved by the
SD-based protocol is
GSD = log2(e)(I −B)2
2σ21
. (5.10)
SD-based p-values are not valid for two reasons. First, the experimental SD is different from
the worst-case SD assuming LR. While it may be possible to check the relevant SDs for all LR
models, this is a challenging task. Second, deviations from Gaussianity in the extreme tail of the
distribution for ILR cannot be asymptotically neglected. To explain this issue, define the random
variable F =√N(ILR − B)/σ1. For any LR model, the expectation 〈F 〉 ≤ 0. Assuming that
LR models have the same SD as the experimentally estimated one, we expect that according to
the central limit theorem, F − 〈F 〉 converges in distribution to a standard normal distribution.
Here, convergence in distribution implies that for a constant l, the probability of the event F ≥ l
converges to the standard normal distribution’s probability for this event. But for the computation
of p(SD)N , one needs the probability of the event F ≥
√N(I−B)/σ1, where
√N(I−B)/σ1 scales as
√N and therefore goes to infinity as an experiment progresses. Thus, convergence in distribution
is insufficient for estimating this probability.
The number of SDs of violation is not normally explicitly converted to a p-value as done here.
Instead, it is primarily intended as a way of claiming successful violation with a good signal-to-
noise ratio. Naturally, one would like to use the measure to compare the strength of the violation
64
of LR in different experiments. Such a relative comparison works only if the experiments use the
same test of LR with the same state, experimental settings, losses, visibilities, and other relevant
parameters.
5.2.3 Martingale-based protocol
For fundamental tests of quantum mechanics, a serious deficiency of SD-based assessments of
experimental tests of LR is that they do not account for memory effects [20, 21, 22, 23], including
the possibility that the state and settings drift in the course of the experiment. To account for the
time dependence of the state and setting parameters and relevant LR models in an experiment,
R. Gill suggested a method for computing p-values based on the super-martingale structure of the
time sequence of observations in a test of LR [18, 19]. That is, given a Bell inequality 〈I(X)〉 ≤ B
as in Eq. (5.3), one can show that the time sequence Mn =∑n
k=1(I(Xk) − B), n = 1, 2, . . .,
is a super-martingale according to any LR model. Here, the measurement settings are assumed
to be chosen randomly and independently at each trial by Alice and Bob according to the fixed
probability distribution pi,j built into the Bell inequality. If the range of the Bell function I is
included in the finite interval [bl, bu], one then can apply large-deviation bounds for the super-
martingale {Mn : n = 1, 2, . . .} with bounded increments Mn−Mn−1 ∈ [bl−B, bu−B] to compute
p-values.
To show that the sequence Mn, n = 1, 2, . . ., is a super-martingale, let Wn be all the infor-
mation available before the n’th trial, including all previous trial results x1, . . . , xn−1. According
to any LR model, the conditional expectation of Mn given Wn satisfies
〈Mn|Wn〉 = 〈I(Xn)−B +Mn−1|Wn〉
= 〈I(Xn)|Wn〉 −B + 〈Mn−1|Wn〉
= 〈I(Xn)|Wn〉 −B +Mn−1
≤Mn−1. (5.11)
The last inequality follows from the fact that the Bell inequality 〈I(X)〉 ≤ B is satisfied for any
65
LR model, regardless of prior information. The inequality in Eq. (5.11) is the defining property for
a super-martingale {Mn : n = 1, 2, . . .}.
Given the results x1, . . . , xN after N trials, an experimental test yields an estimate I =
1N
∑Nn=1 I(xn) of I. Suppose that the n’th trial result xn is distributed according to a ran-
dom variable XLR,n satisfying LR. In this case, the random variable from which I is sampled
is I ′LR = 1N
∑Nn=1 I(XLR,n). By applying the Azuma-Hoeffding inequality [160, 161, 162] for the
tail probability of the super-martingale {Mn : n = 1, 2, . . . , N} with bounded increments, we find
that, after N trials, the probability according to an LR model that I ′LR takes a value greater than
or equal to the observed I > B is bounded above by
ProbLR(I ′LR ≥ I) = ProbLR(MN ≥ N(I −B))
≤ exp
(−2N(I −B)2
(bu − bl)2
). (5.12)
We can further tighten the above bound according to Theorem 6.1 of Ref. [162]. The tighter bound
is
ProbLR(I ′LR ≥ I) = ProbLR
(MN ≥ N(I −B)
)≤
(bu −Bbu − I
) bu−Ibu−bl
(B − blI − bl
) I−blbu−bl
N . (5.13)
This implies a valid p-value for the observed statistic I as
p(mart)N =
(bu −Bbu − I
) bu−Ibu−bl
(B − blI − bl
) I−blbu−bl
N . (5.14)
For large N , I approaches I, thus the confidence-gain rate according to Eq. (5.1) is
Gmart =bu − Ibu − bl
log2
bu − Ibu −B
+I − blbu − bl
log2
I − blB − bl
. (5.15)
Note that, although Theorem 6.1 of Ref. [162] is stated for a martingale that is a sequence of
random variables Mn, n = 1, 2, . . ., such that 〈Mn|Wn〉 = Mn−1, the same result and its proof also
apply to a super-martingale. The same bound is also derived in Theorem 1 of Ref. [160] for a sum
66
of independent and bounded random variables. From Refs. [160, 162], we can see that the bound
in Eq. (5.13) is tighter than bounds of ProbLR(I ′LR ≥ I) used in previous works [3, 17, 19], for
example, the bound as shown in Eq. (5.12). Moreover, from the proof of Theorem 6.1 in Ref. [162],
we can see that, even if a Bell function I and its bounds depend on n, that is, bl,n ≤ I(xn) ≤ bu,n
for any result xn at the n’th trial, the p-value assignment as in Eq. (5.14) is still valid with the
replacement bu and bl by bu =∑N
n=1 bu,n/N and bl =∑N
n=1 bl,n/N , respectively.
We cannot expect the bound on the tail probability in Eq. (5.13) to be asymptotically tight,
since the only constraints considered are the bounds on the Bell function I. The PBR protocol
takes advantage of all available constraints on the distributions of trial results according to LR,
implicitly including all relevant Bell inequalities.
5.2.4 PBR protocol
In contrast to a fixed Bell inequality used in the SD-based or martingale-based protocol,
given the setting distribution pi,j , after n trials but before the (n + 1)’th trial the PBR protocol
returns a special Bell inequality of the form
〈Rn(X)〉 ≤ 1 (5.16)
with a nonnegative Bell function Rn. Here, Rn can depend on previous trial results x1, . . . , xn and
other aspects of the experiment before starting the (n+1)’th trial. The construction of Rn typically
requires predicting the distribution of Xn+1. Thus, Rn is referred to as a prediction-based ratio
(PBR).
Given any sequence of PBRs Rn, n = 0, 1, 2, . . ., the PBR protocol computes a test statistic
according to Pn =∏nk=1Rk−1(Xk), that is, the product of the values of Rk−1 at the potential
result Xk of the k’th trial. We claim that, according to any LR model with arbitrary memory, the
expectation of the test statistic satisfies
〈Pn〉 ≤ 1. (5.17)
To prove the claim, as in Sec. 5.2.3 let Wn denote all the information available before the n’th trial.
67
Then, according to any LR model with arbitrary memory, the expectation of Pn conditioned on
Wn satisfies
〈Pn|Wn〉 =
⟨n∏k=1
Rk−1(Xk)|Wn
⟩
=
⟨n−1∏k=1
Rk−1(Xk)×Rn−1(Xn)|Wn
⟩
=n−1∏k=1
Rk−1(Xk)× 〈Rn−1(Xn)|Wn〉
≤ Pn−1, (5.18)
where we used the facts that Wn includes Rk−1 and Xk−1 for k ≤ n, and that the LR bound
on 〈Rn−1(X)〉 is 1 given Wn, as the LR model in the bound is arbitrary. We can compute the
expectations of both sides of Eq. (5.18) to show that, according to any LR model, 〈Pn〉 ≤ 〈Pn−1〉,
and therefore, by induction, 〈Pn〉 ≤ 1, which is the inequality (5.17).
Given a sequence of experimental results x1, . . . , xN from N trials, the test statistic PN takes
a specific value P =∏Nn=1Rn−1(xn). Suppose that PN is constrained by LR, possibly with memory.
By construction PN ≥ 0, and the expectation according to an LR model 〈PN 〉 ≤ 1 as shown above.
According to Markov’s inequality, we conclude that
ProbLR(PN ≥ P ) ≤ min(1/P , 1), (5.19)
which shows that we can assign a valid p-value associated with the observed statistic P according
to
p(PBR)N = min
( N∏n=1
Rn−1(xn)
)−1
, 1
. (5.20)
Note that, Eq. (5.18) shows that the sequence Pn, n = 1, 2, . . ., is a super-martingale under
any LR model. Since the increment of this super-martingale is not bounded, we cannot apply the
method of Sec. 5.2.3 to bound the tail probability. However, we can use the optional stopping
theorem for a super-martingale [163], to get the following nice property of the PBR protocol:
Suppose that one stops the experiment if and only if the observed statistic P is greater than a
68
prespecified value P0 > 1 or the number of trials performed is greater than a prespecified value N0.
Then, the total number of trials N performed in an experiment is a random variable depending on
P0 and N0. According to the optional stopping theorem for a super-martingale, the expectation
of PN according to LR at the stopping time N is bounded above by 1. That is, even if when to
stop an experiment is determined before the experiment by a theorist who wants LR to prevail, the
theorist cannot explain the observed data on the average.
For the extremely low p-values of interest in tests of LR, we are looking for large (negative) log-
p-value increments log2(Rn(xn+1)) at the (n+ 1)’th trial. Therefore, before the (n+ 1)’th trial, our
goal is to choose Rn so as to maximize the experimentally expected increment l = 〈log2(Rn(Xn+1))〉.
For this purpose, we can take advantage of anything we know about the probability distribution of
the random variable Xn+1 giving rise to the next trial result. Consider a probability distribution q
for Xn+1, which may be either the true distribution or an estimate thereof. Let p be the distribution
according to an LR model. Note that, because the setting distribution is under experimental control,
the probability distributions q and p must be consistent with the chosen setting distribution. Our
ability to distinguish the probability distributions q and p given a collection of independent samples
from q can be characterized by the Kullback-Leibler (KL) divergence from q to p,
DKL(q ‖ p) =∑x
q(x) log2
(q(x)p(x)
). (5.21)
The KL divergence is nonnegative, and it is zero if and only if p = q. This motivates seeking
an LR model whose probability distribution pLR minimizes the KL divergence from q [24]. We
define Sq = DKL(q ‖ pLR), and refer to Sq as the statistical strength for rejecting LR by means
of a test with the distribution q. As shown in Ref. [25], the statistical strength Sq is the optimal
valid confidence-gain rate for rejecting LR given that the experimental distribution is q. Thus, the
experimentally expected log-p-value increment l cannot exceed Sq, and our goal before the (n+1)’th
trial is to make l as close to Sq as possible.
We claim that if we define the PBR
Rn(x) = q(x)/pLR(x), (5.22)
69
then 0 ≤ Rn(x), and for any LR model the expectation satisfies 〈Rn(X)〉 ≤ 1. Consequently,
the p-value computed according to Eq. (5.20) is valid, and if q is the true distribution of Xn+1
the experimentally expected log-p-value increment l is Sq. To prove the claim, consider φ(β) =
DKL(q ‖ pLR + β(p− pLR)), where 0 ≤ β ≤ 1. For any p in the convex set of LR distributions, by
optimality of pLR, φ(β) ≥ φ(0). It follows that ∂φ∂β |β=0+ ≥ 0. Consequently,
∑x
(pLR(x)− p(x))q(x)pLR(x)
≥ 0, (5.23)
which can be rearranged to show that according to any LR model’s probability distribution p the
expectation
〈Rn(X)〉 =∑x
p(x)q(x)pLR(x)
≤ 1. (5.24)
The claim follows. Bell inequalities of the form shown in Eq. (5.24), which are based on minimizing
the KL divergence, were introduced in Ref. [164].
In an experiment, however, we do not know the true distribution q of the random variable
Xn+1 giving rise to the (n+ 1)’th trial result. Instead, we obtain good estimates qn of q before the
(n+1)’th trial, and determine the corresponding optimal LR model’s probability distribution pLR,n.
We then set Rn(x) = qn(x)/pLR,n(x) to compute and update the PBR p-value. If the experiment
is sufficiently stable, good estimates can be obtained from the frequencies of results observed in
trials so far. The estimates can be improved by taking into account that the setting distribution
is known and the distributions of marginal outcomes for given settings of Alice or Bob must agree
due to no-signaling constraints. We discuss how to do this in Sec. 5.3.1. In Sec. 5.3.2, we show
that if the trials are independent and identically distributed, then PBR p-values computed with
any converging method for estimating the true probability distribution q have the property that
the confidence-gain rate
GPBR = Sq. (5.25)
Thus, we prove the asymptotic optimality of PBR p-values.
To determine the optimal LR model one can use numerical algorithms for optimizing convex
functions over a convex domain. In this case one can use the expectation-maximization algo-
70
rithm [152] as discussed in the previous chapter. A problem is that due to stopping criteria and
numerical precision, one cannot expect to find the exact optimum. We show in Sec. 5.3.2 that one
can compensate for this problem to maintain validity of the computed p-value.
Note that, probability ratios such as the ones we use to compute the values of Rn in Eq. (5.22)
are often referred to as likelihood ratios. Likelihood ratios play an important role in many statis-
tical tests as explained in statistics textbooks such as Ref. [151]. In the PBR protocol, the test
statistic can be computed from any sequence of nonnegative functions Rn satisfying the inequality
in Eq. (5.16). Thus, the probability ratios are simply an intermediate step to obtain such functions.
We do not ascribe any other meaning to the ratios.
5.3 Technical details for applying the PBR protocol
5.3.1 Estimating the experimental probability distribution
Consider n trials with observed results given by x1, . . . , xn. Our goal is to obtain an estimate
qn of the true probability distribution q of the (n+1)’th trial result. Assuming no other knowledge,
the estimate can be based on the empirical frequencies fn(x) = 1n
∑nk=1 δxk,x. Due to statistical
fluctuations, the empirical frequencies are not likely to satisfy the following known constraints
satisfied by q:
• Setting distribution: The setting distribution pi,j is fixed, and q satisfies∑
a,b q(i, j, a, b)
= pi,j .
• No signaling: Given that Alice uses setting i, the distribution of Alice’s measurement
outcomes does not depend on Bob’s settings, and vice versa.
There are two other issues for computing PBR p-values. The first is that some empirical frequencies
fn(x) may be zero. If our estimate is qn = fn, zero frequencies can be disastrous. In the case where
the corresponding results occur at the next trial, the ratio contributing to the PBR p-value in
Eq. (5.20) can be zero, and then the p-value increases to 1 with no possibility of later reduction.
The second and related issue is that in the absence of prior knowledge, initially we have insufficient
71
information to make useful estimates of probability distributions of future trial results. Even if
the problem of zero frequencies has been taken care of, this can still result in initial “learning”
transients that cause a negative offset in the accumulated log-p-values (see Fig. 5.3 in Sec. 5.4 for
an example).
Our approach for estimating the next trial result’s probability distribution uses maximum
likelihood to obtain an estimate that respects the above constraints and then adjusts the estimate
by mixing in a distribution that is uniform conditional on the settings. To reduce the impact of
learning transients, we process the trials in blocks.
To apply maximum likelihood for computing a first estimate q0 of q, we assume independent
and identically distributed trials. Whether or not this assumption actually holds in an experiment
only affects the quality of the computed p-value, but not its validity. The probability of observing
empirical frequencies fn after n trials given that the true distribution is q is proportional to
L(fn|q) =∏x
q(x)nfn(x). (5.26)
We therefore set q0 according to
q0 = argmaxq′∈VL(fn|q′), (5.27)
where V is the set of probability distributions satisfying the setting-distribution and no-signaling
constraints. These constraints are linear and log(L(fn|q)) is concave, so there is no difficulty in
applying available nonlinear optimization tools. Note that, for the purpose of computing PBR
p-values, it is not critical that Eq. (5.27) is exactly satisfied, so it is not necessary to use extremely
tight stopping criteria to ensure identity with the best numerical precision possible. Also, whereas
the design of PBRs such as those in Eq. (5.22) requires that the setting-distribution constraint is
satisfied, the no-signaling constraint is not critical. Applying it helps improve our estimates, but
the effect on the log-p-value increments becomes negligible for large n.
There are different ways to solve the problem with empirical frequencies that are zero; some
are explained in Refs. [165, 166]. They generally involve mixing in a distribution that has no zero
72
probabilities with a weight that decreases to zero as n grows. For the plots in Figs. 5.3 and 5.4
of Sec. 5.4, we modified q0 by setting qn = nn+1q0 + 1
n+1u, where conditionally on the settings the
distribution u is uniform, and u’s setting distribution is pi,j .
There are different approaches to mitigate the effect of the initial learning transient. The
first is to “prime” the estimates with knowledge about the experiment available before the trials
are started. Such knowledge could be based on theory or on experiments designed to characterize
the quantum state and measurement setup. The prior information must be assigned a weight. In
our implementation of the local realism analysis engine (see Appendix A), the weight is determined
by the number of trials that would have been required to obtain an equally good estimate directly
from the frequencies. Proper use of priming requires that the initial estimates and parameters such
as the weight are determined “blindly” before any knowledge of the actual data to be analyzed is
available.
A second approach is to set Rn(x) = 1 for any x unless the statistical strength Sqn for qn’s
violation of LR seems sufficiently significant given that the estimated distribution qn is based on
n trials. While one might expect that the violation is sufficiently significant if nSqn ≥ c for some
constant c, simulations show that the best choice of c depends on the distribution of trial results
in an experiment.
The third and simplest approach is to block the data from the trials. Instead of updating
the log-p-value after every trial, we process data h trials at a time. The first block is used only
for estimating the probability distribution of future trial results. That is, we set Rk(x) = 1 for
k = 0, . . . , (h−1). Subsequently, we have Rmh+k = Rmh for k = 1, . . . , (h−1) and all m. Note that
neither the validity nor the asymptotic optimality of the computed p-values requires updating the
PBRs after each trial. Choosing h large enough ensures that the first block’s trials have sufficient
information for obtaining reasonable estimates of the distribution. An additional advantage of
blocking the trials is that we avoid unnecessarily invoking the computationally costly optimizations
required for updating the PBRs. We standardized the choice of block size so that if the total
number of trials to be analyzed is N , h is the maximum of dN/1000e and dln(2d)de, where d is the
73
number of possible results at a trial. The first expression ensures that we do not lose too much
log-p-value by using the first block only for learning the trial results’ distribution. The second one
is chosen so that if q is uniform, the probability that every trial result occurs in each block is at
least 1/2.
We conclude this section with a note on implementing the PBR protocol. For monitoring
an experiment and to adapt to changes in experimental configuration, the estimated experimental
distributions used in the PBRs should be based on recent trials only. This can be accomplished by
windowing the trials with a window large enough to have statistically significant violation of LR
(if there is violation), but small enough to avoid seeing significant changes in configuration. Our
implementation of the local realism analysis engine uses a computationally simpler approach based
on weighting the trials with exponentially decreasing weights in time determined by a configurable
half-life. This feature was not used in the comparisons in Sec. 5.4.
5.3.2 Effects of bad estimates of true distributions and optimal LR models
Ideally the estimated distribution qn used in the numerator of Rn matches the true distribu-
tion q, and the LR distribution pLR,n in the denominator of Rn exactly minimizes the KL divergence
from qn. As shown in Sec. 5.2.4, having qn different from q does not affect the validity of the PBR
p-values. But it can reduce the expected log-p-value increment l. Let Sq be the statistical strength
of q for the violation of LR. We show that
Sq ≥ l ≥ Sq −DKL(q ‖ qn). (5.28)
For reasonable methods of estimating qn such as the one described in Sec. 5.3.1 and independent
and identically distributed trials, qn almost surely approaches q so that DKL(q ‖ qn) goes to zero.
This shows that the PBR protocol achieves the confidence-gain rate Sq and hence is optimal.
To prove the first inequality in Eq. (5.28), let pLR be the LR distribution that minimizes the
74
KL divergence from q, so that Sq = DKL(q ‖ pLR). We bound l as follows:
Sq − l =∑x
q(x) log2
(q(x)pLR(x)
)−∑x
q(x) log2
(qn(x)
pLR,n(x)
)=
∑x
q(x) log2
(q(x)t(x)
), (5.29)
where we define t(x) = pLR(x)qn(x)/pLR,n(x). Since qn(x)/pLR,n(x) is a PBR, and pLR is an LR
distribution, we know that c ≡∑
x t(x) ≤ 1 (see Eq. (5.24)). Since t′ = t/c is a probability
distribution, we can continue the calculation:
Sq − l = log2(1/c) +∑x
q(x) log2
(q(x)t′(x)
)≥ 0, (5.30)
because the second term is a KL divergence.
To obtain the second inequality of Eq. (5.28) we bound
l =∑x
q(x) log2
(qn(x)
pLR,n(x)
)=
∑x
q(x) log2
(q(x)
pLR,n(x)
)− q(x) log2
(q(x)qn(x)
)= DKL(q ‖ pLR,n)−DKL(q ‖ qn)
≥ DKL(q ‖ pLR)−DKL(q ‖ qn)
= Sq −DKL(q ‖ qn). (5.31)
The denominator pLR,n of the PBRs Rn must be computed numerically. Consequently, the
distribution p′LR,n actually obtained is typically not identical to pLR,n and may not minimize the
relevant KL divergence. Hence, there may be an LR distribution p, according to which the ex-
pectation 〈R′n(X)〉p = 〈qn(X)/p′LR,n(X)〉p is greater than 1, and so the PBR p-value is not valid
if it is computed according to Eq. (5.20) with R′n. To maintain validity, we determine the max-
imum value 1 + ε of the expectations 〈R′n(X)〉p according to all LR distributions p and then set
Rn = R′n/(1 + ε). To determine the bound 1 + ε, we recall that LR distributions are mixtures
of distributions pλ induced by “local hidden variables” λ. Each λ assigns deterministic outcomes
independently for each setting of Alice and each setting of Bob. We write a(λ,Ai) and b(λ,Bj) for
75
Alice’s and Bob’s measurement outcomes given settings Ai and Bj , according to λ. The probability
for the trial result x = (i, j, a, b) is given by pλ,(i,j,a,b) = pi,jδa,a(λ,Ai)δb,b(λ,Bj). With these definitions,
1 + ε = maxp is LR
〈qn(X)/p′LR,n(X)〉p = maxλ
∑x
pλ,xqn(x)/p′LR,n(x). (5.32)
Because the number of different λ is finite, the value 1+ε can be computed according to Eq. (5.32).
The expectation-maximization algorithm that we apply to KL-divergence minimization iteratively
updates the probability distribution over the set of hidden variables λ. To perform the updates,
it requires the set of values that are maximized in Eq. (5.32), so the computation of 1 + ε can be
integrated into the algorithm with little overhead. Furthermore, the quantity ε can be used as a
stopping criterion for minimization. That is, the expected log-p-value increment l′, assuming that
the random variable Xn+1 is distributed according to qn, satisfies
l′ =∑x
qn(x) log2
(qn(x)
p′LR,n(x)(1 + ε)
)= DKL(qn ‖ p′LR,n)− log2(1 + ε)
≥ DKL(qn ‖ pLR,n)− log2(1 + ε). (5.33)
Thus, for independent and identically distributed trials, the confidence-gain rate is lowered by at
most log2(1 + ε).
5.4 Results
In this section we show the results using the SD-based, martingale-based, and PBR protocols.
5.4.1 Confidence-gain rates
Let us first compare the confidence-gain rates achieved by different protocols in tests of LR
with different experimental configurations. In Fig. 5.1, we study the confidence-gain rates achieved
by different protocols in tests of LR using unbalanced Bell states |ψuB〉 = cos(θ)|00〉 + sin(θ)|11〉
with θ ∈ (0, π/4]. For the results shown in this figure, we chose a uniform setting distribution.
The family of unbalanced Bell states considered is of interest because they are more tolerant of low
detection efficiency, as studied in Chapter 4.
76
0 5 10 15 20 25 30 35 40 450
0.01
0.02
0.03
0.04
0.05
0.06
0.07
θ(◦)
G
SD−based
Martingale−based
Optimal = PBR
θ = 33.41◦
Figure 5.1: Confidence-gain rates G achieved by the SD-based, martingale-based, and PBR proto-cols. The gain rate G is shown for a CHSH test of LR with an unbalanced Bell state with no lossand perfect detectors. It depends on the parameter θ in the unbalanced Bell state |ψuB〉. Giventhe state parameter θ, the measurement settings are chosen to maximize the violation of the CHSHinequality (1.2). The line corresponding to the gain rates achieved by the SD-based protocol crossesthe line corresponding to the optimal gain rates at θ = 33.41◦.
In Fig. 5.2, we study the confidence-gain rates in tests of LR with noisy and lossy Bell states.
Motivated by the result that the amount of randomness produced using a test of LR with a biased
setting distribution is more than that produced with a uniform setting distribution [3], here the
confidence-gain rates in tests with biased setting distributions are shown.
Figs. 5.1 and 5.2 show that the gain rates achieved by the SD-based protocol can be higher
than justified and are therefore not valid. The worst case is when the state used is a Bell state,
which is an aim of most experiments to date. Both figures also show that the gain rates achieved
by the martingale-based protocol are valid but generally not optimal.
77
0.85 0.9 0.95 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
η
G
0.85 0.9 0.95 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
η
G
0.85 0.9 0.95 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
η
G
0.85 0.9 0.95 10
0.01
0.02
0.03
0.04
0.05
0.06
0.07
η
G
SD−basedMartingale−based
SD−basedMartingale−basedOptimal = PBR
SD−basedMartingale−basedOptimal = PBR
SD−basedMartingale−basedOptimal = PBR
−V = 1−V = 0.9−V = 0.85
−V = 1−V = 0.9−V = 0.85
−V = 1−V = 0.9−V = 0.85
(a) (b)
(d)(c)
−V = 1−V = 0.9−V = 0.85
Figure 5.2: The confidence-gain rate G of a CHSH test of LR with a Bell state and varying detectionefficiency η and visibility V. The measurement settings are chosen to maximize the violation of theCHSH inequality (1.2). Measurement outcomes where no particle is detected are assigned the value−1. (a) pA1 = pB1 = 0.5, (b) pA1 = pB1 = 0.51, (c) pA1 = pB1 = 0.52, and (d) pA1 = pB1 = 0.53,where pA1 and pB1 are the probabilities that at each trial Alice and Bob independently choose thesettings A1 and B1, respectively. Note that, in the subplot (a) the optimal gain rates are not shown,since the optimal gain rate can be at most 6 % larger than the corresponding martingale-based gainrate so that the difference between them is not visible.
From Fig. 5.1, we can infer that, if one uses the number of SDs to compare the violation of the
CHSH inequality in experiments involving different unbalanced Bell states, one tends to unfairly
78
favor the experiment with the more balanced state. From the results in Fig. 5.2, one can see that,
with the increase of the bias in the setting distribution the confidence-gain rates achieved by the
martingale-based protocol become further away from the corresponding optimal gain rates.
Note that, for the above results, the SD-based confidence-gain rates were computed with re-
spect to the conventional method for estimating violation. According to the discussion in Sec. 5.2.1,
the number of SDs of violation computed according to the conventional estimate I in Eq. (5.7) is
generally higher than that computed according to the estimate I in Eq. (5.5). Hence, the con-
ventional way of estimating the violation and the experimental SD worsens the validity problem
for SD-based gain rates. However, using the estimate I and the associated larger SD in Figs. 5.1
and 5.2 does not significantly alter the plots or their interpretation.
5.4.2 Application to experiments
The protocols discussed can compute p-values for recorded trials as an experiment progresses,
and such “running” p-values may be used to optimize experimental settings. Because we are
interested in extremely small p-values with exponential asymptotic behavior, we generally consider
and display the log-p-value.
The SD-based or martingale-based protocol is restricted to a fixed Bell inequality. The PBR
protocol does not have this restriction, which enables wider searches for strong violations of LR.
Running log-p-values are shown for a simulation in Fig. 5.3 and for data from Ref. [3] in Fig. 5.4.
The PBR p-values were computed with our implementation of the local realism analysis engine; see
the code information and documentation associated in Appendix A. Note that whereas running
log-p-values can be used to monitor and tweak an experiment, they must not be used as a stopping
criterion once an experiment has been configured.
For Fig. 5.3 we simulated a CHSH test of LR with an unbalanced Bell state and measurement
settings maximizing the violation of the CHSH inequality (1.2). We assumed an ideal experiment
(no loss of particles or visibility) and simulated 5000 successive trials. The log-p-values were updated
for successive blocks of 56 trials according to the discussion in Sec. 5.3.1. Here, we didn’t prime
79
0 1000 2000 3000 4000 50000
20
40
60
80
100
120
140
160
180
n
−log2pn
0 1000 2000 3000 4000 50000
20
40
60
80
100
120
140
160
180
n
−log2pn
SD−based
Martingale−based
PBR
SD−based
Martingale−based
PBR
(a) (b)
Figure 5.3: Running log-p-values as functions of the number of trials n in a CHSH test of LR withan unbalanced Bell state cos(θ)|00〉+ sin(θ)|11〉 where θ = 22.5◦. We assume that there is no noiseor detection inefficiency and the setting distribution is uniform. The log-p-values are computedaccording to the three protocols discussed. The slopes of the straight lines are the confidence-gainrate achieved by each protocol. (a) is for one simulation of 5000 successive trials. (b) is an averageof 30 independent simulations.
the PBRs before starting the simulation. The figure shows typical and average runs and compares
the running log-p-values to the asymptotic lines with slopes given by the respective gain rates. The
slopes of the running log-p-values approach the gain rates, but PBR log-p-values have a systematic
offset that can be attributed to an initial transient where the experimental probability distribution
is being learned. The transient can be removed if, before the experiment is started, we have a good
estimate of the experimental distribution. Such an estimate can be used to prime the PBRs.
For Fig. 5.4, we compute log-p-values for the data from the experiment described in Ref. [3].
In this experiment, two 171Yb+ ions separated by about one meter were entangled through a
probabilistic process. In this process, each ion is entangled with one emitted photon. By projecting
80
0 500 1000 1500 2000 2500 30000
5
10
15
20
25
30
35
40
45
n
−log2pn
SD−based
Martingale−based
PBR without priming
PBR with priming
Figure 5.4: Running log-p-values as functions of the number of trials n in the experiment of Ref. [3].In this experiment, different measurement settings are chosen uniformly randomly. The dotted linesare provided only to guide the eye.
the two emitted photons into a Bell state the two remote ions are entangled with each other. On
the entangled two-ion system, a CHSH test of LR was performed. The results from 3016 trials were
recorded. The resulting estimate of the CHSH expression is ICHSH = 2.414± 0.058. For the figure,
we processed the data in blocks of 56 trials as before. The log-p-values computed by the PBR
protocol both with and without priming are shown. To prime the PBRs, we assumed that before
the experiment we had an estimate of the experimental probability distribution based on the exact
frequencies observed in this experiment after 3016 trials. In this experiment, there is insufficient
data for PBR log-p-values to exceed martingale-based ones.
Chapter 6
Efficient quantification of experimental violation of local realism
From the last chapter, we know that a small p-value means that the observed data is significant
for rejecting local realism (LR). Upper bounds of p-values for specified test statistics are required
for precise statements of experimental violations of LR. Such bounds not only help to reliably
demonstrate violations of LR, but also help to prove the security of quantum key distribution [11,
13, 109] or certify the generation of genuine randomness [3, 110, 111, 112].
In the last chapter, we discussed two available protocols that compute valid upper bounds
of p-values. One is the martingale-based protocol [18, 19], but the bounds computed are not tight.
The other is the prediction-based-ratio (PBR) protocol, which computes tighter bounds. Specially,
the latter bounds are asymptotically tight with respect to the total number of trials in a test of
LR, if the prepared quantum states and measurement settings do not vary in time. While we
demonstrated that the PBR protocol is practical for many standard configurations, this protocol
is computationally inefficient with respect to the number of parties per test, settings per party,
and outcomes per setting. The reason is that it requires computing estimates of the experimental
probability distribution and the associated optimal local realistic (LR) model. These estimates
are difficult to find when there are many parties, settings, or outcomes. Extreme examples are
provided by experimental configurations involving continuous variables, where the PBR protocol
cannot be directly applied. In this chapter, we propose a simplified PBR protocol to efficiently
compute high-quality p-value bounds for all configurations.
The simplified PBR protocol has at least four advantages over other protocols. First, its
82
p-value bounds are as good as and typically better than those obtained by the martingale-based
protocol. Second, it can take multiple Bell inequalities into consideration at once in a statistically
rigorous way. Thus we can obtain high-quality p-value bounds even when we cannot determine
beforehand which inequality will work best. Third, it can adapt to changes in the experimental
results’ distribution. Fourth, this protocol can be applied to any test with linear witnesses, such as
entanglement detection [82, 84], without a full analysis of the relevant probability space.
Due to the difficulty of determining worst-case tail probabilities of typical test statistics, we
can usually determine only upper bounds of exact p-values as defined in Eq. (5.2) of Chapter 5.
Thus, for the remainder of this chapter, the term “p-value” refers to any valid upper-bound on the
exact p-value. We can compare different protocols according to the (asymptotic) confidence-gain
rate as defined in Eq. (5.1) of Chapter 5. Higher gain rates imply better protocol performance.
In Sec. 6.1, we discuss how to simplify the PBR protocol. Like the martingale-based and
full PBR protocols, the simplified PBR protocol works even under memory effects [20, 21, 22, 23].
We then compare the simplified PBR protocol with the other two protocols in Sec. 6.2. Finally
in Sec. 6.3, we discuss the application of the simplified PBR protocol to other tests with linear
witnesses. This chapter is based on our previous work [29].
6.1 Simplified PBR protocol
The simplified PBR protocol chooses the PBRs from convex combinations of Bell functions
that are derived from a given set of Bell inequalities. To ensure that a convex combination is
a PBR, the Bell functions first need to be standardized so that they are nonnegative and have
expectations at most 1 for any LR model. Any Bell function that is lower-bounded has such a
standardized form. In particular, if 〈I(X)〉 ≤ B is a Bell inequality and I(x) ≥ bl for all x, then
r(x) = (I(x) − bl)/(B − bl) is standardized. Note that, as a constraint on the distribution of
X, 〈r(X)〉 ≤ 1 is equivalent to 〈I(X)〉 ≤ B. Given Bell inequalities 〈I(m)(X)〉 ≤ B(m) where
I(m) is lower-bounded and m = 1, 2, . . . ,M , we can construct the corresponding standardized Bell
functions r(m). We define r = (r(1), . . . , r(M)). The simplified PBR protocol chooses the PBR Rn
83
from among the convex combinations
ω · r =∑m
ωmr(m), (6.1)
where ωm ≥ 0 and∑
m ωm = 1. Our implementation always includes the trivial Bell function
r(1) = 1. This ensures that the set of convex combinations is at least one-dimensional and that the
confidence-gain rate is at least as high as that achieved by the martingale-based protocol (see the
discussion below).
Like the full PBR protocol, the simplified PBR protocol aims to optimize the experimentally
expected log-p-value increment given previous trial results, under the assumption that the distri-
bution of Xn+1 is the same as the empirical-frequency distribution of the previous trial results.
Whether or not this assumption holds does not affect the validity of the p-value computed. The
log-p-value increment at the (n + 1)’th trial may be defined as log2Rn(xn+1). Its experimentally
expected value given that Xn+1 is distributed according to q is
∑xn+1
q(xn+1) log2Rn(xn+1). (6.2)
Before the (n+ 1)’th trial, the protocol attempts to maximize this expected log-p-value increment.
Since q is not known, it is empirically estimated based on the previous n trials. Expanding Rn
according to Eq. (6.1) yields the following estimate of the experimentally expected log-p-value
increment at the (n+ 1)’th trial:
Gn(ω) =1n
n∑k=1
log2(ω · r(xk)) =∑
x:fn(x)6=0
fn(x) log2(ω · r(x)), (6.3)
where fn(x) = 1n
∑nk=1 δxk,x is the empirical frequency of x before the (n+1)’th trial. The protocol
thus determines Rn by maximizing Gn(ω) over ω, that is, Rn = r · argmaxωGn(ω). Note that,
unlike the full PBR protocol, the simplified PBR protocol does not require explicitly optimizing
over all LR models. Computing argmaxωGn(ω) requires optimizing a convex objective function
over an M -dimensional convex space, where the evaluation of the objective function involves a sum
of n terms. In our implementation, we apply the expectation-maximization algorithm [167] to solve
this problem.
84
The performance of the simplified PBR protocol depends on the relationship between the
actual distribution of trial results and the set of standardized Bell functions used. If the results
are independent and identically distributed according to a known distribution that violates LR,
then there exists an optimal Bell inequality that can be derived from the optimal PBR as found
by the full PBR protocol. (Here, optimality refers to the optimality of the gain rate achieved by
the protocol; see Chapter 5.) If the optimal PBR is included in the convex set of standardized
Bell functions, the confidence-gain rate achieved by the simplified PBR protocol is optimal. But
since the actual distribution is unknown before an experiment, the above assumption may not
hold without making the dimension of the set of convex combinations in Eq. (6.1) impractically
large. Thus, before an experiment, it is important to choose a relevant (and preferably small) set
of standardized Bell functions. In Sec. 6.2, we show that it helps to include more than just the
obvious Bell functions.
The performance of the simplified PBR protocol can be compared with that of the martingale-
based protocol, the only valid non-PBR protocol considered so far. To compute a p-value, the
martingale-based protocol uses a Bell inequality 〈I(X)〉 ≤ B with a Bell function I whose range is
included in the interval [bl, bu]. Below we show that the simplified PBR protocol using the same Bell
inequality, together with the default trivial Bell function r = 1, achieves a gain rate at least as high
as the gain rate achieved by the martingale-based protocol. Also, the following proof shows that
these two gain rates are equal to each other if and only if the experimental range of the function I
is contained in the set {bl, bu}.
Let the experimental probability of observing the result x in a trial be q(x). The experimental
mean of I is Iq =∫q(x)I(x)dx. If Iq ≥ B, then from Eqs. (5.1) and (5.14) of Chapter 5 we get the
gain rate
Gmart =bu − Iqbu − bl
log2
bu − Iqbu −B
+Iq − blbu − bl
log2
Iq − blB − bl
=∫q(x)
(bu − I(x)bu − bl
log2
bu − Iqbu −B
+I(x)− blbu − bl
log2
Iq − blB − bl
)dx. (6.4)
Here, we use the fact that the experimental estimate I approaches Iq as N →∞. By the concavity
85
of log2(x) and some algebra, we get that the gain rate Gmart satisfies the inequality
Gmart ≤∫q(x) log2
(bu − I(x)bu − bl
bu − Iqbu −B
+I(x)− blbu − bl
Iq − blB − bl
)dx
=∫q(x) log2
(ω0I(x)− blB − bl
+ 1− ω0
)dx, (6.5)
where 0 ≤ ω0 = Iq−Bbu−B ≤ 1.
From Eqs. (5.1) and (5.20) of Chapter 5 and according to the design of the PBRs by the
simplified PBR protocol, the gain rate achieved by this protocol is
GsPBR = max0≤ω≤1
∫q(x) log2
(ωI(x)− blB − bl
+ 1− ω)
dx. (6.6)
Here, we use the fact that the empirical frequency fN (x) = 1N
∑Nn=1 δxn,x approaches the experi-
mental probability q(x) as N →∞. The inequality Gmart ≤ GsPBR follows directly from comparing
Eq. (6.5) with Eq. (6.6).
By considering the condition for equality in Eq. (6.5), we can show that Gmart = GsPBR if
and only if q(x) = 0 whenever bl < I(x) < bu. For this it suffices to note that log2(x) is strictly
concave, so equality holds in Eq. (6.5) if and only if I(x) = bu or bl whenever q(x) 6= 0.
6.2 Protocol comparison
6.2.1 Computational resource comparison
Of the available protocols for computing p-values, the martingale-based one is the least
resource-intensive and simplest to apply. It requires computing only an estimate of the mean
of the Bell function, which involves a sum of N terms. In the following, we compare the com-
putational resources required by the simplified and full PBR protocols in an experimental test of
LR.
We consider an experimental configuration involving l parties where each party has s mea-
surement settings and each local measurement has d outcomes. (The comparison below is readily
extended to more general configurations.) We suppose that the joint-setting distribution is uniform.
Then, the number of possible results (measurement settings and outcomes of all parties) at a trial
86
is K = (ds)l. Since an LR state specifies the exact outcome for each local measurement of each
party at a trial, there are H = dls many such models. A general LR model is a convex combination
of LR states, so the number of free parameters characterizing a general LR model is (H − 1).
Let the total number of trials in an experimental test of LR be N . We assume that each
PBR protocol sets the initial value of the PBR to R0 = 1 and updates the PBR Rn before each
trial n (n > 1). (In practice this is unnecessary; see Sec. 5.3.1 of Chapter 5.) For updating the
PBR, each PBR protocol needs to optimize a convex objective function over a convex space. The
complexity of this optimization problem can be described in terms of variables that are functions
of the parameters n, l, s, and d characterizing the input data size. (Note that the stored size of
the first n trial results is O(n log(K)) = O(nl(log(d) + log(s))).) We need to quantify the resource
cost of implementing each protocol in terms of these parameters.
The complexity of the optimization problem solved before each trial can be parametrized by
the complexity of the convex search space, the complexity of evaluating the objective function, and
the precision needed for computing a high-quality p-value for rejecting LR. We assume that the sim-
plified and full PBR protocols use generic iterative optimization algorithms whose implementation
complexities as functions of these parameters are asymptotically the same. We also assume that the
complexity of the convex search space is dominated by its dimension. In particular, we do not ac-
count for the complexity of enforcing convex constraints. This is motivated by the observation that
there is no additional overhead for enforcing convex constraints in the expectation-maximization
algorithm [152, 167] used in our implementation. For quantifying the complexity of evaluating the
objective function, we assume that the Bell functions used can be evaluated in constant time given
any trial result. This assumption is realistic for many Bell functions, as their values are determined
by concise formulas derived from theory. Alternatively, these functions can be preprocessed as a
table stored in random-access memory; we do not include preprocessing time in our analysis. Also,
we assume that determining whether or not an arbitrary trial result x happens according to an
LR state takes constant time. (Strictly speaking, the time taken for such a determination process
is proportional to the number of parties l.) The precision needed affects the number of iterations
87
required by an algorithm to find a numerical solution. It affects only the quality of the p-value
computed by a protocol, but not its validity. (For the expectation-maximization algorithm used,
see Theorem 4 of Ref. [167] and Sec. 5.3.2 of Chapter 5 for the effects of the precision parameters
in the simplified and full PBR protocols, respectively.) We assume that the precision parameters in
both protocols are set to be the same, and we do not account for the number of iterations required
to achieve the specified precision. Therefore, for the purpose of comparing the computational re-
sources required by the simplified and full PBR protocols, we focus on comparing the dimensions
D of the convex search spaces and the complexities C of evaluating the objective functions in the
optimization problems solved by the two protocols before each trial.
We first consider the simplified PBR protocol. Given a set of M Bell inequalities, this
protocol sets Rn = ωn · r, where the size of ωn is M , r is defined before Eq. (6.1), and ωn is chosen
to maximize the estimated confidence gain as in Eq. (6.3). Note that, in the right-hand side of
Eq. (6.3), the sum is taken over only the results x already observed in the previous trials.
For the maximization of Eq. (6.3), the dimension of the convex search space is M . The eval-
uation of the objective function can use the left-hand or right-hand side of Eq. (6.3), whichever has
fewer terms. Thus it involves a sum of at most min(n,K) terms where each term requires comput-
ing a convex combination of M Bell-function values. Hence, for updating the PBR Rn before the
(n+1)’th trial, the complexity of evaluating the objective function is CsPBR = O(min(nM,KM)) =
O(min(nM, (ds)lM)), and the dimension of the search space is DsPBR = O(M). Therefore, if any
of the configuration parameters l, s, or d is large, CsPBR and DsPBR are independent of these pa-
rameters. Consequently, the numbers of parties, settings, and outcomes are not limiting factors for
applying the simplified PBR protocol. In this sense, the simplified PBR protocol is efficient for any
experimental configuration.
The full PBR protocol as studied in Chapter 5 computes Rn in two steps. First, the protocol
estimates the probability qn(x) of the result x to be observed at the next trial. This estimate can be
obtained in different ways. The simplest is to let qn(x) be the empirical frequency fn(x) of x over
the previous n trials. However, one can consider additional constraints such as the known joint-
88
setting distribution and no-signaling conditions. Thus, in Chapter 5 we suggested maximizing the
likelihood function Eq. (5.26), subject to these constraints, and we observed that this can improve
the quality of the p-value computed. Since this maximization is not a resource bottleneck, we do
not consider its complexity in the comparison. Second, we find the LR model pLR,n closest to the
estimated distribution qn by minimizing the KL divergence [168] from qn to an LR model p
DKL(qn ‖ p) =∑x
qn(x) log2
(qn(x)p(x)
). (6.7)
The full PBR protocol then sets Rn(xn+1) = qn(xn+1)/pLR,n(xn+1).
For the minimization of Eq. (6.7), the dimension of the convex search space is H. The
evaluation of the objective function involves a sum of K terms where each term requires computing
p(x) according to a convex combination of H LR states. Hence, for updating the PBR Rn before
the (n + 1)’th trial, the complexity of evaluating the objective function is CfPBR = O(KH) =
O(dl(s+1)sl), and the dimension of the search space is DfPBR = O(H) = O(dls). While CfPBR and
DfPBR are polynomial in d, they are exponential in each of l and s. Therefore, the full PBR protocol
is not efficient with respect to these configuration parameters.
Before applying the simplified PBR protocol, one chooses a relevant and preferably small
set of Bell inequalities. In many cases of interest, l, s, or d is large, and so is H = dsl. For
example, in field-quadrature measurements d is fundamentally infinite. Hence, M , the number of
Bell inequalities used in the simplified PBR protocol, is in general much smaller than H, the number
of LR states considered in the full PBR protocol. The complexities show that for such cases, the
simplified PBR protocol is substantially less resource-intensive than the full PBR protocol.
6.2.2 Comparison of confidence-gain rates
We begin by comparing the confidence-gain rates achieved by different protocols for ex-
perimental configurations designed to violate the Collins-Gisin-Linden-Massar-Popescu (CGLMP)
inequality [58]. To test the CGLMP inequality, there are two parties, and each of them performs
one of two possible measurements with d outcomes at each trial. This is an example where the full
89
0 20 40 60 80 1000
0.05
0.10
0.15
0.20
0.25
d
G
Martingale−basedSimplified PBR
Figure 6.1: Confidence-gain rates in the test of the CGLMP inequality 〈Id(X)〉 ≤ 2. Here, we usethe quantum state and measurement settings of Ref. [4], Eqs. (15) and (9), respectively.
PBR protocol is impractical for large d. For this example and the one below, we assume that at
each trial each party’s measurement setting is chosen uniformly randomly. The CGLMP inequality
can be written as 〈Id(X)〉 ≤ 2, where the function Id takes d different values. The gain rates Gmart
and GsPBR, achieved by the martingale-based and simplified PBR protocols, are shown in Fig. 6.1.
Here the simplified PBR protocol uses only the CGLMP inequality. This figure illustrates that
GsPBR is higher than Gmart when d > 2.
The optimal gain rate Sq is achieved by the full PBR protocol and can be computed as the
minimum KL divergence from the experimental probability distribution q to any LR model [25].
For the results of Fig. 6.1, we find that the gain rates GsPBR are numerically indistinguishable from
Sq when d ≤ 13. For the case d > 13, it is difficult to compute Sq due to the large dimension of
the probability space over all possible LR models. For the tests studied in Fig. 6.1, we conjecture
that GsPBR = Sq. In general we cannot guarantee that GsPBR is optimal.
Next, we compare the performance of the simplified PBR protocol when using different num-
bers of Bell inequalities. The experimental configuration considered is for a test of the Clauser-
90
0 5 10 15 20 25 30 35 40 450
0.01
0.02
0.03
0.04
0.05
θ (◦)
G
Simplified PBR 1
Simplified PBR 2
Full PBR
Figure 6.2: Confidence-gain rates in the test of LR with an unbalanced Bell state |ψ(θ)〉. Themeasurement settings are chosen to maximize the violation of the CHSH inequality (1.2) given thestate |ψ(θ)〉. The gain rates achieved by the simplified PBR protocol using the CHSH inequality areshown as circles (◦), while the gain rates by the same protocol using the CHSH inequality togetherwith no-signaling conditions are shown as crosses (+).
Horne-Shimony-Holt (CHSH) inequality [15] using an unbalanced Bell state |ψ(θ)〉 = cos(θ)|00〉+
sin(θ)|11〉. There are many different ways of expressing the CHSH inequality, and they are equiv-
alent considering no-signaling and normalization conditions. For comparison, we consider the sim-
plified PBR protocol with the CHSH inequality (1.2) alone or in conjunction with additional,
seemingly trivial Bell inequalities such as those derived from no-signaling conditions. With Bell
functions corresponding to no-signaling conditions, the gain rates are improved, as shown in Fig. 6.2.
This improvement suggests that the gain rate achieved by the simplified PBR protocol depends on
the form of a Bell inequality used.
91
6.2.3 Comparison of protocols’ behavior for finite data
Here we consider the behavior of each protocol given a finite amount of experimental data.
We simulate the test of the CGLMP inequality 〈I3(X)〉 ≤ 2 [58] with the quantum state and
measurement settings of Ref. [4], Eqs. (15) and (9) (with d = 3), respectively. We assume that
at each trial each party’s measurement setting is chosen uniformly randomly. The protocols’ gain
rates are Gmart = 0.0565 and GsPBR = 0.0675, while the optimal gain rate Sq achieved by the full
PBR protocol is numerically indistinguishable from GsPBR. For computing GsPBR, the simplified
PBR protocol uses the standardized CGLMP inequality and the trivial Bell function r = 1.
0 2000 4000 6000 8000 100000
100
200
300
400
500
600
700
n
−log 2
pn
Martingale−based
Simplified PBR
Full PBR
Figure 6.3: An example of running log-p-values as functions of the number of trials n in a test ofthe CGLMP inequality. The dashed and solid lines are the asymptotic lines for log-p-values basedon gain rates achieved by the (full or simplified) PBR protocol and the martingale-based protocol,respectively. Repetitions of this Monte Carlo simulation show similar behavior.
The results from 10, 000 successive trials are recorded. Fig. 6.3 shows the (negative) log-p-
values computed for the first n results from a simulated sequence of trials as functions of n. The
asymptotic lines for log-p-values, given by the products of n and the respective gain rates achieved
92
by different protocols, are also shown in Fig. 6.3.
In our discussion so far, we have assumed that each PBR protocol updates the PBR before
each trial. In practice, the PBR is updated only for a block of trial results at a time. Specifically,
for the simulation shown in Fig. 6.3, we update the PBRs and log-p-values only after every block
including 154 successive trials. (See Sec. 5.3.1 of Chapter 5 for a discussion of the block-size
choice and related issues.) This block-size choice limits PBR computations to when enough new
information has been obtained, thereby reducing the resource cost. It also mitigates the offset of
the computed log-p-values from the asymptotic line. This offset is due to an initial transient where
the relevant features of the experimental distribution are being learned. The learning offset can be
removed if, before an experiment, we have a good estimate of the experimental results’ distribution.
Such an estimate could be based on (quantum or otherwise) theory or previous experiments.
The PBR protocols provide better results than the martingale-based protocol. However,
the PBR log-p-values show learning offsets from the asymptotic line. Our results show that the
simplified PBR log-p-values have a smaller learning offset than the full PBR log-p-values in each
of 30 independent simulations performed. The reason is that the simplified PBR protocol needs to
infer a much smaller number of parameters for constructing the PBRs.
In the above example, the simplified PBR protocol uses only two Bell functions. Given a
prescient choice of Bell functions, this is sufficient for computing asymptotically optimal p-values.
But in general, more Bell functions are needed for computing a high-quality p-value. However, this
involves inferring more parameters and thus requires more trials before a good inference can be
obtained. As a result, the learning offset is expected to increase when using more Bell functions.
One way to mitigate this problem may be to increase the number of Bell functions used over time,
adding new Bell functions only when there are enough trials for reliable inference of the additional
parameters.
93
6.3 Extensions
To compute a p-value, the simplified PBR protocol uses a set of linear inequalities that are
satisfied by the predictions of a null hypothesis before each trial in an experiment. Besides tests of
LR, there are many other types of tests based on linear witnesses, such as tests for entanglement [82,
84] and system dimensionality above a given bound [169, 170]. In any test based on linear witnesses,
such a witness can be expressed as 〈W (X)〉 ≤ B, where W is a real-valued function and X is the
random variable from which a trial result x is sampled. The result x consists of all choices made
at each trial, such as choices of states and measurement settings, and the outcomes observed
under these choices. Here, we assume that the choices are made randomly according to a known
probability distribution at each trial, so that a witness 〈W (X)〉 ≤ B is satisfied before each trial
assuming the null hypothesis. As for Bell functions, if a witness function W is lower-bounded it can
be standardized. The simplified PBR protocol can then be applied with any set of standardized
witnesses, as we did for tests of LR.
To explain the above idea, let us consider a test of entanglement. Specifically, let us consider
the verification of entanglement in the mixed two-qubit state
ρ(V ) = V |ψ−〉〈ψ−|+ (1− V )I4
4, (6.8)
where the pure state |ψ−〉 is the singlet state 1√2(|10〉 − |01〉), I4
4 is the completely mixed state of
two qubits, and the parameter V characterizes the visibility of the singlet state in an experiment.
The state in Eq. (6.8) is entangled if and only if V > 1/3, which can be verified by measuring the
entanglement-witness operator [81, 171]
W = I4 − 2|ψ−〉〈ψ−|. (6.9)
It is easy to verify that Tr(Wρsep) ≥ 0 for any separable state ρsep and Tr(Wρ(V )) < 0 if and
only if V > 1/3. However, it is difficult to directly measure this witness operator in practice. The
reason is that the projector |ψ−〉〈ψ−| in this witness operator is a nonlocal operator, which is not
straightforward to measure. Therefore, for the experimental implementation we need to decompose
94
the witness operator in Eq. (6.9) into operators that can be measured locally. There are different
ways to decompose an entanglement-witness operator into local operators. As shown in Ref. [171],
the decomposition of Eq. (6.9) that involves the least number of joint measurement settings is
W =12
(I4 + σx ⊗ σx + σy ⊗ σy + σz ⊗ σz), (6.10)
where σx =(
0 11 0
), σy =
(0 −ii 0
)with i as the imaginary unit, and σz =
(1 00 −1
)are the Pauli
matrices.
To verify the entanglement in the state Eq. (6.8) by the simplified PBR protocol, let the
operator W ′ = −σx⊗σx−σy⊗σy−σz⊗σz. Since the witness operator in Eq. (6.10) is W = (I4−
W ′)/2, the operator W ′ satisfies that Tr(W ′ρsep) ≤ 1 for any separable state ρsep and Tr(W ′ρ(V )) >
1 for the entangled state ρ(V ) with V > 1/3 in Eq. (6.8). To measure the operator W ′ in an
experiment, at a trial we choose the joint setting σx ⊗ σx, σy ⊗ σy, or σy ⊗ σy uniformly randomly
and observe the outcome (aj , bj) = (1, 1), (1,−1), (−1, 1), or (−1,−1) under the chosen setting
σj ⊗ σj , where 1 and −1 are the two eigenvalues of the Pauli matrix σj , j = x, y, or z. We then
denote the measurement-setting choice and outcome at a trial by X and define the entanglement-
witness function W ′′(x) = −ajbj/pj = −3ab, where x = (j, aj , bj) includes the setting choice j and
the observed outcome (aj , bj) at a trial, and pj = 1/3 is the probability of choosing the joint setting
σj ⊗ σj at a trial. Here the function W ′′ is chosen so that its expectation 〈W ′′(X)〉 is equal to
Tr(W ′ρ) for any state ρ. Hence, the expectation according to all separable states 〈W ′′(X)〉 ≤ 1 with
W ′′ = ±3; while according to the state ρ(V ) in Eq. (6.8), the probability thatW ′′ = 3 orW ′′ = −3 is
(1+V )/2 or (1−V )/2, respectively, so 〈W ′′(X)〉 = 3V is greater than the separable bound Bsep = 1
if and only if V > 1/3. To compute the confidence-gain rate for rejecting separable-state models for
measurement results according to the state ρ(V ) with V > 1/3, we need to standardize the witness
function W ′′ by r(x) = (W ′′(x) − bl)/(Bsep − bl) with bl = −3 so that the standardized witness
function r satisfies that r(x) ≥ 0 and 〈r(X)〉 ≤ 1 for all separable states. Then, the confidence-gain
rate achieved by the simplified PBR protocol using the standardized entanglement-witness function
95
r and the trivial witness function r′ = 1 is
GsPBR = max0≤ω≤1
(1 + V
2log2(1 +
ω
2) +
1− V2
log2(1− ω))
=1 + V
2log2
(3(1 + V )
4
)+
1− V2
log2
(3(1− V )
2
). (6.11)
It is easy to verify that the gain rate GsPBR is positive if and only if V > 1/3 and that GsPBR
increases with V . Moreover, for this example the gain rate GsPBR is optimal, as shown in the
following.
Given the state ρ(V ) in Eq. (6.8) and the setting choice j (j = x, y, or z) at an experimental
trial, the probabilities of observing various outcomes are
ProbQM(aj = 1, bj = 1) = ProbQM(aj = −1, bj = −1) =1− V
4, and
ProbQM(aj = 1, bj = −1) = ProbQM(aj = −1, bj = 1) =1 + V
4. (6.12)
Then, we can compute the KL divergence from q, i.e., the experimental probability distribution of
setting choices and outcomes according to the entangled state ρ(V ) with V > 1/3, to p, i.e., the
distribution of setting choices and outcomes according to the separable state ρ(V = 1/3). It turns
out that the KL divergence DKL(q ‖ p) = 1+V2 log2
(3(1+V )
4
)+ 1−V
2 log2
(3(1−V )
2
), which is the
same as the confidence-gain rate in Eq. (6.11) achieved by the simplified PBR protocol. According
to Ref. [25], the optimal gain rate Sq in the test of entanglement is given by the minimum KL
divergence from the experimental probability distribution q to any distribution according to a
separable state. That is, Sq ≤ DKL(q ‖ p) = GsPBR. Since the gain-rate achieved by the simplified
PBR protocol is valid, GsPBR ≤ Sq. Combining the above two points, we can see that GsPBR = Sq
for any entangled state ρ(V ) with V > 1/3.
In the above example, we can also apply the martingale-based protocol. According to the
discussion in Sec. 6.1, since the witness function W ′ can take only one of two different values at an
experimental trial, the gain rate achieved by the martingale-based protocol is the same as GsPBR
in Eq. (6.11). However, for other entanglement witnesses, the simplified PBR protocol may have
an advantage over the martingale-based protocol. For example, let us consider the entanglement
96
verification of the noisy Greenberger-Horne-Zeilinger (GHZ) state
ρ = V |ψGHZ〉〈ψGHZ|+ (1− V )I8
8, (6.13)
where the state |ψGHZ〉 = (|000〉+ |111〉)/√
2 is the ideal GHZ state, and I88 is the completely mixed
state of three qubits. One can verify the entanglement of the state Eq. (6.13) by measuring the
entanglement-witness operator [81, 172]
W =I − 2|ψGHZ〉〈ψGHZ|
=14
(3I2 ⊗ I2 ⊗ I2 − I2 ⊗ σz ⊗ σz − σz ⊗ I2 ⊗ σz − σz ⊗ σz ⊗ I2 − 2σx ⊗ σx ⊗ σx
+√
2σπ/4 ⊗ σπ/4 ⊗ σπ/4 +√
2σ−π/4 ⊗ σ−π/4 ⊗ σ−π/4), (6.14)
where σ±π/4 = (σx ± σy)/√
2, and I2 is the identity matrix of size 2 × 2. This decomposition of
the witness operator involves the least number of joint measurement settings. According to the
definition of an entanglement-witness operator, if Tr(Wρ) < 0 then the state ρ is entangled.
To apply the simplified PBR or martingale-based protocol, we define the operator W ′ =
I2⊗σz⊗σz+σz⊗I2⊗σz+σz⊗σz⊗I2+2σx⊗σx⊗σx−√
2σπ/4⊗σπ/4⊗σπ/4−√
2σ−π/4⊗σ−π/4⊗σ−π/4.
Since the witness operator in Eq. (6.14) is W = (3I2⊗I2⊗I2−W ′)/4, the operator W ′ satisfies that
Tr(W ′ρsep) ≤ 3 for all separable states ρsep. So, if Tr(W ′ρ) > 3 the state ρ is entangled. We index
the joint settings I2 ⊗ σz ⊗ σz, σz ⊗ I2 ⊗ σz, σz ⊗ σz ⊗ I2, σx ⊗ σx ⊗ σx, σπ/4 ⊗ σπ/4 ⊗ σπ/4,
and σ−π/4 ⊗ σ−π/4 ⊗ σ−π/4 by j = 1, 2, 3, 4, 5, and 6, respectively. To measure the operator
W ′ in an experiment, at a trial we choose the joint setting indexed by j randomly and observe
the outcome (aj , bj , cj) under the chosen setting, where aj , bj , or cj is ±1. We then denote the
measurement-setting choice and outcome at a trial by X and define the entanglement-witness
function W ′′(x) = ajbjcjwj/pj , where x = (j, aj , bj , cj) includes the setting choice j and the
observed outcome (aj , bj , cj) at a trial, wj is a constant depending on the setting choice j, and pj
is the probability of choosing the setting indexed by j at a trial. To ensure that the expectation
〈W ′′(X)〉 is equal to Tr(W ′ρ) for any state ρ, the constants wj are chosen as follows: wj = 1 if
j = 1, 2, or 3, wj = −√
2 if j = 5 or 6, and w4 = 2. Moreover, assuming the joint setting σz⊗σz⊗σz,
97
σx⊗ σx⊗ σx, σπ/4⊗ σπ/4⊗ σπ/4, or σ−π/4⊗ σ−π/4⊗ σ−π/4 is chosen uniformly randomly at a trial,
then the function W ′′ can take six different values ±12, ±8, or ±4√
2. (Note that, to measure
I2 ⊗ σz ⊗ σz, σz ⊗ I2 ⊗ σz, or σz ⊗ σz ⊗ I2, we use the measurement setup for the joint setting
σz ⊗ σz ⊗ σz and uniformly randomly choose which of the above three measurements to perform.)
Hence, according to the discussion in Sec. 6.1, the confidence-gain rate achieved by the simplified
PBR protocol is higher than that achieved by the martingale-based protocol.
In general, the simplified PBR protocol can also be applied to verify the type of entanglement
of a multipartite state. Moreover, as in a test of LR, the simplified PBR protocol can be applied with
a set of entanglement witnesses, so that this protocol typically behaves better than the martingale-
based protocol in practice. The above strategies are limited to linear witnesses. As is well known,
the set of separable states is convex but not a polytope, so a nonlinear entanglement witness can
detect more entangled states than a linear one. Whether or not the simplified PBR protocol can be
applied with nonlinear witnesses is an interesting open problem and deserves further investigation
in the future.
Chapter 7
Conclusions and future directions
7.1 Conclusion
The degree of violation of local realism (LR) in an experimental test is usually expressed in
terms of the number of standard deviations of violation of a Bell inequality. This quantity cannot,
however, be used to obtain valid p-values for rejecting LR by conventional means. It also fails
to quantitatively compare the success of different experimental tests of LR and does not account
for stability issues or memory effects in experiments. In Chapter 5, we solved these problems
by providing a method—the prediction-based-ratio (PBR) protocol—for determining valid and
asymptotically tight p-value upper bounds directly from the sequence of measurement settings and
outcomes in an experiment. The PBR protocol does not rely on a predetermined Bell inequality,
adapts to the actual experimental configuration, and is asymptotically optimal for independent and
identically distributed results sampled from the experimental probability distribution. It therefore
provides a standardized measure of success for experimental tests of LR. While the protocol remains
valid if the state and setting parameters drifts during an experiment, how well it performs depends
on the nature of the drifts and how the protocol takes them into account.
Our simulations showed that it is practical to apply the PBR protocol to data from typical
experimental configurations, and that the running p-value upper bounds can be used for tweaking
an experiment in progress to find the experimentally accessible configuration that provides the
highest violation of LR. However, the PBR protocol is not efficient with respect to the number of
parties per test, settings per party, and outcomes per setting. In Chapter 6, we simplified the PBR
99
protocol and showed that the simplified PBR protocol is efficient. The simplified PBR protocol uses
a set of Bell inequalities chosen based on the estimated probability distribution of setting choices
and outcomes before an experiment. The behavior and implementation complexity of the simplified
PBR protocol depend on the choice and number of Bell inequalities considered. Compared with the
previously known and efficient protocol, the martingale-based protocol, the simplified PBR protocol
provides better and even optimal results given a relevant set of Bell inequalities. In Chapter 6, we
also briefly discussed how to apply the simplified PBR protocol to any test with linear witnesses,
such as tests of entanglement or system dimensionality.
The p-value for rejecting LR decays exponentially with the number of data points in the
asymptotic limit. The optimal decay rate is given as the minimum Kullback-Leibler divergence from
the experimental probability distribution to all distributions according to LR. In Chapter 4, we
studied the optimal decay rates of p-values in tests of LR using polarized photon pairs and inefficient
detectors or counters. Specifically, we studied the minimum detection efficiency or experimental
visibility required for achieving any given optimal decay rate.
7.2 Future work
Many quantum information tasks, such as device-independent quantum key distribution and
randomness expansion or amplification, have been proposed recently. In these tasks, before sharing
private information, the spatially separated parties need to verify violation of LR using a finite set
of data. It is desirable to apply the PBR protocol to these tasks, so that reliable experimental
realizations of these tasks can be guaranteed. Also, a detailed study of the application of the
simplified PBR protocol to tests of entanglement or system dimensionality is needed. In addition,
we described a systematic and efficient method for deriving Bell inequalities in Chapter 2. Whether
or not all Bell inequalities can be derived using this method is an interesting open problem and
deserves further investigation in the future.
Bibliography
[1] Tamas Vertesi, Stefano Pironio, and Nicolas Brunner. Closing the detection loophole in Bellexperiments using qudits. Phys. Rev. Lett., 104:060401, 2010.
[2] Marek Zukowski, Dagomir Kaszlikowski, and Emilio Santos. Irrelevance of photon eventsdistinguishability in a class of Bell experiments. Phys. Rev. A, 60:R2614–R2617, Oct 1999.
[3] S. Pironio, A. Acin, S. Massar, A. Boyer dela Giroday, D. N. Matsukevich, P. Maunz, S. Olm-schenk, D. Hayes, L. Luo, T. A. Manning, and C. Monroe. Random numbers certified byBell’s theorem. Nature, 464:1021, 2010.
[4] Jing-Ling Chen, Chunfeng Wu, L. C. Kwek, C. H. Oh, and Mo-Lin Ge. Violating Bellinequalities maximally for two d-dimensional systems. Phys. Rev. A, 74:032106, Sep 2006.
[5] J. S. Bell. On the Einstein Podolsky Rosen paradox. Physics, 1:195–200, 1964.
[6] S. J. Freedman and J. F. Clauser. Experimental test of local hidden-variable theories. Phys.Rev. Lett., 28:938–941, 1972.
[7] A. Peres. All the Bell inequalities. Found. Phys., 29:589–614, 1999.
[8] R. F. Werner and M. M. Wolf. Bell inequalities and entanglement. Quant. Inf. Comp., 1:1–25,2001.
[9] M. Genovese. Research on hidden variable theories: A review of recent progresses. Phys.Rep., 413:319–396, 2005.
[10] R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki. Quantum entanglement. Rev.Mod. Phys., 81:865–942, 2009.
[11] Jonathan Barrett, Lucien Hardy, and Adrian Kent. No signaling and quantum key distribu-tion. Phys. Rev. Lett., 95:010503, Jun 2005.
[12] L. Masanes, R. Renner, A. Winter, J. Barrett, and M. Christandl. Security of key distributionfrom causality constraints. 2009. arXiv: quant-ph/0606049.
[13] Lluis Masanes. Universally composable privacy amplification from causality constraints. Phys.Rev. Lett., 102:140501, Apr 2009.
[14] S. Popescu and D. Rohrlich. Quantum nonlocality as an axiom. Found. Phys., 24:379, 1994.
101
[15] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt. Proposed experiment to test localhidden-variable theories. Phys. Rev. Lett., 23:880–884, 1969.
[16] J. F. Clauser and M. A. Horne. Experimental consequences of objective local theories. Phys.Rev. D, 10:526–535, 1974.
[17] Yanbao Zhang, Scott Glancy, and Emanuel Knill. Asymptotically optimal data analysis forrejecting local realism. Phys. Rev. A, 84:062118, Dec 2011.
[18] Richard D. Gill. Accardi contra Bell (cum mundi): The impossible coupling. In MathematicalStatistics and Applications: Festschrift for Constance van Eeden. Eds: M. Moore, S. Frodaand C. Leger. IMS Lecture Notes – Monograph Series, volume 42, pages 133–154. Institute ofMathematical Statistics. Beachwood, Ohio, 2003. Also available as arXiv:quant-ph/0110137.
[19] Richard D. Gill. Time, finite statistics, and Bell’s fifth position. In Proc. of “Foundations ofProbability and Physics - 2”, Ser. Math. Modelling in Phys., Engin., and Cogn. Sc., volume 5,pages 179–206. Vaxjo Univ. Press., 2003.
[20] Luigi Accardi and Massimo Regoli. Locality and Bell’s inequality. arXiv:quant-ph/0007005.
[21] Luigi Accardi and Massimo Regoli. Non-locality and quantum theory: new experimentalevidence. arXiv:quant-ph/0007019.
[22] Luigi Accardi and Massimo Regoli. The EPR correlations and the chameleon effect.arXiv:quant-ph/0110086.
[23] Jonathan Barrett, Daniel Collins, Lucien Hardy, Adrian Kent, and Sandu Popescu. Quantumnonlocality, Bell inequalities, and the memory loophole. Phys. Rev. A, 66(4):042111, Oct 2002.
[24] W. van Dam, R. D. Gill, and P. D. Grunwald. The statistical strength of nonlocality proofs.IEEE Trans. Inf. Theory, 51:2812–2835, 2005.
[25] R. R. Bahadur. An optimal property of the likelihood ratio statistic. In Proc. Fifth BerkeleySymp. on Math. Statist. and Prob., volume 1, pages 13–26. Univ. of Calif. Press, 1967.
[26] Yanbao Zhang, Emanuel Knill, and Scott Glancy. Statistical strength of experiments to rejectlocal realism with photon pairs and inefficient detectors. Phys. Rev. A, 81:032117, Mar 2010.
[27] P. M. Pearle. Hidden-variable example based upon data rejection. Phys. Rev. D, 2:1418 –1425, 1970.
[28] P. H. Eberhard. Background level and counter efficiencies required for a loophole-free Einstein-Podolsky-Rosen experiment. Phys. Rev. A, 47:R747–R750, 1993.
[29] Yanbao Zhang, Scott Glancy, and Emanuel Knill. Efficient quantification of experimentalevidence against local realism. arXiv:1303.7464.
[30] A. Einstein, B. Podolsky, and N. Rosen. Can quantum-mechanical description of physicalreality be considered complete? Phys. Rev., 47:777, 1935.
[31] J. S. Bell. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press,Cambridge, 2004. pp. 139-158.
102
[32] Arthur Fine. Hidden variables, joint probability, and the Bell inequalities. Phys. Rev. Lett.,48:291–295, Feb 1982.
[33] R. F. Werner and M. M. Wolf. All-multipartite Bell-correlation inequalities for two dichotomicobservables per site. Phys. Rev. A, 64:032112, Aug 2001.
[34] Itamar Pitowsky. Geometry of quantum correlations. Phys. Rev. A, 77:062109, Jun 2008.
[35] B.S. Cirel’son. Quantum generalizations of Bell’s inequality. Lett. Math. Phys., 4:93, 1980.
[36] L. Masanes. Necessary and sufficient condition for quantum-generated correlations.arXiv:quant-ph/0309137, 2003.
[37] Gunter M. Ziegler. Lectures on Polytopes. Springer-Verlag, New York, 1995.
[38] Jonathan Barrett, Noah Linden, Serge Massar, Stefano Pironio, Sandu Popescu, and DavidRoberts. Nonlocal correlations as an information-theoretic resource. Phys. Rev. A, 71:022101,Feb 2005.
[39] I. Pitowsky. Quantum Probability–Quantum Logic. Springer, Berlin, 1989.
[40] Itamar Pitowsky and Karl Svozil. Optimal tests of quantum nonlocality. Phys. Rev. A,64:014102, Jun 2001.
[41] Daniel Collins and Nicolas Gisin. A relevant two qubit Bell inequality inequivalent to theCHSH inequality. J. Phys. A: Math. Gen., 37:1775, 2004.
[42] David Avis, Hiroshi Imai, and Tsuyoshi Ito. On the relationship between convex bodiesrelated to correlation experiments with dichotomic observables. J. Phys. A: Math. Gen.,39:11283, 2006.
[43] Lluis Masanes. Tight Bell inequality for d-outcome measurements correlations. Quantuminformation & computation, 3:345, 2002.
[44] Marek Zukowski and Caslav Brukner. Bell’s theorem for general N -qubit states. Phys. Rev.Lett., 88:210401, May 2002.
[45] Jean-Daniel Bancal, Nicolas Gisin, and Stefano Pironio. Looking for symmetric Bell inequal-ities. J. Phys. A: Math. Theor., 43:385303, 2010.
[46] N. Gisin. Bell’s inequality holds for all non-product states. Phys. Lett. A, 154:201–202, 1991.
[47] N. Gisin and A. Peres. Maximal violation of Bell’s inequality for arbitrarily large spin. Phys.Lett. A, 162:15, 1992.
[48] S. Popescu and D. Rohrlich. Generic quantum nonlocality. Phys. Lett. A, 166:293–297, 1992.
[49] T. Vertesi. More efficient Bell inequalities for Werner states. Phys. Rev. A, 78:032112, Sep2008.
[50] N. Brunner, N. Gisin, V. Scarani, and C. Simon. Detection loophole in asymmetric Bellexperiments. Phys. Rev. Lett., 98:220403, 2007.
103
[51] A. Cabello and J.-A. Larsson. Minimum detection efficiency for a loophole-free atom-photonBell experiment. Phys. Rev. Lett., 98:220402, 2007.
[52] Nicolas Brunner and Nicolas Gisin. Partial list of bipartite Bell inequalities with four binarysettings. Phys. Lett. A, 372:3162, 2008.
[53] Karoly F. Pal and Tamas Vertesi. Quantum bounds on Bell inequalities. Phys. Rev. A,79:022120, Feb 2009.
[54] Samuel L. Braunstein and Carlton M. Caves. Wringing out better Bell inequalities. Annalsof Physics, 202:22, 1990.
[55] N. Gisin. Bell inequality for arbitrary many settings of the analyzers. Phys. Lett. A, 260:1,1999.
[56] Dagomir Kaszlikowski and Marek Zukowski. Bell theorem involving all possible local mea-surements. Phys. Rev. A, 61:022114, Jan 2000.
[57] N. Gisin. Bell inequalities: many questions, a few answers. arXiv:quant-ph/0702021.
[58] Daniel Collins, Nicolas Gisin, Noah Linden, Serge Massar, and Sandu Popescu. Bell inequal-ities for arbitrarily high-dimensional systems. Phys. Rev. Lett., 88:040404, Jan 2002.
[59] Li-Bin Fu. General correlation functions of the Clauser-Horne-Shimony-Holt inequality forarbitrarily high-dimensional systems. Phys. Rev. Lett., 92:130404, Mar 2004.
[60] N. David Mermin. Extreme quantum entanglement in a superposition of macroscopicallydistinct states. Phys. Rev. Lett., 65:1838–1840, Oct 1990.
[61] M. Ardehali. Bell inequalities with a magnitude of violation that grows exponentially withthe number of particles. Phys. Rev. A, 46:5375–5378, Nov 1992.
[62] A. V. Belinskii and D. N. Klyshko. Interference of light and Bell’s theorem. Phys. Usp.,36:653, 1993.
[63] N. Gisin and H. Bechmann-Pasquinucci. Bell inequality, Bell states and maximally entangledstates for n qubits. Phys.Lett. A, 246:1–6, 1998.
[64] Adan Cabello. Bell’s inequality for n spin-s particles. Phys. Rev. A, 65:062105, Jun 2002.
[65] Wies law Laskowski, Tomasz Paterek, Marek Zukowski, and Caslav Brukner. Tight multi-partite Bell’s inequalities involving many measurement settings. Phys. Rev. Lett., 93:200401,Nov 2004.
[66] W. Son, Jinhyoung Lee, and M. S. Kim. Generic Bell inequalities for multipartite arbitrarydimensional systems. Phys. Rev. Lett., 96:060406, Feb 2006.
[67] Koji Nagata, Wies law Laskowski, and Tomasz Paterek. Bell inequality with an arbitrarynumber of settings and its applications. Phys. Rev. A, 74:062109, Dec 2006.
[68] Elena R Loubenets. Multipartite Bell-type inequalities for arbitrary numbers of settings andoutcomes per site. J. Phys. A: Math. Theor., 41:445304, 2008.
104
[69] Marek Zukowski and Dagomir Kaszlikowski. Critical visibility for n-particle Greenberger-Horne-Zeilinger correlations to violate local realism. Phys. Rev. A, 56:R1682–R1685, Sep1997.
[70] Stephanie Wehner. Tsirelson bounds for generalized Clauser-Horne-Shimony-Holt inequali-ties. Phys. Rev. A, 73:022110, Feb 2006.
[71] A. Peres. Bayesian analysis of Bell inequalities. Fortsch.Phys., 48:531, 2000.
[72] Jonathan Barrett, Adrian Kent, and Stefano Pironio. Maximally nonlocal and monogamousquantum correlations. Phys. Rev. Lett., 97:170409, Oct 2006.
[73] Roger Colbeck and Renato Renner. Hidden variable models for quantum theory cannot haveany local part. Phys. Rev. Lett., 101:050403, Aug 2008.
[74] A. Cabello, J.-A. Larsson, and D. Rodriguez. Minimum detection efficiency required fora loophole-free violation of the Braunstein-Caves chained Bell inequalities. Phys. Rev. A,79:062109, Jun 2009.
[75] Antonio Acin, Richard Gill, and Nicolas Gisin. Optimal Bell tests do not require maximallyentangled states. Phys. Rev. Lett., 95:210402, Nov 2005.
[76] Serge Massar, Stefano Pironio, Jeremie Roland, and Bernard Gisin. Bell inequalities resistantto detector inefficiency. Phys. Rev. A, 66:052112, Nov 2002.
[77] A. Acin, T. Durt, N. Gisin, and J. I. Latorre. Quantum nonlocality in two three-level systems.Phys. Rev. A, 65:052325, May 2002.
[78] Stefan Zohren and Richard D. Gill. Maximal violation of the Collins-Gisin-Linden-Massar-Popescu inequality for infinite dimensional states. Phys. Rev. Lett., 100:120406, Mar 2008.
[79] Marek Zukowski, Caslav Brukner, Wies law Laskowski, and Marcin Wiesniak. Do all pure en-tangled states violate Bell’s inequalities for correlation functions? Phys. Rev. Lett., 88:210402,May 2002.
[80] Samson Abramsky and Lucien Hardy. Logical Bell inequalities. Phys. Rev. A, 85:062114,Jun 2012.
[81] Otfried Guhne and Geza Toth. Entanglement detection. Physics Reports, 474:1–75, 2009.
[82] Barbara M. Terhal. Bell inequalities and the separability criterion. Phys. Lett. A, 271:319,2000.
[83] Reinhard F. Werner. Quantum states with Einstein-Podolsky-Rosen correlations admittinga hidden-variable model. Phys. Rev. A, 40:4277–4281, Oct 1989.
[84] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Separability of mixed states:necessary and sufficient conditions. Phys. Lett. A, 223:1, 1996.
[85] Otfried Guhne and Norbert Lutkenhaus. Nonlinear entanglement witnesses. Phys. Rev. Lett.,96:170502, May 2006.
105
[86] Asher Peres. Separability criterion for density matrices. Phys. Rev. Lett., 77:1413–1415, Aug1996.
[87] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Mixed-state entanglement anddistillation: Is there a “bound” entanglement in nature? Phys. Rev. Lett., 80:5239–5242,Jun 1998.
[88] George Svetlichny. Distinguishing three-body from two-body nonseparability by a Bell-typeinequality. Phys. Rev. D, 35:3066–3069, May 1987.
[89] Michael Seevinck and George Svetlichny. Bell-type inequalities for partial separability inN -particle systems and quantum mechanical violations. Phys. Rev. Lett., 89:060401, Jul2002.
[90] Daniel Collins, Nicolas Gisin, Sandu Popescu, David Roberts, and Valerio Scarani. Bell-typeinequalities to detect true n-body nonseparability. Phys. Rev. Lett., 88:170405, Apr 2002.
[91] S. M. Roy. Multipartite separability inequalities exponentially stronger than local realityinequalities. Phys. Rev. Lett., 94:010402, Jan 2005.
[92] Wies law Laskowski and Marek Zukowski. Detection of N -particle entanglement with gener-alized Bell inequalities. Phys. Rev. A, 72:062112, Dec 2005.
[93] Michael Seevinck and Jos Uffink. Partial separability and entanglement criteria for multiqubitquantum states. Phys. Rev. A, 78:032101, Sep 2008.
[94] Jos Uffink and Michael Seevinck. Strengthened Bell inequalities for orthogonal spin directions.Phys. Lett. A, 372:1205, 2008.
[95] Jean-Daniel Bancal, Nicolas Gisin, Yeong-Cherng Liang, and Stefano Pironio. Device-independent witnesses of genuine multipartite entanglement. Phys. Rev. Lett., 106:250404,2011.
[96] A. A. Methot and V. Scarani. An anomaly of nonlocality. Quantum Information andComputation, 7:157–170, 2007.
[97] Sixia Yu, Qing Chen, Chengjie Zhang, C. H. Lai, and C. H. Oh. All entangled pure statesviolate a single Bell’s inequality. Phys. Rev. Lett., 109:120402, Sep 2012.
[98] Tamas Vertesi and Nicolas Brunner. Quantum nonlocality does not imply entanglementdistillability. arXiv:1106.4850v2.
[99] Tobias Moroder, Jean-Daniel Bancal, Yeong-Cherng Liang, Martin Hofmann, and OtfriedGuhne. Device-independent entanglement quantification and related applications. Phys.Rev. Lett., 111:030501, Jul 2013.
[100] E. Schrodinger. Discussion of probability relations between separated systems. Proc.Cambridge Philos. Soc., 31:555, 1935.
[101] H. M. Wiseman, S. J. Jones, and A. C. Doherty. Steering, entanglement, nonlocality, and theEinstein-Podolsky-Rosen paradox. Phys. Rev. Lett., 98:140402, Apr 2007.
106
[102] E. G. Cavalcanti, S. J. Jones, H. M. Wiseman, and M. D. Reid. Experimental criteria forsteering and the Einstein-Podolsky-Rosen paradox. Phys. Rev. A, 80:032112, Sep 2009.
[103] John S. Bell. On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys.,38:447–452, Jul 1966.
[104] S. Kochen and E.P. Specker. The problem of hidden variables in quantum mechanics. Journalof Mathematics and Mechanics, 17:5987, 1967.
[105] Matthias Kleinmann, Costantino Budroni, Jan-Ake Larsson, Otfried Guhne, and Adan Ca-bello. Optimal inequalities for state-independent contextuality. Phys. Rev. Lett., 109:250402,2012.
[106] Artur K. Ekert. Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett., 67:661–663, Aug 1991.
[107] Antonio Acin, Nicolas Gisin, and Lluis Masanes. From Bell’s theorem to secure quantum keydistribution. Phys. Rev. Lett., 97:120405, Sep 2006.
[108] Antonio Acin, Nicolas Brunner, Nicolas Gisin, Serge Massar, Stefano Pironio, and ValerioScarani. Device-independent security of quantum cryptography against collective attacks.Phys. Rev. Lett., 98:230501, Jun 2007.
[109] Lluis Masanes, Stefano Pironio, and Antonio Acin. Secure device-independent quantum keydistribution with causally independent measurement devices. Nat. Commun., 2:238, 2011.
[110] Roger Colbeck and Adrian Kent. Private randomness expansion with untrusted devices. J.Phys. A: Math. Theor., 44:095305, 2011.
[111] Roger Colbeck and Renato Renner. Free randomness can be amplified. Nature Physics,8:450453, 2012.
[112] Rodrigo Gallego, Lluis Masanes, Gonzalo Torre, Chirag Dhara, Leandro Aolita, and AntonioAcin. Full randomness from arbitrarily deterministic events. arXiv:1210.6514.
[113] G. Weihs, T. Jennewein, C. Simon, H. Weinfurter, and A. Zeilinger. Violation of Bell’sinequality under strict Einstein locality conditions. Phys. Rev. Lett., 81:5039–5043, 1998.
[114] A. Aspect, J. Dalibard, and G. Roger. Experimental test of Bell’s inequalities using time-varying analyzers. Phys. Rev. Lett., 49:1804–1807, 1982.
[115] Thomas Scheidl, Rupert Ursin, Johannes Kofler, Sven Ramelow, Xiao-Song Ma, ThomasHerbst, Lothar Ratschbacher, Alessandro Fedrizzi, Nathan K. Langford, Thomas Jennewein,and Anton Zeilinger. Violation of local realism with freedom of choice. Proc. Natl. Acad.Sci., 107:19708, 2010.
[116] Y. H. Shih and C. O. Alley. New type of Einstein-Podolsky-Rosen-Bohm experiment usingpairs of light quanta produced by optical parametric down conversion. Phys. Rev. Lett.,61:2921–2924, 1988.
[117] Z. Y. Ou and L. Mandel. Violation of Bell’s inequality and classical probability in a two-photon correlation experiment. Phys. Rev. Lett., 61:50–53, 1988.
107
[118] A. Garg and N. D. Mermin. Detector inefficiencies in the Einstein-Podolsky-Rosen experi-ment. Phys. Rev. D, 35:3831–3835, 1987.
[119] Jan-Ake Larsson. Necessary and sufficient detector-efficiency conditions for the Greenberger-Horne-Zeilinger paradox. Phys. Rev. A, 57:R3145–R3149, May 1998.
[120] Jan-Ake Larsson and Jason Semitecolos. Strict detector-efficiency bounds for n-site Clauser-Horne inequalities. Phys. Rev. A, 63:022117, Jan 2001.
[121] Adan Cabello, David Rodriguez, and Ignacio Villanueva. Necessary and sufficient detectionefficiency for the Mermin inequalities. Phys. Rev. Lett., 101:120402, Sep 2008.
[122] M. A. Rowe, D. Kielpinski, V. Meyer, C. A. Sackett, W. M. Itano, C. Monroe, and D. J.Wineland. Experimental violation of a Bell’s inequality with efficient detection. Nature,409:791–794, 2001.
[123] D. N. Matsukevich, P. Maunz, D. L. Moehring, S. Olmschenk, and C. Monroe. Bell inequalityviolation with two remote atomic qubits. Phys. Rev. Lett., 100:150404, Apr 2008.
[124] Julian Hofmann, Michael Krug, Norbert Ortegel, Lea Gerard, Markus Weber, WenjaminRosenfeld, and Harald Weinfurter. Heralded entanglement between widely separated atoms.Science, 337:72–75, 2012.
[125] R.D. Gill, G. Weihs, A. Zeilinger, and M. Zukowski. No time loophole in Bell’s theorem: thehess-philipp model is non-local. PNAS, 99:14632, 2002.
[126] Fabian Steinlechner, Pavel Trojek, Marc Jofre, Henning Weier, Daniel Perez, Thomas Jen-newein, Rupert Ursin, John Rarity, Morgan W. Mitchell, Juan P. Torres, Harald Weinfurter,and Valerio Pruneri. A high-brightness source of polarization-entangled photons optimizedfor applications in free space. Opt. Exp., 20:9640, 2012.
[127] Onur Kuzucu and Franco N. C. Wong. Pulsed sagnac source of narrow-band polarization-entangled photons. Phys. Rev. A, 77:032314, Mar 2008.
[128] A. E. Lita, A. J. Miller, and S. W. Nam. Counting near-infrared single-photons with 95%efficiency. Opt. Express, 16:3032–3040, 2008.
[129] A. E. Lita, B. Calkins, L. A. Pellochoud, A. J. Miller, and S. W. Nam. High-efficiencyphoton-number-resolving detectors based on Hafnium transition-edge sensors. AIP Conf.Proc., 1185:351, 2009.
[130] A. Lamas-Linares, B. Calkins, N. A. Tomlin, T. Gerrits, A. E. Lita, J. Beyer, R. P. Mirin,and S. W. Nam. Nanosecond-scale timing jitter in transition edge sensors at telecom andvisible wavelengths. ArXiv:1209.5721.
[131] F. Marsili, V. B. Verma, J. A. Stern, S. Harrington, A. E. Lita, T. Gerrits, I. Vayshenker,B. Baek, M. D. Shaw, R. P. Mirin, and S. W. Nam. Detecting single infrared photons with93% system efficiency. ArXiv:1209.5774.
[132] Marissa Giustina, Alexandra Mech, Sven Ramelow, Bernhard Wittmann, Johannes Kofler,Jorn Beyer, Adriana Lita, Brice Calkins, Thomas Gerrits, Sae Woo Nam, Rupert Ursin, andAnton Zeilinger. Bell violation using entangled photons without the fair-sampling assumption.Nature, 497:227–230, 2013.
108
[133] F. Henkel, M. Krug, J. Hofmann, W. Rosenfeld, M. Weber, and H. Weinfurter. Highly efficientstate-selective submicrosecond photoionization detection of single atoms. Phys. Rev. Lett.,105:253001, Dec 2010.
[134] B. B. Blinov, D. L. Moehring, L.-M. Duan, and C. Monroe. Observation of entanglementbetween a single trapped atom and a single photon. Nature, 428:153, 2004.
[135] Jurgen Volz, Markus Weber, Daniel Schlenk, Wenjamin Rosenfeld, Johannes Vrana, KarenSaucke, Christian Kurtsiefer, and Harald Weinfurter. Observation of entanglement of a singlephoton with a trapped atom. Phys. Rev. Lett., 96:030404, Jan 2006.
[136] Christoph Simon and William T. M. Irvine. Robust long-distance entanglement and aloophole-free Bell test with ions and photons. Phys. Rev. Lett., 91:110405, Sep 2003.
[137] N. Sangouard, J.-D. Bancal, N. Gisin, W. Rosenfeld, P. Sekatski, M. Weber, and H. Wein-furter. Loophole-free Bell test with one atom and less than one photon on average. Phys.Rev. A, 84:052122, Nov 2011.
[138] Hyunchul Nha and H. J. Carmichael. Proposed test of quantum nonlocality for continuousvariables. Phys. Rev. Lett., 93:020401, Jul 2004.
[139] R. Garcia-Patron, J. Fiurasek, N. J. Cerf, J. Wenger, R. Tualle-Brouri, and Ph. Grangier.Proposal for a loophole-free Bell test using homodyne detection. Phys. Rev. Lett., 93:130409,Sep 2004.
[140] Daniel Cavalcanti, Nicolas Brunner, Paul Skrzypczyk, Alejo Salles, and Valerio Scarani. Largeviolation of Bell inequalities using both particle and wave measurements. Phys. Rev. A,84:022105, Aug 2011.
[141] M.T. Quintino, M. Araujo, D. Cavalcanti, M.F. Santos, and M.T. Cunha. Maximal violationsand efficiency requirements for Bell tests with photodetection and homodyne measurements.Journal of Physics A: Mathematical and Theoretical, 45(21):215308, 2012.
[142] Mateus Araujo, Marco Tulio Quintino, Daniel Cavalcanti, Marcelo Fran ca Santos, Adan Ca-bello, and Marcelo Terra Cunha. Tests of Bell inequality with arbitrarily low photodetectionefficiency and homodyne measurements. Phys. Rev. A, 86:030101, Sep 2012.
[143] D. L. Moehring, M. J. Madsen, B. B. Blinov, and C. Monroe. Experimental Bell inequalityviolation with an atom and a photon. Phys. Rev. Lett., 93:090410, Aug 2004.
[144] W. Tittel, J. Brendel, N. Gisin, and H. Zbinden. Long-distance Bell-type tests using energy-time entangled photons. Phys. Rev. A, 59:4150–4163, 1999.
[145] T. E. Kiess, Y. H. Shih, A. V. Sergienko, and C. O. Alley. Einstein-Podolsky-Rosen-Bohmexperiment using pairs of light quanta produced by type-II parametric down-conversion. Phys.Rev. Lett., 71:3893–3897, 1993.
[146] B. Lounis and M. Orrit. Single-photon sources. Rep. Prog. Phys., 68:1129–1179, 2005.
[147] M. Oxborrow and A. G. Sinclair. Single-photon sources. Contemp. Phys., 46:173–206, 2005.
[148] P. G. Kwiat, P. H. Eberhard, A. M. Steinberg, and R. Y. Chiao. Proposal for a loophole-freeBell inequality experiment. Phys. Rev. A, 49:3209–3220, 1994.
109
[149] L. De Caro and A. Garuccio. Reliability of Bell-inequality measurements using polarizationcorrelations in parametric-down-conversion photon sources. Phys. Rev. A, 50:R2803–R2805,1994.
[150] S. Popescu, L. Hardy, and M. Zukowski. Revisiting Bell’s theorem for a class of down-conversion experiments. Phys. Rev. A, 56:R4353–R4356, 1997.
[151] Jun Shao. Mathematical Statistics. Springer, New York, 2nd edition, 2003.
[152] Y. Vardi and D. Lee. From image deblurring to optimal investments: Maximum likelihoodsolutions for positive linear inverse problems. J. Royal Stat. Soc. B, 55:569–612, 1993.
[153] A. G. White, D. F. V. James, P. H. Eberhard, and P. G. Kwiat. Nonmaximally entangledstates: Production, characterization, and utilization. Phys. Rev. Lett., 83:3103–3107, 1999.
[154] G. Brida, M. Genovese, C. Novero, and E. Predazzi. New experimental test of Bell inequalitiesby the use of a non-maximally entangled photon state. Phys. Lett. A, 268:12–16, 2000.
[155] V. Scarani and N. Gisin. Spectral decomposition of Bell’s operators for qubits. J. Phys. A:Math. Gen., 34:6043–6053, 2001.
[156] W. Wasilewski, A. I. Lvovsky, K. Banaszek, and C. Radzewicz. Pulsed squeezed light: Si-multaneous squeezing of multiple modes. Phys. Rev. A, 73:063819, 2006.
[157] A. I. Lvovsky, W. Wasilewski, and K. Banaszek. Decomposing a pulsed optical parametricamplifer into independent squeezers. J. Mod. Optics, 54:721–733, 2007.
[158] C.-E. Bardyn, T. C. H. Liew, S. Massar, M. McKague, and V. Scarani. Device-independentstate estimation based on Bell’s inequalities. Phys. Rev. A, 80:062327, 2009.
[159] Rafael Rabelo, Melvyn Ho, Daniel Cavalcanti, Nicolas Brunner, and Valerio Scarani. Device-independent certification of entangled measurements. Phys. Rev. Lett., 107:050502, 2011.
[160] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of theAmerican Statistical Association, 58:13, 1963.
[161] K. Azuma. Weighted sums of certain dependent random variables. TohoKu MathematicalJournal, 19:357, 1967.
[162] C. McDiarmid. On the method of bounded differences. In Surveys in Combinatorics, volume141 of London Math. Soc. Lecture Notes, pages 148–188. Cambridge Univ. Press, 1989.
[163] Rick Durrett. Probability: Theory and Examples. Cambridge, 2010. Also see the optionalstopping theorem at http://en.wikipedia.org/wiki/Optional_stopping_theorem.
[164] A. Acin, N. Gisin, and L. Masanes. From Bell’s theorem to secure quantum key distribution.Phys. Rev. Lett., 97:120405, 2006.
[165] E. S. Ristad. A natural law of succession. 1995. arXiv:cmp-lg/9508012.
[166] Robin Blume-Kohout. Optimal, reliable estimation of quantum states. New J. Phys.,12:043034, 2010.
110
[167] T. M. Cover. An algorithm for maximizing expected log investment return. IEEE Transactionson Information Theory, 30:369, 1984.
[168] S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22:79,1951.
[169] Nicolas Brunner, Stefano Pironio, Antonio Acin, Nicolas Gisin, Andre Allan Methot, andValerio Scarani. Testing the dimension of Hilbert spaces. Phys. Rev. Lett., 100:210503, May2008.
[170] Rodrigo Gallego, Nicolas Brunner, Christopher Hadley, and Antonio Acın. Device-independent tests of classical and quantum dimensions. Phys. Rev. Lett., 105:230501, Nov2010.
[171] O. Guhne, P. Hyllus, D. Brub, A. Ekert, M. Lewenstein, C. Macchiavello, and A. Sanpera.Experimental detection of entanglement via witness operators and local measurements. J.Mod. Opt., 50:1079, 2003.
[172] O. Guhne and P. Hyllus. Investigating three qubit entanglement with local measurements.Int. J. Theor. Phys., 42:1001, 2003.
Appendix A
User guide of the local realism analysis engine
A.1 Overview
The purpose of the Local Realism Analysis Engine (LRE) 1 is to perform online and of-
fline analysis of measurements obtained with randomly chosen measurement settings at two well-
separated locations. The LRE determines the current and overall violation of local realism (LR).
If LR is violated, it provides a measure of this violation in terms of the log-p-value for violation.
It is asymptotically optimal for independent measurements of identical states. If LR is not vio-
lated, it can provide feedback on the degree of nonviolation. This is accomplished by comparing
the observed measurements to those that a reference or goal state would produce. The user must
specify the goal state’s probability distribution for measurement settings and outcomes that should
be expected if an experiment is set up as intended. The motivation and theory for the LRE are
described in Chapter 5.
This document gives a user-level specification of the LRE. The implementation of the LRE
provided with this guide can be used with Octave or Matlab. First we describe the state of the
LRE. Then we define the functions that are used to initialize, modify and update the LRE state.
Next we describe functions useful for supporting the LRE, such as functions to compute refer-
ence measurement distributions and calculate anticipated LR violations for standard experimental
situations. Finally we give functions that can be used to simulate an experiment with the LRE.
Examples of LRE usage are given for reference. For a quick start, one can go directly to Sec. A.5.1 The code is available online with the published paper at http://arxiv.org/abs/1108.2468.
112
A.2 LRE state
The LRE state is characterized by an experimental configuration, analysis and display vari-
ables, and the saved statistics of experimental data so far. Once the variables are initialized, the
engine updates the state each time a new block of data is received. The state variables are main-
tained internally. From a user’s perspective, they serve to formally define the state and behavior of
the LRE and are therefore needed for fully understanding LRE dynamics. The state variables are
controlled through functions and are not intended to be accessed or changed directly. We define
them below as part of the LRE specification. In the implementation provided with this guide, they
are accessible as global variables. Future implementations may choose to hide them, so the normal
user should not rely on global access and use only the interface functions specified later.
A.2.1 Experimental configuration
For the purposes of a test of LR, an experimental configuration is characterized by the number
of measurement settings, their probability distribution, and the number of possible outcomes for a
setting. These configuration variables are set when the LRE is initialized. Note that each block of
data provided to the LRE must conform to the values of these parameters. The variables used by
the LRE are:
n_settings_a, n_settings_b: The number of measurement settings available to Alice and
Bob, respectively.
n_outcomes_a, n_outcomes_b: The number of possible measurement outcomes for each
setting of Alice and Bob, respectively.
p_settings: The joint probability distribution with which Alice and Bob choose their
settings. This choice is assumed to be independent of the state being measured.
113
A.2.2 Analysis and display variables
For the purposes of analysis, a “data point” is the setting and outcome combination that
is used and observed in one trial of an experiment. A “data block” consists of a number of data
points.
The main task of the LRE on receiving a new block of data is to update the log-p-values
test_lp_total, test_lp and goal_lp. (We define the log-p-value to be − log2(pn), where pn is the
p-value upper bound as computed after n trials according to our algorithm.) While test_lp_total
is the overall log-p-value reported after a completed experiment, the log-p-values test_lp and
goal_lp help for monitoring the experimental progress. A positive value of test_lp is sufficient
for an LR violation. If the experiment is stable, it is also asymptotically necessary. Negative
values of test_lp are not informative 2 . The value of goal_lp reflects how well we are doing in
approaching a specified goal distribution for settings and outcomes and can be used to tweak an
experiment when there is no violation of LR. (When there is no violation, test_lp is near zero and
insensitive to tweaks.) A goal of tweaking the experiment can be to increase goal_lp. Its use is
explained in the example in Sec. A.5.2.
In order for these log-p-values to be useful during an experiment, they need to adapt to
changes in the experimental state. The amount of data that plays a role is controlled via data “half-
lives”. Roughly, these half-lives control how many of the last data points are used in calculating
and updating the log-p-values. We use half-lives instead of data windows so that we can avoid
storing old data. As a result, the contribution of data points decays exponentially as more data is
acquired.
The relevant part of the LRE state is described by the following variables.
goal_frequencies: This describes the frequencies of settings and outcomes for a goal
state. To be useful, it should be possible to realistically approach them in an experiment,
and they should violate LR. We provide functions to compute them for some typical ex-2 We allow the log-p-values to be negative for the purpose of tweaking an experiment. That is, we set the p-value
upper bound equal to(∏n
k=1 Rk−1(xk))−1
and do not truncate it at 1, see Chapter 5.
114
perimental configurations. It is not a good idea (and probably not realistic) to have any of
the goal state’s frequencies be zero. If the corresponding settings and outcomes occur in
an experiment, then goal_lp can become −∞, and, without intervention, will stay there.
When goal_frequencies is set or updated, the LRE computes the following:
goal_sd: The statistical strength of the goal frequencies for rejection of LR. If an
experimental state achieves the goal frequencies, then goal_lp should approach the
value of goal_sd multiplied by the effective number of trials according to
goal_lp_weight_snumber (defined below).
goal_ratios: The computation of goal_lp requires probability ratios as described
in Chapter 5. This variable contains the needed ratios.
data_half_life: The LRE maintains cumulative setting and outcome frequencies. Each
data point contributes to the cumulative frequencies with a weight that decays over time.
The idea is that a data point’s weight should be 1/2 of the most recent point’s weight after
data_half_life more points have been acquired. The calculation takes into account that
there are only finitely many data points so far but ensures that the ratio of weights for
data points in successive blocks is the one expected in the asymptotic case. For simplicity,
data points in the same block get the same weight. The details of the calculation are given
in Sec. A.6. To help with this calculation the LRE computes and updates the following:
data_weight: The weight used for the most recent data point contributing to the
stored setting and outcome frequencies, test_frequencies.
lp_half_life: The effective half-life for computing test_lp and goal_lp can be set
independently of data_half_life. The interpretation of lp_half_life differs from that
of data_half_life. This is because the log-p-values are weighted sums of logarithms of
probability ratios such as goal_ratios, but not averages. In order to interpret the log-p-
values as intended, the weights have a maximum value of 1, see Sec. A.6 for an explanation.
115
To calculate updated weights, the LRE computes and updates the following:
test_lp_weight_snumber, goal_lp_weight_snumber: Effective numbers of data points
contributing to the computation of test_lp and goal_lp, respectively. These num-
bers are computed based on lp_half_life and the numbers of data points that have
contributed to the respective log-p-value calculations. The log-p-values test_lp and
goal_lp are updated accordingly.
test_lp_data_weight, goal_lp_data_weight: The weights of the most recent data
points according to lp_half_life for updating test_lp_weight_snumber and
goal_lp_weight_snumber, respectively.
test_lp_tolerance: The log-p-values for LR rejection based on the data (e.g., test_lp)
may be underestimated by a maximum of test_lp_tolerance per data point. The under-
estimate accounts for the possibility that the computation of optimal local realistic (LR)
models may not reach the exact optima after the stopping criteria are satisfied. The value
of test_lp_tolerance must be significantly smaller than the achieved log-p-value per data
point violation in an experiment. Thus, if 103 trials are needed before a violation is ap-
parent, test_lp_tolerance should be much less than 10−3. Otherwise log-p-values may
accumulate negative values. The default value of test_lp_tolerance is 10−6. Setting it
significantly higher may speed up updates. See Sec. 5.3.2 of Chapter 5 for the details.
A.2.3 Data dependent variables
test_frequencies: The cumulative, weighted frequencies of experimental settings and
outcomes.
data_weight: The weight of the most recent data point contributing to test_frequencies.
num_exps: The total number of trials since the LRE was last reset.
weight_snumber: The effective number of data points contributing to test_frequencies.
116
pred_ratios: The prediction-based probability ratios that will be used to update test_lp
and test_lp_total when the next block of data arrives. They are determined as explained
in Chapter 5. The estimated probability distribution in the numerator is derived from
test_frequencies by a maximum likelihood method to enforce the setting-distribution
and (if desired) no-signaling constraints. Then, to avoid the possibility of getting stuck
with log-p-values of −∞, the distribution is modified by mixing in the setting-conditional
uniform distribution, with weight 1/(1+weight_snumber).
lr_use_cm: This option variable controls whether to enforce no-signaling (also called
“consistent marginals”) constraints when computing pred_ratios. Its default value
is 0 (false). Turning this option on can slow down updates but helps to reduce the
log-p-value offset caused by the learning transient, when the number of data points is
still small.
test_sd: The statistical strength of rejection of LR of the estimated probability distribu-
tion used to generate pred_ratios.
test_lp, goal_lp: The current log-p-values as described above. These are weighted ac-
cording to lp_half_life.
test_lp_weight_snumber, goal_lp_weight_snumber: Effective numbers of data points
contributing to the computation of test_lp and goal_lp, respectively.
test_lp_total: The total log-p-value for LR rejection since the last reset. Its value is the
one that test_lp would have if lp_half_life were infinity. This log-p-value should be
reported as the overall log-p-value when analyzing data from a completed experiment.
test_lp_v, goal_lp_v and test_lp_total_v: Estimated variances of the test_lp, goal_lp
and test_lp_total. When reporting test_lp_total, one can give the square root of
test_lp_total_v as its standard error when giving it as a quantitative measure of success
117
of an experiment. The other two variances can be used to assess the expected fluctuations
in the corresponding log-p-values when tweaking an experiment.
test_sd2, goal_sd2: The predicted variances of the log-p-value increments at the
next trial, calculated based on pred_ratios and goal_ratios respectively. These
are internal variables needed to update the estimated variances.
A.3 LRE interface
The specification of the LRE interface functions given here includes explicit instructions for
calculating nominally inaccessible state variables and other internal parameters. The calculations
need not be performed exactly as described here, as long as the specified behavior is preserved.
> function lr_init(n_settings_a, n_settings_b, n_outcomes_a, n_outcomes_b,
p_settings)
Initialize the LRE. The arguments are
n_settings_a, n_settings_b: Positive integers giving the number of measurement
settings of Alice and Bob, respectively.
n_outcomes_a, n_outcomes_b: Positive integers giving the number of outcomes of
each setting of Alice and Bob, respectively.
p_settings: A column vector of dimension n_settings_a * n_settings_b giving
the joint probability distribution for the settings used by Alice and Bob. The prob-
ability that Alice and Bob measure the k’th and l’th setting respectively is given by
the k+(l-1)*n_settings_a’th entry of the vector. This argument is optional and
defaults to the uniform distribution.
After lr_init is called, the state of the LRE satisfies the following:
• The experimental configuration variables are set according to the arguments.
118
• goal_frequencies, goal_sd, goal_sd2, test_frequencies, test_sd, and test_sd2
are set to an undefined value (the empty vector []).
• data_half_life and lp_half_life are set to Inf.
• num_exps, weight_snumber, goal_lp_weight_snumber, test_lp_weight_snumber,
data_weight, test_lp_data_weight, goal_lp_data_weight, goal_lp, test_lp,
test_lp_total, goal_lp_v, test_lp_v, and test_lp_total_v are set to 0.
• The ratios in goal_ratios and pred_ratios are set to 1.
> function lr_set_tolerance(tolerance)
Set the variable test_lp_tolerance to tolerance. This can be changed any time. It
should be much less than the predicted violation test_sd, when this is clearly positive.
> function lr_set_use_cm(t_or_f)
Set the variable lr_use_cm to 1 (true) or 0 (false), which determines whether no-signaling
constraints are used in optimizations. The argument is boolean.
> function lr_prime(frequencies_or_block, num)
Prime the LRE with initial frequencies. The arguments are
frequencies_or_block: The frequencies of settings and outcomes of a block of data,
or the block of data itself, as specified below.
num: If the first argument gives the frequencies, then this argument gives the number
of data points that contribute to the frequencies. The argument num is optional. If it
is given, the first argument must be a frequency array, not a block of data.
A block of data consists of a num by 4 matrix. Each row contains a specific data point,
specified by four non-negative integers in the following order: Alice’s setting, Alice’s mea-
surement outcome, Bob’s setting, Bob’s measurement outcome. The integers must be in
119
the appropriate range. For example, Alice’s settings must be between 1 and n_settings_a,
and her outcomes must be between 0 and n_outcomes_a-1.
Frequencies are entered as an array of dimension n_settings_a * n_settings_b by
n_outcomes_a *n_outcomes_b. The entry indexed by (k+(l-1)*n_settings_a,
1+r+s*n_outcomes_a) is the frequency with which Alice’s (Bob’s) outcome is r (s) for the
setting indexed by k (l). The frequencies are normalized so that sum(sum(frequencies)) = 1.
The function lr_prime assumes that the LRE has been initialized. It performs the following
actions:
• The LRE state is reset via lr_reset() (specified below).
• Let bf be the setting and outcome frequencies computed (if necessary) from the input,
and let n be the number of data points that contribute to these frequencies.
• Perform the following assignments:
num_exps = n;
data_weight = 1/n;
weight_snumber = n;
test_frequencies = bf;
• Use test_frequencies to estimate the probabilities fbf of future settings and out-
comes. Our implementation is equivalent to computing fbf in three steps. The first is
to modify bf so that it has the correct setting distribution. The modification is equiv-
alent to a maximum likelihood estimate with the setting distribution as a constraint.
If lr_use_cm is 1 (true), the next step is to obtain the maximum likelihood estimate
subject to no-signaling constraints. The last step adjusts the estimate by mixing in the
setting-conditional uniform distribution with weight 1/(1+weight_snumber). Future
implementations may perform this estimate differently.
120
• Let lf be the optimal LR frequencies witnessing the statistical strength of fbf. Com-
pute the statistical strength test_sd of fbf for rejection of LR according to test_sd =
sum(sum(fbf .* log2(fbf./lf))). The predicted variance is computed as test_sd2
= sum(sum(fbf .* log2(fbf./lf).^2)) - test_sd^2.
• Set pred_ratios = fbf./lf. We multiply pred_ratios by a factor slightly smaller
than 1 to account for not having found the exact optimal LR frequencies due to the
optimization stopping criterion test_lp_tolerance, see Sec. 5.3.2 of Chapter 5 for
details.
• If goal_frequencies is defined, compute the initial values of goal_lp and goal_lp_v,
and set goal_lp_data_weight = 1/n and goal_lp_weight_snumber = n. The initial
values of goal_lp and goal_lp_v are given by sum(sum(n * bf .* log2(goal_ratios)))
and n * goal_sd2, respectively. See function lr_set_goal below for how goal_ratios
and the initial goal_sd2 are calculated. Finally, update goal_sd2 by setting goal_sd2
= sum(sum(fbf .* log2(goal_ratios).^2 )) - sum(sum(fbf .* log2(goal_ratios)))^2.
> function lr_update(frequencies_or_block, num)
Update the LRE state according to new data. The arguments are as for the function
lr_prime. If test_frequencies is undefined, lr_prime(frequencies_or_block, num)
is called. Otherwise the following actions are taken:
• Let bf be the setting and outcome frequencies computed (if necessary) from the input,
and let n be the number of data points that contribute to these frequencies.
• Compute the weight ndw of the contribution of each new data point to test_frequencies.
After computing ndw, test_frequencies is updated according to
test_frequencies = (1 - n*ndw) * test_frequencies + n*ndw * bf;
To compute ndw, solve the equation
121
(1 - n*ndw) * data_weight / ndw = 2^(-n/data_half_life);
The formula is explained in Sec. A.6. For constant block size n and no change in
data_half_life, this ensures that the weight of the contribution of each data point
in the (data_half_life/n)’th last block to test_frequencies is 1/2 of the weight
of the most recent data point.
• Compute the new effective number of data points ndwsn as follows:
ndwsn = ((1 - n*ndw)^2 / weight_snumber + n*ndw^2)^-1;
See Sec. A.6 for an explanation.
• To update test_lp do the following:
+ Calculate the log-p-value increment lpi with respect to pred_ratios for the
block of data:
lpi = sum(sum(n * bf .* log2(pred_ratios)));
test_lp_total = test_lp_total + lpi;
test_lp_total_v = test_lp_total_v + test_sd2 * n;
+ Compute tlp_ndw and tlp_ndwsn by following the steps used to compute ndw and
ndwsn, using test_lp_data_weight, test_lp_weight_snumber and lp_half_life
instead of data_weight, data_weight_snumber and data_half_life, respec-
tively.
+ Obtain tlpw by solving
tlp_ndwsn = (1 - tlpw) * test_lp_weight_snumber + n;
(If test_lp_weight_snumber is zero, set tlpw = 1.)
122
+ Set test_lp = (1 - tlpw) * test_lp + lpi. This ensures that the weight
of each data-point’s contribution to the log-p-value is at most 1, necessary for
interpreting test_lp as a valid log-p-value, see Sec. A.6. It also ensures that the
sum of the weights is tlp_ndwsn, so that tlp_ndwsn is the effective number of
contributing data points consistent with the value of lp_half_life.
+ Set test_lp_v = (1 - tlpw)^2 * test_lp_v + test_sd2 * n.
• Update test_sd, test_sd2 and pred_ratios as explained in the description of the
function lr_prime.
• If goal_frequencies is defined, first calculate the log-p-value increment lpi_g from
goal_ratios, using the method for calculating lpi from pred_ratios. Then up-
date goal_lp and goal_lp_v by following the steps used to update test_lp and
test_lp_v, respectively. After that, update goal_sd2 as in the function lr_prime.
• Complete the update by setting
num_exps = num_exps + n;
data_weight = ndw; weight_snumber = ndwsn;
test_lp_data_weight = tlp_ndw;
test_lp_weight_snumber = tlp_ndwsn;
If goal_frequencies is defined, update goal_lp_data_weight and
goal_lp_weight_snumber similarly.
> function lr_std_analysis(all_trials, verbose)
Run a full analysis of a complete data set using recommended parameters. The argument
all_trials has the same form as a block of data. verbose is a boolean variable 1 or
0 indicating whether to print progress information during the analysis. This argument
is optional and defaults to 1. The function returns the value of test_lp_total and the
123
square root of test_lp_total_v, which is the estimated standard error of test_lp_total
with respect to its mean for repetitions of the same experiment.
Let num be the number of trials as indicated by the number of rows in all_trials. Let
so_num be the number of possible setting and outcome combinations, n_settings_a *
n_settings_b * n_outcomes_a * n_outcomes_b. Our implementation of the function
lr_std_analysis does the following after resetting the LRE:
• Set lr_use_cm = 1 and test_lp_tolerance = min(1E-6, 1/(num*100)).
• Set block_size = ceil(max(num/1000, so_num * log(2*so_num))) as recommended
in Sec. 5.3.1 of Chapter 5.
• Apply the function lr_update to consecutive blocks of block_size rows of all_trials.
The last block may be smaller if block_size does not divide num.
• Return test_lp_total and the square root of test_lp_total_v.
> function lr_set_goal(frequencies)
(Re)set the variable goal_frequencies to the argument frequencies, and (re)compute
the value of the statistical strength goal_sd, the predicted variance goal_sd2 and the prob-
ability ratios goal_ratios. The computations of goal_sd, goal_sd2, and goal_ratios are
performed like those of test_sd, test_sd2, and pred_ratios, respectively (see function
lr_prime), but instead of the estimated probability distribution fbf the computations
use frequencies. Note that, if goal_frequencies is (re)set after priming the LRE, then
goal_sd2 is updated in the same way as in the function lr_prime. The log-p-value goal_lp
and the associated variables goal_lp_data_weight and goal_lp_weight_snumber are un-
changed. This means that the old value of goal_frequencies continues contributing to
goal_lp via the data added before this function was called. This contribution decays
according to the relevant half-life lp_half_life.
Note: If the function lr_prime is used as a way to reduce the learning transient by ini-
124
tializing pred_ratios with prior knowledge from theory or earlier experiments such as
tomography experiments, it is a good idea to delay setting the goal frequencies until after
priming. This makes sure that goal_lp remains zero until an experiment starts properly.
> function lr_set_data_half_life(half_life)
Set data_half_life = half_life. Contributions to frequencies from old and new data
decay at this rate from now on. Relative weights within contributions from old data are
unchanged.
> function lr_set_lp_half_life(half_life)
Set lp_half_life = half_life. Contributions to log-p-values from old and new data
decay at this rate from now on. Relative weights within contributions from old data are
unchanged.
> function lr_reset()
Reset the state of the LRE. Only the data-independent variables are kept. That is, the
state is the same as after
g = goal_frequencies; dh = data_half_life; lh = lp_half_life;
lr_init(n_settings_a, n_settings_b, n_outcomes_a, n_outcomes_b,
p_settings);
lr_set_goal(g); lr_set_data_half_life(dh); lr_set_lp_half_life(lh);
test_lp_tolerance and lr_use_cm are unchanged by the function lr_reset.
> function lr_lps()
Return goal_lp, test_lp, and test_lp_total.
> function lr_lp_vs()
Return goal_lp_v, test_lp_v, and test_lp_total_v.
125
> function lr_sds()
Return goal_sd and test_sd.
> function lr_snumbers()
Return weight_snumber, goal_lp_weight_snumber, and test_lp_weight_snumber.
A.4 LRE support functions
By setting the half-lives to infinity, the LRE can be used directly to analyze existing data
for violation of LR. If one can make a blind prediction of the frequencies of settings and outcomes,
then one can prime the engine with the predicted frequencies and get better log-p-values without
having to learn the setting and outcome frequencies from the initial blocks of data.
One of the applications of the LRE is to monitor an ongoing experiment, even if the data are
not currently in a region where LR is violated. For this we provide functions to compute realistic
goal frequencies associated with goal states and (possibly optimized) measurement settings. With
the goal frequencies, monitoring goal_lp may help to tweak an experiment toward an LR violation.
Once LR is violated, one can try to improve test_lp directly, without necessarily moving the
experiment to the hoped-for goal.
The LRE support functions enable computing goal frequencies and their statistical strengths
for rejecting LR, given experimental settings and noise parameters. They also make it possible
to generate simulated data to test and explore LRE functions. The support function arguments
include a state specification, state_spec; a noise specification (losses and visibility), noise_spec;
and a measurement-setting specification, settings_spec.
state_spec: State specification. Currently only balanced and unbalanced Bell states as
if prepared by idealized down-conversion can be specified. For this case, state_spec is a
two-component vector, specifying the bias θ and population p of the state
|ψ〉 =√
1− p|0〉A|0〉B +√p (sin(θ)|H〉A|H〉B + cos(θ)|V 〉A|V 〉B) . (A.1)
126
We assume that double and higher-order pair emission is negligible. Forms of state_spec
with more than two components are reserved for other types of states.
noise_spec: Noise specification. For the case of experiments involving the states in
Eq. (A.1) (as indicated by state_spec being a vector of length 2), there are up to three
parameters: losses ηa and ηb for Alice and Bob, and visibility v. If noise_spec is a real
number, it specifies the two identical losses, and visibility defaults to 1. If noise_spec is a
vector of length 2, the first entry is ηa = ηb, and the second is the visibility. If it is a vector
of length 3, it contains ηa, ηb, and v, in this order.
settings_spec: Description of Alice’s and Bob’s measurement configurations. For an
experimental test of LR, this is a cell array with the following components:
• settings_spec{1}: The numbers of settings available to Alice and Bob. If
length(settings_spec{1}) == 0, they default to 2. If length(settings_spec{1})
== 1, then both numbers of settings are given by settings_spec{1}. If
length(settings_spec{1}) == 2, then Alice’s and Bob’s numbers of settings are
the first and second entries, respectively.
• settings_spec{2}: The numbers of (setting-independent) measurement outcomes for
Alice and Bob. If length(settings_spec{2}) == 0, they default to 2. Otherwise,
they are treated like settings_spec{1}. For experiments involving the states in
Eq. (A.1), only the values 2 and 3 make sense. The measurement settings are specified
by observables with eigenvalues +1,−1 (see the description of settings_spec{4}).
The +1-eigenvalue outcomes are assigned to outcome 1, when entered in data for
the LRE. If settings_spec{2} == 2, we assume that there is only one detector,
and it “clicks” if the photon is found in the +1-eigenvalue eigenstate. Otherwise we
don’t detect the photon (it is lost, or it would have been found in the −1-eigenvalue
eigenstate), which is assigned to outcome 0. If settings_spec{2} == 3, we assume
that there are two detectors, one for each eigenvalue. The +1 and −1 eigenvalues are
127
assigned to outcomes 1 and 2, respectively, and no detection is assigned to 0.
• settings_spec{3}: The probability distribution of the measurement-setting choices.
If length(settings_spec{3}) == 0, it is assumed to be uniform. Otherwise it has
the same form as p_settings.
• settings_spec{4}: Alice’s measurement settings. This is a n_settings_a by 1 or 2
matrix. The k’th setting is given by its k’th row. It specifies angles for the Jones vector
of the polarization +1-eigenstate of the measurement operator mentioned above. If
there is only one parameter θ, the Jones vector is (cos(θ), sin(θ)), indicating polariza-
tion at angle θ from horizontal polarization. If there are two parameters θ and φ, the
Jones vector is (cos(θ), eiφ sin(θ)). The vectors can be interpreted as photon polariza-
tion states with (0, 1), (1, 0), (1, 1)/√
2, and (1− i, 1 + i)/2 the vertically, horizontally,
diagonally and right-circularly polarized states, respectively.
• settings_spec{5}: Bob’s measurement settings, in the same form as Alice’s but with
n_settings_b many rows.
The following functions are provided:
> function lr_config_freqs(state_spec, noise_spec, settings_spec)
Return the probabilities of settings and outcomes according to the given experimental
configuration. These probabilities are predicted by quantum theory. The returned value’s
data type is the same as that of goal_frequencies.
> function lr_config_analysis(state_spec, noise_spec, settings_spec)
Analyze the given experimental configuration for violation of LR. The function returns sd,
frequencies, ratios, p_settings, and lr_frequencies.
sd: The statistical strength of the configuration for rejecting LR.
frequencies: The quantum predicted frequencies of settings and outcomes. The data
type matches that of goal_frequencies.
128
ratios: The probability ratios for computing log-p-values. The data type matches
that of goal_ratios.
p_settings: The probability distribution of settings inferred from the arguments.
lr_frequencies: The frequencies of settings and outcomes according to the optimal
LR model. The data type is the same as that of goal_frequencies.
> function lr_config_optim(state_spec, noise_spec, settings_spec)
Optimize the measurement settings for a specified experimental configuration by maximiz-
ing the statistical strength of the LR violation. The optimization uses the set of given
measurement settings as an initial point. A local optimum is found. The function returns
opt_settings and opt_sd, where
opt_settings: The locally optimal settings found in the format matching the settings_spec
argument. Only setting parameters of the argument are changed.
opt_sd: The statistical strength for the LR violation of the optimal settings.
> function lr_test_simulation(state_spec, noise_spec, settings_spec, num)
Simulate num many trials according to the given experimental configuration. The function
returns a block of data data_block, in the correct form for use as the frequencies_or_block
argument in the function lr_prime.
A.5 LRE usage examples
Note: The implementation of the LRE provided with this guide is “research grade”. It is in-
consistently documented and should be considered unstable. It does not perform consistency checks
on user-provided inputs. The following observation may help: Inconsistencies in user-provided in-
put often result in array mismatch errors. If changes are made to the code, there is a minimal test
suite (see testsuite.m) to perform a few simple (but incomplete) specification tests.
129
In the examples, the commands are shown without a prompt and are intended to be invoked
in an Octave or Matlab shell. The LRE file prep_lre.m must be located on the path used to find
scripts. Commands that are specific to this implementation of the LRE have comments starting
with %**. Scripts for the examples are given with the code, see example1.m and example2.m.
A.5.1 Analyzing an existing data set
Here is an example of how to use the LRE for analyzing a data set.
For the implementation provided with this guide, the first step is always to initialize the
LRE. This sets up the needed paths and variables.
prep_lre;
%** CAUTION: This script defines the global variables used. Most of
%** them are explained in the description of LRE parameters and
%** states. There are some exceptions. Be aware of the possibility
%** of conflict if any of these variable names is used elsewhere.
%** Check pre_lre.m for the list of variable names.
Normally, one next loads the data set to be analyzed. For the purpose of this example, we
simulate it by using the provided support functions. We assume that the measurement setting is
chosen uniformly at random in each trial, the default.
sim_state_spec = [pi/4, 0.1]; % A Bell state, pair probability = 0.1.
sim_noise_spec = [0.95, 0.97]; % The efficiencies are 0.95,
% the visibility is 0.97.
sim_settings_spec = cell(5, 1);
sim_settings_spec{1} = 2; % Two measurement settings each.
sim_settings_spec{2} = 3; % Three outcomes for each setting.
sim_settings_spec{4} = [0; pi/4]; % Alice’s settings, PBS angles.
sim_settings_spec{5} = [pi/8; -pi/8];% Bob’s settings, PBS angles.
130
[sd, frequencies] = lr_config_analysis(sim_state_spec, sim_noise_spec,
sim_settings_spec);
disp(sd); % Expected statistical strength is sd = 0.0017331.
all_data = lr_test_simulation(sim_state_spec, sim_noise_spec, sim_settings_spec,
100000); % Simulate 100,000 trials.
The first step of the analysis is to initialize the LRE.
lr_init(2, 2, 3, 3); % The arguments specify numbers of settings and outcomes.
The recommended method for analyzing data from a completed experiment is to use the
provided function lr_std_analysis and report the returned value.
[violation_log_p_value, violation_log_p_value_sd] = lr_std_analysis(all_data);
% The final log-$p$-value is expected to be about 173 minus the learning
% transient, which we observed to be about 25 on average. The standard
% deviation of the reported log-$p$-value is around 20.
It may be desirable to choose some of the analysis parameters differently from the ones
recommended in lr_std_analysis. If so, such a choice must be made blindly, that is before any
information about the data to be analyzed is available. Information about an experiment before
the data was acquired can be used. With such information, simulation may help in making better
choices of analysis parameters.
Here is an instance of an explicit analysis with parameters chosen manually. First we reset
the analysis engine.
lr_reset();
The main choice to be made is the number of data points in each block of data to be processed.
It is necessary to make sure that the data in each block has a reasonable setting distribution. In
131
particular, do not provide blocks with just one of the settings’ data, otherwise the LRE estimated
distributions may vary excessively, potentially worsening the log-p-values.
Blocking the data improves efficiency, and having the first block be large enough to contain a
good sample of the possible setting and outcome combinations mitigates the log-p-value offset that
typically arises from the learning transient. If the blocks are too large, we may reduce the log-p-
value by, in effect, not considering the first block for the log-p-value calculations. The first block
is used only to estimate the next block’s setting and outcome frequencies. Thus, we recommend a
block size of the larger of N/1000, where N is the number of trials, and d ln(2d), where d is the
number of possible setting and outcome combinations. The first bound is chosen so that we do
not lose too much significance by using the first block only for learning the setting and outcome
distribution. The second ensures that if the setting and outcome distribution is uniform, the
probability of having at least one event for each setting and outcome combination in each block is
at least 1/2. For our example, the recommended block size is 154. For simplicity, we use 200.
Our implementation of the LRE has two parameters that affect the calculations of log-p-
values. The first is test_lp_tolerance, which affects the stopping criterion for the LR model
optimization and the conservative factors used in pred_ratios and goal_ratios to ensure valid
log-p-values. It should be a small fraction of the anticipated statistical strength. If the strength
is not particularly known, setting it to 1/(100N) ensures that if the set of data has sufficient
violation of LR, the stopping criterion does not significantly decrease the log-p-value compared to
what would be obtained with ideal LR model optimization. The second parameter is lr_use_cm
and determines whether no-signaling constraints are applied in estimating setting and outcome
distributions. Applying these constraints increases computation time, but improves the estimates,
particularly for the first few blocks.
lr_set_tolerance(1e-7); % 1/(100*N)
lr_use_cm = 1; % Turn on "no-signaling" constraints.
If we have a reasonable estimate of the setting and outcome frequencies that was obtained
132
before an experiment was started, we can “prime” the LRE with these frequencies. This can
reduce or eliminate the effect of the learning transient. If these frequencies are available in
predicted_frequencies and the statistics of the prediction are as good as if we had inferred
them from support_number many trials, then we can invoke the following before analyzing the
experimental data (after removing the comment symbol):
% lr_prime(predicted_frequencies, support_number);
If the experiment was not stable for the entire time that it took to acquire the data, it may
help to set the half-lives to values of the order of the stability time. This allows the LRE to adapt
to drifts in the experimental state. Here we assume that the experiment was sufficiently stable and
leave the half-lives at infinity, the default.
We now process one block of data with the block size 200 at a time.
for i = 1:500
lr_update(all_data(((i-1)*200+1):(i*200), 1:4));
%** The following can be used to monitor progress:
disp(sprintf(’Log-p-value so far: %5.2f +/- %4.2f’,test_lp_total,...
sqrt(test_lp_total_v)));
end
[glp, tlp, tlpt] = lr_lps();
disp(tlpt); % The third value returned is the total
% log-$p$-value and the main result of the
% analysis.
[glpv, tlpv, tlptv] = lr_lp_vs();
disp(sqrt(tlptv)); % The third returned value is the variance of the total
% log-$p$-value.
133
A.5.2 Monitoring an experiment in progress
The following shows how one can use the LRE for monitoring an experiment in progress.
The experiment to be simulated involves a (balanced) Bell state expected to be measured with
95% efficiency (for all detectors). Measurements have three possible outcomes (two orthogonal
polarizations and “no detection”). The setup aims for a photon-pair production probability of 0.1
and a visibility of 97%. The settings are intended to maximize the CHSH inequality. We begin
with setting up the LRE environment.
prep_lre;
We initialize the LRE with two settings and three outcomes for each of Alice and Bob. The
setting distribution is uniform, the default.
lr_init(2, 2, 3, 3);
For monitoring an experiment, it can be useful to set up ‘goal’ frequencies, which are the
setting and outcome frequencies that can be realistically expected if the experiment is configured
as intended. The goal frequencies are used by the LRE to calculate a quasi-log-p-value, goal_lp.
This behaves somewhat like the negative logarithm of a distance from the goal to the current
experimental state: To approach the goal one tries to increase goal_lp. More negative values
indicate that the current state is further from the goal.
By tweaking experimental parameters, one can aim for increasing goal_lp. If the goal state
has been reached, then goal_lp is expected to be approximately the statistical strength of the
goal frequencies times the effective number of contributing data points, goal_lp_weight_snumber.
While monitoring goal_lp, one can also monitor test_lp. Unlike goal_lp, negative values of
test_lp are not useful for tweaking. However, once test_lp becomes significantly positive, one
can tweak it directly to optimize the experimental configuration. This allows exploring states that
differ from the anticipated goal but may have better statistical strengths.
134
To set up the goal frequencies, we can use the support functions. We compute the goal
frequencies based on reasonable and achievable state parameters.
state_spec = [pi/4, 0.1]; % Bell state with 0.1 probability of pairs.
noise_spec = [0.95, 0.97]; % Hoped-for efficiencies are 0.95,
% and visibility is 0.97.
settings_spec = cell(5, 1);
settings_spec{1} = 2; % Two settings each.
settings_spec{2} = 3; % Three outcomes for each setting.
settings_spec{4} = [0; pi/4]; % Alice’s settings, PBS angles.
settings_spec{5} = [pi/8; -pi/8]; % Bob’s settings, PBS angles.
[sd, frequencies] = lr_config_analysis(state_spec, noise_spec, settings_spec);
disp(sd);
% The statistical strength for the goal frequencies is sd = 0.0017331.
% Set the goal frequencies in the LRE.
lr_set_goal(frequencies);
Next we set the half-lives for monitoring. It sets the time scale (in number of data points)
after which data no longer contributes significantly to current statistics. The values should be a
good multiple of the inverse of the expected statistical strength.
lr_set_data_half_life(4000);
lr_set_lp_half_life(4000);
For this example, we need to simulate the data, so we set up simulation parameters. For the
purpose of the example, we assume that all is well except for the visibility, which is only 0.7 at the
moment.
sim_state_spec = [pi/4, 0.1];
135
sim_noise_spec = [0.95, 0.7]; % Visibility is still low, at 0.7.
sim_settings_spec = settings_spec;
Normally we are not likely to know what the true experimental state parameters actually
are. But for this example we do, so we can check whether we are going to violate LR, and we find
that we are not.
[sd, frequencies] = lr_config_analysis(sim_state_spec, sim_noise_spec,
sim_settings_spec);
disp(sd);
% Statistical strength is numerically 0, no violation.
This is how to get a block of 1000 data points.
data_block = lr_test_simulation(sim_state_spec, sim_noise_spec,
sim_settings_spec, 1000);
% lr_use_cm = 1; %** Turn on "no-signaling" constraints for slightly
%** better predictions.
Run the experiment for a while.
lr_update(data_block);
for i = 1:20
lr_update(lr_test_simulation(sim_state_spec, sim_noise_spec,
sim_settings_spec, 1000));
%** The following can be used to monitor progress:
disp(sprintf(’Block %2d, goal_lp:%5.2f +/- %4.2f, test_lp:%5.2f +/- %4.2f,...
test_lp_total:%5.2f +/- %4.2f’, i+1, goal_lp, sqrt(goal_lp_v), test_lp,...
sqrt(test_lp_v), test_lp_total, sqrt(test_lp_total_v)));
end
136
We can check the status of the LRE by displaying various statistics. See the explanation of
the functions lr_lps, lr_lp_vs, lr_sds, and lr_snumbers.
[glp, tlp, tlpt] = lr_lps(); disp([glp, tlp, tlpt]); % Log-$p$-values.
[glpv, tlpv, tlptv] = lr_lp_vs();
disp([glpv, tlpv, tlptv]); % Variances of log-$p$-values.
[gsd, tsd] = lr_sds(); disp([gsd, tsd]); % Statistical strengths.
[ws, gws, tws] = lr_snumbers(); disp([ws, gws, tws]); % Effective numbers of points.
Suppose we tweak the experiment so that the visibility improves. Using our unrealistic
knowledge of the true visibility, we can check the expected statistical strength.
sim_noise_spec=[0.95, 0.8];
[sd, frequencies] = lr_config_analysis(sim_state_spec, sim_noise_spec,
sim_settings_spec);
disp(sd);
% Statistical strength is now 1.5e-5.
Run the experiment for a while.
for i = 1:40
lr_update(lr_test_simulation(sim_state_spec, sim_noise_spec,
sim_settings_spec, 1000));
%** The following can be used to monitor progress:
disp(sprintf(’Block %2d, goal_lp:%5.2f +/- %4.2f, test_lp:%5.2f +/- %4.2f,...
test_lp_total:%5.2f +/- %4.2f’, i+21, goal_lp, sqrt(goal_lp_v), test_lp,...
sqrt(test_lp_v), test_lp_total, sqrt(test_lp_total_v)));
end
The half-lives were not set high enough to clearly see the violation yet. But the goal log-p-
value goal_lp should have increased noticeably, suggesting that we improved the experiment in a
137
useful direction. We tweak it a bit more, in this case improving the visibility to that assumed for
the goal frequencies, and continue running the experiment.
sim_noise_spec=[0.95, 0.97];
for i = 1:40
lr_update(lr_test_simulation(sim_state_spec, sim_noise_spec,
sim_settings_spec, 1000));
%** The following can be used to monitor progress:
disp(sprintf(’Block %2d, goal_lp:%5.2f +/- %4.2f, test_lp:%5.2f +/- %4.2f,...
test_lp_total:%5.2f +/- %4.2f’, i+61, goal_lp, sqrt(goal_lp_v), test_lp,...
sqrt(test_lp_v), test_lp_total, sqrt(test_lp_total_v)));
end
The violation should now be noticeable, with test_lp now positive.
A.6 Technical notes
A.6.1 Data half-lives and data weights
The LRE is designed to require minimal memory of past data. Online monitoring of exper-
iments requires that only recently acquired data contributes significantly to the various statistics
being tracked. This is made possible by updating the recorded setting and outcome frequencies by
setting them to a convex combination of new and old frequencies. Let di be the i’th data point
represented by a 0/1-vector whose length is the number of setting and outcome combinations and
that has a single 1 in the position corresponding to the i’th setting and outcome combination. The
frequency vector for all data points is given by∑N
i=1 di/N , where N is the total number of data
points. This is the final value of test_frequencies if data_half_life = Inf. In general, after
the n’th data point is acquired, the value of test_frequencies is given by a weighted combination
fn =∑n
i=1wn,idi, where 0 ≤ wn,i ≤ 1 and∑n
i=1wn,i = 1. To minimize what we have to remember
of the past, we wish to update the frequencies so that fn+1 = vn+1dn+1 + (1 − vn+1)fn, where
138
0 ≤ vn+1 ≤ 1. The quantity vn+1 is the weight of the (n+ 1)’th data point after acquiring (n+ 1)
data points. For efficiency and because there is little new information in individual data points,
we prefer to update frequencies and other statistics one block of data at a time. Blocking does not
affect the validity of the computed log-p-values. We implement blocking by having the weights of
data points within a block be identical. Let Dk be the sum of the di contributing to the k’th block,
and let mk be the number of data points in the block. In terms of blocks, we use the frequency
update Fk+1 = Vk+1Dk+1 + (1 −mk+1Vk+1)Fk, with 0 ≤ Vk+1 ≤ 1/mk+1. The interpretation of
Vk+1 is similar to vn+1 above, but it is the identical weight of each data point in the (k + 1)’th
block. Its value is computed according to an approximate half-life λ (specified by the parameter
data_half_life). For a normal geometric decay, this requires wn,i/wn,i+1 = 2−1/λ. Because we
have blocked the data, matching the half life on average requires Vk(1−mk+1Vk+1)/Vk+1 = 2−mk+1/λ
to get the right ratio of weights between the last two contributing blocks after the update. We
solve this equation for Vk+1 to get
Vk+1 =Vk
mk+1Vk + 2−mk+1/λ. (A.2)
Thus the update requires keeping track of the weight of the most recent point, which after the
update is given by Vk+1. The first block requires special treatment: We set V1 = 1/m1.
We use a similar geometrical weight strategy to update the log-p-values. However, the con-
tribution to the log-p-value of the most recent data point should be weighted by 1, and the sum
of the weights should be an effective number of contributing data points. To ensure that the in-
terpretation of the weighted combination as a log-p-value is valid, the weights must be between 0
and 1. The reason is as follows. As explained in Chapter 5, the starting point for computing a
valid log-p-value is that the ratio function R (the value of variables such as goal_ratios) satisfies
the two conditions 0 ≤ R(x) and 〈R(x)〉 ≤ 1 for any LR model, where x is a trial’s setting and
outcome combination. Given such a function R, if we modify it according to R′(x) = (R(x))γ
where 0 ≤ γ ≤ 1, then 0 ≤ R′(x), and for any LR model 〈R′(x)〉 ≤ 〈R(x)〉γ ≤ 1 by concavity of the
function y → yγ . Hence the log-p-value computed using R′ instead of R is also valid. In particular,
139
the weighted combination of valid log-p-value increments is a valid log-p-value, if all weights are
between 0 and 1.
One way to quantify an effective number of contributing data points is by noting that if we
have independent instances xi of a random variable X and estimate the mean according to weights
wn,i as∑
iwn,ixi, the variance of the estimate is given by ve =∑
iw2n,iv, where v is the variance
of X. If the weights are all equal, ve = v/n. For arbitrary weights, ve = v/sn, where sn can be
interpreted as an effective number of contributing points. Thus, sn = 1/∑
iw2n,i. Let Sk be the
effective number of contributing points after the k’th block has been received. To update Sk when
it is nonzero, we use the following relationship implied by the formula for sn:
1Sk+1
= mk+1(Vk+1)2 + (1−mk+1Vk+1)2 1Sk, (A.3)
where Vk+1 is given by Eq. (A.2).
We also account for the following issues: First, the first nonzero Sk depends on which effective
number suffixed by weight_snumber it is. For weight_snumber, S1 = m1; for test_lp_weight_snumber,
S1 = 0 and S2 = m2; and for goal_lp_weight_snumber, the Sk are nonzero only after the goal fre-
quencies are set. Second, for different effective numbers, the weights of the most recent data points
Vk are different. This is because the half-life for the log-p-values can be different from the half-life
for the setting and outcome frequencies, and because goal frequencies can be set anytime. Hence,
we need to separately keep track of these weights with the variables suffixed by data_weight.
Also, the weight Vk is nonzero only after the corresponding Sk is nonzero. If Sl is the first nonzero
effective number, then Vi = 0, i = 1, . . . , (l − 1), and Vl = 1/Sl. After that, the Vk are updated
according to Eq. (A.2).
To see what to expect of the effective number of contributing data points given the half-life λ,
consider the case of block size 1 in the asymptotic limit. Compute 1/S∞ =∑
i(1−2−1/λ)22−2i/λ =
(1− 2−1/λ)/(1 + 2−1/λ). For large λ, S∞ ≈ (2/ ln(2))λ ≈ 2.9λ.
Given the corresponding effective numbers of contributing data points, we compute and
update the log-p-values test_lp and goal_lp as follows. Consider the case for test_lp. Write
140
Lk for the value of test_lp after the k’th block of data has been processed. We compute Lk+1
so that it is a weighted combination of Lk and the log-p-value contributions from the last block of
data, and the total weight of log-p-value contributions is given by the corresponding Sk+1. Let Bk
be the sum of the log-p-value contributions of the data points in the k’th block. Thus, we have
Lk+1 = Bk+1 +Wk+1Lk, where Wk+1 is obtained by solving mk+1 +Wk+1Sk = Sk+1. This ensures
that 0 ≤ Wk+1 ≤ 1, so that the weight of the log-p-value increment due to each data point is at
most 1. Consequently, Lk+1 is always a valid log-p-value. We can use the same strategy to compute
and update goal_lp.
Appendix B
Optimization results for Chapter 4
Using the code written in Octave, which is available by request, we find the results as shown
in the following tables. In these tables, the units of the columns labeled as theta, gamma, A_1, A_2,
B_1, and B_2 are degrees (◦). The column labeled as theta (or gamma) contains the values of the
parameter θ in the state |ψuB〉 = cos(θ)|H〉A|H〉B +sin(θ)|V 〉A|V 〉B (or the values of the parameter
γ in the state |ψpB〉 = cos2(γ)|H〉3|H〉4 + sin2(γ)|V 〉3|V 〉4 + cos(γ) sin(γ)(|H〉3|V 〉3 + |H〉4|V 〉4)).
The columns labeled as A_1 and A_2 contain the two optimal measurement-setting angles for Alice,
while the columns labeled as B_1 and B_2 contain the two optimal measurement-setting angles for
Bob. The columns labeled as eta_1 and eta_2 give the detection efficiencies required to achieve the
statistical strengths in the columns labeled as S_1 and S_2, respectively. The column labeled as V
(if shown) means the visibility of the prepared unbalanced Bell state. Due to the limit of numerical
accuracy, we cannot find the exact detection efficiency ηc required to achieve a specified statistical
strength level S. Up to the 10−4 level, we list the best two detection efficiencies, which are closest
to ηc. Using eta_1, the statistical strength of a test of LR is a little higher than the specified level,
while using eta_2, the statistical strength is a little lower than the specified level. In the plots of
Fig. 4.2, Fig. 4.3, and Fig. 4.4, we use eta_1 or eta_2, according to which of them gives a statistical
strength closer to the specified level. Also, for calculations of the minimum detection efficiencies at
0 + ε statistical strength, we truncate the statistical strength to 0 when it is numerically calculated
to be less than 10−9 or 10−10, depending on the situation.
142
B.1 Results for unbalanced Bell states using photon counters or detectors
Statistical strength~=0
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 82.85% 8.66E-009 82.84% 1.44E-013
40 21.28 -66.89 -21.28 66.89 80.61% 1.47E-010 80.60% 3.18E-015
35 19.40 -65.60 -19.40 65.60 78.50% 3.97E-009 78.49% 1.33E-013
30 17.00 -63.58 -17.00 63.58 76.50% 5.48E-009 76.49% 3.62E-011
25 14.21 -60.72 -14.21 60.72 74.60% 1.48E-009 74.59% 1.66E-013
20 11.14 -56.79 -11.14 56.79 72.81% 2.91E-009 72.80% 5.56E-011
15 7.92 -51.42 -7.92 51.42 71.12% 3.84E-009 71.11% 6.85E-010
10 4.70 -43.88 -4.70 43.88 69.53% 2.56E-009 69.52% 7.10E-010
5 1.81 -32.41 -1.81 32.41 68.06% 1.67E-009 68.05% 8.37E-010
4 1.32 -29.25 -1.32 29.25 67.78% 1.21E-009 67.77% 6.40E-010
3 0.87 -25.55 -0.87 25.55 67.52% 1.48E-009 67.51% 9.78E-010
2 0.48 -21.04 -0.48 21.04 67.27% 1.31E-009 67.26% 9.85E-010
1 0.17 -15.01 -0.17 15.01 67.06% 1.00E-009 67.05% 8.57E-010
Statistical strength~=1E-6
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 82.93% 1.24E-006 82.92% 9.71E-007
40 21.28 -66.89 -21.28 66.89 80.72% 1.00E-006 80.71% 8.30E-007
35 19.41 -65.60 -19.41 65.60 78.62% 1.03E-006 78.61% 8.79E-007
30 17.02 -63.59 -17.02 63.59 76.64% 1.09E-006 76.63% 9.48E-007
25 14.23 -60.73 -14.23 60.73 74.77% 1.07E-006 74.76% 9.53E-007
20 11.15 -56.80 -11.15 56.80 73.01% 1.01E-006 73.00% 9.18E-007
15 7.93 -51.43 -7.93 51.43 71.39% 1.07E-006 71.38% 1.00E-006
10 4.72 -43.89 -4.72 43.89 69.93% 1.04E-006 69.92% 9.88E-007
5 1.82 -32.42 -1.82 32.42 68.86% 1.02E-006 68.85% 9.96E-007
4 1.33 -29.25 -1.33 29.25 68.78% 1.01E-006 68.77% 9.90E-007
143
3 0.88 -25.55 -0.88 25.55 68.84% 1.00E-006 68.83% 9.86E-007
2.5 0.68 -23.43 -0.68 23.43 68.98% 1.01E-006 68.97% 9.94E-007
2 0.49 -21.05 -0.49 21.05 69.25% 1.01E-006 69.24% 9.96E-007
1.5 0.32 -18.31 -0.32 18.31 69.78% 1.00E-006 69.77% 9.94E-007
1 0.18 -15.01 -0.18 15.01 70.96% 1.00E-006 70.95% 9.98E-007
0.75 0.12 -13.03 -0.12 13.03 72.17% 1.00E-006 72.16% 9.98E-007
0.5 0.07 -10.66 -0.07 10.66 74.56% 1.00E-006 74.55% 9.98E-007
0.08 0.00 90.00 -0.16 0.16 100.00% 1.00E-006
Statistical strength~=1E-5
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 83.10% 1.08E-005 83.09% 9.95E-006
40 21.30 -66.90 -21.30 66.90 80.96% 1.00E-005 80.95% 9.46E-006
35 19.43 -65.62 -19.43 65.62 78.89% 1.01E-005 78.88% 9.56E-006
30 17.05 -63.61 -17.05 63.61 76.95% 1.02E-005 76.94% 9.79E-006
25 14.26 -60.76 -14.26 60.76 75.14% 1.03E-005 75.13% 9.96E-006
20 11.20 -56.84 -11.20 56.84 73.47% 1.03E-005 73.46% 9.99E-006
15 7.97 -51.47 -7.97 51.47 71.98% 1.01E-005 71.97% 9.90E-006
10 4.75 -43.92 -4.75 43.92 70.81% 1.01E-005 70.80% 9.92E-006
5 1.85 -32.45 -1.85 32.45 70.60% 1.01E-005 70.59% 9.99E-006
4 1.35 -29.27 -1.35 29.27 70.94% 1.00E-005 70.93% 9.97E-006
3 0.90 -25.57 -0.90 25.57 71.69% 1.00E-005 71.68% 9.98E-006
2.5 0.69 -23.45 -0.69 23.45 72.36% 1.00E-005 72.35% 9.98E-006
2 0.50 -21.06 -0.50 21.06 73.41% 1.00E-005 73.40% 9.98E-006
1.5 0.33 -18.32 -0.33 18.32 75.20% 1.00E-005 75.19% 1.00E-005
1 0.19 -15.02 -0.19 15.02 78.69% 1.00E-005 78.68% 9.99E-006
0.5 0.00 90.00 -0.95 0.95 90.11% 1.00E-005 90.10% 9.99E-006
0.31 0.00 90.00 -0.61 0.61 100.00% 1.00E-005
144
Statistical strength~=1E-4
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 83.63% 1.01E-004 83.62% 9.83E-005
40 21.34 -66.94 -21.34 66.94 81.71% 1.00E-004 81.70% 9.82E-005
35 19.51 -65.69 -19.51 65.69 79.74% 1.01E-004 79.73% 9.90E-005
30 17.16 -63.71 -17.16 63.71 77.92% 1.00E-004 77.91% 9.91E-005
25 14.39 -60.87 -14.39 60.87 76.28% 1.01E-004 76.27% 9.95E-005
20 11.33 -56.96 -11.33 56.96 74.87% 1.01E-004 74.86% 9.98E-005
15 8.09 -51.58 -8.09 51.58 73.81% 1.00E-004 73.80% 9.94E-005
10 4.86 -44.03 -4.86 44.03 73.50% 1.00E-004 73.49% 9.96E-005
7 3.03 -37.82 -3.03 37.82 74.22% 1.00E-004 74.21% 9.99E-005
5 1.92 -32.52 -1.92 32.52 75.73% 1.00E-004 75.72% 9.99E-005
4 1.41 -29.34 -1.41 29.34 77.21% 1.00E-004 77.20% 9.99E-005
3 0.95 -25.64 -0.95 25.64 79.74% 1.00E-004 79.73% 9.98E-005
2 0.54 -21.12 -0.54 21.12 84.64% 1.00E-004 84.63% 1.00E-004
1 0.00 90.00 -1.98 1.98 99.30% 1.00E-004 99.29% 1.00E-004
0.98 0.00 90.00 -1.93 1.93 100.00% 1.00E-004
Statistical strength~=1E-3
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 85.33% 1.01E-003 85.32% 9.99E-004
40 21.47 -67.06 -21.47 67.06 84.02% 1.01E-003 84.01% 1.00E-003
35 19.77 -65.90 -19.77 65.90 82.33% 1.00E-003 82.32% 9.95E-004
30 17.50 -63.99 -17.50 63.99 80.88% 1.00E-003 80.87% 9.97E-004
25 14.79 -61.22 -14.79 61.22 79.74% 1.00E-003 79.73% 9.98E-004
20 11.74 -57.34 -11.74 57.34 79.06% 1.00E-003 79.05% 9.97E-004
15 8.48 -51.97 -8.48 51.97 79.21% 1.00E-003 79.20% 9.99E-004
10 5.18 -44.37 -5.18 44.37 81.17% 1.00E-003 81.16% 9.99E-004
145
7 3.28 -38.13 -3.28 38.13 84.52% 1.00E-003 84.51% 9.99E-004
5 2.15 -32.89 -2.15 32.89 89.16% 1.00E-003 89.15% 1.00E-003
4 1.68 -29.59 -1.68 29.59 93.81% 1.00E-003 93.80% 1.00E-003
3.09 0.00 90.00 -6.10 6.10 100.00% 1.00E-003
B.2 Tradeoff between visibility and efficiency using unbalanced Bell states
where S = 10−6
theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2 V
45 22.50 -67.50 -22.50 67.50 99.90% 1.05E-006 99.89% 8.57E-007 0.71
44.4 22.49 -67.49 -22.49 67.49 99.20% 1.15E-006 99.19% 9.48E-007 0.72
43.9 22.47 -67.48 -22.47 67.48 98.49% 1.03E-006 98.48% 8.41E-007 0.73
43.3 22.43 -67.44 -22.43 67.44 97.79% 1.10E-006 97.78% 9.12E-007 0.74
42.8 22.38 -67.42 -22.38 67.42 97.09% 1.18E-006 97.08% 9.89E-007 0.75
42.2 22.31 -67.37 -22.31 67.37 96.38% 1.09E-006 96.37% 9.05E-007 0.76
41.6 22.23 -67.31 -22.23 67.31 95.67% 1.03E-006 95.66% 8.55E-007 0.77
41 22.13 -67.24 -22.13 67.24 94.96% 1.02E-006 94.95% 8.47E-007 0.78
40.4 22.00 -67.16 -22.00 67.16 94.25% 1.07E-006 94.24% 8.94E-007 0.79
39.7 21.86 -67.05 -21.86 67.05 93.53% 1.01E-006 93.52% 8.45E-007 0.80
39.1 21.71 -66.94 -21.71 66.94 92.81% 1.05E-006 92.80% 8.82E-007 0.81
38.5 21.54 -66.82 -21.54 66.82 92.08% 1.02E-006 92.07% 8.59E-007 0.82
37.8 21.34 -66.67 -21.34 66.67 91.35% 1.13E-006 91.34% 9.59E-007 0.83
37.1 21.10 -66.50 -21.10 66.50 90.60% 1.04E-006 90.59% 8.83E-007 0.84
36.4 20.86 -66.32 -20.86 66.32 89.85% 1.14E-006 89.84% 9.77E-007 0.85
35.6 20.56 -66.09 -20.56 66.09 89.08% 1.12E-006 89.07% 9.57E-007 0.86
34.8 20.23 -65.84 -20.23 65.84 88.29% 1.02E-006 88.28% 8.65E-007 0.87
34 19.89 -65.57 -19.89 65.57 87.49% 1.04E-006 87.48% 8.90E-007 0.88
33.1 19.48 -65.24 -19.48 65.24 86.67% 1.08E-006 86.66% 9.34E-007 0.89
32.1 19.00 -64.84 -19.00 64.84 85.82% 1.06E-006 85.81% 9.12E-007 0.90
31.1 18.49 -64.40 -18.49 64.40 84.94% 1.04E-006 84.93% 9.04E-007 0.91
30 17.91 -63.89 -17.91 63.89 84.02% 1.00E-006 84.01% 8.69E-007 0.92
146
28.9 17.29 -63.34 -17.29 63.34 83.06% 1.06E-006 83.05% 9.28E-007 0.93
27.5 16.48 -62.56 -16.48 62.56 82.04% 1.12E-006 82.03% 9.87E-007 0.94
26.1 15.62 -61.71 -15.62 61.71 80.93% 1.04E-006 80.92% 9.21E-007 0.95
24.4 14.55 -60.56 -14.55 60.56 79.72% 1.08E-006 79.71% 9.69E-007 0.96
22.4 13.24 -59.05 -13.24 59.05 78.34% 1.03E-006 78.33% 9.27E-007 0.97
19.8 11.49 -56.80 -11.49 56.80 76.70% 1.05E-006 76.69% 9.60E-007 0.98
16.2 9.00 -53.00 -9.00 53.00 74.50% 1.01E-006 74.49% 9.32E-007 0.99
13.3 7.00 -49.23 -7.00 49.23 72.88% 1.05E-006 72.87% 9.84E-007 0.995
7.06 2.98 -37.85 -2.98 37.85 69.93% 1.03E-006 69.92% 1.00E-006 0.9995
4 1.32 -29.25 -1.32 29.25 68.78% 1.01E-006 68.77% 9.90E-007 1
B.3 Results for pseudo-Bell states using photon counters
Statistical strength~=0
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 90.62% 3.61E-009 90.61% 2.02E-014
40 20.49 -66.01 -20.49 66.01 89.71% 8.90E-009 89.70% 8.94E-014
35 16.76 -62.14 -16.76 62.14 89.78% 4.24E-009 89.77% 3.63E-014
30 12.32 -56.16 -12.32 56.16 90.80% 2.55E-010 90.79% 1.18E-014
25 8.00 -48.43 -8.00 48.43 92.57% 8.08E-009 92.56% 6.21E-013
20 4.43 -39.49 -4.43 39.49 94.71% 8.13E-009 94.70% 1.08E-012
15 1.96 -29.88 -1.96 29.88 96.81% 4.08E-009 96.80% 5.21E-014
10 0.59 -19.98 -0.59 19.98 98.52% 1.36E-010 98.51% 5.54E-015
5 0.07 -10.00 -0.07 10.00 99.63% 5.76E-009 99.62% 1.77E-013
Statistical strength~=5E-5
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 91.05% 5.11E-005 91.04% 4.88E-005
40 20.53 -66.05 -20.53 66.05 90.30% 5.11E-005 90.29% 4.94E-005
35 16.81 -62.20 -16.81 62.20 90.40% 5.06E-005 90.39% 4.89E-005
147
30 12.34 -56.21 -12.34 56.21 91.46% 5.12E-005 91.45% 4.97E-005
25 7.98 -48.45 -7.98 48.45 93.25% 5.06E-005 93.24% 4.91E-005
20 4.39 -39.49 -4.39 39.49 95.42% 5.15E-005 95.41% 5.00E-005
15 1.91 -29.87 -1.91 29.87 97.52% 5.02E-005 97.51% 4.87E-005
10 0.56 -19.97 -0.56 19.97 99.20% 5.01E-005 99.19% 4.84E-005
6.30 0.00 90.00 -1.38 1.38 100.00% 5.00E-005
Statistical strength~=5E-4
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 91.98% 5.06E-004 91.97% 4.99E-004
40 0.00 90.00 -42.54 42.54 91.53% 5.03E-004 91.52% 4.97E-004
35 16.96 -62.38 -16.96 62.38 91.70% 5.02E-004 91.69% 4.97E-004
30 12.42 -56.35 -12.42 56.35 92.81% 5.05E-004 92.80% 5.00E-004
25 7.96 -48.53 -7.96 48.53 94.64% 5.04E-004 94.63% 4.99E-004
20 4.30 -39.51 -4.30 39.51 96.80% 5.03E-004 96.79% 4.97E-004
15 0.00 90.00 -7.99 7.99 98.73% 5.03E-004 98.72% 4.97E-004
11.26 0.00 90.00 -4.49 4.49 100.00% 5.00E-004
Statistical strength~=1.5E-3
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 22.50 -67.50 -22.50 67.50 92.97% 1.51E-003 92.96% 1.49E-003
40 0.00 90.00 -42.81 42.81 92.73% 1.51E-003 92.72% 1.50E-003
35 0.00 90.00 -37.46 37.46 93.00% 1.51E-003 92.99% 1.50E-003
30 12.54 -56.55 -12.54 56.55 94.16% 1.51E-003 94.15% 1.50E-003
25 0.00 90.00 -21.93 21.93 95.92% 1.50E-003 95.91% 1.49E-003
20 0.00 90.00 -14.44 14.44 97.94% 1.51E-003 97.93% 1.50E-003
15 0.00 90.00 -8.11 8.11 99.97% 1.51E-003 99.96% 1.50E-003
14.92 0.00 90.00 -8.01 8.01 100.00% 1.50E-003
148
B.4 Results for pseudo-Bell states using photon detectors
Statistical strength~=0
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 11.64 -63.88 -11.64 63.88 92.23% 1.96E-009 92.22% 2.57E-014
40 11.08 -62.79 -11.08 62.79 91.31% 4.10E-009 91.30% 7.36E-014
35 9.79 -59.60 -9.79 59.60 91.11% 7.81E-009 91.10% 3.78E-013
30 7.93 -54.42 -7.93 54.42 91.71% 3.60E-009 91.70% 7.68E-014
25 5.73 -47.46 -5.73 47.46 93.05% 1.20E-009 93.04% 2.56E-014
20 3.53 -39.09 -3.53 39.09 94.89% 3.16E-010 94.88% 1.03E-014
15 1.68 -29.76 -1.68 29.76 96.85% 2.71E-010 96.84% 7.47E-015
10 0.54 -19.96 -0.54 19.96 98.53% 3.26E-009 98.52% 3.30E-014
5 0.07 -10.00 -0.07 10.00 99.63% 5.66E-009 99.62% 1.44E-013
Statistical strength~=5E-5
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 11.62 -63.89 -11.62 63.89 92.69% 5.13E-005 92.68% 4.91E-005
40 11.06 -62.85 -11.06 62.85 91.93% 5.11E-005 91.92% 4.94E-005
35 9.78 -59.69 -9.78 59.69 91.75% 5.12E-005 91.74% 4.96E-005
30 7.92 -54.50 -7.92 54.50 92.37% 5.05E-005 92.36% 4.90E-005
25 5.70 -47.51 -5.70 47.51 93.74% 5.11E-005 93.73% 4.96E-005
20 3.49 -39.10 -3.49 39.10 95.60% 5.07E-005 95.59% 4.92E-005
15 1.67 -29.76 -1.67 29.76 97.57% 5.12E-005 97.56% 4.97E-005
10 0.52 -19.96 -0.52 19.96 99.21% 5.11E-005 99.20% 4.94E-005
6.36 0.00 90.00 -1.40 1.40 100.00% 5.00E-005
Statistical strength~=5E-4
149
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 11.58 -63.92 -11.58 63.92 93.67% 5.04E-004 93.66% 4.97E-004
40 11.00 -63.04 -11.00 63.04 93.23% 5.02E-004 93.22% 4.97E-004
35 9.74 -59.93 -9.74 59.93 93.09% 5.04E-004 93.08% 4.99E-004
30 7.88 -54.71 -7.88 54.71 93.74% 5.02E-004 93.73% 4.97E-004
25 5.64 -47.66 -5.64 47.66 95.13% 5.03E-004 95.12% 4.98E-004
20 3.40 -39.19 -3.40 39.19 96.98% 5.02E-004 96.97% 4.97E-004
15 1.58 -29.84 -1.58 29.84 98.85% 5.06E-004 98.84% 5.00E-004
11.64 0.00 90.00 -4.69 4.69 100.00% 5.00E-004
Statistical strength~=1.5E-3
gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2
45 11.52 -63.93 -11.52 63.93 94.71% 1.51E-003 94.70% 1.50E-003
40 10.97 -63.22 -10.97 63.22 94.56% 1.50E-003 94.55% 1.49E-003
35 9.70 -60.23 -9.70 60.23 94.46% 1.51E-003 94.45% 1.50E-003
30 7.83 -55.00 -7.83 55.00 95.12% 1.51E-003 95.11% 1.50E-003
25 5.58 -47.90 -5.58 47.90 96.48% 1.50E-003 96.47% 1.49E-003
20 3.33 -39.45 -3.33 39.45 98.24% 1.50E-003 98.23% 1.49E-003
15.34 1.79 -30.61 -1.79 30.61 100.00% 1.50E-003