analysis of tests of local realism

Analysis of tests of local realism

by

Yanbao Zhang

B.S., University of Science and Technology of China, 2006

A thesis submitted to the

Faculty of the Graduate School of the

University of Colorado in partial fulfillment

of the requirements for the degree of

Doctor of Philosophy

Department of Physics

2013

This thesis entitled:Analysis of tests of local realism

written by Yanbao Zhanghas been approved for the Department of Physics

Emanuel Knill

Sae Woo Nam

Date

The final copy of this thesis has been examined by the signatories, and we find that both thecontent and the form meet acceptable presentation standards of scholarly work in the above

mentioned discipline.

iii

Zhang, Yanbao (Ph.D., Physics)

Analysis of tests of local realism

Thesis directed by Dr. Emanuel Knill

Reliable and loophole-free demonstrations of the violation of local realism (LR) are highly

desirable not only for understanding the foundation of quantum mechanics but also for facilitating

quantum information processing, such as quantum key distribution and randomness expansion.

To date, LR has been experimentally violated, but with loopholes, by testing predetermined Bell

inequalities. This thesis presents a framework for verifying and quantifying the violation of LR

without relying on a particular Bell inequality.

First, the experimental resources, such as the quantum state, measurement settings, detection

efficiency, and visibility required for a violation of LR, are studied via a measure called the statistical

strength. The higher the statistical strength, the more confidence in a violation of LR one has after

a sufficiently large number of experimental data. Particularly, we study the minimum detection

efficiency required to achieve any given statistical strength level in tests of LR with entangled states

created from two independent polarized photons passing through a polarizing beam splitter. It is

shown that, compared with photon detectors, photon counters make violations of LR easier to

detect for any nonzero probability of multiple photons in an output beam of the polarizing beam

splitter.

Second, to quantify the statistical evidence against LR obtained from a finite number of

experimental data, one can choose a test statistic, such as a Bell-inequality violation, to measure

the amount of violation of LR. It is desirable to bound the probability, according to LR, of obtaining

a test statistic at least as extreme as that observed. This probability is known as a p-value for

the hypothesis test of LR. We propose a protocol to bound such a p-value. The bound provided is

asymptotically tight, if the prepared quantum state and measurement settings are stable during an

experiment. Therefore, the proposed protocol is asymptotically optimal, and the bound provided

iv

is a standardized measure of success for experimental tests of LR. One can quantitatively compare

different experimental tests based on this bound. Moreover, the bound provided is valid even if the

quantum state varies arbitrarily and local realistic models depend on previous measurement settings

and outcomes. Hence, this bound facilitates device-independent and nonlocality-based quantum

information processing. For comparison, bounds of p-values derived from Bell-inequality violations

using the number of standard deviations of violation of a Bell inequality or using martingale theory

are studied. It is found that putative bounds derived from the number of standard deviations of

violation are not valid and bounds from martingale theory are not tight.

Finally, a simplified and efficient data analysis protocol using a set of Bell inequalities is

proposed and compared with the above optimal and martingale-based protocols. The simplified

protocol provides as good as and typically tighter p-value bounds than the martingale-based proto-

col, and the bounds provided can even be asymptotically tight. Moreover, the simplified protocol

can be applied to any test with linear witnesses, such as tests for verifying entanglement, system

dimensionality, or steering.

Dedication

To my parents.

vi

Acknowledgements

I would like to thank my advisor, Manny Knill. Manny has tremendous amount of ideas

and amazing intuitions on how to solve problems. No matter what kind of problems I met during

research, he always gave me pieces of helpful advice. During these years, I have learned from him

not only a lot on physics and mathematics but also critical thinking. I cannot express all my thanks

to him.

I have also benefited a lot from working with Scott Glancy. Scott is specially good at

simplifying problems and making research and writing easier to understand. He helped me a

lot at writing and presenting my work. I also enjoyed the time spent together with Adam Meier,

Bryan Eastin, and Mike Mullan. They, specially Adam, helped me a lot at presenting my work

and improving my English.

I also would like to thank many other people at NIST and JILA, including, but not limited

to, Lorna Buhse, Kevin Coakley, Sae Woo Nam, Alan Migdall, Thomas Gerrits, Dominic Meiser,

and Murray Holland. Lorna helped me with every conference travel. Kevin helped me a lot

on mathematical statistics. I learned a lot and became interested in quantum optics through

discussions with Sae Woo, Alan, Thomas, Dominic, and Murray.

vii

Contents

Chapter

1 Introduction 1

1.1 Why test local realism? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Bell’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Overview of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Contents of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Bell inequalities 8

2.1 Locality, realism, and Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2 Geometric interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.1 Special case: The CHSH inequality . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.2 The general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 Various Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.1 Bell inequalities with many settings . . . . . . . . . . . . . . . . . . . . . . . 16

2.3.2 Bell inequalities with many outcomes . . . . . . . . . . . . . . . . . . . . . . 18

2.3.3 Bell inequalities with many parties . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.4 Derivation of Bell inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.4 Bell inequality and entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.5 Bell inequality, steering, and contextuality . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6 Bell inequality and private information . . . . . . . . . . . . . . . . . . . . . . . . . . 31

viii

3 Challenges of testing local realism 32

3.1 Experimental configuration for testing local realism . . . . . . . . . . . . . . . . . . . 32

3.2 The locality loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 The detection loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.4 The memory loophole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.5 Possibilities of loophole-free violations of LR . . . . . . . . . . . . . . . . . . . . . . . 36

4 Statistical strength of experiments for rejecting local realism 39

4.1 Experimental configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 Data analysis method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.3 Results and discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.4 Concluding remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5 Asymptotically optimal data analysis for rejecting local realism 56

5.1 Statistical concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

5.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2.1 Bell functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2.2 SD-based protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2.3 Martingale-based protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.2.4 PBR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

5.3 Technical details for applying the PBR protocol . . . . . . . . . . . . . . . . . . . . . 70

5.3.1 Estimating the experimental probability distribution . . . . . . . . . . . . . . 70

5.3.2 Effects of bad estimates of true distributions and optimal LR models . . . . . 73

5.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.4.1 Confidence-gain rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

5.4.2 Application to experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

ix

6 Efficient quantification of experimental violation of local realism 81

6.1 Simplified PBR protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.2 Protocol comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

6.2.1 Computational resource comparison . . . . . . . . . . . . . . . . . . . . . . . 85

6.2.2 Comparison of confidence-gain rates . . . . . . . . . . . . . . . . . . . . . . . 88

6.2.3 Comparison of protocols’ behavior for finite data . . . . . . . . . . . . . . . . 91

6.3 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7 Conclusions and future directions 98

7.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

7.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Bibliography 100

Appendix

A User guide of the local realism analysis engine 111

A.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

A.2 LRE state . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

A.2.1 Experimental configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

A.2.2 Analysis and display variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

A.2.3 Data dependent variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

A.3 LRE interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

A.4 LRE support functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

A.5 LRE usage examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

A.5.1 Analyzing an existing data set . . . . . . . . . . . . . . . . . . . . . . . . . . 129

A.5.2 Monitoring an experiment in progress . . . . . . . . . . . . . . . . . . . . . . 133

A.6 Technical notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

x

A.6.1 Data half-lives and data weights . . . . . . . . . . . . . . . . . . . . . . . . . 137

B Optimization results for Chapter 4 141

B.1 Results for unbalanced Bell states using photon counters or detectors . . . . . . . . . 142

B.2 Tradeoff between visibility and efficiency using unbalanced Bell states where S = 10−6145

B.3 Results for pseudo-Bell states using photon counters . . . . . . . . . . . . . . . . . . 146

B.4 Results for pseudo-Bell states using photon detectors . . . . . . . . . . . . . . . . . . 148

xi

Tables

Table

4.1 Extreme conditions for tests of LR free of the detection loophole for photon counters

or photon detectors using the unbalanced Bell states |ψuB〉 defined in Eq. (4.11).

The asymptotic behavior when θ → 0 is consistent with results in Ref. [1], which are

shown in the last row. The angle parameters are explained in the text. . . . . . . . . 48

4.2 Extreme conditions for tests of LR free of the detection loophole for photon counters

and photon detectors using the pseudo-Bell states of Eq. (4.13). The angle parame-

ters are explained in the text. The minimum detection efficiencies for counters and

detectors when γ = 45◦ are the same as those found in Ref. [2]. . . . . . . . . . . . . 53

xii

Figures

Figure

2.1 The regions achievable by LR, quantum mechanics, and all physical theories satis-

fying no signaling. Any correlation vector inside black squares is achievable under

the no-signaling conditions as in Eq. (2.8). The quantum convex set Q and the

LR polytope L are bounded by red curves and blue lines (with black lines in (b)),

respectively. (a) is the situation in the subspace E(A1B1) = 1 and E(A1B2) = 0,

while (b) is the situation in the subspace E(A1B1) = E(A1B2) = 1/2. . . . . . . . . 13

3.1 The experimental procedure for testing a bipartite Bell inequality. The inset reflects

the locality condition in the space-time diagram. . . . . . . . . . . . . . . . . . . . . 33

4.1 Schematic of a test of LR with the independent photons source. Two spatially and

temporally matched polarized photons are inserted at 1 and 2. The polarization

rotators PR1 and PR2 are set so that photons 1 and 2 are linearly polarized at the

same direction when they reach the polarizing beam splitter PBS1. After PBS1, the

photons are in a nonmaximally entangled state (see Eq. (4.3)) and are sent to Alice’s

and Bob’s measurement setups. Each measurement setup uses a PR, a PBS and

two detectors. The PR is used to select measurement bases by rotating the photon’s

polarization state. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

xiii

4.2 Detection efficiency of photon counters or photon detectors required for different

statistical strength levels S vs the parameter θ [Eq. (4.11)]. The empty squares

show our calculated points, and the dotted lines are linear interpolations to guide

the eyes. In curve a, the linear extrapolation toward θ = 0 is shown. . . . . . . . . . 50

4.3 Tradeoff between the overall minimum detection efficiency minθ ηc(θ) and the visi-

bility V of unbalanced Bell states. Here, we fix the optimal statistical strength to

10−6. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4.4 Detection efficiencies of photon counters and photon detectors required for different

statistical strength levels S vs the parameter γ of the pseudo-Bell state of Eq. (4.13):

(a) S = 0, (b) S = 5E-5, (c) S = 5E-4, and (d) S = 1.5E-3. The calculated points

are labeled by squares for photon counters and by diamonds for photon detectors,

and the dotted lines are linear interpolations to guide the eyes. . . . . . . . . . . . . 54

5.1 Confidence-gain rates G achieved by the SD-based, martingale-based, and PBR pro-

tocols. The gain rate G is shown for a CHSH test of LR with an unbalanced Bell state

with no loss and perfect detectors. It depends on the parameter θ in the unbalanced

Bell state |ψuB〉. Given the state parameter θ, the measurement settings are chosen

to maximize the violation of the CHSH inequality (1.2). The line corresponding to

the gain rates achieved by the SD-based protocol crosses the line corresponding to

the optimal gain rates at θ = 33.41◦. . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

xiv

5.2 The confidence-gain rate G of a CHSH test of LR with a Bell state and varying detec-

tion efficiency η and visibility V. The measurement settings are chosen to maximize

the violation of the CHSH inequality (1.2). Measurement outcomes where no particle

is detected are assigned the value −1. (a) pA1 = pB1 = 0.5, (b) pA1 = pB1 = 0.51, (c)

pA1 = pB1 = 0.52, and (d) pA1 = pB1 = 0.53, where pA1 and pB1 are the probabilities

that at each trial Alice and Bob independently choose the settings A1 and B1, respec-

tively. Note that, in the subplot (a) the optimal gain rates are not shown, since the

optimal gain rate can be at most 6 % larger than the corresponding martingale-based

gain rate so that the difference between them is not visible. . . . . . . . . . . . . . . 77

5.3 Running log-p-values as functions of the number of trials n in a CHSH test of LR

with an unbalanced Bell state cos(θ)|00〉 + sin(θ)|11〉 where θ = 22.5◦. We assume

that there is no noise or detection inefficiency and the setting distribution is uniform.

The log-p-values are computed according to the three protocols discussed. The slopes

of the straight lines are the confidence-gain rate achieved by each protocol. (a) is

for one simulation of 5000 successive trials. (b) is an average of 30 independent

simulations. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

5.4 Running log-p-values as functions of the number of trials n in the experiment of

Ref. [3]. In this experiment, different measurement settings are chosen uniformly

randomly. The dotted lines are provided only to guide the eye. . . . . . . . . . . . . 80

6.1 Confidence-gain rates in the test of the CGLMP inequality 〈Id(X)〉 ≤ 2. Here,

we use the quantum state and measurement settings of Ref. [4], Eqs. (15) and (9),

respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

xv

6.2 Confidence-gain rates in the test of LR with an unbalanced Bell state |ψ(θ)〉. The

measurement settings are chosen to maximize the violation of the CHSH inequal-

ity (1.2) given the state |ψ(θ)〉. The gain rates achieved by the simplified PBR

protocol using the CHSH inequality are shown as circles (◦), while the gain rates by

the same protocol using the CHSH inequality together with no-signaling conditions

are shown as crosses (+). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.3 An example of running log-p-values as functions of the number of trials n in a test

of the CGLMP inequality. The dashed and solid lines are the asymptotic lines for

log-p-values based on gain rates achieved by the (full or simplified) PBR protocol

and the martingale-based protocol, respectively. Repetitions of this Monte Carlo

simulation show similar behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Chapter 1

Introduction

1.1 Why test local realism?

Theories designed according to “local realism” (LR) include a set of hidden variables, which

if known would deterministically predict all measurement outcomes. Moreover, the values of the

hidden variables cannot be influenced by spacelike-separated events, hence these hidden variables

are called local hidden variables. In 1964, Bell first showed that quantum mechanics violates LR [5].

This profound result is known as Bell’s theorem. To prove this theorem, Bell and his followers

constructed a class of inequalities, called Bell inequalities. These inequalities are satisfied by all the

predictions according to LR, but can be violated by the predictions of quantum mechanics. A test

of LR showing violation was first realized by Freedman and Clauser in 1972 [6]. Since then, many

such tests have been performed. For reviews of this field, see Refs. [7, 8, 9, 10]. Naturally, one will

ask, “Why does anyone still perform tests of LR given the many claimed experimental violations?

Why are they important?”

Most importantly and fundamentally, the violation of LR implies that the physical description

of the world contradicts at least one of the principles—locality or realism. Locality states that two

spacelike-separated events cannot affect each other, while realism is the ability to meaningfully

speak of the definiteness of the outcomes of measurements that have not been performed. Each

principle sounds natural in practice, however, their combination does not predict measurement

results correctly. To date, no test of LR performed has satisfied both principles without introducing

additional assumptions, i.e., loopholes. Accordingly, no experimental result so far conclusively rules

2

out LR. This is the central motivation underlying the competition for performing a loophole-free

test of LR.

Secondly, quantum physicists are trying to build up quantum computers or networks. Entan-

glement and violation of LR are very important resources for these tasks. (See Sec. 2.4 of Chapter 2

for the definition of entanglement and relevant discussions.) To verify and quantify these quantum

resources, Bell inequalities or generalized Bell inequalities are useful and even indispensable tools.

For example, if experimental results violate LR, a family of quantum communication protocols are

secure even for causal adversaries not limited by the laws of quantum mechanics [11, 12, 13].

Last but not the least, quantum physicists are interested in understanding and quantifying

the violation of LR achievable in quantum mechanics. As is well known, the quantum violation of

LR is less than the maximal violation of LR possible according to a theory satisfying relativistic

causality (also called no signaling) [14]. Physicists would like to verify that the violations of LR in

experiments are consistent with the predictions of quantum mechanics.

1.2 Bell’s theorem

As mentioned above, Bell’s theorem states that no local realistic (LR) theory (i.e., a theory

designed according to LR) can reproduce all the predictions of quantum mechanics. In the work [5],

Bell derived the following inequality

1 + E(A1B1) ≥ |E(A2B2)− E(A2B1)|, (1.1)

where E(AiBj) with i, j ∈ {1, 2} is the correlation between Alice’s and Bob’s measurements Ai and

Bj with outcomes ±1 on two separated particles. This inequality (1.1) is satisfied by a restricted set

of LR theories for which the outcomes from the two separated particles are exactly anticorrelated

when Alice’s and Bob’s measurements are A1 and B2, respectively. This restriction is reasonable

if the two particles are spin-1/2 particles and they are in the singlet state 1√2(| ↑↓〉 − | ↓↑〉), a Bell

state. For this case, Bell gave a set of measurements for which the right-hand side of Eq. (1.1) is

larger than the left-hand side, thus Eq. (1.1) is violated. However, the ideal singlet state required

3

in Bell’s original proof cannot be prepared in practice.

Practical experimental tests of LR were made possible by the proposal by Clauser, Horne,

Shimony and Holt in 1969 [15]. They constructed Bell inequalities satisfied by a general LR theory.

One is the Clauser-Horne-Shimony-Holt (CHSH) inequality [15]

ICHSH ≡ E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 2, (1.2)

where the terms E(AiBj) are the same as those in Eq. (1.1). Another Bell inequality, equivalent

to the CHSH inequality but easier to test, is the Clauser-Horne (CH) inequality [16]

ICH ≡ P (A1B1) + P (A1B2) + P (A2B1)− P (A2B2)− P (A1)− P (B1) ≤ 0, (1.3)

where P (AiBj) is the probability that both measurements Ai and Bj have outcome +1 and P (Ai)

or P (Bj) is the probability that measurement Ai or Bj has outcome +1. By choosing appropriate

measurements on a Bell state, the CHSH and CH expressions ICHSH and ICH take their maxi-

mum values 2√

2 and 1/√

2 − 1/2, respectively. To test inequalities (1.2) or (1.3), each of two

parties—Alice and Bob—receives one particle from a common source. Each of them randomly and

independently chooses a local measurement from a set consisting of two measurements, performs

the chosen measurement on their own particle, and records the outcome. This procedure is called

a trial. After a large number of trials, Alice and Bob collect enough data and can estimate ICHSH

or ICH from these data.

1.3 Overview of the thesis

This thesis tells the story of our quest to quantify the evidence against LR obtained from

experimental data. Since the first test of LR in 1972 [6], it has been a convention to test a Bell

inequality and present the result in terms of the number of experimental standard deviations (SDs)

of violation of this Bell inequality. This is a way of claiming successful violation of LR with small

measurement uncertainties. However, a large number of SDs of violation does not necessarily imply

a small p-value, where a p-value is the probability, if LR holds, of a violation at least as high as that

4

observed. A small p-value means that the data is significant for rejecting LR. Hence, a reliable test

of LR requires a small p-value. The main work in this thesis is about how to upper bound a p-value

for the hypothesis test of LR. Specifically, we propose a method to compute an asymptotically tight

upper bound of a p-value. The proposed method can be simplified and adapted to quantify the

experimental evidence for rejecting an arbitrary set of hypothetical probability distributions from

which the experimental probability distribution is separated by hyperplanes in the probability space.

For example, the simplified method can be applied to verify entanglement or system dimensionality

with linear witnesses.

If one pretends that the distribution of the violation of a Bell inequality, if LR holds, is

Gaussian with mean less than or equal to 0 and SD equal to the observed one, one can estimate

a p-value from the number of SDs of violation. However, as our work [17] showed, this p-value

estimate is not an upper bound of the corresponding exact p-value, and so it is not valid. A valid

p-value bound was suggested by Gill [18, 19]. But, it is based on a conservative estimate of a tail

probability, and so this bound is not tight.

We propose a method to compute a tighter p-value bound without relying on a particular Bell

inequality [17]. Specially, our bound is asymptotically tight with respect to the number of trials, if

the prepared quantum state and measurement settings are stable over time. Hence, the proposed

bound is a standardized measure of success for experimental tests of LR, and one can compare the

strengths of rejecting LR in different experiments based on this measure. The proposed method

works even if the prepared quantum state and measurement settings vary arbitrarily and relevant

LR models depend on previous measurement settings and outcomes. That is, our method works in a

device-independent way and is robust against the memory loophole [20, 21, 22, 23]. Computing the

proposed p-value bound requires only the sequence of measurement settings and outcomes without

using a predetermined Bell inequality. Because the proposed method is not restricted to a single

Bell inequality, it enables wider searches for strong violations of LR. Also, this method adapts to

changes in experimental configuration over the time period for acquiring experimental data. We

implement this method in Matlab and Octave both for monitoring experiments in progress and

5

analyzing existing data sets.

Geometrically, the probability distributions accessible by LR form a convex polytope L, while

the distributions achievable by quantum mechanics form a bigger convex set Q such that L ⊂ Q.

Each face of the convex polytope L corresponds to a Bell inequality. If the joint probability distribu-

tion q of measurement settings and outcomes given by a quantum state and a set of measurements

violates a Bell inequality, then the distribution q is not in the polytope L, i.e., q /∈ L. However,

determining the violation of a predetermined Bell inequality by the distribution q may not be the

most effective way for showing that q /∈ L. The best Bell inequality that separates the distribu-

tion q as far from the polytope L as possible depends on the relative position of q to L. Our

proposed method [17] first estimates the position of the distribution q relative to L before a trial

using previous measurement results, and then it estimates the best Bell inequality with respect to

q. If the distribution q is stable over time, these estimates are asymptotically optimal and so is

the computed p-value bound. For typical experimental configurations, such as the configuration

for testing the CHSH inequality (1.2), our proposed method [17] works efficiently. However, it is

difficult to implement this method as the configuration parameters, that is, the numbers of parties,

measurement settings, and measurement outcomes, increase.

The proposed method in [17] can be simplified if we consider the information about the

polytope L available before the test. For example, if we know a set of relevant faces of the polytope

L, i.e., a set of Bell inequalities, we can simplify the proposed method and at the same time make

p-value bounds sufficiently tight. The motivation is as follows: Given a set of Bell inequalities, we

can assume that the best Bell inequality with respect to the experimental probability distribution

can be expressed as a convex combination of the Bell inequalities in the set considered. Whether

or not this assumption actually holds affects only the tightness of a p-value bound computed, but

not its validity. It is easier to estimate the best Bell inequality in this case, and so the complexity

of computing a p-value bound is reduced. The efficiency of the simplified method depends on

the number of Bell inequalities considered, but not the configuration parameters. The bound

depends on the choice and number of Bell inequalities, and generally, more inequalities make the

6

bound tighter. We find that even trivial Bell inequalities such as those derived from no-signaling

conditions can improve the tightness of the bound. In general, we cannot guarantee that the p-value

bound provided by the simplified method is asymptotically tight.

We also study the quantification of an experimental violation of LR in the asymptotic limit.

The difference between the experimental probability distribution q and all the distributions ac-

cessible by LR can be characterized by a measure called the statistical strength, defined as the

minimum Kullback-Leibler divergence from q to all LR distributions [24]. The reason for using this

measure is that, the p-value for rejecting LR decays exponentially with the number of data points

in the asymptotic limit, and the decay rate cannot be larger than the statistical strength as defined

above [25]. This measure helps to quantify the experimental resources required for a loophole-free

test of LR. Particularly, we study the minimum detection efficiency required for achieving any

given statistical strength level S. Our results [26] show that, for the tests with unbalanced Bell

states of the form cos(θ)|00〉+ sin(θ)|11〉, the minimum detection efficiency required for closing the

detection loophole [27] (corresponding to S = 0) is 2/3, consistent with Eberhard’s result [28]. For

the tests with entangled states created from two independent polarized photons passing through

a polarizing beam splitter, the minimum detection efficiencies for closing the detection loophole

are 89.71 % and 91.11 %, using photon counters and photon detectors respectively. These results

are a little better than the minimum efficiencies 90.62 % for counters and 92.23 % for detectors,

as presented in Ref. [2]. The results show that, compared with photon detectors, photon counters

make violations of LR easier to detect.

1.4 Contents of the thesis

Chapter 2 reviews theoretical works on the tests of LR and related subjects. In addition,

we discuss a systematic and efficient method for deriving various Bell inequalities. Chapter 3

reviews experimental challenges of performing a loophole-free test of LR and recent experimental

progress. Chapter 4 explains our contribution for closing the detection loophole. Chapter 5 studies

the quantification of the statistical evidence against LR through bounding p-values. As mentioned

7

in Sec. 1.3, our proposed data analysis is asymptotically optimal. In Chapter 6, we simply the

proposed data analysis and discuss how to extend it to other tests that benefit quantum information

processing. Finally we conclude the thesis in Chapter 7. There are also two appendices, Appendix A

and Appendix B. In Appendix A, we provide the user guide and code information for implementing

our data analysis. The code can be used to both monitor experiments in progress and analyze

existing data sets. In Appendix B, we provide the details of the optimization results in Chapter 4.

Note that, Chapters 4–6 are based on our papers [17, 26, 29]. Most contents of these chapters are

the same as in the published papers, but more results and discussions are added.

Chapter 2

Bell inequalities

2.1 Locality, realism, and Bell inequalities

Starting from this chapter, the upper-case letters A,B, . . . characterizing the measurements

are also used as the random variables from which the measurement outcomes a, b, . . . are sampled.

As is conventional, the upper-case letter X denotes a random variable and the lower-case letter x

denotes the sampled value of this random variable. We apologize that the readers have to figure out

from context whether A,B, . . . mean the settings or the random variables from which the outcomes

a, b, . . . are sampled.

The two assumptions behind a Bell inequality are locality and realism, as mentioned in

Chapter 1. It helps to understand what are locality and realism by considering the framework for

deriving a Bell inequality.

Physical theories, whether classical, quantum-mechanical, or more general, describe a physical

system by a state. Suppose that there is a source of two particles, described by a joint state. The

two particles are going to Alice and Bob, respectively. Alice and Bob perform measurements A

and B chosen randomly on their own particle and get outcomes a and b, respectively. The above

procedure is called a trial. In a theory designed according to local realism (LR), the joint state of

two particles is λ. Given the state λ and measurement settings A and B at a trial, the outcomes a

and b are completely determined, i.e., a = a(λ,A,B) and b = b(λ,A,B). This determinism is the

meaning of realism. Suppose that the two measurement processes are space-like separated in the

space-time diagram. Then, there is no causal effect on the event of observing the outcome a from

9

the event of choosing the setting B, or on the event of observing the outcome b from the event

of choosing the setting A. Hence, the outcomes are expressible as a = a(λ,A) and b = b(λ,B),

conveying the locality assumption. The combination of locality and realism is called LR, and a

theory designed according to LR is called a local realistic (LR) theory. Note that, the concept of

realism is different from the element of reality in the Einstein-Podolsky-Rosen paradox described in

Ref. [30]. In this paradox, the element of reality is assigned to a physical quantity only if, without

any disturbance, an experimenter can predict with certainty the value of this physical quantity.

As is well known, given a quantum state, quantum mechanics can predict only the probability

P (A = a,B = b) of observing the outcomes a and b at a trial after the measurements A and B,

respectively. However, from the above paragraph, we can see that an LR theory can predetermine

the outcome of each measurement at each trial given the associated state λ. Since the LR state λ is

not accessible in an experiment and can be different at different trials, a general LR theory assumes

that there is a probability distribution ρ(λ) over different states λ. The measurement-outcome

probability at a trial according to this general LR theory is

P (A = a′, B = b′) =∫ρ(λ)δa′,a(λ,A)δb′,b(λ,B)dλ, (2.1)

where the indicator function δx,y = 1 if x = y, otherwise δx,y = 0. As described so far, it is

conceivable that the LR state λ and measurement settings A,B at a trial statistically depend on

each other. Assuming that Alice and Bob have “free will”, as discussed by Bell [31], one can

assure the statistical independence between λ and the setting choices A and B. With the free will

assumption, from Eq. (2.1) one can derive various Bell inequalities satisfied by any LR theory, for

example, the Clauser-Horne-Shimony-Holt (CHSH) inequality (1.2) introduced in Chapter 1.

Suppose that each of Alice and Bob have two different measurement settings A1 and A2 or

B1 and B2. Each measurement has two different outcomes ±1. At each trial, Alice and Bob choose

their own measurement setting randomly and independently. After many trials, they estimate the

correlations E(AiBj) between measurements Ai and Bj with i, j ∈ {1, 2}. For any physical theory,

10

these correlations satisfy

−1 ≤ E(AiBj) ≤ 1. (2.2)

According to an LR theory ρ(λ),

E(AiBj) =∫ρ(λ)ai(λ,Ai)bj(λ,Bj)dλ, (2.3)

where ai(λ,Ai), bj(λ,Bj) = ±1 for any λ. Hence, the CHSH expression

ICHSH = E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2)

=∫ρ(λ) [a1(λ,A1)b1(λ,B1) + a1(λ,A1)b2(λ,B2) + a2(λ,A2)b1(λ,B1)− a2(λ,A2)b2(λ,B2)] dλ

=∫ρ(λ) {a1(λ,A1) [b1(λ,B1) + b2(λ,B2)] + a2(λ,A2) [b1(λ,B1)− b2(λ,B2)]} dλ (2.4)

Using the fact that the expression a1(λ,A1)[b1(λ,B1)+b2(λ,B2)]+a2(λ,A2)[b1(λ,B1)−b2(λ,B2)] =

±2, from Eq. (2.4) we get the CHSH inequality −2 ≤ ICHSH ≤ 2.

2.2 Geometric interpretation

2.2.1 Special case: The CHSH inequality

Considering the permutation symmetry over the four correlations in the CHSH expres-

sion (2.4), we can get the following inequalities,

−2 ≤E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 2,

−2 ≤E(A1B1) + E(A1B2)− E(A2B1) + E(A2B2) ≤ 2,

−2 ≤E(A1B1)− E(A1B2) + E(A2B1) + E(A2B2) ≤ 2, and

−2 ≤− E(A1B1) + E(A1B2) + E(A2B1) + E(A2B2) ≤ 2. (2.5)

As shown by Fine [32], the vector ~E = (E(A1B1), E(A1B2), E(A2B1), E(A2B2)) in the correlation

space where each dimension denotes the correlation between Ai and Bj can be explained by LR

if and only if ~E satisfies the above four CHSH inequalities (2.5) and the trivial inequalities (2.2).

We denote the set of correlation vectors in the correlation space satisfying the inequalities (2.5)

11

and (2.2) by L. Because the region L is defined by linear inequalities, it is a convex polytope,

namely a four-dimensional octahedron [33, 34]. Note that, a set of points S is convex if and only if

for all points ~x, ~y ∈ S the set also contains the straight-line segments {ω~x+ (1− ω)~y : 0 ≤ ω ≤ 1}

between these points.

Quantum mechanics provides correlation vectors ~E outside of L. As first shown by Cirel’son [35],

the CHSH expression ICHSH in Eq. (2.4) can be as high as 2√

2 according to quantum mechanics.

The maximum value can be achieved by the Bell state |ψBell〉 = (|00〉+|11〉)/√

2 with measurements

A1 = σz, A2 = σx, B1 = (σz + σx)/√

2 and B2 = (σz − σx)/√

2. Here, σz =(

1 00 −1

)and σx =

(0 11 0

)are two Pauli matrices. The region accessible by quantum mechanics in the correlation space was

first fully characterized by Masanes [36]. Masanes showed that a correlation vector ~E is obtainable

within quantum mechanics if and only if it satisfies the nonlinear inequalities

−π ≤ sin−1(E(A1B1)) + sin−1(E(A1B2)) + sin−1(E(A2B1))− sin−1(E(A2B2)) ≤ π

−π ≤ sin−1(E(A1B1)) + sin−1(E(A1B2))− sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π

−π ≤ sin−1(E(A1B1))− sin−1(E(A1B2)) + sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π

−π ≤ − sin−1(E(A1B1)) + sin−1(E(A1B2)) + sin−1(E(A2B1)) + sin−1(E(A2B2)) ≤ π (2.6)

and the trivial linear inequalities (2.2). The set of correlation vectors fulfilling the inequalities (2.6)

and (2.2) is denoted by Q, which is a convex set, but not a convex polytope, such that L ⊂ Q.

The violation of LR by quantum mechanics does not achieve the maximal violation according

to a theory satisfying relativistic causality. Relativistic causality forbids sending messages faster

than light, hence it is also called no signaling. In 1994, Popescu and Rohrlich formulated a specific

theory [14], later referred to as the Popescu-Rohrlich (PR) box, which achieves ICHSH = 4, the

algebraic maximum of the CHSH expression. According to the PR box, the joint probability is

given as

P (Ai = ai, Bj = bj) =

1/2 if (ai + 1)/2⊕ (bj + 1)/2 = (i− 1)(j − 1),

0 otherwise,(2.7)

12

where ⊕ denotes addition modulo 2, i and j are 1 or 2, and ai and bj are −1 or 1. So, E(AiBj) =

(−1)(i−1)(j−1) and ICHSH = 4. By relabeling the outcomes of Bj , i.e., bj ↔ −bj , the PR box can

also achieve the algebraic minimum ICHSH = −4. Probabilities such as those in Eq. (2.7) satisfy

the no-signaling conditions [14]

∑bj

P (Ai = ai, Bj = bj) =∑bj′

P (Ai = ai, Bj′ = bj′) ≡ P (Ai = ai) ∀ai, Ai, Bj , Bj′ ,

∑ai

P (Ai = ai, Bj = bj) =∑ai′

P (Ai′ = ai′ , Bj = bj) ≡ P (Bj = bj) ∀bj , Ai, Ai′ , Bj . (2.8)

Under the no-signaling conditions (2.8), the CHSH expression ICHSH can take any value between

−4 and 4. Considering the permutation symmetry, the set of correlation vectors ~E satisfying the

inequalities

−4 ≤ E(A1B1) + E(A1B2) + E(A2B1)− E(A2B2) ≤ 4

−4 ≤ E(A1B1) + E(A1B2)− E(A2B1) + E(A2B2) ≤ 4

−4 ≤ E(A1B1)− E(A1B2) + E(A2B1) + E(A2B2) ≤ 4

−4 ≤ −E(A1B1) + E(A1B2) + E(A2B1) + E(A2B2) ≤ 4 (2.9)

and the inequalities (2.2) is achievable according to a general theory constrained by only the no-

signaling conditions (2.8). This set of correlation vectors is a four-dimensional cube P containing

the LR polytope L and the quantum convex set Q. The relative relationships between L, Q, and

P are shown as in Fig. 2.1. The left and right plots show the situations in two different subspaces.

2.2.2 The general case

So far we have discussed only a particular case where each of Alice and Bob has two measure-

ments with outcome ±1. For a general scenario, Alice and Bob can perform mA and mB measure-

ments, respectively. Each measurement Ai or Bj has a certain number, dA or dB, of possible out-

comes. After many trials in an experiment, we can estimate the probabilities P (Ai = ai, Bj = bj)

given the chosen measurement settings Ai and Bj . These probabilities satisfy the no-signaling

13

−1 1−1

1

E(A

2B

2)

E (A2B1)−1 1

−1

1

E(A

2B

2)

E (A2B1)

(a) (b)

Figure 2.1: The regions achievable by LR, quantum mechanics, and all physical theories satisfyingno signaling. Any correlation vector inside black squares is achievable under the no-signalingconditions as in Eq. (2.8). The quantum convex set Q and the LR polytope L are bounded by redcurves and blue lines (with black lines in (b)), respectively. (a) is the situation in the subspaceE(A1B1) = 1 and E(A1B2) = 0, while (b) is the situation in the subspace E(A1B1) = E(A1B2) =1/2.

conditions as in Eq. (2.8) and the following trivial constraints, i.e., the positivity conditions

P (Ai = ai, Bj = bj) ≥ 0 ∀ai, bj , Ai, Bj , (2.10)

and the normalization conditions

∑ai,bj

P (Ai = ai, Bj = bj) = 1 ∀Ai, Bj . (2.11)

The d = mAmBdAdB probabilities can be considered as a point in a d-dimensional space. Since the

sum of these probabilities is equal to mAmB, the total number of joint measurement settings, this

space is not a conventional “probability space”. However, when it is not necessary to differentiate

it from the conventional probability space, we call this space also the probability space. Note that,

14

the number of independent probabilities is less than d, the dimension of the probability space, since

the probabilities P (Ai = ai, Bj = bj) satisfy the no-signaling conditions as in Eq. (2.8) and the

normalization conditions as in Eq (2.11).

An LR state λ specifies a specific outcome for each measurement, so there are a total of

Nλ = dmAA dmB

B different LR states. A general LR theory corresponds to a convex combination,

i.e., a probability distribution ρ(λ), over these LR states. Since there are only a finite number of

LR states, the set of LR theories constitutes a convex polytope L in the probability space [7, 8].

Note that, a convex polytope can be defined either as the set of all convex combinations of a finite

set of points or as a bounded intersection of halfspaces. (For an introduction of basic properties

of a convex polytope, see the first three lectures in the book [37].) A convex polytope has many

faces of different dimensions, and a face is the intersection of the polytope with a hyperplane whose

corresponding linear inequality is satisfied by all points inside of the polytope. Hence, the empty

set is a face for every convex polytope. The dimension of a face is the minimum of the dimensions

of linear vector spaces containing the face. A 0-dimensional face is a vertex (or an extreme point) of

a convex polytope, and a face with the maximal dimension is called a facet. For the LR polytope,

each vertex corresponds to an LR state, and any face corresponds to a Bell inequality. Given a

Bell inequality, one can also construct a face of the LR polytope such that this Bell inequality is

an equality on the face. If a Bell inequality is not satisfied by all no-signaling theories and the

inequality’s corresponding face of the LR polytope has a dimension greater than or equal to 1, the

Bell inequality is tight. Otherwise, the Bell inequality is not tight.

In contrast, all no-signaling theories are bounded by only the trivial inequalities as in Eqs. (2.8),

(2.10), and (2.11), which are linear inequalities. Hence, the set of no-signaling theories is a convex

polytope containing L, called the no-signaling polytope P [38]. Finally, the quantum set Q has

an infinite number of extreme points, obtainable by projective measurements on pure quantum

states. Hence, the quantum probabilities form a convex set Q, but not a convex polytope. Since

the probabilities achievable within quantum mechanics can violate Bell inequalities but they satisfy

all inequalities defining the no-signaling polytope P, it follows that the convex set Q is sandwiched

15

between L and P, as illustrated in Fig. 2.1.

The above discussion is for a bipartite case. But, the same argument can be applied to

multipartite cases, and in general we have the relationship that L ⊂ Q ⊂ P.

To help demonstrate the violation of LR in an experimental test, we need to characterize the

LR polytope L in terms of Bell inequalities. Given the number of parties per test, the number of

measurement settings per party, and the number of outcomes per measurement, one can list all LR

states λ, i.e., the vertices of the LR polytope L. Hence, in principle, all the faces of the polytope

L including all tight Bell inequalities can be constructed. However, this task is computationally

hard. Specifically, Pitowsky showed that determining whether or not a set of probabilities is inside

of the LR polytope L is an NP-complete problem [39]. So, it is not surprising that the complete

characterization of the LR polytope exists only either in a configuration where the numbers of

parties, settings, and outcomes are small [32, 40, 41, 42, 43] or where additional symmetries can

be exploited [33, 44, 45]. In the following section, we present some important and well-studied Bell

inequalities.

2.3 Various Bell inequalities

The simplest and most tested Bell inequality is the CHSH inequality (1.2) of Chapter 1.

It requires only two local parties with two dichotomic measurements (i.e., measurements with two

outcomes) at each party. It is violated by all the pure entangled states of d-level systems (d ≥ 2) [46,

47, 48] (see Sec. 2.4 below for the definition of entanglement). It is the best Bell inequality robust

against the special depolarizing noise (i.e., unpolarized photons) in an experimentally prepared

Bell state of two polarized photons, except the slightly better Bell inequality with at least 465

dichotomic measurements at each party [49]. In addition, the violation of the CHSH inequality is

robust against detection inefficiencies (i.e., particle losses) in experiments [28, 50, 51, 52, 53]. In

the following, we mostly discuss different generalizations of the CHSH inequality, including but not

restricted to Bell inequalities with more measurement settings [27, 41, 52, 54, 55, 56, 57], with more

measurement outcomes [41, 58, 59], or with more parties [33, 44, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69].

16

2.3.1 Bell inequalities with many settings

The CHSH inequality (1.2) can be expressed in the following form

P (A2 6= B1) + P (B1 6= A1) + P (A1 6= B2) ≥ P (A2 6= B2), (2.12)

where P (Ai 6= Bj) is the probability that measurements Ai and Bj have different outcomes. The

inequality (2.12) can be derived as follows: We define the function f(X,Y ) = |x− y| on the set of

measurement outcomes {X = x, Y = y} where X,Y ∈ {A1, A2, B1, B2} and x, y = ±1. Then, it is

easy to see that this function satisfies the triangle inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z). So, we

have f(A2, B1) + f(B1, A1) ≥ f(A2, A1) and f(A2, A1) + f(A1, B2) ≥ f(A2, B2). Combing these

two inequalities, we get the inequality

f(A2, B1) + f(B1, A1) + f(A1, B2) ≥ f(A2, B2), (2.13)

which is satisfied by the measurement outcomes assigned by any LR state λ. Since a general LR

theory corresponds to a probability distribution ρ(λ) over all LR states and∫ρ(λ)f(X,Y )dλ =

2P (X 6= Y ), we get the inequality (2.12) from Eq. (2.13). By the same argument and using the

above triangle inequality 2(m − 1) times, we can extend the inequality (2.12) to the case where

each of Alice and Bob has m measurement settings:

P (Am 6= B1)+P (B1 6= A1)+P (A1 6= B2)+...+P (Bm−1 6= Am−1)+P (Am−1 6= Bm) ≥ P (Am 6= Bm),

(2.14)

which is the same as

Ichained ≡ E(AmB1) + E(B1A1) + ...+ E(Am−1Bm)− E(AmBm) ≤ (2m− 2). (2.15)

The inequality (2.15) is the chained CHSH inequality as presented in Refs. [27, 54]. The chained

CHSH inequality can be violated by quantum mechanics with a value for Ichained as high as

2m cos(π/(2m)) by choosing appropriate measurement settings on a Bell state [70].

The chained CHSH inequality has interesting applications in situations where the CHSH in-

equality is inadequate. For example, the use of the chained CHSH inequality reduces the number of

17

trials required to reject LR at a specified confidence level in experiments without noise or detection

inefficiency [71]. Moreover, the use of the chained CHSH inequality with a large m improves the

security of quantum key distribution [11, 72], and shows that, if quantum correlations are expressed

as mixtures of local correlations and general (not necessarily quantum) correlations, the coefficients

of local correlations approach to zero [72, 73]. In practice, however, the chained CHSH inequal-

ity (2.15) with m > 2 requires higher detection efficiency than the CHSH inequality (1.2) for closing

the detection loophole [74] (see Sec. 3.3 of Chapter 3 for an introduction of the detection loophole).

There are also other generalizations of the CHSH inequality for many settings [55, 56]. How-

ever, these Bell inequalities [27, 54, 55, 56] are generally not tight. In 2001, Pitowsky and Svozil

suggested a general method for obtaining all tight Bell inequalities for a given experimental con-

figuration [40]. By this method, many tight Bell inequalities are found [40, 41, 52, 57]. In the

following, I present some known tight Bell inequalities.

As discussed in Sec. 2.2, when each of Alice and Bob has two dichotomic measurements,

there is only one type of tight Bell inequality, i.e., the CHSH inequality (1.2) (or the Clauser-Horne

(CH) inequality (1.3)). When each party has three dichotomic measurements, besides the CHSH

inequality there is one more type of tight Bell inequality [40, 41]

I32 ≡P (A1B1) + P (A1B2) + P (A1B3) + P (A2B1) + P (A2B2)− P (A2B3)

+ P (A3B1)− P (A3B2)− P (A1)− 2P (B1)− P (B2) ≤ 0, (2.16)

where P (AiBj) is the probability that both measurements Ai and Bj have outcome +1 and P (Ai)

or P (Bj) is the probability that measurement Ai or Bj has outcome +1. Note that, here and below

two Bell inequalities are called of the same type if and only if they can be transformed to each

other by relabling the parties, the settings, or the outcomes and by considering the no-signaling

conditions as in Eq. (2.8) and the normalization conditions as in Eq. (2.11). Also, the subscripts

md in the notation Imd means that each of Alice and Bob has m different measurement settings

and each measurement has d possible outcomes.

There are two-qubit states that violate the inequality (2.16) but not a CHSH inequality [41].

18

(A qubit is a two-level quantum-mechanical system.) This new inequality also illustrates a sharing

of bipartite violation of LR between three qubits [41]. That is, given a state ρ123 of three qubits 1, 2

and 3, the corresponding states of the first two qubits and the last two qubits, ρ12 and ρ23, can show

violations of the inequality (2.16) at the same time by choosing three appropriate measurements on

each qubit 1, 2 or 3. However, such a phenomenon cannot be observed using a CHSH inequality.

In practice, the Bell inequality I32 ≤ 0 can tolerate more detection inefficiency than a CHSH

inequality. Specifically, Ref. [50] considered a situation where one party’s detector is perfect and

showed that the inequality I32 ≤ 0 can be violated by measurements on a two-qubit state when

the other party’s detector has an efficiency higher than 43%, which is lower than the minimum

detection efficiency 50% required for violating a CHSH inequality in the same situation.

For the case where both Alice and Bob have four dichotomic measurements, there are many

new types of tight Bell inequalities. Only a partial list of 155 inequivalent tight Bell inequalities

was found [52, 53]. Some of these inequalities tolerate more detection inefficiency than a CHSH

inequality [1, 52, 53]. Specifically, using one of these inequalities, the minimum detection efficiency

required at each party for closing the detection loophole is as low as 61.8% (using an entangled

state of two four-level quantum systems) [1], which is lower than the minimum efficiency 66.7%

when using a CHSH inequality and a two-qubit state [28].

For the case where there are m (m > 4) dichotomic measurements at each of Alice and Bob,

not many Bell inequalities are constructed. Ref. [41] constructed one type Bell inequality Im2 ≤ 0,

but this Bell inequality may not be tight. Using the Bell inequality Im2 ≤ 0, it is shown that, if

one party’s detector is perfect and the other party’s detector has an efficiency higher than 1/m,

the detection loophole can be closed using an entangled state of two m-level quantum systems [1].

2.3.2 Bell inequalities with many outcomes

For two d-outcome measurements A1 and A2 (or B1 and B2) at Alice (or Bob), we label

the measurement outcomes by 0, 1, ..., or d − 1. Let the notation x mod d denote the positive

remainder of the division of x by d, and define the function f(X,Y ) = (x− y) mod d on the set of

19

measurement outcomes {X = x, Y = y} where X,Y ∈ {A1, A2, B1, B2} and x, y ∈ {0, 1, ..., d− 1}.

It is easy to show that this function satisfies the triangle inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z).

Using this triangle inequality twice, we can see that the measurement outcomes according to any

LR state satisfy the inequality f(A2, B1) + f(B1, A1) + f(A1, B2) ≥ f(A2, B2). Hence, we get a

Bell inequality

〈f(A2, B1)〉+ 〈f(B1, A1)〉+ 〈f(A1, B2)〉 ≥ 〈f(A2, B2)〉 . (2.17)

Eq. (2.17) is a simplified expression of the original Collins-Gisin-Linden-Massar-Popescu (CGLMP)

inequality [58], equivalent to the expression in Ref. [75]. By the same argument and using the above

triangle inequality 2(m− 1) times, we get a chained form of the CGLMP inequality [72]

〈f(Am, B1)〉+〈f(B1, A1)〉+〈f(A1, B2)〉+〈f(B2, A2)〉+...+〈f(Am−1, Bm)〉 ≥ 〈f(Am, Bm)〉 . (2.18)

When d = 2, Eqs. (2.17) and (2.18) reduce to the CHSH inequality (2.12) and the chained CHSH

inequality (2.14), respectively.

One of the interests of the CGLMP inequality (2.17) is that it is a tight Bell inequality for

any number d of outcomes [43]. Particularly, when d ≤ 3 all tight Bell inequalities are the CGLMP

inequalities with the parameter d = 2 or 3 in the definition of the function f [43]. Also, when

d > 2 a CGLMP inequality is more robust against noise and detection inefficiency in an experiment

(due to the higher dimension of the tested quantum system) than a CHSH inequality [58, 76]. In

addition, unlike a CHSH inequality, when d > 2 the maximal violation of a CGLMP inequality

cannot be achieved with the maximally entangled state [4, 77, 78].

2.3.3 Bell inequalities with many parties

For two measurements with outcomes ±1 at each of n parties, the correlation functions

E(O1,k1O2,k2 ...On,kn) for measurements Oi,kiat the ith party where ki = 1, 2 can be explained by

LR theories if and only if the following set of Bell inequalities [33, 44]∣∣∣∣∣∣∑

s1,...,sn=±1

S(s1, ..., sn)∑

k1,...,kn=1,2

sk1−11 ...skn−1

n E(O1,k1O2,k2 ...On,kn)

∣∣∣∣∣∣ ≤ 2n (2.19)

20

is satisfied, where S(s1, ..., sn) stands for an arbitrary function of the indices s1, ..., sn ∈ {−1, 1}

such that its range is the set {−1, 1}. The Bell inequalities in Eq. (2.19) follow from the algebraic

identity ∑s1,...,sn=±1

S(s1, ..., sn)n∏i=1

[oi,1(λ,Oi,1) + sioi,2(λ,Oi,2)] = ±2n,

where oi,ki(λ,Oi,ki

) is the predetermined outcome for measurement Oi,kigiven an LR state λ.

Since there are 22ndifferent functions S(s1, ..., sn), Eq. (2.19) represents a set of 22n

Bell

inequalities, which includes all tight Bell inequalities in the correlation space [33]. Many of these

inequalities are trivial. For example, when the choice for the function is S(s1, ..., sn) = 1 for all

arguments, one gets the condition E(O1,1O2,1...On,1) ≤ 1. Specific other choices give nontrivial

inequalities. For example, when S(s1, ..., sn) =√

2 cos[(s1 + ... + sn − n − 1)π/4], which is always

±1 no matter what are the values of s1, ..., sn, one recovers the Mermin-Ardehali-Belinskii-Klyshko

(MABK) inequalities [60, 61, 62], in the form derived by Belinskii and Klyshko [62]. Specially, for

n = 2, the CHSH inequality (1.2) follows.

The maximal violations of Bell inequalities in Eq. (2.19) are attained by the generalized

Greenberger-Horne-Zeilinger (GHZ) state |ψGHZ〉 = 1√2(|0〉1...|0〉n + |1〉1...|1〉n) with a choice of

measurements depending on the inequality under consideration [33]. Among these Bell inequalities,

the MABK inequality can be violated by the largest amount, and this maximal violation is 2n√

2n−1.

Considering the special depolarizing noise in an experiment, the experimental state has the form

ρ = V |ψGHZ〉〈ψGHZ|+ (1− V )ρnoise where ρnoise is the completely mixed state. Then, the MABK

inequality is violated if and only if V > 1/√

2n−1 [44]. In addition, unlike the case where n is

even, there are pure and fully entangled states of an n-partite system that do not violate any Bell

inequality in Eq. (2.19) when n is odd [79].

The above discussion is for the case where each of n parties has two dichotomic measurements.

For a general case where each party has more than two measurements or two measurements with

more than two outcomes, a few Bell inequalities are constructed [64, 65, 66, 67, 68, 69].

21

2.3.4 Derivation of Bell inequalities

There are many different types of tight Bell inequalities. These inequalities corresponds to

the faces of the LR polytopes associated with different numbers of parties, settings, and outcomes.

Since it is a hard problem to find all faces of an LR polytope, deriving Bell inequalities from the

characterization of an LR polytope is not practical. Are there other guiding principles as to how

Bell inequalities are derived? In the following, we discuss two such principles.

First, an LR model corresponds to a probability distribution over all LR states λ, where given

the state λ the outcomes of all possible measurements are known. That is, according to an LR

model, there is a probability distribution over the outcomes of all measurements of all parties [32].

For example, in the configuration where there are two dichotomic measurements at each of two

parties, A1, A2 at Alice and B1, B2 at Bob, the existence of an LR model is equivalent to the

existence of the joint probability distribution P (A1 = a1, A2 = a2, B1 = b1, B2 = b2) where ai and

bj are the outcomes of the corresponding measurements Ai and Bj , i, j = 1, 2. However, since the

measurements A1 and A2 at Alice (or B1 and B2 at Bob) are not compatible with each other, this

joint probability distribution is not accessible. In an experiment, we can observe only marginal

distributions P (Ai = ai, Bj = bj), i, j = 1, 2. If there is a joint distribution consistent with the

experimental marginals, these marginals satisfy linear inequalities, i.e., Bell inequalities. Hence, we

can think a Bell inequality as a consistency constraint on marginal distributions.

In general, it is difficult to test whether or not a set of marginal distributions is consistent.

However, some necessary conditions for consistency may be easy to characterize. Particularly, if we

associate each measurement context corresponding to a marginal distribution with a logical formula,

then it is possible to derive a Bell inequality from a logical consistency constraint on these formulas.

For example, for the above configuration we associate each measurement context (Ai, Bj) with the

logical formula Ai 6= Bj , where Ai 6= Bj means that the outcomes of measurements Ai and Bj are

different. Classical logic shows that, if A2 6= B2 is true then (A1 6= B1)∨ (A1 6= B2)∨ (A2 6= B1) is

22

also true, where the notation ∨ is the logical or operator. As a result,

P (A2 6= B2) ≤ P ((A1 6= B1) ∨ (A1 6= B2) ∨ (A2 6= B1))

≤ P (A1 6= B1) + P (A1 6= B2) + P (A2 6= B1),

which is the CHSH inequality (2.12). To find a violation of such a logical Bell inequality, we only

need to find a situation where the logical formulas are jointly contradictory. Recently, Abramsky

and Hardy [80] showed that, for the configuration where all measurements of each party have 2p

outcomes, any Bell inequality can be derived from a logical consistency constraint (although such

a constraint may be hard to find in practice). Their proof works for any finite number of parties

and any finite number of measurement settings per party.

The second principle is that, from Secs. 2.3.1 and 2.3.2 we can see that both the CHSH

inequality (2.12) and the CGLMP inequality (2.17) can be derived from a triangle inequality

f(X,Y ) + f(Y,Z) ≥ f(X,Z), satisfied by the function f(X,Y ) = (x − y) mod d on the set of

measurement outcomes {X = x, Y = y} where x, y ∈ {0, 1, ..., d− 1}. To derive the CHSH inequal-

ity or the CGLMP inequality, we use the above triangle inequality twice. For a general configuration

where each party has more than two measurement settings, we can repeat the use of the triangle

inequality several times in order to derive new types of Bell inequalities. For example, when Alice

and Bob have m d-outcome measurements A1, A2, ..., Am and B1, B2, ..., Bm, respectively, using the

triangle inequality 2(m− 1) times we get the chained CGLMP inequality (2.18).

Also, we can consider other functions satisfying the triangle inequality. For example, the

function f(X,Y ) = max{0, x − y}, defined on the set of measurement outcomes {X = x, Y = y}

where X,Y ∈ {A1, A2, B1, B2} and x, y ∈ {0, 1, ..., d−1}, satisfies the triangle inequality f(X,Y )+

f(Y, Z) ≥ f(X,Z). Using this triangle inequality twice, we get that, according to any LR model,

〈f(A2, B1)〉+ 〈f(B1, A1)〉+ 〈f(A1, B2)〉 ≥ 〈f(A2, B2)〉. (2.20)

When d = 2, the above inequality (2.20) reduces to

P (A2 = 1, B1 = 0) + P (B1 = 1, A1 = 0) + P (A1 = 1, B2 = 0) ≥ P (A2 = 1, B2 = 0). (2.21)

23

Using the no-signaling conditions in Eq. (2.8), we get

P (B1 = 1, A1 = 0) = P (A1 = 0)− P (A1 = 0, B1 = 0),

P (A2 = 1, B1 = 0) = P (B1 = 0)− P (A2 = 0, B1 = 0), and

P (A1 = 1, B2 = 0)− P (A2 = 1, B2 = 0) = P (A2 = 0, B2 = 0)− P (A1 = 0, B2 = 0). (2.22)

One can see that the terms at the left-hand side of Eq. (2.22) are in Eq. (2.21). Replacing these

terms in Eq. (2.21) by the terms at the right-hand side of Eq. (2.22), the CH inequality (1.3) follows.

In this sense, the inequality (2.20) is the generalized CH inequality for a high-dimensional bipartite

system. However, it is an open problem whether or not the generalized CH inequality (2.20)

is tight for any d, i.e., whether or not the Bell inequality (2.20) corresponds to a face of the

associated LR polytope. Also, the relationship between the generalized CH inequality and the

CGLMP inequality (2.17) deserves further investigation.

Furthermore, using the triangle inequality satisfied by f(X,Y ) = max{0, x−y}, we can show

that, no matter what are the predetermined outcomes assigned by an LR state to the measurements

A1, A2, A3 at Alice and B1, B2, B3 at Bob, the inequality

f(A1, B3) + f(B2, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2) ≥ f(A2, B3) + f(B2, A3),

(2.23)

is satisfied. To show the above inequality, we need to consider three different cases: (i) when

f(A2, B3) = 0, we can get f(B2, A1) + f(A1, B1) + f(B1, A3) ≥ f(B2, A3) using the triangle

inequality twice. Hence, the inequality (2.23) follows. (ii) when f(B2, A3) = 0, as in case (i), we

can show the inequality (2.23) using the triangle inequality twice. (iii) when f(A2, B3) = k > 0 and

f(B2, A3) = l > 0, that is, A2 = B3 + k and B2 = A3 + l, the left-hand side of the inequality (2.23)

24

becomes

f(A1, B3) + f(B2, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2)

=f(A1, A2 − k) + f(A3 + l, A1) + f(B1, A3) + f(A2, B1) + f(A1, B1) + f(A2, B2)

≥f(A3 + l, A2 − k) + f(A2, A3)

≥(k + l).

Hence, the inequality (2.23) is always satisfied by an LR state. Therefore, we get the following Bell

inequality

〈f(A1, B3)〉+ 〈f(B2, A1)〉+ 〈f(B1, A3)〉+ 〈f(A2, B1)〉+ 〈f(A1, B1)〉+ 〈f(A2, B2)〉

≥ 〈f(A2, B3)〉+ 〈f(B2, A3)〉. (2.24)

When d = 2, the above inequality (2.24) reduces to

P (A1 = 1, B3 = 0) + P (A1 = 0, B2 = 1) + P (A3 = 0, B1 = 1) + P (A2 = 1, B1 = 0)

+ P (A1 = 1, B1 = 0) + P (A2 = 1, B2 = 0) ≥ P (A2 = 1, B3 = 0) + P (A3 = 0, B2 = 1). (2.25)

Using the no-signaling conditions in Eq. (2.8), we get

P (A1 = 1, B3 = 0)− P (A2 = 1, B3 = 0) = P (A2 = 0, B3 = 0)− P (A1 = 0, B3 = 0),

P (A3 = 0, B1 = 1)− P (A3 = 0, B2 = 1) = P (A3 = 0, B2 = 0)− P (A3 = 0, B1 = 0),

P (A1 = 1, B1 = 0) = P (B1 = 0)− P (A1 = 0, B1 = 0),

P (A1 = 0, B2 = 1) = P (A1 = 0)− P (A1 = 0, B2 = 0),

P (A2 = 1, B1 = 0) = P (B1 = 0)− P (A2 = 0, B1 = 0), and

P (A2 = 1, B2 = 0) = P (B2 = 0)− P (A2 = 0, B2 = 0). (2.26)

One can see that the terms at the left-hand side of Eq. (2.26) are in Eq. (2.25). Replacing these

terms in Eq. (2.25) by the terms at the right-hand side of Eq. (2.26), the I32 inequality (2.16) follows.

In this sense, the inequality (2.24) is a Bell inequality generalizing the I32 inequality (2.16) to a

25

high-dimensional bipartite system. However, whether or not the generalized Bell inequality (2.24)

is tight for any d > 2 is unknown.

So far, only bipartite Bell inequalities have been derived from triangle inequalities. For

multipartite cases, from triangle inequalities we can also derive constraints on the predictions

according to LR. Suppose that there are n(n ≥ 3) parties where each party i has two measurements

Oi and O′i with outcomes 0 or 1. To derive a constraint on all LR predictions, we use the triangle

inequality f(X,Y ) + f(Y,Z) ≥ f(X,Z), satisfied by the function f(X,Y ) = |x− y| defined on the

set of measurement outcomes {X = x, Y = y} where X and Y are measurements with outcomes 0

or 1. For the bipartite case, one can see that the expressions

S2 ≡12

[f(O′1, O2) + f(O2, O1) + f(O1, O′2)− f(O′1, O

′2)]

and

S′2 ≡12

[f(O1, O′2) + f(O′2, O

′1) + f(O′1, O2)− f(O1, O2)]

can be only 0 or 1 according to an LR state. We can think S2 and S′2 as measurements on the first

two subsystems with outcomes 0 or 1. Then, we can use the triangle inequality f(X,Y )+f(Y,Z) ≥

f(X,Z) twice on the measurements S2, S′2 of the first two subsystems and the measurements O3

and O′3 of the third subsystem, in order to get a constraint on the LR predictions of the three

subsystems

〈f(S′2, O3)〉+ 〈f(O3, S2)〉+ 〈f(S2, O′3)〉 ≥ 〈f(S′2, O

′3)〉. (2.27)

Note that, this constraint is not expressed in terms of directly measurable quantities in an exper-

iment. We conjecture that some Bell inequalities can be derived from Eq. (2.27). In general, for

the n-partite case, let

Sn ≡12

[f(S′n−1, On) + f(On, Sn−1) + f(Sn−1, O′n)− f(S′n−1, O

′n)], (2.28)

where S′n−1 can be got from Sn−1 by exchanging Oi and O′i for all i < n. By induction we can

show that the expressions S2, ..., Sn−1 and S′2, ..., S′n−1 can be only 0 or 1 according to an LR state.

Hence, the expectation of Sn according to LR cannot be less than 0.

26

From the above, we can see that the triangle inequality is a powerful tool for deriving various

Bell inequalities. Whether or not all Bell inequalities can be derived from triangle inequalities is

an interesting open problem and deserves further investigation.

2.4 Bell inequality and entanglement

Let us first introduce several concepts. We call a state of a quantum system pure, if this

state corresponds to a vector |ψ〉 in the Hilbert space, that is, the state space of the system. In

a general situation one does not know which pure state a quantum system is in. It is only known

that the system is, with some probability pi, in one of some pure states |ψi〉. For this situation, the

state of the system is described by a density matrix

ρ =∑i

pi|ψi〉〈ψi|, with∑i

pi = 1 and pi ≥ 0. (2.29)

For a bipartite system, if the two subsystems 1 and 2 are in the pure states |ψ〉1 and |φ〉2,

respectively, the state of the composite system is |ϕ〉12 = |ψ〉1⊗ |φ〉2, the tensor product of the two

subsystems’ states. This state |ϕ〉12 is called a product state. In a general situation, the state of a

bipartite system is described by a density matrix ρ12. If the state ρ12 can be written as a convex

combination of product states

ρ12 =∑ij

pij |ψi〉1〈ψi| ⊗ |φj〉2〈φj |, with∑ij

pij = 1 and pij ≥ 0, (2.30)

the state ρ12 is called separable. Otherwise, the state is called entangled. So, for a bipartite system

its state is either separable or entangled. However, for a multipartite system, its state space has

a richer structure. We call a state ρ12...n of an n-partite system fully separable, if ρ12...n can be

written as a convex combination of product states

ρ12...n =∑i1...in

pi1...in |ψi1〉1〈ψi1 |⊗|φi2〉2〈φi2 |⊗ . . .⊗|ϕin〉n〈ϕin |, with∑i1...in

pi1...in = 1 and pi1...in ≥ 0.

(2.31)

Otherwise, the state ρ12...n is called entangled; however, it can be either fully entangled or partially

entangled and partially separable. A state ρ12...n of an n-partite system is called fully entangled

27

if it is not biseparable, that is, if it cannot be prepared by mixing states that are separable with

respect to some bipartitions of the n subsystems. For example, a tripartite state ρ123 is biseparable

if it can be written as a convex combination

ρ123 =∑ij

p(1)i |ψi〉1〈ψi| ⊗ |φj〉23〈φj |+

∑ij

p(2)i |ψ

′i〉2〈ψ′i| ⊗ |φ′j〉13〈φ′j |+

∑ij

p(3)i |ψ

′′i 〉3〈ψ

′′i | ⊗ |φ

′′j 〉12〈φ

′′j |, with

∑ijl

p(l)ij = 1 and p

(l)ij ≥ 0. (2.32)

Here, |φ′′j 〉12, |φj〉23, or |φ′j〉13 is a pure state of the subsystem 12, 23, or 13, respectively. Note

that, these pure states can be entangled states of their corresponding subsystems. If a state of

a composite system is both pure and entangled, then it is called a pure entangled state. If a

state is entangled but not pure, it is called a mixed entangled state. For more discussions about

entanglement including its classification and quantification, see the review papers [10, 81].

Since a separable state can be decomposed into a convex combination of product states, it can

be shown that all measurements on the subsystems over which the separable state of a composite

system is decomposed admit LR descriptions. Hence, the violation of a Bell inequality signifies that

the state is entangled, as first pointed out by Terhal [82]. For example, the violation of the CHSH

inequality (1.2) can detect all pure entangled two-qubit states [46] and some mixed entangled

two-qubit states [83]. Moreover, entanglement detection based on a Bell-inequality violation is

device-independent. That is, one can infer the presence of entanglement even when the tested

quantum state and the measurement settings chosen in an experiment are unknown.

In general, one can detect entanglement based on an entanglement witness. Such a witness

is an observable W with the properties Tr(Wρent) < 0 for at least one entangled state ρent and

Tr(Wρsep) ≥ 0 for all separable states ρsep [82, 84]. Note that, an entanglement witness corresponds

to a linear inequality constraining the probability distribution of experimental results. So, by

subtracting some nonlinear terms the modified expression will detect more entangled states than

the original witness [85].

For each entangled state ρent, there exists an entanglement witness detecting it [84]. For a

bipartite 2×2 or 2×3 dimensional quantum system, if a state ρ12 is entangled, the partial transpose

28

of ρ12 on the subsystem 1 or 2 has a negative eigenvalue. Suppose that the state of a bipartite

system is given as

ρ12 =∑i,j,l,k

Mil,jk|ψi〉1〈ψj | ⊗ |φl〉2〈φk|, (2.33)

where Mil,jk is a complex number, and |ψi〉1 and |φl〉2 are state-space bases of subsystems 1 and 2,

respectively. The partial transpose of ρ12 on a subsystem, for example on subsystem 2, is defined

as

Γ(2)(ρ12) =∑i,j,l,k

Mil,jk|ψi〉1〈ψj | ⊗ (|φl〉2〈φk|)T

=∑i,j,l,k

Mil,jk|ψi〉1〈ψj | ⊗ |φk〉2〈φl|. (2.34)

From this definition, it is easy to see that the partial transpose on subsystem 1, Γ(1)(ρ12), is given as

the matrix transpose of Γ(2)(ρ12). Hence, the eigenvalues and corresponding eigenstates of Γ(1)(ρ12)

and Γ(2)(ρ12) are the same. The entanglement witness detecting ρ12 can be constructed from the

eigenstate of Γ(2)(ρ12) corresponding to a negative eigenvalue. The motivation for this construction

is that, the partial transpose of a separable state (of any dimension) has no negative eigenvalue [86].

However, this construction does not work for all entangled states, since there is an entangled state

whose partial transpose has no negative eigenvalue [87]. In general, constructing an entanglement

witness to detect an arbitrary entangled state is a hard problem.

The detection of entanglement based on a witness, however, requires a detailed characteri-

zation of the system observed (such as its dimension) and of the measurements performed in an

experiment. Otherwise, the measured witness operator W ′ is different from the ideal witness W ,

so that even if Tr(Wρsep) ≥ 0 for all separable states ρsep, it is possible Tr(W ′ρsep) < 0 for some

separable states. To overcome this problem, researchers derived several generalized Bell-type in-

equalities [63, 88, 89, 90, 91, 92, 93, 94, 95] satisfied by all fully or partially separable states, so

that the violation of such a generalized inequality certifies that the quantum state is entangled or

fully multipartite entangled without assuming the dimension of the system observed or knowing the

measurements performed. Note that, in general these generalized Bell-type inequalities are different

29

from Bell inequalities, since the former inequalities are derived without assuming LR descriptions

of quantum systems.

The exact relationship between the violation of LR and entanglement is still poorly under-

stood. Quantitatively, the violation of LR and entanglement are different. The maximal violation

of LR according to various measures (e.g., the violation of a Bell inequality or the Kullback-Leibler

divergence [24]) is generally not given by maximally entangled states [96]. For example, as pointed

out in Sec. 2.3.1, the maximal violation of the CGLMP inequality is not given by the maximally

entangled state of two d-level systems.

Qualitatively, all bipartite or multipartite pure entangled states, where each subsystem may

have a different dimension, violate a Bell inequality [97]. However, for mixed states, entanglement

does not promise a violation of LR. Specifically, some mixed states have bound entanglement,

that is, from these states a singlet state cannot be distilled. For example, if the partial transpose

of an entangled state has no negative eigenvalue, this state has bound entanglement [87]. Peres

conjectured that such states always admit LR descriptions [7]. Peres’s conjecture is an interesting

open problem, and there are many recent works trying to prove or disprove it. Recently, Peres’s

conjecture was disproved in the multipartite case [98]. Specifically, Ref. [98] exhibits a three-qubit

entangled state that is biseparable (see Eq. (2.32)) and so bound entangled, but the measurement

results according to this state violate a tripartite Bell inequality. But Peres’s conjecture remains

open in the bipartite case. (The numerical optimization results in the recent work [99] suggest that

Peres’s conjecture is correct in the bipartite case.) The above shows that the violation of LR and

entanglement are different concepts.

2.5 Bell inequality, steering, and contextuality

The violation of a Bell inequality certifies not only entanglement but also two other properties

of a quantum system, steering [100, 101, 102] and contextuality [103, 104].

Steering describes the ability of Alice, by performing different measurements on her own

system, to remotely prepare Bob’s system into different ensembles of pure states [100]. For example,

30

suppose that Alice and Bob share the Bell state |ψBell〉 = 1√2(|0〉A|0〉B + |1〉A|1〉B), where |0〉 and

|1〉 are the two eigenstates of the measurement operator σz =(

1 00 1

). Then, if Alice performs the

measurement σz on her qubit, Bob’s qubit will become in a state in the ensemble {|0〉B, |1〉B};

however, if Alice performs the measurement σx =(

0 11 0

)on her qubit, Bob’s qubit will become in a

state in a different ensemble {|+〉B, |−〉B} where |+〉 and |−〉 are the two eigenstates of σx. Note

that, in each case, after Alice’s measurement Bob’s qubit will be prepared into one pure state in

the corresponding ensemble and the prepared state depends on Alice’s measurement outcome. So,

a Bell state exhibits steering. On the other hand, for a separable state as in Eq. (2.30) shared by

Alice and Bob, after any measurement performed by Alice on her system, the ensemble of possible

pure states for Bob’s system is always the same. Hence, a separable state does not exhibit steering.

Recently, Wiseman et al. showed that the above operational definition of steering is equivalent

to the violation of the local hidden state model for Bob [101]. In the local hidden state model,

the measurement outcome of Alice is determined by a hidden variable while the distribution of the

measurement outcomes of Bob is explained by a set of quantum states that are correlated with the

values of the hidden variable. According to this mathematical formulation, a local hidden state

model is a special LR model. Specifically, the local hidden state models form a strict and convex

subset of LR models [102]. Hence, the violation of a Bell inequality demonstrates the ability of

Alice to steer the state of Bob’s system.

Contextuality means that, if there is a hidden variable explaining the outcome of a mea-

surement performed on a system, then the outcome assigned to this measurement by the hidden

variable depends on the experimental context, i.e., which other compatible measurements are per-

formed simultaneously on the system. Hence, to show the contextuality of a quantum system, we

consider the constraints satisfied by all non-contextual hidden variable models. It turns out that

non-contextual hidden variable models are special LR models. Hence, a constraint on LR models,

such as a Bell inequality, is also satisfied by all non-contextual hidden variable models. Therefore,

a Bell-inequality violation demonstrates the contextuality of a quantum system.

Like the case for local hidden variables, given a set of experimental contexts the proba-

31

bility distributions described by non-contextual hidden variables form a convex polytope, the

non-contextual polytope. One difference between these two kinds of hidden variables is that, a

non-contextual hidden variable is associated with a set of compatible measurements, while a local

hidden variable is associated with a set of spacelike-separated measurements. Another difference is

that, unlike a Bell inequality, there exists an non-contextuality inequality, corresponding to a facet

of the non-contextual polytope, such that it can be maximally violated by all quantum states of a

system with the same set of measurements [105].

2.6 Bell inequality and private information

Suppose that two parties, Alice and Bob, share a pair of spin-1/2 particles which is in the

singlet state 1√2(| ↑↓〉 − | ↓↑〉). If they measure their particles’ spins along the same direction,

they will observe exactly anticorrelated outcomes that are unknown to a third party. Hence, if

Alice and Bob can verify that they share the singlet state, for example through testing the CHSH

inequality (1.2) as suggested by Ekert in 1991 [106], they can build a secure quantum channel for

sharing private information. However, Ekert’s scheme [106] relies on two assumptions that cannot be

verified in practice: (i) the system and any eavesdropper must obey the law of quantum mechanics,

and (ii) Alice and Bob have perfect control of the state preparation and of the measurement devices,

i.e., they know how their devices work.

Gradually researchers realized that, provided the no-signaling principle can be trusted and

without assuming quantum mechanics, in the device-independent scenario one can still extract se-

cure private information from measurement outcomes that violate LR [3, 11, 13, 107, 108, 109, 110,

111, 112]. For example, in quantum key distribution, if Alice’s and Bob’s measurement outcomes

violate LR, whatever is the underlying physical theory that produces these outcomes, the eaves-

dropper cannot have full information about them. Otherwise, the eavesdropper’s information could

be treated as an LR description of these outcomes. Hence, the violation of LR can be thought of

as a privacy witness.

Chapter 3

Challenges of testing local realism

3.1 Experimental configuration for testing local realism

The experimental procedure for testing a bipartite Bell inequality is shown in Fig. 3.1. We call

such a procedure a trial. At each trial the locality assumption must be satisfied, as shown in the inset

space-time diagram of Fig. 3.1. After many trials, one can estimate the correlations or probabilities

appeared in a Bell inequality and determine whether the Bell inequality is violated based on these

estimates. If the Bell inequality is violated, in consideration of the statistical fluctuations in finite

trials, experimenters conventionally present the violation in terms of the number of experimental

standard deviations (SDs) of violation of a Bell inequality. For example, Weihs et al. [113] reported

an experimental estimate ICHSH = 2.73±0.02 and claimed a violation of the CHSH inequality (1.2)

by 30 SDs. However, there are several loopholes that are never closed simultaneously in a test of

local realism (LR). In the following, we discuss them individually.

3.2 The locality loophole

As discussed in Sec. 2.1 of Chapter 2, a Bell inequality is derived based on two conditions—

locality and realism. Hence, to show a violation of LR, an experiment where the data violate a Bell

inequality should satisfy these two conditions.

Realism assumes that the outcome of an arbitrary measurement on a quantum system is in

principle predetermined, independent of the interaction between the system and the measurement

apparatus. This assumption is the belief of local realistic (LR) theorists and so one can pretend

33

𝐸

Particle

emission 𝐸

𝐴

Measurement

Alice Bob

𝐵

Measurement

Outcome 𝑏

Source

Outcome 𝑎

𝑖 Random choice

𝑖

Tim

e

𝑎 𝑏

Space

𝐴

𝐵

Random choice

Figure 3.1: The experimental procedure for testing a bipartite Bell inequality. The inset reflectsthe locality condition in the space-time diagram.

that it is true in an experiment. But, the locality condition requires experimenters’ efforts to

ensure that it is satisfied. Specifically, the experiment should satisfy the following two conditions:

first, the distance between different parties of the experiment should be large enough to prevent

light-speed or slower communication between one observer’s measurement choice and the result

of another observer’s measurement; second, local measurement choices should be made randomly,

and one should make sure that these choices are independent of each other and also that they are

independent of the LR state of the particle pair emitted from a source. These requirements are

illustrated in Fig. 3.1. Unfortunately, the locality condition is not satisfied in most of experiments

34

performed so far. The failure of this condition is called the locality loophole [31].

Before the epoch-making experiment reported in Ref. [114], all experiments for testing LR

were performed with static setups, in which measurement settings are held fixed for many successive

trials. It allows the possibility that the LR state of the particle pair emitted at a trial depends on the

setting choices made at the same trial. In 1982, Aspect et al. [114] performed the first experiment

using time-varying measurement settings. However, the settings are periodically switched during

the experiment. So, these settings are actually predictable, and communication even at a speed less

than the speed of light could explain the observed results. In 1998, Weihs et al. [113] performed

the first experiment that closed the locality loophole. In this experiment, each of Alice and Bob

chooses a local measurement setting randomly and independently after the entangled photon pair

left the source, and the measurement choice at one observer and the measurement outcome at

the other observer are spacelike separated. In 2010, another experiment [115] was performed that

improved on the result of Ref. [113] by having spacelike separation between the entangled photon

pair emission and random local setting choices.

3.3 The detection loophole

To test LR, each observer needs to perform a local measurement on his or her own particle

emitting from a common entanglement source. Over this process, the particles are subject to trans-

mission loss or detection loss. Furthermore, most experimental tests of LR utilize entangled photon

pairs, which are generated probabilistically by spontaneous parametric down-conversion [116, 117].

Due to particle loss or no particle generation, only at a small fraction of experimental trials there

are particle detections, and at most of the trials the detectors have no response signal. It is possible

that the detected outcomes violate LR while the whole pattern of measurement outcomes admits

an LR model. This is related with the fact that, even if the experimental probability distribution

is inside of the LR polytope in the probability space associated with the performed measurements

and all possible outcomes, the distribution conditioning on detections is still possible to be outside

of the LR polytope in the corresponding subspace. This kind of problem is generally called the

35

detection loophole [27].

In photonic experiments such as in Refs. [113, 114, 115, 116, 117], the violations of LR are

inferred using only the outcomes from detected photons. Hence, these results are subject to the

detection loophole. To justify these results, one needs to employ the fair-sampling assumption, i.e.,

the subensemble of detected photons is assumed to be representative of the entire ensemble. Other-

wise, one needs to improve the overall detection efficiency, including decreasing the transmission loss

and increasing the entangled photon pair generation probability. To show a loophole-free violation

of the CHSH inequality (1.2) or the CH inequality (1.3), using a Bell state the minimum detection

efficiency is 82.85 % [118], while using an unbalanced Bell state of the form cos(θ)|00〉+ sin(θ)|11〉

the minimum detection efficiency approaches 2/3 as θ goes to 0 [28]. If one particle in the entan-

gled pair is always detected, the minimum efficiency for detecting another particle can be as low

as 1/2 [50, 51]. Using other Bell inequalities or other entangled states, the minimum detection

efficiency can be further decreased [1, 50, 76, 119, 120, 121].

The only experiments that have closed the detection loophole so far involve entangled ions

or atoms [3, 122, 123, 124]. In such an experiment, the outcome from every trial is used to test the

CHSH inequality. The overall efficiency for detecting an ion or atom is about 98 %, high enough for

closing the detection loophole. However, in the experiment reported in Ref. [122], entangled ions

were only a few micrometers apart and the detection time was a few milliseconds, so this experiment

did not close the locality loophole. Later, the separation between two entangled ions was extended

to about one meter [3, 123]. Most recently, heralded entanglement between two neutral atoms

trapped independently 20 meters apart was created, and the violation of the CHSH inequality was

observed [124]. But, to close the locality loophole, it is necessary to increase the separation.

3.4 The memory loophole

Besides the locality and detection loopholes, there is another loophole, introduced by con-

sidering time-varying experimental configurations or LR models with memory. In general, to test

LR, a sequence of measurements is performed successively on a state prepared repeatedly. In the

36

analysis of experimental data, it is usually assumed that the LR model at a trial is independent of

previous trial results, i.e., previous measurement-setting choices and outcomes. Also, in the con-

ventional way of presenting a Bell-inequality violation in terms of the number of SDs, it is implicitly

assumed that the prepared quantum state and measurement settings do not vary during an experi-

ment. However, LR models can take advantage of previous trial results in order to explain the next

trial result better; and the experimental configuration may be unstable so that the quantification

of the evidence against LR in terms of the observed number of SDs of violation is not justified. The

violation of LR due to these possibilities is called the memory loophole [20, 21, 22, 23].

As shown by Barrett et al. [23] and Gill et al. [125], a Bell inequality is satisfied by all

probability distributions of trial results predicted by LR models, no matter whether these models

have memory or not. Hence, if an experimental probability distribution does not satisfy a Bell

inequality, the experiment reliably demonstrate the violation of LR. However, since the experimental

probability distribution is unknown after finite trials, the arguments in Ref. [23, 125] cannot be

used to justify the violation of LR witnessed by a finite number of data. The memory effect can

significantly influence the statistical fluctuations in LR predictions or the uncertainty of an observed

violation of a Bell inequality after finite trials. Hence, it affects the probability, according to LR,

of predicting a violation as high as the observed after finite trials. A rigorous bound on such a

probability was proposed by Gill [18, 19], which is satisfied by all LR models with memory or not.

However, this bound is not tight. The main part of this thesis is how to achieve a tight bound

(Chapter 5) and how to efficiently compute high-quality bounds (Chapter 6) while considering

memory effects.

3.5 Possibilities of loophole-free violations of LR

To date, no experiment has demonstrated a loophole-free violation of LR. It is still an open

problem to determine which systems are the best candidates for closing both the locality and

detection loopholes simultaneously in the near future. In the following, we discuss the challenges

needed to overcome in different systems, in order to perform loophole-free tests of LR.

37

In the test of LR using photons, such as in the first experiment [113] that closed the locality

loophole, the photon detection efficiency is about 5 %, not high enough for closing the detection

loophole. With the rapid development of photon sources and detectors, it is likely that the detec-

tion loophole in photonic experiments will be closed soon. Recently, entangled photon pairs with

high generation probability [126, 127] and photon-number-resolving detectors with high detection

efficiency (≥ 95 %) [128, 129] and with short timing jitter (≤ 4 ns) [130, 131] were developed. The

problem left for closing the detection loophole is how to integrate state-of-art photon sources and

detectors together and at the same time minimize the photon loss in transmission and measurement

apparatuses.

Actually, only a few months ago Giustina et al. demonstrated the violation of the CH

inequality using entangled photons without the fair-sampling assumption [132]. However, their

data analysis has problems so that the claimed high number of SDs of violation is not justified.

Also, since the photon-pair source is a continuous-wave rather than pulsed source, there is no well

defined “trial” in this experiment so that LR theorists can take advantage of this drawback to

explain the observed data.

In atomic experiments, the main problem is how to entangle two ions or atoms far away from

each other. Considering the sofar fastest atom (or ion) detection scheme [133], the detection time

is about 1 µs and so the separation between two ions or atoms should be at least 300 m in order to

close the locality loophole. It is very difficult to entangle two ions or atoms at such a long distance.

Usually, the entanglement between one ion or atom and one photon is first established, which is

relatively easily realized [134, 135]. After that, by an appropriate joint detection of the two photons

where each photon is entangled with one ion or atom, two distant ions or atoms are entangled [136].

In principle, entangling two distant ions or atoms is feasible. However, in experiments [3, 123], due

to the low photon collection and detection efficiency, the probability of heralding an entangled ion

pair at the distance of one meter is very low (about 2× 10−8), and only one entangled ion pair is

generated every 8 minutes. Also, in the most recent experiment [124], one pair of entangled atoms

20 m apart was generated every 2 minutes. So far no entanglement generation between two ions or

38

atoms separated by a longer distance has been reported.

Other proposals for loophole-free violations of LR include using entangled atom/ion-photon

systems [136, 137], entangled systems of dimension larger than two [1], continuous-variable measure-

ments on squeezed light [138, 139], or wave-particle correlations of entangled photons [140, 141, 142].

Compared with separating two entangled atoms or ions, it is easier to separate a photon

from an atom or ion. Due to the high efficiency of detecting an atom or ion, the minimum photon

detection efficiency required to close the detection loophole is decreased [50, 51]. However, the

overall efficiency of detecting a photon emitting from an atom or ion is still lower than the min-

imum efficiency required. The more serious problem is that, the entanglement creation between

an atom/ion and a photon is probabilistic and cannot be heralded, so that the test is hard to be

loophole-free. There is one ion-photon experiment [143] demonstrating a violation of the CHSH

inequality, but this experiment does not close either the detection loophole or the locality loophole.

Using a larger dimensional system can lower the minimum detection efficiency, but the state

required is hard to prepare. Continuous variables can be measured with efficiency close to one;

however, the state required is unfeasible to prepare or the violation displayed is very small and

sensitive to noise in an experiment. Using wave-particle correlations can partially reduce the

difficulty of the state preparation, but the photon detection efficiency required is still hard to

achieve. In summary, all these proposals have their advantages but also their disadvantages. There

are few experimental demonstrations of these proposals.

Chapter 4

Statistical strength of experiments for rejecting local realism

From Chapter 3, we can see that all experimental tests of local realism (LR) performed so far

are subject to at least one of the several loopholes discussed. Experiments carried out on trapped

atoms or ions closed the detection loophole [3, 122, 123, 124], but these particles were close to

each other (at most 20 meters apart) and the detection time was long (at least 1 µs), so these

experiments did not close the locality loophole. There have been photonic experiments addressing

the locality loophole [113, 114, 115, 144]. Yet due to low photon detection efficiency, photonic

experiments have not closed the detection loophole (at least at a significant level).

Previous results show that closure of the detection loophole requires a minimum detection

efficiency of 82.85 % when a Bell state is used [118]. With unbalanced Bell states cos(θ)|00〉 +

sin(θ)|11〉, the minimum detection efficiency approaches 2/3 as θ goes to 0 [28]. These results are

obtained via the Clauser-Horne-Shimony-Holt (CHSH) inequality (1.2). To test this inequality, each

local measurement should have two possible outcomes, such as whether a photon is horizontally

or vertically polarized. However, in a photonic experiment, if the measurement apparatus consists

of one polarization rotator, one polarizing beam splitter, and two photon detectors (see Fig. 4.1),

the measurement outcome can also be no detection at both detectors. In this case, to test the

CHSH inequality, we need to combine the no-detection outcome with one polarization. After this

combination, the evidence against LR generally becomes weaker. To study the violation of LR

and the minimum detection efficiency required without choosing a particular Bell inequality, we

quantify the experimental evidence against LR by a measure called the statistical strength (see

40

Sec. 4.2).

In this chapter, we study the possibility of rejecting LR with a source of entangled states

created from two independent polarized photons passing through a polarizing beam splitter. Simi-

lar sources are used in Refs. [116, 117, 145]. We call this source the “independent inputs” source.

Although this source does not produce balanced or unbalanced Bell pairs, it does create some

entanglement. An advantage of this source is that the input photons do not need to be entan-

gled. The two independent polarized photons can be generated by spontaneous parametric down-

conversion (SPDC) in nonlinear crystals [116, 117, 145], or by other single-photon sources being

developed such as atoms, ions, molecules, solid-state quantum dots, or nitrogen-vacancy centers in

diamond [146, 147]. The states of the two photons can be detected by photon counters or pho-

ton detectors. (We use the term “photon detector” to refer to detectors that determine only the

presence or absence of photons, not their number.) Since experimenters can gain more information

with photon counters than with simple photon detectors, we expect that photon counters make

violation of LR more detectable. We also expect that photon counters can mitigate the influence

of the effectively unentangled part of the state.

Our results show that it is possible to perform a test of LR free of the detection loophole

using the independent inputs source, assuming that the detection efficiency of photon counters

(photon detectors) is at least 89.71 % (at least 91.11 %, respectively), showing a small advantage

for photon counters. Furthermore, we numerically quantify the statistical strength of such a test

of LR as a function of the counter or detector efficiency and state parameters. For comparison, we

obtain the same information for an ideal source of unbalanced Bell states. This makes it possible

to estimate the minimum number of trials required to gain reasonable confidence in rejecting LR,

as this number is inversely related to statistical strength.

In Sec. 4.1, we briefly describe the experimental scheme that we analyze. In Sec. 4.2, we

point out the deficiencies of the most commonly used method for quantifying the violation of LR

and summarize the method based on the Kullback-Leibler (KL) divergence proposed in Ref. [24].

We present our results in Sec. 4.3, and we make concluding remarks in Sec. 4.4. This chapter is

41

based on our previous work [26].

4.1 Experimental configuration

Here we consider a test of LR using pairs of polarized photons which are in the same spatial-

temporal mode. The two photons can be generated by an SPDC process [116, 117, 145] in the

weak-pumping regime, although single-photon sources could be used [146, 147]. Given such photon

pairs, they can be processed as shown in Fig. 4.1 to produce a state that can violate LR.

2

4

3PR1 PR4

PR2

PR3

PBS1

PBS2

PBS3

D1

D2

D3

D4

Alice

Bob

PBS – Polarizing Beam

Splitter

PR – Polarization Rotator

D – Photon Detector (or

Photon Counter)

1

Figure 4.1: Schematic of a test of LR with the independent photons source. Two spatially andtemporally matched polarized photons are inserted at 1 and 2. The polarization rotators PR1 andPR2 are set so that photons 1 and 2 are linearly polarized at the same direction when they reachthe polarizing beam splitter PBS1. After PBS1, the photons are in a nonmaximally entangled state(see Eq. (4.3)) and are sent to Alice’s and Bob’s measurement setups. Each measurement setupuses a PR, a PBS and two detectors. The PR is used to select measurement bases by rotating thephoton’s polarization state.

Consider a pair of photons arriving in modes 1 and 2 of Fig. 4.1 in the state

|ψ〉12 = |H〉1|H〉2, (4.1)

42

where H (V ) denotes horizontal (vertical) polarization. We set the polarization rotators PR1 and

PR2 to the same angle to produce the state

|ψ′〉12 = (α|H〉1 + β|V 〉1)(α|H〉2 + β|V 〉2), (4.2)

where |α|2 + |β|2 = 1. After the polarizing beam splitter PBS1, we get the “pseudo-Bell” state

|ψpB〉 = α2|H〉3|H〉4 + β2|V 〉3|V 〉4 + αβ|H〉3|V 〉3 + αβ|H〉4|V 〉4. (4.3)

Using these states, we can perform a test of LR. Motivated by the result of Eberhard [28],

we investigate the possibility of reducing the minimum detection efficiency required to close the

detection loophole in a test of LR by changing the values of α and β in Eq. (4.3).

When we set |α| = |β| = 1/√

2 in Eq. (4.3) and condition on coincidence postselection, we

may treat the pseudo-Bell state as a maximally entangled state, as in the experiments reported in

Refs. [116, 117, 145]. This postselection process discards events where both photons leave PBS1

in the same direction, effectively projecting onto a Bell state. However, the discarded events may

create another loophole similar to the detection loophole for tests of LR [148, 149]. To close this

loophole, the entire pattern of experimental data must be included when evaluating a violation of

a Bell inequality [150]. Here, we also use all data without postselection, but instead of obtaining a

violation of a Bell inequality, we quantify the experimental evidence against all local realistic (LR)

models by means of measures derived from the KL divergence.

4.2 Data analysis method

Contradictions between experimental results and LR are often shown by the violation of a Bell

inequality, such as the CHSH inequality ICHSH ≤ 2 as in Eq. (1.2). To test the CHSH inequality in

an experiment, one needs to estimate the probabilities of various outcomes from a finite number of

trials. Due to uncertainties in the estimated probabilities, it is conventional to present the violation

of LR in terms of the number of experimental standard deviations (SDs) separating the estimate of

ICHSH from its LR upper bound, i.e., 2. For example, Weihs et al. [113] reported an experimental

estimate ICHSH = 2.73± 0.02 and claimed a violation of the CHSH inequality by 30 SDs.

43

While the experimental SD provides the precision with which a Bell-inequality violation is

measured, there are several problems with the number of SDs of violation. First, although the

SD partially quantifies the measurement uncertainty due to a finite number of trials, it does not

characterize the probability that an LR system could also violate a Bell inequality after a finite

number of trials. Because such a system’s (non-)violation can have a larger SD, the experimental

SD may suggest more confidence in rejecting LR than justified (see Chapter 5 for examples).

Second, one would expect that the probability distribution of the estimate of ICHSH under LR is

Gaussian, since this appears to be justified by the central limit theorem [151] as the number of

trials approaches infinity. It therefore seems reasonable to statistically quantify the violation by the

probability that a Gaussian random variable can exceed the mean by the number of SDs of violation

experimentally observed. However, for a finite number of trials and high violation, the Gaussianity

assumption fails. Third, the computation of SDs assumes that the trials are independent and

identically distributed; that is, it does not consider the memory effect [20, 21, 22, 23]. We cannot

expect the prepared states and experimental settings to be stable over the course of a long sequence

of trials. In addition, We cannot exclude the possibility that the LR model for the experiment at a

trial depends on the previous trial results, i.e., previous measurement-setting choices and outcomes.

Fourth, it is desirable to compare experimental results from different tests of LR, but the effects of

the problems with experimental SDs depend on the Bell inequality, the quantum state, measurement

settings, detection efficiency, and other experimental parameters. Consequently, the number of SDs

of violation cannot be used to directly compare the amount of evidence for rejecting LR obtained

from different experimental tests.

To avoid these problems, in this chapter we quantify the violation of LR by the statistical

strength of a test of LR as proposed by van Dam et al. [24]. The statistical strength is charac-

terized by the KL divergence from the experimental probability distribution to the best prediction

according to LR. This measure is justified by the observation that the confidence at which the

experimental data violate LR is asymptotically related to the statistical strength [25]. In the next

two chapters we propose methods to rigorously quantify the confidence in rejecting LR obtained

44

from a finite set of data.

To better understand the approach based on the KL divergence, it is helpful to analyze tests

of LR in terms of a two-player game. The two players are the quantum experimenter QM and

the LR theorist LRT who wants LR to prevail. During the test of LR, given a source of quantum

states, experimenter QM can randomly change the measurement settings. After a large number N

of trials, QM estimates the probability distribution q of measurement settings and outcomes from

the experimental data, which, hopefully, is consistent with the quantum-mechanical prediction and

violate LR. At the same time, knowing the state preparation procedure and the distribution of

measurement settings but not the actual settings or outcomes at a trial, LRT can design all kinds

of different LR models, predicting different probability distributions p for the settings and outcomes.

(We are assuming that state preparation protocols and measurement-setting distributions are not

changed during the experiment.) The goal is to make p as consistent as possible with the eventually

obtained estimated distribution q. This requires minimizing a distance between the QM’s estimate

q and LRT’s prediction p. Following the argument in Ref. [24], this distance can be measured by

the KL divergence from q to p, as defined by

DKL(q ‖ p) =K∑k=1

L∑l=1

q(k, l) log2

(q(k, l)p(k, l)

), (4.4)

where k is the measurement-setting index, K is the number of different measurement settings, l is

the measurement outcome index, and L is the number of different measurement outcomes under

each measurement setting. For example, in the test of the CHSH inequality using photon pairs

entangled in polarization, k denotes one of the measurement settings (A1, B1), (A1, B2), (A2, B1),

or (A2, B2), and so K = 2× 2 = 4; l denotes one of the outcomes (H, H ), (H, V ), (V, H ), or (V,

V ) (assuming perfect detection) and so L = 2× 2 = 4.

The KL divergence has the property that DKL(q ‖ p) ≥ 0, with equality if and only if p = q.

Since there are many different LR models, LRT has the freedom to choose the best one pLR, namely,

the one that minimizes the KL divergence from q. We then define the statistical strength Sq of the

45

distribution q for rejecting LR according to

Sq ≡ DKL(q ‖ pLR) = minp∈L

DKL(q ‖ p), (4.5)

where L is the set of LR models. Likewise, QM also has the freedom to choose different measurement

settings and setting distributions so that the best LR model explains the experimental data poorly.

Hence, the general problem is to determine the optimal statistical strength S of tests of LR subject

to experimental constraints, which is defined to be

S ≡ DKL (qs ‖ ps,LR) = maxq∈Q

Sq = maxq∈Q

minp∈L

DKL(q ‖ p), (4.6)

where qs is an optimal quantum strategy maximizing Eq. (4.5), ps,LR is the best LR model with

respect to qs, and Q is the set of accessible quantum strategies. The statistical strength is asymp-

totically related to the p-value, which is the probability according to LR of obtaining a violation

as high as that observed after finite trials. There is a statistical test such that if S > 0, then for

almost all infinite sequences of independent measurement-setting choices and outcomes, the p-value

after N trials is

pN = 2−NS+o(N), (4.7)

where o(N) is a data-dependent term that goes to 0 as N →∞ [25]. No statistical test can have a

better asymptotic p-value. Because 1− pN can be thought of as a confidence in rejecting LR, the

statistical strength S quantifies the asymptotic rate at which confidence is gained. In particular,

the number of trials required to have reasonable confidence in rejecting LR is necessarily greater

than 1/S.

LRT’s effort to minimize the KL divergence as in Eq. (4.5) is a maximum likelihood estimation

problem. Here, we use the expectation-maximization algorithm in Ref. [152]. The general problem

of computing the optimal statistical strength S is nontrivial. Given the prepared state and the

setting distribution in an experiment, to calculate S, we maximize Eq. (4.5) over measurement

settings with standard nonlinear optimization techniques.

To calculate the statistical strength of a test of LR, we need to learn how LRT predicts

the measurement results given the state preparation procedure and possible measurement settings.

46

Suppose that for a bipartite system with nA × nB measurement settings there are dA outcomes

for each of nA measurement settings at Alice’s side, and there are dB outcomes for each of nB

measurement settings at Bob’s side. Then the LR description implies the existence of a single joint

probability distribution over a dnAA × d

nBB -element event space, which we write as

ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) , (4.8)

where a1, . . . , anA ∈ {1, 2, . . . , dA}, and b1, . . . , bnB ∈ {1, 2, . . . , dB}, with normalization

dA∑a1,...,anA

=1

dB∑b1,...,bnB

=1

ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) = 1. (4.9)

Hence, the marginal probability for the measurement outcome (ai; bj) when settings Ai and Bj are

chosen is given by

ProbLR(Ai = ai;Bj = bj) =dA∑

a1,...,ai−1,ai+1,...,anA=1

dB∑b1,...,bj−1,bj+1,...,bnB

=1

ProbLR (A1 = a1, . . . , AnA = anA ;B1 = b1, . . . , BnB = bnB ) .

(4.10)

Since the probabilities ProbLR(Ai = ai;Bj = bj) are constrained to be marginal distributions,

they satisfy nontrivial relationships. The goal of a test of LR is to choose states and settings

that result in quantum predictions that cannot be obtained as the marginals of a single LR model

for all i and j. The quantum-mechanical prediction of the probability is given by ProbQM(Ai =

ai;Bj = bj) = Tr(ρO(Ai = ai;Bj = bj)), where ρ is the density matrix of the quantum state,

and O(Ai = ai;Bj = bj) is the positive-operator valued measure element corresponding to the

measurement outcome (ai; bj) when Alice and Bob use settings Ai and Bj , respectively. Given the

distributions of measurement settings chosen by Alice and Bob, the KL divergence measures the

statistical distance of the optimal LR model from the quantum predictions as in Eq. (4.5).

4.3 Results and discussion

We consider tests of LR using the independent inputs source for pseudo-Bell pairs and tests

using unbalanced Bell pairs. In both cases, Alice and Bob use measurement devices like those shown

47

in Fig. 4.1. They use either counters or detectors for photon detection, and they independently

and uniformly randomly choose one of two measurement settings each, where the settings are

determined by the polarization rotators. We use Bloch-sphere Euler angles as explained below to

define measurement settings. We label the measurement settings A1 and A2 for Alice or B1 and

B2 for Bob and write the two-photon state coming out of modes 3 and 4 in Fig. 4.1 as |ψ〉AB. We

calculate the optimal statistical strength S according to Eq. (4.6) by maximizing over the Euler

angles of the measurement settings {A1, A2, B1, B2} and minimizing over the set of LR models L,

where we fix the two-photon state |ψ〉AB shared by Alice and Bob. The inner minimization as

implemented guarantees convergence to the optimal LR model pLR, whereas the outer one obtains

only a local optimum. Confidence in global optimality can be obtained by repetition from many

different starting points (which we have done) or by more sophisticated search strategies. A local

optimum satisfying S > 0 is sufficient for having found a detection-loophole-free test. On the other

hand, finding no solution with S > 0 is heuristic evidence that such a test does not exist subject

to the constraints of the experiment. Thus, with this optimization strategy, we can trace the

boundary of the region for which S > 0 (by searching for where S decreases to 0) to heuristically

determine the minimum detection efficiency ηmin and the associated optimal measurement settings

{A1min, A2min, B1min, B2min} needed to perform a test of LR free of the detection loophole with a

given state.

Note that as S → 0, the number of trials required to gain confidence close to unity diverges.

For a constant rate of gaining confidence (see the explanation below Eq. (4.7)), we set the desired

optimal statistical strength S = X > 0 and determine the minimum detection efficiency ηc and the

associated optimal measurement settings {A1c, A2c, B1c, B2c} that achieve this statistical strength.

The strategy for finding such a set of solutions {ηc, A1c, A2c, B1c, B2c} is as follows: First we start

with a set of solutions {ηold, A1old, A2old, B1old, B2old} achieving a statistical strength Xold ≥ X.

Second we optimize Eq. (4.5) over the measurement settings {A1, A2, B1, B2} with the fixed de-

tection efficiency ηold, which yields new settings {A1new, A2new, B1new, B2new} achieving S = Y

(Y ≥ Xold) under the efficiency ηold. Third, we decrease the detection efficiency from ηold to ηnew

48

as much as we can without reducing the statistical strength to below X, so that this new set of

solutions {ηnew, A1new, A2new, B1new, B2new} achieves S = Xnew with Xnew close to X (within nu-

merical error). We then repeat the above procedure several times replacing the old with the new

solutions, until we are unable to reduce the efficiency parameter. We thus heuristically find the set

of optimal solutions {ηc, A1c, A2c, B1c, B2c} to achieve a given statistical strength level S = X.

Table 4.1: Extreme conditions for tests of LR free of the detection loophole for photon countersor photon detectors using the unbalanced Bell states |ψuB〉 defined in Eq. (4.11). The asymptoticbehavior when θ → 0 is consistent with results in Ref. [1], which are shown in the last row. Theangle parameters are explained in the text.

θ α1min α2min β1min β2min ηmin

45◦ 22.50◦ −67.50◦ −22.50◦ 67.50◦ 82.85 %40◦ 21.28◦ −66.89◦ −21.28◦ 66.89◦ 80.61 %35◦ 19.40◦ −65.60◦ −19.40◦ 65.60◦ 78.50 %30◦ 17.00◦ −63.58◦ −17.00◦ 63.58◦ 76.50 %25◦ 14.21◦ −60.72◦ −14.21◦ 60.72◦ 74.60 %20◦ 11.14◦ −56.79◦ −11.14◦ 56.79◦ 72.81 %15◦ 7.92◦ −51.42◦ −7.92◦ 51.42◦ 71.12 %10◦ 4.70◦ −43.88◦ −4.70◦ 43.88◦ 69.53 %5◦ 1.81◦ −32.41◦ −1.81◦ 32.41◦ 68.06 %4◦ 1.32◦ −29.25◦ −1.32◦ 29.25◦ 67.78 %3◦ 0.87◦ −25.55◦ −0.87◦ 25.55◦ 67.52 %2◦ 0.48◦ −21.04◦ −0.48◦ 21.04◦ 67.27 %1◦ 0.17◦ −15.01◦ −0.17◦ 15.01◦ 67.06 %→ 0 0 → −2θ1/2 0 → 2θ1/2 → 2/3

First, we analyze unbalanced Bell states of the form

|ψuB〉 = cos(θ)|H〉A|H〉B + sin(θ)|V 〉A|V 〉B, (4.11)

where θ ∈ (0, π/4]. Note that whether there is a relative phase ei∆φ between the second and first

terms of Eq. (4.11) is not important, since Alice can always adjust her polarization basis, i.e.,

|H〉A → |H〉A, and |V 〉A → e−i∆φ|V 〉A, to put the state in the above form. In principle, the state

|ψuB〉 can be simulated by postselection on the state |ψpB〉 in Eq. (4.3), although this introduces a

loophole as mentioned earlier. Experimental techniques to prepare |ψuB〉 without postselection have

been demonstrated and applied to tests of LR [153, 154]. Here we calculate the statistical strength

for photon detectors. Photon counters have no advantage over photon detectors here, because no

more than one photon simultaneously arrives at Alice’s or Bob’s detectors. Hence, counters and

49

detectors have the same ability of detecting different outcomes with the same probabilities. Our

optimization results are summarized in Table 4.1 and Fig. 4.2. The measurement angle αi,min (or

βj,min) shown in Table 4.1 is the angle from the z axis of the polarization state of an incoming photon

that gets reflected at PBS2 (or PBS3) in Fig. 4.1, where we use the Bloch sphere representation for

this state. By convention, |H〉 and 1√2(|H〉+|V 〉) are polarization states associated with the z and x

axes, respectively. The measurement operators are related with the measurement angles αic and βjc

by Aic = cos(αic)σz + sin(αic)[cos(φic)σx + sin(φic)σy] and Bjc = cos(βjc)σz + sin(βjc)[cos(φ′jc)σx +

sin(φ′jc)σy], i, j = 1, 2. The optimizations show heuristically that we can take φic = φ′jc = 0

everywhere; i.e., all the optimal measurement settings lie in the (x, z) plane of the Bloch sphere,

an observation which has been proven for several special cases [46, 48, 155].

From Table 4.1, we can see that when the optimal statistical strength S approaches 0, αi,min =

−βi,min for i = 1, 2. The minimum detection efficiency ηmin decreases monotonically with the

parameter θ in |ψuB〉 and is 82.85 % when θ = π/4, where the state is a Bell state. It approaches

2/3 when θ approaches 0, where the state is very close to a product state. These results are

consistent with previous results [28, 118]. From Fig. 4.2, we can see how the optimal statistical

strength increases for η > ηmin and how the input state must change to achieve this statistical

strength. Note that not all unbalanced Bell states can achieve a given statistical strength level

S > 0, even if η = 1. For example, for S ≥ 10−4, the parameter θ must be greater than 0.98◦.

Associated measurement settings can be found in the tables in Appendix B.

We also study the effect of the depolarizing noise in the unbalanced Bell state on the minimum

detection efficiency. We model the effective state shared by Alice and Bob as

ρAB = V |ψuB〉〈ψuB|+ (1− V )I/4, (4.12)

where I is the identity matrix of size 4 × 4, and the visibility V characterizes the depolarizing

noise in an experiment. Given the visibility V , the minimum detection efficiency ηc required to

achieve a specific statistical strength level X is a function of the state parameter θ. We study

the relationship between the visibility V and the overall minimum detection efficiency minθ ηc(θ)

50

required to achieve the statistical strength level S = 10−6, as plotted in Fig. 4.3. From Fig. 4.3,

we can see how the overall minimum detection efficiency minθ ηc(θ) changes with the visibility V .

Generally, the higher the visibility, the lower the overall minimum detection efficiency. Associated

states and measurement settings can be found in the tables in Appendix B.

0 5 10 15 20 25 30 35 40 450.65

0.70

0.75

0.80

0.85

0.90

0.95

1.00

θ

η c

e

2/3

c

b

d

(deg)

a

a: 0 b: 1E-6 c: 1E-5 d: 1E-4 e: 1E-3

Figure 4.2: Detection efficiency of photon counters or photon detectors required for different sta-tistical strength levels S vs the parameter θ [Eq. (4.11)]. The empty squares show our calculatedpoints, and the dotted lines are linear interpolations to guide the eyes. In curve a, the linearextrapolation toward θ = 0 is shown.

We now consider the pseudo-Bell states of Eq. (4.3). Let α = cos(γ) and β = sin(γ)eiφ, then

Eq. (4.3) can be rewritten as

|ψpB〉 = cos2(γ)|H〉3|H〉4 + sin2(γ)ei2φ|V 〉3|V 〉4

+ cos(γ) sin(γ)eiφ(|H〉3|V 〉3 + |H〉4|V 〉4), (4.13)

51

0.7 0.75 0.8 0.85 0.9 0.95 10.65

0.7

0.75

0.8

0.85

0.9

0.95

1

V

min

θη c

Figure 4.3: Tradeoff between the overall minimum detection efficiency minθ ηc(θ) and the visibilityV of unbalanced Bell states. Here, we fix the optimal statistical strength to 10−6.

where γ ∈ (0, π/4], and φ ∈ [0, 2π). We can prepare different pseudo-Bell states by changing the

values of both γ and φ. However, for a given γ, as the following discussion shows, the optimal

statistical strength S is the same regardless of the value of φ. In the test of LR as shown in

Fig. 4.1, Alice’s and Bob’s measurements are restricted to polarization rotation followed by photon

counting. They cannot detect coherences between any two of the first two, the third, and the

last terms in the state |ψpB〉 as written in Eq. (4.13), because these terms correspond to different

photon-number-distribution subspaces. Hence, the measurement outcomes determined by |ψpB〉

52

are equivalent to the outcomes given by a mixture of the following two states:

|ψ1〉〈ψ1|, with |ψ1〉 ∝ cos2(γ)|H〉3|H〉4 + sin2(γ)ei2φ|V 〉3|V 〉4, (4.14)

and

ρ2 ∝ |H〉3|V 〉3 3〈H|3〈V |+ |H〉4|V 〉4 4〈H|4〈V |. (4.15)

Since the state |ψ1〉 can be written in the form |ψuB〉 as in Eq. (4.11) by changing the mode

labels and the state bases, the measurement outcomes attributable to |ψ1〉 can reveal a violation

of LR when γ ∈ (0, π/4], as our earlier results show. But ρ2 is a separable state and so the

outcomes attributable to ρ2 can be explained by LR no matter what the measurement settings

{A1, A2, B1, B2} are. Hence, in a test of LR, the information about whether or not LR is violated

is conveyed only by the outcomes from |ψ1〉, while the state ρ2 acts as noise. Based on these

considerations and the earlier argument about being able to eliminate a potential phase in |ψuB〉,

we do not need to consider different phases φ in the pseudo-Bell state |ψpB〉 when calculating the

optimal statistical strength S. Hence, we can choose a fixed value, such as φ = 0. Moreover, we

determined heuristically by extended optimizations in selected cases that the optimal measurement

settings {A1c, A2c, B1c, B2c} can be chosen to lie in the (x, z) plane of the Bloch sphere, just like for

|ψuB〉. Taking these observations into account reduces the number of free parameters and speeds

up the general calculations.

The optimization results for pseudo-Bell states are summarized in Table 4.2 and Fig. 4.4.

Similar to unbalanced Bell states, Table 4.2 shows that when the optimal statistical strength S

approaches 0, αi,min = −βi,min for i = 1, 2. Figure 4.4 shows that there is a lower bound on the

state parameter γ to achieve a nonzero statistical strength level S. Measurement settings for the

results shown in Fig. 4.4 are given in Appendix B.

Table 4.2 and Fig. 4.4 (a) show that the minimum detection efficiency ηmin required to close

the detection loophole achieves its minimum in the interior of the domain, in contrast to what was

found for the case of unbalanced Bell states. We might have expected this behavior based on the

following two observations: First, with respect to the measurement setups used (see Fig. 4.1), the

53

Table 4.2: Extreme conditions for tests of LR free of the detection loophole for photon counters andphoton detectors using the pseudo-Bell states of Eq. (4.13). The angle parameters are explainedin the text. The minimum detection efficiencies for counters and detectors when γ = 45◦ are thesame as those found in Ref. [2].

Photon counter Photon detectorγ α1min α2min β1min β2min ηmin α1min α2min β1min β2min ηmin

45◦ 22.50◦ −67.50◦ −22.50◦ 67.50◦ 90.62 % 11.64◦ −63.88◦ −11.64◦ 63.88◦ 92.23 %40◦ 20.49◦ −66.01◦ −20.49◦ 66.01◦ 89.71 % 11.08◦ −62.79◦ −11.08◦ 62.79◦ 91.31 %35◦ 16.76◦ −62.14◦ −16.76◦ 62.14◦ 89.78 % 9.79◦ −59.60◦ −9.79◦ 59.60◦ 91.11 %30◦ 12.32◦ −56.16◦ −12.32◦ 56.16◦ 90.80 % 7.93◦ −54.42◦ −7.93◦ 54.42◦ 91.71 %25◦ 8.00◦ −48.43◦ −8.00◦ 48.43◦ 92.57 % 5.73◦ −47.46◦ −5.73◦ 47.46◦ 93.05 %20◦ 4.43◦ −39.49◦ −4.43◦ 39.49◦ 94.71 % 3.53◦ −39.09◦ −3.53◦ 39.09◦ 94.89 %15◦ 1.96◦ −29.88◦ −1.96◦ 29.88◦ 96.81 % 1.68◦ −29.76◦ −1.68◦ 29.76◦ 96.85 %10◦ 0.59◦ −19.98◦ −0.59◦ 19.98◦ 98.52 % 0.54◦ −19.96◦ −0.54◦ 19.96◦ 98.53 %5◦ 0.07◦ −10.00◦ −0.07◦ 10.00◦ 99.63 % 0.07◦ −10.00◦ −0.07◦ 10.00◦ 99.63 %

state |ψpB〉 can be thought of as the state |ψuB〉 with noise, as pointed out above, and second, the

violation of LR given by |ψuB〉 is very sensitive to noise, particularly when θ in |ψuB〉 of Eq. (4.11)

is small (see the results under Fig. 4.3 and discussions in Ref. [50]). Table 4.2 and Fig. 4.4 (a) also

suggest that any pseudo-Bell state |ψpB〉 can violate LR using counters or detectors with sufficient

efficiency.

When we look at the minimum detection efficiency required to achieve a given statistical

strength level S, the efficiencies of photon counters and photon detectors are notably different,

showing the utility of the additional information available with photon counters. The advantage

of photon counters is most notable for γ between approximately 35◦ and 45◦. In particular, the

minimum detection efficiency ηmin is 89.71 % for photon counters and 91.11 % for photon detectors,

and ηmin is achieved for γ in this range. Loosely speaking, this advantage is because photon counters

are better at differentiating between measurement outcomes contributed by the entangled (|ψ1〉 in

Eq. (4.14)) and unentangled (ρ2 in Eq. (4.15)) parts of the state |ψpB〉.

A comparison between Fig. 4.2 and Fig. 4.4 suggests that higher efficiencies are required to

achieve given statistical strengths with pseudo-Bell states |ψpB〉 than with unbalanced Bell states

|ψuB〉. This again can be attributed to the noise added by ρ2 to measurement outcomes, which

reduces the statistical strength considerably. As an explicit example, consider the optimal statistical

54

0 10 20 30 40 450.880.900.920.940.960.981.00

γ

η min

CounterDetector

0 10 20 30 40 450.880.900.920.940.960.981.00

γ

η c

CounterDetector

0 10 20 30 40 450.880.900.920.940.960.981.00

γ

η c

CounterDetector

0 10 20 30 40 450.880.900.920.940.960.981.00

γ

η c

CounterDetector

=0(b)=5E-5

(d)

(deg)

(a)

(c)

S S

=1.5E-3

(deg)

(deg)

=5E-4 SS

(deg)

Figure 4.4: Detection efficiencies of photon counters and photon detectors required for differentstatistical strength levels S vs the parameter γ of the pseudo-Bell state of Eq. (4.13): (a) S = 0,(b) S = 5E-5, (c) S = 5E-4, and (d) S = 1.5E-3. The calculated points are labeled by squares forphoton counters and by diamonds for photon detectors, and the dotted lines are linear interpolationsto guide the eyes.

strengths SuB or SpB achievable with

|ψuB(θ = π/4)〉 =1√2

(|H〉A|H〉B + |V 〉A|V 〉B), (4.16)

or with

|ψpB(γ = π/4, φ = 0)〉 =12

(|H〉3|H〉4 + |V 〉3|V 〉4 + |H〉3|V 〉3 + |H〉4|V 〉4). (4.17)

We find that SuB = 2SpB ≈ 0.04627 for perfect photon counters. The ratio can be explained

by observing that one half of the measurement outcomes of |ψpB(γ = π/4, φ = 0)〉 are from the

separable state ρ2 in Eq. (4.15).

55

4.4 Concluding remarks

We have demonstrated a method to measure the statistical strength of tests of LR that is

based on the KL divergence from the predicted experimental probability distribution to the best

prediction given by LR. This method helps to design a loophole-free test of LR and quantifies the

confidence in violation of LR for sufficiently large experimental data sets. We used the method

to determine optimal statistical strengths of tests of LR using a typical measurement setup for

polarized photon pairs with inefficient detectors or counters. We considered both ideal unbalanced

Bell states and pseudo-Bell states obtained by combining independent polarized photons on a

polarizing beam splitter. Creating the latter can be easier [116, 117, 145], but observing a violation

of LR requires higher detection efficiencies. Our calculations show that with pseudo-Bell states,

we can close the detection loophole with a minimum detection efficiency of 89.71 % using photon

counters, or 91.11 % using photon detectors. For unbalanced Bell states, we confirmed previous

calculations [28] showing that violations of LR are possible at detection efficiencies above 2/3.

Furthermore, we numerically exhibited the relationships between state parameters (or visibilities)

and minimum detection efficiencies needed to achieve given levels of statistical strength. Given that

the current roadblock for performing loophole-free tests of LR with photons is detection inefficiency

rather than the difficulty of obtaining an entangled source, we cannot recommend using the pseudo-

Bell state for such an experiment.

In current experiments based on SPDC to produce entangled photon pairs, we must consider

other sources of potentially unwanted measurement outcomes. Such sources include dark counts

and the generation of more than one photon pair [156, 157]. The latter effect can be quite noticeable,

particularly for the brighter, more strongly pumped sources. Further work is required to analyze

the consequences of these effects for the statistical strength. It is also desirable to obtain rigorous

confidence levels for rejecting LR with moderately sized data sets, which we discuss in detail in the

following two chapters. Such confidence levels will improve on measures derived from experimental

SDs of Bell-inequality violation.

Chapter 5

Asymptotically optimal data analysis for rejecting local realism

From the discussion in Sec. 4.2 of the previous chapter, we know that there are several

problems with the conventional measure, the number of experimental standard deviations (SDs)

of violation of a Bell inequality. To avoid these problems, in this chapter we show how to analyze

data from experimental tests of LR to compute a measure of the strength of the evidence against

local realism (LR). By computing this measure, violations of LR by different experiments can

be rigorously assessed and compared. Specifically, the proposed analysis protocol quantifies the

violation of LR in terms of p-values, where small p-values imply strong violation. We call this

the prediction-based-ratio (PBR) protocol. Protocols such as this compute a p-value from a “test

statistic” (see Sec. 5.1 for details). A test statistic is a function of the sequence of trial results,

i.e., measurement-setting choices and outcomes of trials. There are many such statistics to choose

from; an example is the Bell-inequality violation estimated from a finite number of trials and used

by the SD-based protocol.

We prove that the PBR protocol is valid; see Sec. 5.1 for the definition of validity. We compare

the PBR protocol to SD-based and martingale-based [18, 19] protocols. For N independent and

identically distributed trials, these protocols have the property that the p-values computed decrease

to 0 exponentially as N → ∞. We can therefore compare different protocols’ performances in a

test of LR according to the (asymptotic) confidence-gain rate defined by

G = − limN→∞

log2 p(prot)N

N, (5.1)

where p(prot)N is the p-value computed by a protocol. It is desirable to have a high confidence-gain

57

rate as this implies that fewer trials are needed to achieve the same strength of violation of LR.

Given the experimental probability distribution q, the optimal confidence-gain rate that can be

achieved by any protocol is given by the statistical strength Sq as defined in Eq. (4.5) of Chapter 4.

We prove that the PBR protocol is asymptotically optimal. That is, its p-values always achieve the

optimal confidence-gain rate. The confidence-gain rates achieved by different protocols are shown

in Figs. 5.1 and 5.2 for a number of experimental configurations that are explained in Sec. 5.4. The

figures show that SD-based p-values are not valid in some regions. Because the relationship of the

SD-based confidence-gain rates compared to the asymptotically optimal ones varies substantially,

results of experiments with different configurations cannot be directly compared by the common

“number of SDs of violation” measure. The martingale-based protocol is valid and computationally

simple but achieves suboptimal confidence-gain rates.

The PBR protocol remains valid even if the prepared quantum state, measurement settings,

and relevant local realistic (LR) models vary arbitrarily during an experiment, that is, in the

presence of the memory effect [20, 21, 22, 23]. This is desirable not only for tests of LR but

also for practical applications of quantum information, such as device-independent quantum key

distribution [11, 13, 107, 108, 109], randomness expansion [3, 110, 111, 112], state estimation [158],

and certification of entangled measurements [159].

Compared with the other two protocols, an advantage of the PBR protocol is that it can

be applied to a wide variety of configurations (the combinations of quantum state, measurement

settings and other relevant parameters) without having to specify a Bell inequality. Since such

Bell inequalities characterize the family of probability distributions achievable by LR models, they

provide a useful guide to designing an experiment and determining good goal configurations to

be achieved. But since Bell-inequality violation is not directly related to statistical strength, it

is not obvious how to choose the best inequality with respect to an experiment. Moreover, the

predetermined Bell inequality restricts a successful experiment to configurations close to the goal,

closer than may be achievable in a given experiment. The PBR protocol automatically adapts to

deviations from the goal, achieving optimal confidence-gain rates for actual configurations. One

58

can exploit this adaptability by applying the PBR protocol to experiments in progress. This makes

it possible to monitor the current (non-)violation of LR for the purpose of optimizing configuration

parameters. Appendix A contains the code information and documentation for an implementation

of the PBR protocol (the local realism analysis engine) that can be used for monitoring experi-

ments in progress and for analyzing existing data sets. Our results show that the PBR protocol is

sufficiently efficient for practical use with typical experimental configurations.

The chapter is structured as follows: In Sec. 5.1, we provide the relevant statistical back-

ground and justify the use of p-values. In Sec. 5.2, we explain how to compute p-values using

the three protocols mentioned above. We then discuss the technical details for applying the PBR

protocol in Sec. 5.3. Finally in Sec. 5.4, we show how confidence-gain rates achieved by different

protocols compare for various tests of LR. The protocols are also applied to and compared on

simulated and actual experiments. This chapter is based on our previous work [17].

5.1 Statistical concepts

To quantify the strength of the experimental evidence against LR, one needs to take into

account the possibility that a finite set of data generated according to LR can violate a Bell

inequality due to statistical fluctuations in finite samples. This possibility can be formalized in

statistics via a p-value for the hypothesis test of LR. A p-value is associated with a test statistic T

that is a function of the sequence of trial results. If N is the total number of trials, the corresponding

sequence of results is denoted by x = (x1, . . . , xN ). As is conventional, we distinguish between

the sequence of results and the sequence of random variables X = (X1, . . . , XN ) giving rise to

these results. The exact p-value pN is defined as the maximum of the probabilities of the events

T (XLR) ≥ T (x) over all random-variable sequences XLR distributed according to LR models. That

is,

pN = maxLR

ProbLR(T (XLR) ≥ T (x)). (5.2)

59

Due to the difficulty of determining worst-case tail probabilities of typical test statistics, we can

usually determine only upper bounds of exact p-values. Moreover, to close the memory loophole in

a test of LR, the computation of exact p-values is further complicated by the fact that the set of

null hypotheses includes all possible sequences of LR models depending on previous trial results.

Thus, for the remainder of this chapter, the term “p-value” refers to any putative upper-bound

b(T (x)), computed according to a protocol, on the exact p-value Eq. (5.2). That is, the p-value of

a protocol given the observed data x is defined by p(prot)N = b(T (x)).

In order to be able to interpret a protocol’s p-value as a measure of the violation of LR,

it must satisfy statistical validity: A protocol and its p-values are valid if and only if the bound

b(t) ≥ ProbLR(T (XLR) ≥ t) is true whenever XLR is distributed according to LR.

A main purpose of the PBR and related protocols is to evaluate the strength of the evidence

against LR by computing valid p-values given the data. Some care must be taken in interpreting

such p-values in terms of probabilities. For example, a p-value cannot be interpreted as a probability

that LR is true. Although p-values are computed for the data, their validity is defined in terms of

what is known before an experiment, not after. Strictly speaking, we can only state for sure that

before performing the trials, the following holds: For any fixed 0 ≤ α ≤ 1, if LR holds, then the

probability that the returned valid p-value satisfies p(prot)N ≤ α is at most α. Although we have no

intention of making an actual decision on the failure of LR, this statement can be viewed in terms

of traditional hypothesis testing: A protocol tests LR simultaneously at all significance levels α,

and “rejects” LR at a given α if p(prot)N ≤ α. The validity property is equivalent to the statement

that, if LR holds, the maximum probability of (falsely) rejecting at level α is bounded above by

α. This justifies the use of p-values to quantify the violation of LR. The definitions of significance

levels and p-values are based on Ref. [151], 2nd edition, pages 126 and 127.

We use the the term “protocol” rather than “test” for two reasons. The first is that the

term “test” in “test of LR” typically refers to the experimental setup and subsequent analysis,

not a conventional hypothesis test. The second is that hypothesis tests, as the term is used in

mathematical statistics, are valid by definition. Thus, although we do not encourage it, one can

60

think of a valid analysis protocol as a family of hypothesis tests. For such a family to be useful,

the tests should also have high power. For our situation, one can express the power in terms

of the probabilities of rejection at given significance levels, supposing a set of data are sampled

from non-LR models. Alternatively, one can consider the expected p-values, and look for tests for

which the expected p-values are as small as possible. We do not expect that the PBR protocol has

particularly low p-values for a given finite number of trials. In fact, because of the conservative

nature of Markov’s inequality used in the PBR protocol (see Sec. 5.2.4), better protocols exist.

However, asymptotic optimality of the PBR protocol assures us that it performs well when the

evidence for rejection is very strong.

It is also worth noting that many issues that arise in applications of hypothesis testing, such

as selection biases, are less of a concern when one is considering the extremely low p-values that

are desirable when falsifying a physical theory. Corrections for such effects improve p-values by

relatively small terms in our setting. Also, one application of the PBR protocol is to quantify

the success of an experiment independent of the details of the configuration, so that different

experiments can be compared. For this application, the statistical interpretation of the p-value

serves only as a motivation.

5.2 Theory

In this section, we consider three protocols that determine p-values for rejecting LR from ex-

perimental data: SD-based, martingale-based, and PBR protocols. The first two protocols depend

on a Bell inequality, whereas the PBR protocol requires only a sequence of trial results. While all

these three protocols applies to tests of LR with multiple parties, we discuss it explicitly for the

bipartite case to simplify the formulas. (Our implementation of the local realism analysis engine

is presently restricted to tests of LR with two parties.) The result of the n’th trial is denoted by

xn = (in, jn, an, bn), where in, jn are the n’th chosen settings and an, bn are the n’th observed

outcomes of Alice and Bob, respectively. Let i(X) and j(X) be Alice’s and Bob’s settings, respec-

tively, given the potential result X. The joint-setting distribution is fixed, and the probability of

61

choosing the settings i and j by Alice and Bob is given by pi,j .

Before explaining the details of the protocols, let us discuss how to use Bell inequalities in

the SD-based and martingale-based protocols.

5.2.1 Bell functions

To apply the SD-based or martingale-based protocol, we need to write a Bell inequality in

the following form

〈I(X)〉 ≤ B, (5.3)

where X is the random variable from which a trial result x is sampled, I is a real-valued function,

called a Bell function, and I = 〈I(X)〉 is its expectation. Here, the expectation is respect to

the joint distribution of measurement settings and outcomes. An example is the Clauser-Horne-

Shimony-Holt (CHSH) inequality in Eq. (1.2). In this case, if the trial result x consists of setting

choices i, j and outcomes a, b, then

ICHSH(x) = (1− 2δi,2δj,2)ab/pi,j , and B = 2. (5.4)

The functional form ICHSH in Eq. (5.4) ensures that its expectation is equal to the left-hand side

of the CHSH inequality (1.2). In particular, this requires dividing by the known probabilities of

choosing different measurement settings. There is no loss of generality by fixing the setting distri-

bution in advance. Violation of LR requires that measurement settings be chosen independently of

local hidden variables. In particular, the locality and memory loopholes cannot be closed unless at

each trial, measurement settings are chosen randomly and independently by each party according

to a known probability distribution so that there is no possibility of a causal connection between

any two events of Alice’s setting choice, Bob’s setting choice, and the emission of the entangled

particle pair.

Given an experimentally obtained sequence of results x1, . . . , xN from N trials, the obvious

method for estimating I is to compute the average of the sequential values I(xn) given by

I =1N

N∑n=1

I(xn). (5.5)

62

However, this is not the minimum-variance estimate of I, since the setting distribution is fixed and

known. In fact, the conventional way of writing a Bell inequality is as a sum of expectations as in

Eq. (1.2), which makes it independent of the setting distribution. The correspondence between the

two ways of writing a Bell inequality is given by

〈I(X)〉 =∑i,j

pi,j〈I(X)|i(X) = i, j(X) = j〉, (5.6)

where the expectation in the sum is conditioned on the settings of Alice and Bob, as indicated. If

we assume that the state at each trial is identical and do not worry about the memory and locality

loopholes, we can estimate each expectation 〈I(X)|i(X) = i, j(X) = j〉 separately, experimentally

fixing the settings for each estimate if desired. The right-hand side of Eq. (5.6) can then be

computed formally. If we define c(i, j, a, b) to be the number of trials with setting choices i, j and

outcomes a, b, the estimate for I thus computed is

I =∑i,j

pi,j

∑a,b c(i, j, a, b)I(i, j, a, b)∑

a,b c(i, j, a, b), (5.7)

a nonlinear function of c(i, j, a, b). Its SD can be approximated by linear propagation of errors from

SDs for the counts c(i, j, a, b), assuming that these counts are independent and each count follows

a Poisson distribution as are commonly done in experiments. The SD thus obtained is generally

smaller than that of I in Eq. (5.5).

5.2.2 SD-based protocol

The results from N trials are used to obtain I and estimate the SD σ of I as discussed in

Sec. 5.2.1. Given that I > B, it is conventional to give (I −B)/σ, the number of SDs of violation,

as a measure of the amount of violation. To convert the number of SDs to a p-value, we make the

unjustified assumption that, for any LR model the distribution of the random variable ILR, from

which I is sampled, is sufficiently close to Gaussian with the SD σ as estimated from N trial results

but with a mean bounded by B. With this assumption, according to any LR model, the probability

63

of the event ILR ≥ I is then bounded above by

ProbLR(ILR ≥ I) ≤ Q

(I −Bσ

), (5.8)

where Q(z) is the Q-function, which is the probability that a standard normal random variable Z

satisfies Z ≥ z. This allows us to assign the p-value for the observed statistic I as

p(SD)N = Q

(I −Bσ

), (5.9)

with the caveat that our assumption is not justified. As a function of the number of trials N ,

σ√N approaches σ1, where σ1 is an effective one-trial SD. For large N , the quantity Q((I −B)/σ)

approaches e−N(I−B)2/(2σ21). Thus, according to Eq. (5.1), the confidence-gain rate achieved by the

SD-based protocol is

GSD = log2(e)(I −B)2

2σ21

. (5.10)

SD-based p-values are not valid for two reasons. First, the experimental SD is different from

the worst-case SD assuming LR. While it may be possible to check the relevant SDs for all LR

models, this is a challenging task. Second, deviations from Gaussianity in the extreme tail of the

distribution for ILR cannot be asymptotically neglected. To explain this issue, define the random

variable F =√N(ILR − B)/σ1. For any LR model, the expectation 〈F 〉 ≤ 0. Assuming that

LR models have the same SD as the experimentally estimated one, we expect that according to

the central limit theorem, F − 〈F 〉 converges in distribution to a standard normal distribution.

Here, convergence in distribution implies that for a constant l, the probability of the event F ≥ l

converges to the standard normal distribution’s probability for this event. But for the computation

of p(SD)N , one needs the probability of the event F ≥

√N(I−B)/σ1, where

√N(I−B)/σ1 scales as

√N and therefore goes to infinity as an experiment progresses. Thus, convergence in distribution

is insufficient for estimating this probability.

The number of SDs of violation is not normally explicitly converted to a p-value as done here.

Instead, it is primarily intended as a way of claiming successful violation with a good signal-to-

noise ratio. Naturally, one would like to use the measure to compare the strength of the violation

64

of LR in different experiments. Such a relative comparison works only if the experiments use the

same test of LR with the same state, experimental settings, losses, visibilities, and other relevant

parameters.

5.2.3 Martingale-based protocol

For fundamental tests of quantum mechanics, a serious deficiency of SD-based assessments of

experimental tests of LR is that they do not account for memory effects [20, 21, 22, 23], including

the possibility that the state and settings drift in the course of the experiment. To account for the

time dependence of the state and setting parameters and relevant LR models in an experiment,

R. Gill suggested a method for computing p-values based on the super-martingale structure of the

time sequence of observations in a test of LR [18, 19]. That is, given a Bell inequality 〈I(X)〉 ≤ B

as in Eq. (5.3), one can show that the time sequence Mn =∑n

k=1(I(Xk) − B), n = 1, 2, . . .,

is a super-martingale according to any LR model. Here, the measurement settings are assumed

to be chosen randomly and independently at each trial by Alice and Bob according to the fixed

probability distribution pi,j built into the Bell inequality. If the range of the Bell function I is

included in the finite interval [bl, bu], one then can apply large-deviation bounds for the super-

martingale {Mn : n = 1, 2, . . .} with bounded increments Mn−Mn−1 ∈ [bl−B, bu−B] to compute

p-values.

To show that the sequence Mn, n = 1, 2, . . ., is a super-martingale, let Wn be all the infor-

mation available before the n’th trial, including all previous trial results x1, . . . , xn−1. According

to any LR model, the conditional expectation of Mn given Wn satisfies

〈Mn|Wn〉 = 〈I(Xn)−B +Mn−1|Wn〉

= 〈I(Xn)|Wn〉 −B + 〈Mn−1|Wn〉

= 〈I(Xn)|Wn〉 −B +Mn−1

≤Mn−1. (5.11)

The last inequality follows from the fact that the Bell inequality 〈I(X)〉 ≤ B is satisfied for any

65

LR model, regardless of prior information. The inequality in Eq. (5.11) is the defining property for

a super-martingale {Mn : n = 1, 2, . . .}.

Given the results x1, . . . , xN after N trials, an experimental test yields an estimate I =

1N

∑Nn=1 I(xn) of I. Suppose that the n’th trial result xn is distributed according to a ran-

dom variable XLR,n satisfying LR. In this case, the random variable from which I is sampled

is I ′LR = 1N

∑Nn=1 I(XLR,n). By applying the Azuma-Hoeffding inequality [160, 161, 162] for the

tail probability of the super-martingale {Mn : n = 1, 2, . . . , N} with bounded increments, we find

that, after N trials, the probability according to an LR model that I ′LR takes a value greater than

or equal to the observed I > B is bounded above by

ProbLR(I ′LR ≥ I) = ProbLR(MN ≥ N(I −B))

≤ exp

(−2N(I −B)2

(bu − bl)2

). (5.12)

We can further tighten the above bound according to Theorem 6.1 of Ref. [162]. The tighter bound

is

ProbLR(I ′LR ≥ I) = ProbLR

(MN ≥ N(I −B)

)≤

(bu −Bbu − I

) bu−Ibu−bl

(B − blI − bl

) I−blbu−bl

N . (5.13)

This implies a valid p-value for the observed statistic I as

p(mart)N =

(bu −Bbu − I

) bu−Ibu−bl

(B − blI − bl

) I−blbu−bl

N . (5.14)

For large N , I approaches I, thus the confidence-gain rate according to Eq. (5.1) is

Gmart =bu − Ibu − bl

log2

bu − Ibu −B

+I − blbu − bl

log2

I − blB − bl

. (5.15)

Note that, although Theorem 6.1 of Ref. [162] is stated for a martingale that is a sequence of

random variables Mn, n = 1, 2, . . ., such that 〈Mn|Wn〉 = Mn−1, the same result and its proof also

apply to a super-martingale. The same bound is also derived in Theorem 1 of Ref. [160] for a sum

66

of independent and bounded random variables. From Refs. [160, 162], we can see that the bound

in Eq. (5.13) is tighter than bounds of ProbLR(I ′LR ≥ I) used in previous works [3, 17, 19], for

example, the bound as shown in Eq. (5.12). Moreover, from the proof of Theorem 6.1 in Ref. [162],

we can see that, even if a Bell function I and its bounds depend on n, that is, bl,n ≤ I(xn) ≤ bu,n

for any result xn at the n’th trial, the p-value assignment as in Eq. (5.14) is still valid with the

replacement bu and bl by bu =∑N

n=1 bu,n/N and bl =∑N

n=1 bl,n/N , respectively.

We cannot expect the bound on the tail probability in Eq. (5.13) to be asymptotically tight,

since the only constraints considered are the bounds on the Bell function I. The PBR protocol

takes advantage of all available constraints on the distributions of trial results according to LR,

implicitly including all relevant Bell inequalities.

5.2.4 PBR protocol

In contrast to a fixed Bell inequality used in the SD-based or martingale-based protocol,

given the setting distribution pi,j , after n trials but before the (n + 1)’th trial the PBR protocol

returns a special Bell inequality of the form

〈Rn(X)〉 ≤ 1 (5.16)

with a nonnegative Bell function Rn. Here, Rn can depend on previous trial results x1, . . . , xn and

other aspects of the experiment before starting the (n+1)’th trial. The construction of Rn typically

requires predicting the distribution of Xn+1. Thus, Rn is referred to as a prediction-based ratio

(PBR).

Given any sequence of PBRs Rn, n = 0, 1, 2, . . ., the PBR protocol computes a test statistic

according to Pn =∏nk=1Rk−1(Xk), that is, the product of the values of Rk−1 at the potential

result Xk of the k’th trial. We claim that, according to any LR model with arbitrary memory, the

expectation of the test statistic satisfies

〈Pn〉 ≤ 1. (5.17)

To prove the claim, as in Sec. 5.2.3 let Wn denote all the information available before the n’th trial.

67

Then, according to any LR model with arbitrary memory, the expectation of Pn conditioned on

Wn satisfies

〈Pn|Wn〉 =

⟨n∏k=1

Rk−1(Xk)|Wn

⟩

=

⟨n−1∏k=1

Rk−1(Xk)×Rn−1(Xn)|Wn

⟩

=n−1∏k=1

Rk−1(Xk)× 〈Rn−1(Xn)|Wn〉

≤ Pn−1, (5.18)

where we used the facts that Wn includes Rk−1 and Xk−1 for k ≤ n, and that the LR bound

on 〈Rn−1(X)〉 is 1 given Wn, as the LR model in the bound is arbitrary. We can compute the

expectations of both sides of Eq. (5.18) to show that, according to any LR model, 〈Pn〉 ≤ 〈Pn−1〉,

and therefore, by induction, 〈Pn〉 ≤ 1, which is the inequality (5.17).

Given a sequence of experimental results x1, . . . , xN from N trials, the test statistic PN takes

a specific value P =∏Nn=1Rn−1(xn). Suppose that PN is constrained by LR, possibly with memory.

By construction PN ≥ 0, and the expectation according to an LR model 〈PN 〉 ≤ 1 as shown above.

According to Markov’s inequality, we conclude that

ProbLR(PN ≥ P ) ≤ min(1/P , 1), (5.19)

which shows that we can assign a valid p-value associated with the observed statistic P according

to

p(PBR)N = min

( N∏n=1

Rn−1(xn)

)−1

, 1

. (5.20)

Note that, Eq. (5.18) shows that the sequence Pn, n = 1, 2, . . ., is a super-martingale under

any LR model. Since the increment of this super-martingale is not bounded, we cannot apply the

method of Sec. 5.2.3 to bound the tail probability. However, we can use the optional stopping

theorem for a super-martingale [163], to get the following nice property of the PBR protocol:

Suppose that one stops the experiment if and only if the observed statistic P is greater than a

68

prespecified value P0 > 1 or the number of trials performed is greater than a prespecified value N0.

Then, the total number of trials N performed in an experiment is a random variable depending on

P0 and N0. According to the optional stopping theorem for a super-martingale, the expectation

of PN according to LR at the stopping time N is bounded above by 1. That is, even if when to

stop an experiment is determined before the experiment by a theorist who wants LR to prevail, the

theorist cannot explain the observed data on the average.

For the extremely low p-values of interest in tests of LR, we are looking for large (negative) log-

p-value increments log2(Rn(xn+1)) at the (n+ 1)’th trial. Therefore, before the (n+ 1)’th trial, our

goal is to choose Rn so as to maximize the experimentally expected increment l = 〈log2(Rn(Xn+1))〉.

For this purpose, we can take advantage of anything we know about the probability distribution of

the random variable Xn+1 giving rise to the next trial result. Consider a probability distribution q

for Xn+1, which may be either the true distribution or an estimate thereof. Let p be the distribution

according to an LR model. Note that, because the setting distribution is under experimental control,

the probability distributions q and p must be consistent with the chosen setting distribution. Our

ability to distinguish the probability distributions q and p given a collection of independent samples

from q can be characterized by the Kullback-Leibler (KL) divergence from q to p,

DKL(q ‖ p) =∑x

q(x) log2

(q(x)p(x)

). (5.21)

The KL divergence is nonnegative, and it is zero if and only if p = q. This motivates seeking

an LR model whose probability distribution pLR minimizes the KL divergence from q [24]. We

define Sq = DKL(q ‖ pLR), and refer to Sq as the statistical strength for rejecting LR by means

of a test with the distribution q. As shown in Ref. [25], the statistical strength Sq is the optimal

valid confidence-gain rate for rejecting LR given that the experimental distribution is q. Thus, the

experimentally expected log-p-value increment l cannot exceed Sq, and our goal before the (n+1)’th

trial is to make l as close to Sq as possible.

We claim that if we define the PBR

Rn(x) = q(x)/pLR(x), (5.22)

69

then 0 ≤ Rn(x), and for any LR model the expectation satisfies 〈Rn(X)〉 ≤ 1. Consequently,

the p-value computed according to Eq. (5.20) is valid, and if q is the true distribution of Xn+1

the experimentally expected log-p-value increment l is Sq. To prove the claim, consider φ(β) =

DKL(q ‖ pLR + β(p− pLR)), where 0 ≤ β ≤ 1. For any p in the convex set of LR distributions, by

optimality of pLR, φ(β) ≥ φ(0). It follows that ∂φ∂β |β=0+ ≥ 0. Consequently,

∑x

(pLR(x)− p(x))q(x)pLR(x)

≥ 0, (5.23)

which can be rearranged to show that according to any LR model’s probability distribution p the

expectation

〈Rn(X)〉 =∑x

p(x)q(x)pLR(x)

≤ 1. (5.24)

The claim follows. Bell inequalities of the form shown in Eq. (5.24), which are based on minimizing

the KL divergence, were introduced in Ref. [164].

In an experiment, however, we do not know the true distribution q of the random variable

Xn+1 giving rise to the (n+ 1)’th trial result. Instead, we obtain good estimates qn of q before the

(n+1)’th trial, and determine the corresponding optimal LR model’s probability distribution pLR,n.

We then set Rn(x) = qn(x)/pLR,n(x) to compute and update the PBR p-value. If the experiment

is sufficiently stable, good estimates can be obtained from the frequencies of results observed in

trials so far. The estimates can be improved by taking into account that the setting distribution

is known and the distributions of marginal outcomes for given settings of Alice or Bob must agree

due to no-signaling constraints. We discuss how to do this in Sec. 5.3.1. In Sec. 5.3.2, we show

that if the trials are independent and identically distributed, then PBR p-values computed with

any converging method for estimating the true probability distribution q have the property that

the confidence-gain rate

GPBR = Sq. (5.25)

Thus, we prove the asymptotic optimality of PBR p-values.

To determine the optimal LR model one can use numerical algorithms for optimizing convex

functions over a convex domain. In this case one can use the expectation-maximization algo-

70

rithm [152] as discussed in the previous chapter. A problem is that due to stopping criteria and

numerical precision, one cannot expect to find the exact optimum. We show in Sec. 5.3.2 that one

can compensate for this problem to maintain validity of the computed p-value.

Note that, probability ratios such as the ones we use to compute the values of Rn in Eq. (5.22)

are often referred to as likelihood ratios. Likelihood ratios play an important role in many statis-

tical tests as explained in statistics textbooks such as Ref. [151]. In the PBR protocol, the test

statistic can be computed from any sequence of nonnegative functions Rn satisfying the inequality

in Eq. (5.16). Thus, the probability ratios are simply an intermediate step to obtain such functions.

We do not ascribe any other meaning to the ratios.

5.3 Technical details for applying the PBR protocol

5.3.1 Estimating the experimental probability distribution

Consider n trials with observed results given by x1, . . . , xn. Our goal is to obtain an estimate

qn of the true probability distribution q of the (n+1)’th trial result. Assuming no other knowledge,

the estimate can be based on the empirical frequencies fn(x) = 1n

∑nk=1 δxk,x. Due to statistical

fluctuations, the empirical frequencies are not likely to satisfy the following known constraints

satisfied by q:

• Setting distribution: The setting distribution pi,j is fixed, and q satisfies∑

a,b q(i, j, a, b)

= pi,j .

• No signaling: Given that Alice uses setting i, the distribution of Alice’s measurement

outcomes does not depend on Bob’s settings, and vice versa.

There are two other issues for computing PBR p-values. The first is that some empirical frequencies

fn(x) may be zero. If our estimate is qn = fn, zero frequencies can be disastrous. In the case where

the corresponding results occur at the next trial, the ratio contributing to the PBR p-value in

Eq. (5.20) can be zero, and then the p-value increases to 1 with no possibility of later reduction.

The second and related issue is that in the absence of prior knowledge, initially we have insufficient

71

information to make useful estimates of probability distributions of future trial results. Even if

the problem of zero frequencies has been taken care of, this can still result in initial “learning”

transients that cause a negative offset in the accumulated log-p-values (see Fig. 5.3 in Sec. 5.4 for

an example).

Our approach for estimating the next trial result’s probability distribution uses maximum

likelihood to obtain an estimate that respects the above constraints and then adjusts the estimate

by mixing in a distribution that is uniform conditional on the settings. To reduce the impact of

learning transients, we process the trials in blocks.

To apply maximum likelihood for computing a first estimate q0 of q, we assume independent

and identically distributed trials. Whether or not this assumption actually holds in an experiment

only affects the quality of the computed p-value, but not its validity. The probability of observing

empirical frequencies fn after n trials given that the true distribution is q is proportional to

L(fn|q) =∏x

q(x)nfn(x). (5.26)

We therefore set q0 according to

q0 = argmaxq′∈VL(fn|q′), (5.27)

where V is the set of probability distributions satisfying the setting-distribution and no-signaling

constraints. These constraints are linear and log(L(fn|q)) is concave, so there is no difficulty in

applying available nonlinear optimization tools. Note that, for the purpose of computing PBR

p-values, it is not critical that Eq. (5.27) is exactly satisfied, so it is not necessary to use extremely

tight stopping criteria to ensure identity with the best numerical precision possible. Also, whereas

the design of PBRs such as those in Eq. (5.22) requires that the setting-distribution constraint is

satisfied, the no-signaling constraint is not critical. Applying it helps improve our estimates, but

the effect on the log-p-value increments becomes negligible for large n.

There are different ways to solve the problem with empirical frequencies that are zero; some

are explained in Refs. [165, 166]. They generally involve mixing in a distribution that has no zero

72

probabilities with a weight that decreases to zero as n grows. For the plots in Figs. 5.3 and 5.4

of Sec. 5.4, we modified q0 by setting qn = nn+1q0 + 1

n+1u, where conditionally on the settings the

distribution u is uniform, and u’s setting distribution is pi,j .

There are different approaches to mitigate the effect of the initial learning transient. The

first is to “prime” the estimates with knowledge about the experiment available before the trials

are started. Such knowledge could be based on theory or on experiments designed to characterize

the quantum state and measurement setup. The prior information must be assigned a weight. In

our implementation of the local realism analysis engine (see Appendix A), the weight is determined

by the number of trials that would have been required to obtain an equally good estimate directly

from the frequencies. Proper use of priming requires that the initial estimates and parameters such

as the weight are determined “blindly” before any knowledge of the actual data to be analyzed is

available.

A second approach is to set Rn(x) = 1 for any x unless the statistical strength Sqn for qn’s

violation of LR seems sufficiently significant given that the estimated distribution qn is based on

n trials. While one might expect that the violation is sufficiently significant if nSqn ≥ c for some

constant c, simulations show that the best choice of c depends on the distribution of trial results

in an experiment.

The third and simplest approach is to block the data from the trials. Instead of updating

the log-p-value after every trial, we process data h trials at a time. The first block is used only

for estimating the probability distribution of future trial results. That is, we set Rk(x) = 1 for

k = 0, . . . , (h−1). Subsequently, we have Rmh+k = Rmh for k = 1, . . . , (h−1) and all m. Note that

neither the validity nor the asymptotic optimality of the computed p-values requires updating the

PBRs after each trial. Choosing h large enough ensures that the first block’s trials have sufficient

information for obtaining reasonable estimates of the distribution. An additional advantage of

blocking the trials is that we avoid unnecessarily invoking the computationally costly optimizations

required for updating the PBRs. We standardized the choice of block size so that if the total

number of trials to be analyzed is N , h is the maximum of dN/1000e and dln(2d)de, where d is the

73

number of possible results at a trial. The first expression ensures that we do not lose too much

log-p-value by using the first block only for learning the trial results’ distribution. The second one

is chosen so that if q is uniform, the probability that every trial result occurs in each block is at

least 1/2.

We conclude this section with a note on implementing the PBR protocol. For monitoring

an experiment and to adapt to changes in experimental configuration, the estimated experimental

distributions used in the PBRs should be based on recent trials only. This can be accomplished by

windowing the trials with a window large enough to have statistically significant violation of LR

(if there is violation), but small enough to avoid seeing significant changes in configuration. Our

implementation of the local realism analysis engine uses a computationally simpler approach based

on weighting the trials with exponentially decreasing weights in time determined by a configurable

half-life. This feature was not used in the comparisons in Sec. 5.4.

5.3.2 Effects of bad estimates of true distributions and optimal LR models

Ideally the estimated distribution qn used in the numerator of Rn matches the true distribu-

tion q, and the LR distribution pLR,n in the denominator of Rn exactly minimizes the KL divergence

from qn. As shown in Sec. 5.2.4, having qn different from q does not affect the validity of the PBR

p-values. But it can reduce the expected log-p-value increment l. Let Sq be the statistical strength

of q for the violation of LR. We show that

Sq ≥ l ≥ Sq −DKL(q ‖ qn). (5.28)

For reasonable methods of estimating qn such as the one described in Sec. 5.3.1 and independent

and identically distributed trials, qn almost surely approaches q so that DKL(q ‖ qn) goes to zero.

This shows that the PBR protocol achieves the confidence-gain rate Sq and hence is optimal.

To prove the first inequality in Eq. (5.28), let pLR be the LR distribution that minimizes the

74

KL divergence from q, so that Sq = DKL(q ‖ pLR). We bound l as follows:

Sq − l =∑x

q(x) log2

(q(x)pLR(x)

)−∑x

q(x) log2

(qn(x)

pLR,n(x)

)=

∑x

q(x) log2

(q(x)t(x)

), (5.29)

where we define t(x) = pLR(x)qn(x)/pLR,n(x). Since qn(x)/pLR,n(x) is a PBR, and pLR is an LR

distribution, we know that c ≡∑

x t(x) ≤ 1 (see Eq. (5.24)). Since t′ = t/c is a probability

distribution, we can continue the calculation:

Sq − l = log2(1/c) +∑x

q(x) log2

(q(x)t′(x)

)≥ 0, (5.30)

because the second term is a KL divergence.

To obtain the second inequality of Eq. (5.28) we bound

l =∑x

q(x) log2

(qn(x)

pLR,n(x)

)=

∑x

q(x) log2

(q(x)

pLR,n(x)

)− q(x) log2

(q(x)qn(x)

)= DKL(q ‖ pLR,n)−DKL(q ‖ qn)

≥ DKL(q ‖ pLR)−DKL(q ‖ qn)

= Sq −DKL(q ‖ qn). (5.31)

The denominator pLR,n of the PBRs Rn must be computed numerically. Consequently, the

distribution p′LR,n actually obtained is typically not identical to pLR,n and may not minimize the

relevant KL divergence. Hence, there may be an LR distribution p, according to which the ex-

pectation 〈R′n(X)〉p = 〈qn(X)/p′LR,n(X)〉p is greater than 1, and so the PBR p-value is not valid

if it is computed according to Eq. (5.20) with R′n. To maintain validity, we determine the max-

imum value 1 + ε of the expectations 〈R′n(X)〉p according to all LR distributions p and then set

Rn = R′n/(1 + ε). To determine the bound 1 + ε, we recall that LR distributions are mixtures

of distributions pλ induced by “local hidden variables” λ. Each λ assigns deterministic outcomes

independently for each setting of Alice and each setting of Bob. We write a(λ,Ai) and b(λ,Bj) for

75

Alice’s and Bob’s measurement outcomes given settings Ai and Bj , according to λ. The probability

for the trial result x = (i, j, a, b) is given by pλ,(i,j,a,b) = pi,jδa,a(λ,Ai)δb,b(λ,Bj). With these definitions,

1 + ε = maxp is LR

〈qn(X)/p′LR,n(X)〉p = maxλ

∑x

pλ,xqn(x)/p′LR,n(x). (5.32)

Because the number of different λ is finite, the value 1+ε can be computed according to Eq. (5.32).

The expectation-maximization algorithm that we apply to KL-divergence minimization iteratively

updates the probability distribution over the set of hidden variables λ. To perform the updates,

it requires the set of values that are maximized in Eq. (5.32), so the computation of 1 + ε can be

integrated into the algorithm with little overhead. Furthermore, the quantity ε can be used as a

stopping criterion for minimization. That is, the expected log-p-value increment l′, assuming that

the random variable Xn+1 is distributed according to qn, satisfies

l′ =∑x

qn(x) log2

(qn(x)

p′LR,n(x)(1 + ε)

)= DKL(qn ‖ p′LR,n)− log2(1 + ε)

≥ DKL(qn ‖ pLR,n)− log2(1 + ε). (5.33)

Thus, for independent and identically distributed trials, the confidence-gain rate is lowered by at

most log2(1 + ε).

5.4 Results

In this section we show the results using the SD-based, martingale-based, and PBR protocols.

5.4.1 Confidence-gain rates

Let us first compare the confidence-gain rates achieved by different protocols in tests of LR

with different experimental configurations. In Fig. 5.1, we study the confidence-gain rates achieved

by different protocols in tests of LR using unbalanced Bell states |ψuB〉 = cos(θ)|00〉 + sin(θ)|11〉

with θ ∈ (0, π/4]. For the results shown in this figure, we chose a uniform setting distribution.

The family of unbalanced Bell states considered is of interest because they are more tolerant of low

detection efficiency, as studied in Chapter 4.

76

0 5 10 15 20 25 30 35 40 450

0.01

0.02

0.03

0.04

0.05

0.06

0.07

θ(◦)

G

SD−based

Martingale−based

Optimal = PBR

θ = 33.41◦

Figure 5.1: Confidence-gain rates G achieved by the SD-based, martingale-based, and PBR proto-cols. The gain rate G is shown for a CHSH test of LR with an unbalanced Bell state with no lossand perfect detectors. It depends on the parameter θ in the unbalanced Bell state |ψuB〉. Giventhe state parameter θ, the measurement settings are chosen to maximize the violation of the CHSHinequality (1.2). The line corresponding to the gain rates achieved by the SD-based protocol crossesthe line corresponding to the optimal gain rates at θ = 33.41◦.

In Fig. 5.2, we study the confidence-gain rates in tests of LR with noisy and lossy Bell states.

Motivated by the result that the amount of randomness produced using a test of LR with a biased

setting distribution is more than that produced with a uniform setting distribution [3], here the

confidence-gain rates in tests with biased setting distributions are shown.

Figs. 5.1 and 5.2 show that the gain rates achieved by the SD-based protocol can be higher

than justified and are therefore not valid. The worst case is when the state used is a Bell state,

which is an aim of most experiments to date. Both figures also show that the gain rates achieved

by the martingale-based protocol are valid but generally not optimal.

77

0.85 0.9 0.95 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

η

G

0.85 0.9 0.95 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

η

G

0.85 0.9 0.95 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

η

G

0.85 0.9 0.95 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

η

G

SD−basedMartingale−based

SD−basedMartingale−basedOptimal = PBR



−V = 1−V = 0.9−V = 0.85

−V = 1−V = 0.9−V = 0.85

−V = 1−V = 0.9−V = 0.85

(a) (b)

(d)(c)

−V = 1−V = 0.9−V = 0.85

Figure 5.2: The confidence-gain rate G of a CHSH test of LR with a Bell state and varying detectionefficiency η and visibility V. The measurement settings are chosen to maximize the violation of theCHSH inequality (1.2). Measurement outcomes where no particle is detected are assigned the value−1. (a) pA1 = pB1 = 0.5, (b) pA1 = pB1 = 0.51, (c) pA1 = pB1 = 0.52, and (d) pA1 = pB1 = 0.53,where pA1 and pB1 are the probabilities that at each trial Alice and Bob independently choose thesettings A1 and B1, respectively. Note that, in the subplot (a) the optimal gain rates are not shown,since the optimal gain rate can be at most 6 % larger than the corresponding martingale-based gainrate so that the difference between them is not visible.

From Fig. 5.1, we can infer that, if one uses the number of SDs to compare the violation of the

CHSH inequality in experiments involving different unbalanced Bell states, one tends to unfairly

78

favor the experiment with the more balanced state. From the results in Fig. 5.2, one can see that,

with the increase of the bias in the setting distribution the confidence-gain rates achieved by the

martingale-based protocol become further away from the corresponding optimal gain rates.

Note that, for the above results, the SD-based confidence-gain rates were computed with re-

spect to the conventional method for estimating violation. According to the discussion in Sec. 5.2.1,

the number of SDs of violation computed according to the conventional estimate I in Eq. (5.7) is

generally higher than that computed according to the estimate I in Eq. (5.5). Hence, the con-

ventional way of estimating the violation and the experimental SD worsens the validity problem

for SD-based gain rates. However, using the estimate I and the associated larger SD in Figs. 5.1

and 5.2 does not significantly alter the plots or their interpretation.

5.4.2 Application to experiments

The protocols discussed can compute p-values for recorded trials as an experiment progresses,

and such “running” p-values may be used to optimize experimental settings. Because we are

interested in extremely small p-values with exponential asymptotic behavior, we generally consider

and display the log-p-value.

The SD-based or martingale-based protocol is restricted to a fixed Bell inequality. The PBR

protocol does not have this restriction, which enables wider searches for strong violations of LR.

Running log-p-values are shown for a simulation in Fig. 5.3 and for data from Ref. [3] in Fig. 5.4.

The PBR p-values were computed with our implementation of the local realism analysis engine; see

the code information and documentation associated in Appendix A. Note that whereas running

log-p-values can be used to monitor and tweak an experiment, they must not be used as a stopping

criterion once an experiment has been configured.

For Fig. 5.3 we simulated a CHSH test of LR with an unbalanced Bell state and measurement

settings maximizing the violation of the CHSH inequality (1.2). We assumed an ideal experiment

(no loss of particles or visibility) and simulated 5000 successive trials. The log-p-values were updated

for successive blocks of 56 trials according to the discussion in Sec. 5.3.1. Here, we didn’t prime

79

0 1000 2000 3000 4000 50000

20

40

60

80

100

120

140

160

180

n

−log2pn

0 1000 2000 3000 4000 50000

20

40

60

80

100

120

140

160

180

n

−log2pn

SD−based

Martingale−based

PBR

SD−based

Martingale−based

PBR

(a) (b)

Figure 5.3: Running log-p-values as functions of the number of trials n in a CHSH test of LR withan unbalanced Bell state cos(θ)|00〉+ sin(θ)|11〉 where θ = 22.5◦. We assume that there is no noiseor detection inefficiency and the setting distribution is uniform. The log-p-values are computedaccording to the three protocols discussed. The slopes of the straight lines are the confidence-gainrate achieved by each protocol. (a) is for one simulation of 5000 successive trials. (b) is an averageof 30 independent simulations.

the PBRs before starting the simulation. The figure shows typical and average runs and compares

the running log-p-values to the asymptotic lines with slopes given by the respective gain rates. The

slopes of the running log-p-values approach the gain rates, but PBR log-p-values have a systematic

offset that can be attributed to an initial transient where the experimental probability distribution

is being learned. The transient can be removed if, before the experiment is started, we have a good

estimate of the experimental distribution. Such an estimate can be used to prime the PBRs.

For Fig. 5.4, we compute log-p-values for the data from the experiment described in Ref. [3].

In this experiment, two 171Yb+ ions separated by about one meter were entangled through a

probabilistic process. In this process, each ion is entangled with one emitted photon. By projecting

80

0 500 1000 1500 2000 2500 30000

5

10

15

20

25

30

35

40

45

n

−log2pn

SD−based

Martingale−based

PBR without priming

PBR with priming

Figure 5.4: Running log-p-values as functions of the number of trials n in the experiment of Ref. [3].In this experiment, different measurement settings are chosen uniformly randomly. The dotted linesare provided only to guide the eye.

the two emitted photons into a Bell state the two remote ions are entangled with each other. On

the entangled two-ion system, a CHSH test of LR was performed. The results from 3016 trials were

recorded. The resulting estimate of the CHSH expression is ICHSH = 2.414± 0.058. For the figure,

we processed the data in blocks of 56 trials as before. The log-p-values computed by the PBR

protocol both with and without priming are shown. To prime the PBRs, we assumed that before

the experiment we had an estimate of the experimental probability distribution based on the exact

frequencies observed in this experiment after 3016 trials. In this experiment, there is insufficient

data for PBR log-p-values to exceed martingale-based ones.

Chapter 6

Efficient quantification of experimental violation of local realism

From the last chapter, we know that a small p-value means that the observed data is significant

for rejecting local realism (LR). Upper bounds of p-values for specified test statistics are required

for precise statements of experimental violations of LR. Such bounds not only help to reliably

demonstrate violations of LR, but also help to prove the security of quantum key distribution [11,

13, 109] or certify the generation of genuine randomness [3, 110, 111, 112].

In the last chapter, we discussed two available protocols that compute valid upper bounds

of p-values. One is the martingale-based protocol [18, 19], but the bounds computed are not tight.

The other is the prediction-based-ratio (PBR) protocol, which computes tighter bounds. Specially,

the latter bounds are asymptotically tight with respect to the total number of trials in a test of

LR, if the prepared quantum states and measurement settings do not vary in time. While we

demonstrated that the PBR protocol is practical for many standard configurations, this protocol

is computationally inefficient with respect to the number of parties per test, settings per party,

and outcomes per setting. The reason is that it requires computing estimates of the experimental

probability distribution and the associated optimal local realistic (LR) model. These estimates

are difficult to find when there are many parties, settings, or outcomes. Extreme examples are

provided by experimental configurations involving continuous variables, where the PBR protocol

cannot be directly applied. In this chapter, we propose a simplified PBR protocol to efficiently

compute high-quality p-value bounds for all configurations.

The simplified PBR protocol has at least four advantages over other protocols. First, its

82

p-value bounds are as good as and typically better than those obtained by the martingale-based

protocol. Second, it can take multiple Bell inequalities into consideration at once in a statistically

rigorous way. Thus we can obtain high-quality p-value bounds even when we cannot determine

beforehand which inequality will work best. Third, it can adapt to changes in the experimental

results’ distribution. Fourth, this protocol can be applied to any test with linear witnesses, such as

entanglement detection [82, 84], without a full analysis of the relevant probability space.

Due to the difficulty of determining worst-case tail probabilities of typical test statistics, we

can usually determine only upper bounds of exact p-values as defined in Eq. (5.2) of Chapter 5.

Thus, for the remainder of this chapter, the term “p-value” refers to any valid upper-bound on the

exact p-value. We can compare different protocols according to the (asymptotic) confidence-gain

rate as defined in Eq. (5.1) of Chapter 5. Higher gain rates imply better protocol performance.

In Sec. 6.1, we discuss how to simplify the PBR protocol. Like the martingale-based and

full PBR protocols, the simplified PBR protocol works even under memory effects [20, 21, 22, 23].

We then compare the simplified PBR protocol with the other two protocols in Sec. 6.2. Finally

in Sec. 6.3, we discuss the application of the simplified PBR protocol to other tests with linear

witnesses. This chapter is based on our previous work [29].

6.1 Simplified PBR protocol

The simplified PBR protocol chooses the PBRs from convex combinations of Bell functions

that are derived from a given set of Bell inequalities. To ensure that a convex combination is

a PBR, the Bell functions first need to be standardized so that they are nonnegative and have

expectations at most 1 for any LR model. Any Bell function that is lower-bounded has such a

standardized form. In particular, if 〈I(X)〉 ≤ B is a Bell inequality and I(x) ≥ bl for all x, then

r(x) = (I(x) − bl)/(B − bl) is standardized. Note that, as a constraint on the distribution of

X, 〈r(X)〉 ≤ 1 is equivalent to 〈I(X)〉 ≤ B. Given Bell inequalities 〈I(m)(X)〉 ≤ B(m) where

I(m) is lower-bounded and m = 1, 2, . . . ,M , we can construct the corresponding standardized Bell

functions r(m). We define r = (r(1), . . . , r(M)). The simplified PBR protocol chooses the PBR Rn

83

from among the convex combinations

ω · r =∑m

ωmr(m), (6.1)

where ωm ≥ 0 and∑

m ωm = 1. Our implementation always includes the trivial Bell function

r(1) = 1. This ensures that the set of convex combinations is at least one-dimensional and that the

confidence-gain rate is at least as high as that achieved by the martingale-based protocol (see the

discussion below).

Like the full PBR protocol, the simplified PBR protocol aims to optimize the experimentally

expected log-p-value increment given previous trial results, under the assumption that the distri-

bution of Xn+1 is the same as the empirical-frequency distribution of the previous trial results.

Whether or not this assumption holds does not affect the validity of the p-value computed. The

log-p-value increment at the (n + 1)’th trial may be defined as log2Rn(xn+1). Its experimentally

expected value given that Xn+1 is distributed according to q is

∑xn+1

q(xn+1) log2Rn(xn+1). (6.2)

Before the (n+ 1)’th trial, the protocol attempts to maximize this expected log-p-value increment.

Since q is not known, it is empirically estimated based on the previous n trials. Expanding Rn

according to Eq. (6.1) yields the following estimate of the experimentally expected log-p-value

increment at the (n+ 1)’th trial:

Gn(ω) =1n

n∑k=1

log2(ω · r(xk)) =∑

x:fn(x)6=0

fn(x) log2(ω · r(x)), (6.3)

where fn(x) = 1n

∑nk=1 δxk,x is the empirical frequency of x before the (n+1)’th trial. The protocol

thus determines Rn by maximizing Gn(ω) over ω, that is, Rn = r · argmaxωGn(ω). Note that,

unlike the full PBR protocol, the simplified PBR protocol does not require explicitly optimizing

over all LR models. Computing argmaxωGn(ω) requires optimizing a convex objective function

over an M -dimensional convex space, where the evaluation of the objective function involves a sum

of n terms. In our implementation, we apply the expectation-maximization algorithm [167] to solve

this problem.

84

The performance of the simplified PBR protocol depends on the relationship between the

actual distribution of trial results and the set of standardized Bell functions used. If the results

are independent and identically distributed according to a known distribution that violates LR,

then there exists an optimal Bell inequality that can be derived from the optimal PBR as found

by the full PBR protocol. (Here, optimality refers to the optimality of the gain rate achieved by

the protocol; see Chapter 5.) If the optimal PBR is included in the convex set of standardized

Bell functions, the confidence-gain rate achieved by the simplified PBR protocol is optimal. But

since the actual distribution is unknown before an experiment, the above assumption may not

hold without making the dimension of the set of convex combinations in Eq. (6.1) impractically

large. Thus, before an experiment, it is important to choose a relevant (and preferably small) set

of standardized Bell functions. In Sec. 6.2, we show that it helps to include more than just the

obvious Bell functions.

The performance of the simplified PBR protocol can be compared with that of the martingale-

based protocol, the only valid non-PBR protocol considered so far. To compute a p-value, the

martingale-based protocol uses a Bell inequality 〈I(X)〉 ≤ B with a Bell function I whose range is

included in the interval [bl, bu]. Below we show that the simplified PBR protocol using the same Bell

inequality, together with the default trivial Bell function r = 1, achieves a gain rate at least as high

as the gain rate achieved by the martingale-based protocol. Also, the following proof shows that

these two gain rates are equal to each other if and only if the experimental range of the function I

is contained in the set {bl, bu}.

Let the experimental probability of observing the result x in a trial be q(x). The experimental

mean of I is Iq =∫q(x)I(x)dx. If Iq ≥ B, then from Eqs. (5.1) and (5.14) of Chapter 5 we get the

gain rate

Gmart =bu − Iqbu − bl

log2

bu − Iqbu −B

+Iq − blbu − bl

log2

Iq − blB − bl

=∫q(x)

(bu − I(x)bu − bl

log2

bu − Iqbu −B

+I(x)− blbu − bl

log2

Iq − blB − bl

)dx. (6.4)

Here, we use the fact that the experimental estimate I approaches Iq as N →∞. By the concavity

85

of log2(x) and some algebra, we get that the gain rate Gmart satisfies the inequality

Gmart ≤∫q(x) log2

(bu − I(x)bu − bl

bu − Iqbu −B

+I(x)− blbu − bl

Iq − blB − bl

)dx

=∫q(x) log2

(ω0I(x)− blB − bl

+ 1− ω0

)dx, (6.5)

where 0 ≤ ω0 = Iq−Bbu−B ≤ 1.

From Eqs. (5.1) and (5.20) of Chapter 5 and according to the design of the PBRs by the

simplified PBR protocol, the gain rate achieved by this protocol is

GsPBR = max0≤ω≤1

∫q(x) log2

(ωI(x)− blB − bl

+ 1− ω)

dx. (6.6)

Here, we use the fact that the empirical frequency fN (x) = 1N

∑Nn=1 δxn,x approaches the experi-

mental probability q(x) as N →∞. The inequality Gmart ≤ GsPBR follows directly from comparing

Eq. (6.5) with Eq. (6.6).

By considering the condition for equality in Eq. (6.5), we can show that Gmart = GsPBR if

and only if q(x) = 0 whenever bl < I(x) < bu. For this it suffices to note that log2(x) is strictly

concave, so equality holds in Eq. (6.5) if and only if I(x) = bu or bl whenever q(x) 6= 0.

6.2 Protocol comparison

6.2.1 Computational resource comparison

Of the available protocols for computing p-values, the martingale-based one is the least

resource-intensive and simplest to apply. It requires computing only an estimate of the mean

of the Bell function, which involves a sum of N terms. In the following, we compare the com-

putational resources required by the simplified and full PBR protocols in an experimental test of

LR.

We consider an experimental configuration involving l parties where each party has s mea-

surement settings and each local measurement has d outcomes. (The comparison below is readily

extended to more general configurations.) We suppose that the joint-setting distribution is uniform.

Then, the number of possible results (measurement settings and outcomes of all parties) at a trial

86

is K = (ds)l. Since an LR state specifies the exact outcome for each local measurement of each

party at a trial, there are H = dls many such models. A general LR model is a convex combination

of LR states, so the number of free parameters characterizing a general LR model is (H − 1).

Let the total number of trials in an experimental test of LR be N . We assume that each

PBR protocol sets the initial value of the PBR to R0 = 1 and updates the PBR Rn before each

trial n (n > 1). (In practice this is unnecessary; see Sec. 5.3.1 of Chapter 5.) For updating the

PBR, each PBR protocol needs to optimize a convex objective function over a convex space. The

complexity of this optimization problem can be described in terms of variables that are functions

of the parameters n, l, s, and d characterizing the input data size. (Note that the stored size of

the first n trial results is O(n log(K)) = O(nl(log(d) + log(s))).) We need to quantify the resource

cost of implementing each protocol in terms of these parameters.

The complexity of the optimization problem solved before each trial can be parametrized by

the complexity of the convex search space, the complexity of evaluating the objective function, and

the precision needed for computing a high-quality p-value for rejecting LR. We assume that the sim-

plified and full PBR protocols use generic iterative optimization algorithms whose implementation

complexities as functions of these parameters are asymptotically the same. We also assume that the

complexity of the convex search space is dominated by its dimension. In particular, we do not ac-

count for the complexity of enforcing convex constraints. This is motivated by the observation that

there is no additional overhead for enforcing convex constraints in the expectation-maximization

algorithm [152, 167] used in our implementation. For quantifying the complexity of evaluating the

objective function, we assume that the Bell functions used can be evaluated in constant time given

any trial result. This assumption is realistic for many Bell functions, as their values are determined

by concise formulas derived from theory. Alternatively, these functions can be preprocessed as a

table stored in random-access memory; we do not include preprocessing time in our analysis. Also,

we assume that determining whether or not an arbitrary trial result x happens according to an

LR state takes constant time. (Strictly speaking, the time taken for such a determination process

is proportional to the number of parties l.) The precision needed affects the number of iterations

87

required by an algorithm to find a numerical solution. It affects only the quality of the p-value

computed by a protocol, but not its validity. (For the expectation-maximization algorithm used,

see Theorem 4 of Ref. [167] and Sec. 5.3.2 of Chapter 5 for the effects of the precision parameters

in the simplified and full PBR protocols, respectively.) We assume that the precision parameters in

both protocols are set to be the same, and we do not account for the number of iterations required

to achieve the specified precision. Therefore, for the purpose of comparing the computational re-

sources required by the simplified and full PBR protocols, we focus on comparing the dimensions

D of the convex search spaces and the complexities C of evaluating the objective functions in the

optimization problems solved by the two protocols before each trial.

We first consider the simplified PBR protocol. Given a set of M Bell inequalities, this

protocol sets Rn = ωn · r, where the size of ωn is M , r is defined before Eq. (6.1), and ωn is chosen

to maximize the estimated confidence gain as in Eq. (6.3). Note that, in the right-hand side of

Eq. (6.3), the sum is taken over only the results x already observed in the previous trials.

For the maximization of Eq. (6.3), the dimension of the convex search space is M . The eval-

uation of the objective function can use the left-hand or right-hand side of Eq. (6.3), whichever has

fewer terms. Thus it involves a sum of at most min(n,K) terms where each term requires comput-

ing a convex combination of M Bell-function values. Hence, for updating the PBR Rn before the

(n+1)’th trial, the complexity of evaluating the objective function is CsPBR = O(min(nM,KM)) =

O(min(nM, (ds)lM)), and the dimension of the search space is DsPBR = O(M). Therefore, if any

of the configuration parameters l, s, or d is large, CsPBR and DsPBR are independent of these pa-

rameters. Consequently, the numbers of parties, settings, and outcomes are not limiting factors for

applying the simplified PBR protocol. In this sense, the simplified PBR protocol is efficient for any

experimental configuration.

The full PBR protocol as studied in Chapter 5 computes Rn in two steps. First, the protocol

estimates the probability qn(x) of the result x to be observed at the next trial. This estimate can be

obtained in different ways. The simplest is to let qn(x) be the empirical frequency fn(x) of x over

the previous n trials. However, one can consider additional constraints such as the known joint-

88

setting distribution and no-signaling conditions. Thus, in Chapter 5 we suggested maximizing the

likelihood function Eq. (5.26), subject to these constraints, and we observed that this can improve

the quality of the p-value computed. Since this maximization is not a resource bottleneck, we do

not consider its complexity in the comparison. Second, we find the LR model pLR,n closest to the

estimated distribution qn by minimizing the KL divergence [168] from qn to an LR model p

DKL(qn ‖ p) =∑x

qn(x) log2

(qn(x)p(x)

). (6.7)

The full PBR protocol then sets Rn(xn+1) = qn(xn+1)/pLR,n(xn+1).

For the minimization of Eq. (6.7), the dimension of the convex search space is H. The

evaluation of the objective function involves a sum of K terms where each term requires computing

p(x) according to a convex combination of H LR states. Hence, for updating the PBR Rn before

the (n + 1)’th trial, the complexity of evaluating the objective function is CfPBR = O(KH) =

O(dl(s+1)sl), and the dimension of the search space is DfPBR = O(H) = O(dls). While CfPBR and

DfPBR are polynomial in d, they are exponential in each of l and s. Therefore, the full PBR protocol

is not efficient with respect to these configuration parameters.

Before applying the simplified PBR protocol, one chooses a relevant and preferably small

set of Bell inequalities. In many cases of interest, l, s, or d is large, and so is H = dsl. For

example, in field-quadrature measurements d is fundamentally infinite. Hence, M , the number of

Bell inequalities used in the simplified PBR protocol, is in general much smaller than H, the number

of LR states considered in the full PBR protocol. The complexities show that for such cases, the

simplified PBR protocol is substantially less resource-intensive than the full PBR protocol.

6.2.2 Comparison of confidence-gain rates

We begin by comparing the confidence-gain rates achieved by different protocols for ex-

perimental configurations designed to violate the Collins-Gisin-Linden-Massar-Popescu (CGLMP)

inequality [58]. To test the CGLMP inequality, there are two parties, and each of them performs

one of two possible measurements with d outcomes at each trial. This is an example where the full

89

0 20 40 60 80 1000

0.05

0.10

0.15

0.20

0.25

d

G

Martingale−basedSimplified PBR

Figure 6.1: Confidence-gain rates in the test of the CGLMP inequality 〈Id(X)〉 ≤ 2. Here, we usethe quantum state and measurement settings of Ref. [4], Eqs. (15) and (9), respectively.

PBR protocol is impractical for large d. For this example and the one below, we assume that at

each trial each party’s measurement setting is chosen uniformly randomly. The CGLMP inequality

can be written as 〈Id(X)〉 ≤ 2, where the function Id takes d different values. The gain rates Gmart

and GsPBR, achieved by the martingale-based and simplified PBR protocols, are shown in Fig. 6.1.

Here the simplified PBR protocol uses only the CGLMP inequality. This figure illustrates that

GsPBR is higher than Gmart when d > 2.

The optimal gain rate Sq is achieved by the full PBR protocol and can be computed as the

minimum KL divergence from the experimental probability distribution q to any LR model [25].

For the results of Fig. 6.1, we find that the gain rates GsPBR are numerically indistinguishable from

Sq when d ≤ 13. For the case d > 13, it is difficult to compute Sq due to the large dimension of

the probability space over all possible LR models. For the tests studied in Fig. 6.1, we conjecture

that GsPBR = Sq. In general we cannot guarantee that GsPBR is optimal.

Next, we compare the performance of the simplified PBR protocol when using different num-

bers of Bell inequalities. The experimental configuration considered is for a test of the Clauser-

90

0 5 10 15 20 25 30 35 40 450

0.01

0.02

0.03

0.04

0.05

θ (◦)

G

Simplified PBR 1

Simplified PBR 2

Full PBR

Figure 6.2: Confidence-gain rates in the test of LR with an unbalanced Bell state |ψ(θ)〉. Themeasurement settings are chosen to maximize the violation of the CHSH inequality (1.2) given thestate |ψ(θ)〉. The gain rates achieved by the simplified PBR protocol using the CHSH inequality areshown as circles (◦), while the gain rates by the same protocol using the CHSH inequality togetherwith no-signaling conditions are shown as crosses (+).

Horne-Shimony-Holt (CHSH) inequality [15] using an unbalanced Bell state |ψ(θ)〉 = cos(θ)|00〉+

sin(θ)|11〉. There are many different ways of expressing the CHSH inequality, and they are equiv-

alent considering no-signaling and normalization conditions. For comparison, we consider the sim-

plified PBR protocol with the CHSH inequality (1.2) alone or in conjunction with additional,

seemingly trivial Bell inequalities such as those derived from no-signaling conditions. With Bell

functions corresponding to no-signaling conditions, the gain rates are improved, as shown in Fig. 6.2.

This improvement suggests that the gain rate achieved by the simplified PBR protocol depends on

the form of a Bell inequality used.

91

6.2.3 Comparison of protocols’ behavior for finite data

Here we consider the behavior of each protocol given a finite amount of experimental data.

We simulate the test of the CGLMP inequality 〈I3(X)〉 ≤ 2 [58] with the quantum state and

measurement settings of Ref. [4], Eqs. (15) and (9) (with d = 3), respectively. We assume that

at each trial each party’s measurement setting is chosen uniformly randomly. The protocols’ gain

rates are Gmart = 0.0565 and GsPBR = 0.0675, while the optimal gain rate Sq achieved by the full

PBR protocol is numerically indistinguishable from GsPBR. For computing GsPBR, the simplified

PBR protocol uses the standardized CGLMP inequality and the trivial Bell function r = 1.

0 2000 4000 6000 8000 100000

100

200

300

400

500

600

700

n

−log 2

pn

Martingale−based

Simplified PBR

Full PBR

Figure 6.3: An example of running log-p-values as functions of the number of trials n in a test ofthe CGLMP inequality. The dashed and solid lines are the asymptotic lines for log-p-values basedon gain rates achieved by the (full or simplified) PBR protocol and the martingale-based protocol,respectively. Repetitions of this Monte Carlo simulation show similar behavior.

The results from 10, 000 successive trials are recorded. Fig. 6.3 shows the (negative) log-p-

values computed for the first n results from a simulated sequence of trials as functions of n. The

asymptotic lines for log-p-values, given by the products of n and the respective gain rates achieved

92

by different protocols, are also shown in Fig. 6.3.

In our discussion so far, we have assumed that each PBR protocol updates the PBR before

each trial. In practice, the PBR is updated only for a block of trial results at a time. Specifically,

for the simulation shown in Fig. 6.3, we update the PBRs and log-p-values only after every block

including 154 successive trials. (See Sec. 5.3.1 of Chapter 5 for a discussion of the block-size

choice and related issues.) This block-size choice limits PBR computations to when enough new

information has been obtained, thereby reducing the resource cost. It also mitigates the offset of

the computed log-p-values from the asymptotic line. This offset is due to an initial transient where

the relevant features of the experimental distribution are being learned. The learning offset can be

removed if, before an experiment, we have a good estimate of the experimental results’ distribution.

Such an estimate could be based on (quantum or otherwise) theory or previous experiments.

The PBR protocols provide better results than the martingale-based protocol. However,

the PBR log-p-values show learning offsets from the asymptotic line. Our results show that the

simplified PBR log-p-values have a smaller learning offset than the full PBR log-p-values in each

of 30 independent simulations performed. The reason is that the simplified PBR protocol needs to

infer a much smaller number of parameters for constructing the PBRs.

In the above example, the simplified PBR protocol uses only two Bell functions. Given a

prescient choice of Bell functions, this is sufficient for computing asymptotically optimal p-values.

But in general, more Bell functions are needed for computing a high-quality p-value. However, this

involves inferring more parameters and thus requires more trials before a good inference can be

obtained. As a result, the learning offset is expected to increase when using more Bell functions.

One way to mitigate this problem may be to increase the number of Bell functions used over time,

adding new Bell functions only when there are enough trials for reliable inference of the additional

parameters.

93

6.3 Extensions

To compute a p-value, the simplified PBR protocol uses a set of linear inequalities that are

satisfied by the predictions of a null hypothesis before each trial in an experiment. Besides tests of

LR, there are many other types of tests based on linear witnesses, such as tests for entanglement [82,

84] and system dimensionality above a given bound [169, 170]. In any test based on linear witnesses,

such a witness can be expressed as 〈W (X)〉 ≤ B, where W is a real-valued function and X is the

random variable from which a trial result x is sampled. The result x consists of all choices made

at each trial, such as choices of states and measurement settings, and the outcomes observed

under these choices. Here, we assume that the choices are made randomly according to a known

probability distribution at each trial, so that a witness 〈W (X)〉 ≤ B is satisfied before each trial

assuming the null hypothesis. As for Bell functions, if a witness function W is lower-bounded it can

be standardized. The simplified PBR protocol can then be applied with any set of standardized

witnesses, as we did for tests of LR.

To explain the above idea, let us consider a test of entanglement. Specifically, let us consider

the verification of entanglement in the mixed two-qubit state

ρ(V ) = V |ψ−〉〈ψ−|+ (1− V )I4

4, (6.8)

where the pure state |ψ−〉 is the singlet state 1√2(|10〉 − |01〉), I4

4 is the completely mixed state of

two qubits, and the parameter V characterizes the visibility of the singlet state in an experiment.

The state in Eq. (6.8) is entangled if and only if V > 1/3, which can be verified by measuring the

entanglement-witness operator [81, 171]

W = I4 − 2|ψ−〉〈ψ−|. (6.9)

It is easy to verify that Tr(Wρsep) ≥ 0 for any separable state ρsep and Tr(Wρ(V )) < 0 if and

only if V > 1/3. However, it is difficult to directly measure this witness operator in practice. The

reason is that the projector |ψ−〉〈ψ−| in this witness operator is a nonlocal operator, which is not

straightforward to measure. Therefore, for the experimental implementation we need to decompose

94

the witness operator in Eq. (6.9) into operators that can be measured locally. There are different

ways to decompose an entanglement-witness operator into local operators. As shown in Ref. [171],

the decomposition of Eq. (6.9) that involves the least number of joint measurement settings is

W =12

(I4 + σx ⊗ σx + σy ⊗ σy + σz ⊗ σz), (6.10)

where σx =(

0 11 0

), σy =

(0 −ii 0

)with i as the imaginary unit, and σz =

(1 00 −1

)are the Pauli

matrices.

To verify the entanglement in the state Eq. (6.8) by the simplified PBR protocol, let the

operator W ′ = −σx⊗σx−σy⊗σy−σz⊗σz. Since the witness operator in Eq. (6.10) is W = (I4−

W ′)/2, the operator W ′ satisfies that Tr(W ′ρsep) ≤ 1 for any separable state ρsep and Tr(W ′ρ(V )) >

1 for the entangled state ρ(V ) with V > 1/3 in Eq. (6.8). To measure the operator W ′ in an

experiment, at a trial we choose the joint setting σx ⊗ σx, σy ⊗ σy, or σy ⊗ σy uniformly randomly

and observe the outcome (aj , bj) = (1, 1), (1,−1), (−1, 1), or (−1,−1) under the chosen setting

σj ⊗ σj , where 1 and −1 are the two eigenvalues of the Pauli matrix σj , j = x, y, or z. We then

denote the measurement-setting choice and outcome at a trial by X and define the entanglement-

witness function W ′′(x) = −ajbj/pj = −3ab, where x = (j, aj , bj) includes the setting choice j and

the observed outcome (aj , bj) at a trial, and pj = 1/3 is the probability of choosing the joint setting

σj ⊗ σj at a trial. Here the function W ′′ is chosen so that its expectation 〈W ′′(X)〉 is equal to

Tr(W ′ρ) for any state ρ. Hence, the expectation according to all separable states 〈W ′′(X)〉 ≤ 1 with

W ′′ = ±3; while according to the state ρ(V ) in Eq. (6.8), the probability thatW ′′ = 3 orW ′′ = −3 is

(1+V )/2 or (1−V )/2, respectively, so 〈W ′′(X)〉 = 3V is greater than the separable bound Bsep = 1

if and only if V > 1/3. To compute the confidence-gain rate for rejecting separable-state models for

measurement results according to the state ρ(V ) with V > 1/3, we need to standardize the witness

function W ′′ by r(x) = (W ′′(x) − bl)/(Bsep − bl) with bl = −3 so that the standardized witness

function r satisfies that r(x) ≥ 0 and 〈r(X)〉 ≤ 1 for all separable states. Then, the confidence-gain

rate achieved by the simplified PBR protocol using the standardized entanglement-witness function

95

r and the trivial witness function r′ = 1 is

GsPBR = max0≤ω≤1

(1 + V

2log2(1 +

ω

2) +

1− V2

log2(1− ω))

=1 + V

2log2

(3(1 + V )

4

)+

1− V2

log2

(3(1− V )

2

). (6.11)

It is easy to verify that the gain rate GsPBR is positive if and only if V > 1/3 and that GsPBR

increases with V . Moreover, for this example the gain rate GsPBR is optimal, as shown in the

following.

Given the state ρ(V ) in Eq. (6.8) and the setting choice j (j = x, y, or z) at an experimental

trial, the probabilities of observing various outcomes are

ProbQM(aj = 1, bj = 1) = ProbQM(aj = −1, bj = −1) =1− V

4, and

ProbQM(aj = 1, bj = −1) = ProbQM(aj = −1, bj = 1) =1 + V

4. (6.12)

Then, we can compute the KL divergence from q, i.e., the experimental probability distribution of

setting choices and outcomes according to the entangled state ρ(V ) with V > 1/3, to p, i.e., the

distribution of setting choices and outcomes according to the separable state ρ(V = 1/3). It turns

out that the KL divergence DKL(q ‖ p) = 1+V2 log2

(3(1+V )

4

)+ 1−V

2 log2

(3(1−V )

2

), which is the

same as the confidence-gain rate in Eq. (6.11) achieved by the simplified PBR protocol. According

to Ref. [25], the optimal gain rate Sq in the test of entanglement is given by the minimum KL

divergence from the experimental probability distribution q to any distribution according to a

separable state. That is, Sq ≤ DKL(q ‖ p) = GsPBR. Since the gain-rate achieved by the simplified

PBR protocol is valid, GsPBR ≤ Sq. Combining the above two points, we can see that GsPBR = Sq

for any entangled state ρ(V ) with V > 1/3.

In the above example, we can also apply the martingale-based protocol. According to the

discussion in Sec. 6.1, since the witness function W ′ can take only one of two different values at an

experimental trial, the gain rate achieved by the martingale-based protocol is the same as GsPBR

in Eq. (6.11). However, for other entanglement witnesses, the simplified PBR protocol may have

an advantage over the martingale-based protocol. For example, let us consider the entanglement

96

verification of the noisy Greenberger-Horne-Zeilinger (GHZ) state

ρ = V |ψGHZ〉〈ψGHZ|+ (1− V )I8

8, (6.13)

where the state |ψGHZ〉 = (|000〉+ |111〉)/√

2 is the ideal GHZ state, and I88 is the completely mixed

state of three qubits. One can verify the entanglement of the state Eq. (6.13) by measuring the

entanglement-witness operator [81, 172]

W =I − 2|ψGHZ〉〈ψGHZ|

=14

(3I2 ⊗ I2 ⊗ I2 − I2 ⊗ σz ⊗ σz − σz ⊗ I2 ⊗ σz − σz ⊗ σz ⊗ I2 − 2σx ⊗ σx ⊗ σx

+√

2σπ/4 ⊗ σπ/4 ⊗ σπ/4 +√

2σ−π/4 ⊗ σ−π/4 ⊗ σ−π/4), (6.14)

where σ±π/4 = (σx ± σy)/√

2, and I2 is the identity matrix of size 2 × 2. This decomposition of

the witness operator involves the least number of joint measurement settings. According to the

definition of an entanglement-witness operator, if Tr(Wρ) < 0 then the state ρ is entangled.

To apply the simplified PBR or martingale-based protocol, we define the operator W ′ =

I2⊗σz⊗σz+σz⊗I2⊗σz+σz⊗σz⊗I2+2σx⊗σx⊗σx−√

2σπ/4⊗σπ/4⊗σπ/4−√

2σ−π/4⊗σ−π/4⊗σ−π/4.

Since the witness operator in Eq. (6.14) is W = (3I2⊗I2⊗I2−W ′)/4, the operator W ′ satisfies that

Tr(W ′ρsep) ≤ 3 for all separable states ρsep. So, if Tr(W ′ρ) > 3 the state ρ is entangled. We index

the joint settings I2 ⊗ σz ⊗ σz, σz ⊗ I2 ⊗ σz, σz ⊗ σz ⊗ I2, σx ⊗ σx ⊗ σx, σπ/4 ⊗ σπ/4 ⊗ σπ/4,

and σ−π/4 ⊗ σ−π/4 ⊗ σ−π/4 by j = 1, 2, 3, 4, 5, and 6, respectively. To measure the operator

W ′ in an experiment, at a trial we choose the joint setting indexed by j randomly and observe

the outcome (aj , bj , cj) under the chosen setting, where aj , bj , or cj is ±1. We then denote the

measurement-setting choice and outcome at a trial by X and define the entanglement-witness

function W ′′(x) = ajbjcjwj/pj , where x = (j, aj , bj , cj) includes the setting choice j and the

observed outcome (aj , bj , cj) at a trial, wj is a constant depending on the setting choice j, and pj

is the probability of choosing the setting indexed by j at a trial. To ensure that the expectation

〈W ′′(X)〉 is equal to Tr(W ′ρ) for any state ρ, the constants wj are chosen as follows: wj = 1 if

j = 1, 2, or 3, wj = −√

2 if j = 5 or 6, and w4 = 2. Moreover, assuming the joint setting σz⊗σz⊗σz,

97

σx⊗ σx⊗ σx, σπ/4⊗ σπ/4⊗ σπ/4, or σ−π/4⊗ σ−π/4⊗ σ−π/4 is chosen uniformly randomly at a trial,

then the function W ′′ can take six different values ±12, ±8, or ±4√

2. (Note that, to measure

I2 ⊗ σz ⊗ σz, σz ⊗ I2 ⊗ σz, or σz ⊗ σz ⊗ I2, we use the measurement setup for the joint setting

σz ⊗ σz ⊗ σz and uniformly randomly choose which of the above three measurements to perform.)

Hence, according to the discussion in Sec. 6.1, the confidence-gain rate achieved by the simplified

PBR protocol is higher than that achieved by the martingale-based protocol.

In general, the simplified PBR protocol can also be applied to verify the type of entanglement

of a multipartite state. Moreover, as in a test of LR, the simplified PBR protocol can be applied with

a set of entanglement witnesses, so that this protocol typically behaves better than the martingale-

based protocol in practice. The above strategies are limited to linear witnesses. As is well known,

the set of separable states is convex but not a polytope, so a nonlinear entanglement witness can

detect more entangled states than a linear one. Whether or not the simplified PBR protocol can be

applied with nonlinear witnesses is an interesting open problem and deserves further investigation

in the future.

Chapter 7

Conclusions and future directions

7.1 Conclusion

The degree of violation of local realism (LR) in an experimental test is usually expressed in

terms of the number of standard deviations of violation of a Bell inequality. This quantity cannot,

however, be used to obtain valid p-values for rejecting LR by conventional means. It also fails

to quantitatively compare the success of different experimental tests of LR and does not account

for stability issues or memory effects in experiments. In Chapter 5, we solved these problems

by providing a method—the prediction-based-ratio (PBR) protocol—for determining valid and

asymptotically tight p-value upper bounds directly from the sequence of measurement settings and

outcomes in an experiment. The PBR protocol does not rely on a predetermined Bell inequality,

adapts to the actual experimental configuration, and is asymptotically optimal for independent and

identically distributed results sampled from the experimental probability distribution. It therefore

provides a standardized measure of success for experimental tests of LR. While the protocol remains

valid if the state and setting parameters drifts during an experiment, how well it performs depends

on the nature of the drifts and how the protocol takes them into account.

Our simulations showed that it is practical to apply the PBR protocol to data from typical

experimental configurations, and that the running p-value upper bounds can be used for tweaking

an experiment in progress to find the experimentally accessible configuration that provides the

highest violation of LR. However, the PBR protocol is not efficient with respect to the number of

parties per test, settings per party, and outcomes per setting. In Chapter 6, we simplified the PBR

99

protocol and showed that the simplified PBR protocol is efficient. The simplified PBR protocol uses

a set of Bell inequalities chosen based on the estimated probability distribution of setting choices

and outcomes before an experiment. The behavior and implementation complexity of the simplified

PBR protocol depend on the choice and number of Bell inequalities considered. Compared with the

previously known and efficient protocol, the martingale-based protocol, the simplified PBR protocol

provides better and even optimal results given a relevant set of Bell inequalities. In Chapter 6, we

also briefly discussed how to apply the simplified PBR protocol to any test with linear witnesses,

such as tests of entanglement or system dimensionality.

The p-value for rejecting LR decays exponentially with the number of data points in the

asymptotic limit. The optimal decay rate is given as the minimum Kullback-Leibler divergence from

the experimental probability distribution to all distributions according to LR. In Chapter 4, we

studied the optimal decay rates of p-values in tests of LR using polarized photon pairs and inefficient

detectors or counters. Specifically, we studied the minimum detection efficiency or experimental

visibility required for achieving any given optimal decay rate.

7.2 Future work

Many quantum information tasks, such as device-independent quantum key distribution and

randomness expansion or amplification, have been proposed recently. In these tasks, before sharing

private information, the spatially separated parties need to verify violation of LR using a finite set

of data. It is desirable to apply the PBR protocol to these tasks, so that reliable experimental

realizations of these tasks can be guaranteed. Also, a detailed study of the application of the

simplified PBR protocol to tests of entanglement or system dimensionality is needed. In addition,

we described a systematic and efficient method for deriving Bell inequalities in Chapter 2. Whether

or not all Bell inequalities can be derived using this method is an interesting open problem and

deserves further investigation in the future.

Bibliography

[1] Tamas Vertesi, Stefano Pironio, and Nicolas Brunner. Closing the detection loophole in Bellexperiments using qudits. Phys. Rev. Lett., 104:060401, 2010.

[2] Marek Zukowski, Dagomir Kaszlikowski, and Emilio Santos. Irrelevance of photon eventsdistinguishability in a class of Bell experiments. Phys. Rev. A, 60:R2614–R2617, Oct 1999.

[3] S. Pironio, A. Acin, S. Massar, A. Boyer dela Giroday, D. N. Matsukevich, P. Maunz, S. Olm-schenk, D. Hayes, L. Luo, T. A. Manning, and C. Monroe. Random numbers certified byBell’s theorem. Nature, 464:1021, 2010.

[4] Jing-Ling Chen, Chunfeng Wu, L. C. Kwek, C. H. Oh, and Mo-Lin Ge. Violating Bellinequalities maximally for two d-dimensional systems. Phys. Rev. A, 74:032106, Sep 2006.

[5] J. S. Bell. On the Einstein Podolsky Rosen paradox. Physics, 1:195–200, 1964.

[6] S. J. Freedman and J. F. Clauser. Experimental test of local hidden-variable theories. Phys.Rev. Lett., 28:938–941, 1972.

[7] A. Peres. All the Bell inequalities. Found. Phys., 29:589–614, 1999.

[8] R. F. Werner and M. M. Wolf. Bell inequalities and entanglement. Quant. Inf. Comp., 1:1–25,2001.

[9] M. Genovese. Research on hidden variable theories: A review of recent progresses. Phys.Rep., 413:319–396, 2005.

[10] R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki. Quantum entanglement. Rev.Mod. Phys., 81:865–942, 2009.

[11] Jonathan Barrett, Lucien Hardy, and Adrian Kent. No signaling and quantum key distribu-tion. Phys. Rev. Lett., 95:010503, Jun 2005.

[12] L. Masanes, R. Renner, A. Winter, J. Barrett, and M. Christandl. Security of key distributionfrom causality constraints. 2009. arXiv: quant-ph/0606049.

[13] Lluis Masanes. Universally composable privacy amplification from causality constraints. Phys.Rev. Lett., 102:140501, Apr 2009.

[14] S. Popescu and D. Rohrlich. Quantum nonlocality as an axiom. Found. Phys., 24:379, 1994.

101

[15] J. F. Clauser, M. A. Horne, A. Shimony, and R. A. Holt. Proposed experiment to test localhidden-variable theories. Phys. Rev. Lett., 23:880–884, 1969.

[16] J. F. Clauser and M. A. Horne. Experimental consequences of objective local theories. Phys.Rev. D, 10:526–535, 1974.

[17] Yanbao Zhang, Scott Glancy, and Emanuel Knill. Asymptotically optimal data analysis forrejecting local realism. Phys. Rev. A, 84:062118, Dec 2011.

[18] Richard D. Gill. Accardi contra Bell (cum mundi): The impossible coupling. In MathematicalStatistics and Applications: Festschrift for Constance van Eeden. Eds: M. Moore, S. Frodaand C. Leger. IMS Lecture Notes – Monograph Series, volume 42, pages 133–154. Institute ofMathematical Statistics. Beachwood, Ohio, 2003. Also available as arXiv:quant-ph/0110137.

[19] Richard D. Gill. Time, finite statistics, and Bell’s fifth position. In Proc. of “Foundations ofProbability and Physics - 2”, Ser. Math. Modelling in Phys., Engin., and Cogn. Sc., volume 5,pages 179–206. Vaxjo Univ. Press., 2003.

[20] Luigi Accardi and Massimo Regoli. Locality and Bell’s inequality. arXiv:quant-ph/0007005.

[21] Luigi Accardi and Massimo Regoli. Non-locality and quantum theory: new experimentalevidence. arXiv:quant-ph/0007019.

[22] Luigi Accardi and Massimo Regoli. The EPR correlations and the chameleon effect.arXiv:quant-ph/0110086.

[23] Jonathan Barrett, Daniel Collins, Lucien Hardy, Adrian Kent, and Sandu Popescu. Quantumnonlocality, Bell inequalities, and the memory loophole. Phys. Rev. A, 66(4):042111, Oct 2002.

[24] W. van Dam, R. D. Gill, and P. D. Grunwald. The statistical strength of nonlocality proofs.IEEE Trans. Inf. Theory, 51:2812–2835, 2005.

[25] R. R. Bahadur. An optimal property of the likelihood ratio statistic. In Proc. Fifth BerkeleySymp. on Math. Statist. and Prob., volume 1, pages 13–26. Univ. of Calif. Press, 1967.

[26] Yanbao Zhang, Emanuel Knill, and Scott Glancy. Statistical strength of experiments to rejectlocal realism with photon pairs and inefficient detectors. Phys. Rev. A, 81:032117, Mar 2010.

[27] P. M. Pearle. Hidden-variable example based upon data rejection. Phys. Rev. D, 2:1418 –1425, 1970.

[28] P. H. Eberhard. Background level and counter efficiencies required for a loophole-free Einstein-Podolsky-Rosen experiment. Phys. Rev. A, 47:R747–R750, 1993.

[29] Yanbao Zhang, Scott Glancy, and Emanuel Knill. Efficient quantification of experimentalevidence against local realism. arXiv:1303.7464.

[30] A. Einstein, B. Podolsky, and N. Rosen. Can quantum-mechanical description of physicalreality be considered complete? Phys. Rev., 47:777, 1935.

[31] J. S. Bell. Speakable and Unspeakable in Quantum Mechanics. Cambridge University Press,Cambridge, 2004. pp. 139-158.

102

[32] Arthur Fine. Hidden variables, joint probability, and the Bell inequalities. Phys. Rev. Lett.,48:291–295, Feb 1982.

[33] R. F. Werner and M. M. Wolf. All-multipartite Bell-correlation inequalities for two dichotomicobservables per site. Phys. Rev. A, 64:032112, Aug 2001.

[34] Itamar Pitowsky. Geometry of quantum correlations. Phys. Rev. A, 77:062109, Jun 2008.

[35] B.S. Cirel’son. Quantum generalizations of Bell’s inequality. Lett. Math. Phys., 4:93, 1980.

[36] L. Masanes. Necessary and sufficient condition for quantum-generated correlations.arXiv:quant-ph/0309137, 2003.

[37] Gunter M. Ziegler. Lectures on Polytopes. Springer-Verlag, New York, 1995.

[38] Jonathan Barrett, Noah Linden, Serge Massar, Stefano Pironio, Sandu Popescu, and DavidRoberts. Nonlocal correlations as an information-theoretic resource. Phys. Rev. A, 71:022101,Feb 2005.

[39] I. Pitowsky. Quantum Probability–Quantum Logic. Springer, Berlin, 1989.

[40] Itamar Pitowsky and Karl Svozil. Optimal tests of quantum nonlocality. Phys. Rev. A,64:014102, Jun 2001.

[41] Daniel Collins and Nicolas Gisin. A relevant two qubit Bell inequality inequivalent to theCHSH inequality. J. Phys. A: Math. Gen., 37:1775, 2004.

[42] David Avis, Hiroshi Imai, and Tsuyoshi Ito. On the relationship between convex bodiesrelated to correlation experiments with dichotomic observables. J. Phys. A: Math. Gen.,39:11283, 2006.

[43] Lluis Masanes. Tight Bell inequality for d-outcome measurements correlations. Quantuminformation & computation, 3:345, 2002.

[44] Marek Zukowski and Caslav Brukner. Bell’s theorem for general N -qubit states. Phys. Rev.Lett., 88:210401, May 2002.

[45] Jean-Daniel Bancal, Nicolas Gisin, and Stefano Pironio. Looking for symmetric Bell inequal-ities. J. Phys. A: Math. Theor., 43:385303, 2010.

[46] N. Gisin. Bell’s inequality holds for all non-product states. Phys. Lett. A, 154:201–202, 1991.

[47] N. Gisin and A. Peres. Maximal violation of Bell’s inequality for arbitrarily large spin. Phys.Lett. A, 162:15, 1992.

[48] S. Popescu and D. Rohrlich. Generic quantum nonlocality. Phys. Lett. A, 166:293–297, 1992.

[49] T. Vertesi. More efficient Bell inequalities for Werner states. Phys. Rev. A, 78:032112, Sep2008.

[50] N. Brunner, N. Gisin, V. Scarani, and C. Simon. Detection loophole in asymmetric Bellexperiments. Phys. Rev. Lett., 98:220403, 2007.

103

[51] A. Cabello and J.-A. Larsson. Minimum detection efficiency for a loophole-free atom-photonBell experiment. Phys. Rev. Lett., 98:220402, 2007.

[52] Nicolas Brunner and Nicolas Gisin. Partial list of bipartite Bell inequalities with four binarysettings. Phys. Lett. A, 372:3162, 2008.

[53] Karoly F. Pal and Tamas Vertesi. Quantum bounds on Bell inequalities. Phys. Rev. A,79:022120, Feb 2009.

[54] Samuel L. Braunstein and Carlton M. Caves. Wringing out better Bell inequalities. Annalsof Physics, 202:22, 1990.

[55] N. Gisin. Bell inequality for arbitrary many settings of the analyzers. Phys. Lett. A, 260:1,1999.

[56] Dagomir Kaszlikowski and Marek Zukowski. Bell theorem involving all possible local mea-surements. Phys. Rev. A, 61:022114, Jan 2000.

[57] N. Gisin. Bell inequalities: many questions, a few answers. arXiv:quant-ph/0702021.

[58] Daniel Collins, Nicolas Gisin, Noah Linden, Serge Massar, and Sandu Popescu. Bell inequal-ities for arbitrarily high-dimensional systems. Phys. Rev. Lett., 88:040404, Jan 2002.

[59] Li-Bin Fu. General correlation functions of the Clauser-Horne-Shimony-Holt inequality forarbitrarily high-dimensional systems. Phys. Rev. Lett., 92:130404, Mar 2004.

[60] N. David Mermin. Extreme quantum entanglement in a superposition of macroscopicallydistinct states. Phys. Rev. Lett., 65:1838–1840, Oct 1990.

[61] M. Ardehali. Bell inequalities with a magnitude of violation that grows exponentially withthe number of particles. Phys. Rev. A, 46:5375–5378, Nov 1992.

[62] A. V. Belinskii and D. N. Klyshko. Interference of light and Bell’s theorem. Phys. Usp.,36:653, 1993.

[63] N. Gisin and H. Bechmann-Pasquinucci. Bell inequality, Bell states and maximally entangledstates for n qubits. Phys.Lett. A, 246:1–6, 1998.

[64] Adan Cabello. Bell’s inequality for n spin-s particles. Phys. Rev. A, 65:062105, Jun 2002.

[65] Wies law Laskowski, Tomasz Paterek, Marek Zukowski, and Caslav Brukner. Tight multi-partite Bell’s inequalities involving many measurement settings. Phys. Rev. Lett., 93:200401,Nov 2004.

[66] W. Son, Jinhyoung Lee, and M. S. Kim. Generic Bell inequalities for multipartite arbitrarydimensional systems. Phys. Rev. Lett., 96:060406, Feb 2006.

[67] Koji Nagata, Wies law Laskowski, and Tomasz Paterek. Bell inequality with an arbitrarynumber of settings and its applications. Phys. Rev. A, 74:062109, Dec 2006.

[68] Elena R Loubenets. Multipartite Bell-type inequalities for arbitrary numbers of settings andoutcomes per site. J. Phys. A: Math. Theor., 41:445304, 2008.

104

[69] Marek Zukowski and Dagomir Kaszlikowski. Critical visibility for n-particle Greenberger-Horne-Zeilinger correlations to violate local realism. Phys. Rev. A, 56:R1682–R1685, Sep1997.

[70] Stephanie Wehner. Tsirelson bounds for generalized Clauser-Horne-Shimony-Holt inequali-ties. Phys. Rev. A, 73:022110, Feb 2006.

[71] A. Peres. Bayesian analysis of Bell inequalities. Fortsch.Phys., 48:531, 2000.

[72] Jonathan Barrett, Adrian Kent, and Stefano Pironio. Maximally nonlocal and monogamousquantum correlations. Phys. Rev. Lett., 97:170409, Oct 2006.

[73] Roger Colbeck and Renato Renner. Hidden variable models for quantum theory cannot haveany local part. Phys. Rev. Lett., 101:050403, Aug 2008.

[74] A. Cabello, J.-A. Larsson, and D. Rodriguez. Minimum detection efficiency required fora loophole-free violation of the Braunstein-Caves chained Bell inequalities. Phys. Rev. A,79:062109, Jun 2009.

[75] Antonio Acin, Richard Gill, and Nicolas Gisin. Optimal Bell tests do not require maximallyentangled states. Phys. Rev. Lett., 95:210402, Nov 2005.

[76] Serge Massar, Stefano Pironio, Jeremie Roland, and Bernard Gisin. Bell inequalities resistantto detector inefficiency. Phys. Rev. A, 66:052112, Nov 2002.

[77] A. Acin, T. Durt, N. Gisin, and J. I. Latorre. Quantum nonlocality in two three-level systems.Phys. Rev. A, 65:052325, May 2002.

[78] Stefan Zohren and Richard D. Gill. Maximal violation of the Collins-Gisin-Linden-Massar-Popescu inequality for infinite dimensional states. Phys. Rev. Lett., 100:120406, Mar 2008.

[79] Marek Zukowski, Caslav Brukner, Wies law Laskowski, and Marcin Wiesniak. Do all pure en-tangled states violate Bell’s inequalities for correlation functions? Phys. Rev. Lett., 88:210402,May 2002.

[80] Samson Abramsky and Lucien Hardy. Logical Bell inequalities. Phys. Rev. A, 85:062114,Jun 2012.

[81] Otfried Guhne and Geza Toth. Entanglement detection. Physics Reports, 474:1–75, 2009.

[82] Barbara M. Terhal. Bell inequalities and the separability criterion. Phys. Lett. A, 271:319,2000.

[83] Reinhard F. Werner. Quantum states with Einstein-Podolsky-Rosen correlations admittinga hidden-variable model. Phys. Rev. A, 40:4277–4281, Oct 1989.

[84] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Separability of mixed states:necessary and sufficient conditions. Phys. Lett. A, 223:1, 1996.

[85] Otfried Guhne and Norbert Lutkenhaus. Nonlinear entanglement witnesses. Phys. Rev. Lett.,96:170502, May 2006.

105

[86] Asher Peres. Separability criterion for density matrices. Phys. Rev. Lett., 77:1413–1415, Aug1996.

[87] Michal Horodecki, Pawel Horodecki, and Ryszard Horodecki. Mixed-state entanglement anddistillation: Is there a “bound” entanglement in nature? Phys. Rev. Lett., 80:5239–5242,Jun 1998.

[88] George Svetlichny. Distinguishing three-body from two-body nonseparability by a Bell-typeinequality. Phys. Rev. D, 35:3066–3069, May 1987.

[89] Michael Seevinck and George Svetlichny. Bell-type inequalities for partial separability inN -particle systems and quantum mechanical violations. Phys. Rev. Lett., 89:060401, Jul2002.

[90] Daniel Collins, Nicolas Gisin, Sandu Popescu, David Roberts, and Valerio Scarani. Bell-typeinequalities to detect true n-body nonseparability. Phys. Rev. Lett., 88:170405, Apr 2002.

[91] S. M. Roy. Multipartite separability inequalities exponentially stronger than local realityinequalities. Phys. Rev. Lett., 94:010402, Jan 2005.

[92] Wies law Laskowski and Marek Zukowski. Detection of N -particle entanglement with gener-alized Bell inequalities. Phys. Rev. A, 72:062112, Dec 2005.

[93] Michael Seevinck and Jos Uffink. Partial separability and entanglement criteria for multiqubitquantum states. Phys. Rev. A, 78:032101, Sep 2008.

[94] Jos Uffink and Michael Seevinck. Strengthened Bell inequalities for orthogonal spin directions.Phys. Lett. A, 372:1205, 2008.

[95] Jean-Daniel Bancal, Nicolas Gisin, Yeong-Cherng Liang, and Stefano Pironio. Device-independent witnesses of genuine multipartite entanglement. Phys. Rev. Lett., 106:250404,2011.

[96] A. A. Methot and V. Scarani. An anomaly of nonlocality. Quantum Information andComputation, 7:157–170, 2007.

[97] Sixia Yu, Qing Chen, Chengjie Zhang, C. H. Lai, and C. H. Oh. All entangled pure statesviolate a single Bell’s inequality. Phys. Rev. Lett., 109:120402, Sep 2012.

[98] Tamas Vertesi and Nicolas Brunner. Quantum nonlocality does not imply entanglementdistillability. arXiv:1106.4850v2.

[99] Tobias Moroder, Jean-Daniel Bancal, Yeong-Cherng Liang, Martin Hofmann, and OtfriedGuhne. Device-independent entanglement quantification and related applications. Phys.Rev. Lett., 111:030501, Jul 2013.

[100] E. Schrodinger. Discussion of probability relations between separated systems. Proc.Cambridge Philos. Soc., 31:555, 1935.

[101] H. M. Wiseman, S. J. Jones, and A. C. Doherty. Steering, entanglement, nonlocality, and theEinstein-Podolsky-Rosen paradox. Phys. Rev. Lett., 98:140402, Apr 2007.

106

[102] E. G. Cavalcanti, S. J. Jones, H. M. Wiseman, and M. D. Reid. Experimental criteria forsteering and the Einstein-Podolsky-Rosen paradox. Phys. Rev. A, 80:032112, Sep 2009.

[103] John S. Bell. On the problem of hidden variables in quantum mechanics. Rev. Mod. Phys.,38:447–452, Jul 1966.

[104] S. Kochen and E.P. Specker. The problem of hidden variables in quantum mechanics. Journalof Mathematics and Mechanics, 17:5987, 1967.

[105] Matthias Kleinmann, Costantino Budroni, Jan-Ake Larsson, Otfried Guhne, and Adan Ca-bello. Optimal inequalities for state-independent contextuality. Phys. Rev. Lett., 109:250402,2012.

[106] Artur K. Ekert. Quantum cryptography based on Bell’s theorem. Phys. Rev. Lett., 67:661–663, Aug 1991.

[107] Antonio Acin, Nicolas Gisin, and Lluis Masanes. From Bell’s theorem to secure quantum keydistribution. Phys. Rev. Lett., 97:120405, Sep 2006.

[108] Antonio Acin, Nicolas Brunner, Nicolas Gisin, Serge Massar, Stefano Pironio, and ValerioScarani. Device-independent security of quantum cryptography against collective attacks.Phys. Rev. Lett., 98:230501, Jun 2007.

[109] Lluis Masanes, Stefano Pironio, and Antonio Acin. Secure device-independent quantum keydistribution with causally independent measurement devices. Nat. Commun., 2:238, 2011.

[110] Roger Colbeck and Adrian Kent. Private randomness expansion with untrusted devices. J.Phys. A: Math. Theor., 44:095305, 2011.

[111] Roger Colbeck and Renato Renner. Free randomness can be amplified. Nature Physics,8:450453, 2012.

[112] Rodrigo Gallego, Lluis Masanes, Gonzalo Torre, Chirag Dhara, Leandro Aolita, and AntonioAcin. Full randomness from arbitrarily deterministic events. arXiv:1210.6514.

[113] G. Weihs, T. Jennewein, C. Simon, H. Weinfurter, and A. Zeilinger. Violation of Bell’sinequality under strict Einstein locality conditions. Phys. Rev. Lett., 81:5039–5043, 1998.

[114] A. Aspect, J. Dalibard, and G. Roger. Experimental test of Bell’s inequalities using time-varying analyzers. Phys. Rev. Lett., 49:1804–1807, 1982.

[115] Thomas Scheidl, Rupert Ursin, Johannes Kofler, Sven Ramelow, Xiao-Song Ma, ThomasHerbst, Lothar Ratschbacher, Alessandro Fedrizzi, Nathan K. Langford, Thomas Jennewein,and Anton Zeilinger. Violation of local realism with freedom of choice. Proc. Natl. Acad.Sci., 107:19708, 2010.

[116] Y. H. Shih and C. O. Alley. New type of Einstein-Podolsky-Rosen-Bohm experiment usingpairs of light quanta produced by optical parametric down conversion. Phys. Rev. Lett.,61:2921–2924, 1988.

[117] Z. Y. Ou and L. Mandel. Violation of Bell’s inequality and classical probability in a two-photon correlation experiment. Phys. Rev. Lett., 61:50–53, 1988.

107

[118] A. Garg and N. D. Mermin. Detector inefficiencies in the Einstein-Podolsky-Rosen experi-ment. Phys. Rev. D, 35:3831–3835, 1987.

[119] Jan-Ake Larsson. Necessary and sufficient detector-efficiency conditions for the Greenberger-Horne-Zeilinger paradox. Phys. Rev. A, 57:R3145–R3149, May 1998.

[120] Jan-Ake Larsson and Jason Semitecolos. Strict detector-efficiency bounds for n-site Clauser-Horne inequalities. Phys. Rev. A, 63:022117, Jan 2001.

[121] Adan Cabello, David Rodriguez, and Ignacio Villanueva. Necessary and sufficient detectionefficiency for the Mermin inequalities. Phys. Rev. Lett., 101:120402, Sep 2008.

[122] M. A. Rowe, D. Kielpinski, V. Meyer, C. A. Sackett, W. M. Itano, C. Monroe, and D. J.Wineland. Experimental violation of a Bell’s inequality with efficient detection. Nature,409:791–794, 2001.

[123] D. N. Matsukevich, P. Maunz, D. L. Moehring, S. Olmschenk, and C. Monroe. Bell inequalityviolation with two remote atomic qubits. Phys. Rev. Lett., 100:150404, Apr 2008.

[124] Julian Hofmann, Michael Krug, Norbert Ortegel, Lea Gerard, Markus Weber, WenjaminRosenfeld, and Harald Weinfurter. Heralded entanglement between widely separated atoms.Science, 337:72–75, 2012.

[125] R.D. Gill, G. Weihs, A. Zeilinger, and M. Zukowski. No time loophole in Bell’s theorem: thehess-philipp model is non-local. PNAS, 99:14632, 2002.

[126] Fabian Steinlechner, Pavel Trojek, Marc Jofre, Henning Weier, Daniel Perez, Thomas Jen-newein, Rupert Ursin, John Rarity, Morgan W. Mitchell, Juan P. Torres, Harald Weinfurter,and Valerio Pruneri. A high-brightness source of polarization-entangled photons optimizedfor applications in free space. Opt. Exp., 20:9640, 2012.

[127] Onur Kuzucu and Franco N. C. Wong. Pulsed sagnac source of narrow-band polarization-entangled photons. Phys. Rev. A, 77:032314, Mar 2008.

[128] A. E. Lita, A. J. Miller, and S. W. Nam. Counting near-infrared single-photons with 95%efficiency. Opt. Express, 16:3032–3040, 2008.

[129] A. E. Lita, B. Calkins, L. A. Pellochoud, A. J. Miller, and S. W. Nam. High-efficiencyphoton-number-resolving detectors based on Hafnium transition-edge sensors. AIP Conf.Proc., 1185:351, 2009.

[130] A. Lamas-Linares, B. Calkins, N. A. Tomlin, T. Gerrits, A. E. Lita, J. Beyer, R. P. Mirin,and S. W. Nam. Nanosecond-scale timing jitter in transition edge sensors at telecom andvisible wavelengths. ArXiv:1209.5721.

[131] F. Marsili, V. B. Verma, J. A. Stern, S. Harrington, A. E. Lita, T. Gerrits, I. Vayshenker,B. Baek, M. D. Shaw, R. P. Mirin, and S. W. Nam. Detecting single infrared photons with93% system efficiency. ArXiv:1209.5774.

[132] Marissa Giustina, Alexandra Mech, Sven Ramelow, Bernhard Wittmann, Johannes Kofler,Jorn Beyer, Adriana Lita, Brice Calkins, Thomas Gerrits, Sae Woo Nam, Rupert Ursin, andAnton Zeilinger. Bell violation using entangled photons without the fair-sampling assumption.Nature, 497:227–230, 2013.

108

[133] F. Henkel, M. Krug, J. Hofmann, W. Rosenfeld, M. Weber, and H. Weinfurter. Highly efficientstate-selective submicrosecond photoionization detection of single atoms. Phys. Rev. Lett.,105:253001, Dec 2010.

[134] B. B. Blinov, D. L. Moehring, L.-M. Duan, and C. Monroe. Observation of entanglementbetween a single trapped atom and a single photon. Nature, 428:153, 2004.

[135] Jurgen Volz, Markus Weber, Daniel Schlenk, Wenjamin Rosenfeld, Johannes Vrana, KarenSaucke, Christian Kurtsiefer, and Harald Weinfurter. Observation of entanglement of a singlephoton with a trapped atom. Phys. Rev. Lett., 96:030404, Jan 2006.

[136] Christoph Simon and William T. M. Irvine. Robust long-distance entanglement and aloophole-free Bell test with ions and photons. Phys. Rev. Lett., 91:110405, Sep 2003.

[137] N. Sangouard, J.-D. Bancal, N. Gisin, W. Rosenfeld, P. Sekatski, M. Weber, and H. Wein-furter. Loophole-free Bell test with one atom and less than one photon on average. Phys.Rev. A, 84:052122, Nov 2011.

[138] Hyunchul Nha and H. J. Carmichael. Proposed test of quantum nonlocality for continuousvariables. Phys. Rev. Lett., 93:020401, Jul 2004.

[139] R. Garcia-Patron, J. Fiurasek, N. J. Cerf, J. Wenger, R. Tualle-Brouri, and Ph. Grangier.Proposal for a loophole-free Bell test using homodyne detection. Phys. Rev. Lett., 93:130409,Sep 2004.

[140] Daniel Cavalcanti, Nicolas Brunner, Paul Skrzypczyk, Alejo Salles, and Valerio Scarani. Largeviolation of Bell inequalities using both particle and wave measurements. Phys. Rev. A,84:022105, Aug 2011.

[141] M.T. Quintino, M. Araujo, D. Cavalcanti, M.F. Santos, and M.T. Cunha. Maximal violationsand efficiency requirements for Bell tests with photodetection and homodyne measurements.Journal of Physics A: Mathematical and Theoretical, 45(21):215308, 2012.

[142] Mateus Araujo, Marco Tulio Quintino, Daniel Cavalcanti, Marcelo Fran ca Santos, Adan Ca-bello, and Marcelo Terra Cunha. Tests of Bell inequality with arbitrarily low photodetectionefficiency and homodyne measurements. Phys. Rev. A, 86:030101, Sep 2012.

[143] D. L. Moehring, M. J. Madsen, B. B. Blinov, and C. Monroe. Experimental Bell inequalityviolation with an atom and a photon. Phys. Rev. Lett., 93:090410, Aug 2004.

[144] W. Tittel, J. Brendel, N. Gisin, and H. Zbinden. Long-distance Bell-type tests using energy-time entangled photons. Phys. Rev. A, 59:4150–4163, 1999.

[145] T. E. Kiess, Y. H. Shih, A. V. Sergienko, and C. O. Alley. Einstein-Podolsky-Rosen-Bohmexperiment using pairs of light quanta produced by type-II parametric down-conversion. Phys.Rev. Lett., 71:3893–3897, 1993.

[146] B. Lounis and M. Orrit. Single-photon sources. Rep. Prog. Phys., 68:1129–1179, 2005.

[147] M. Oxborrow and A. G. Sinclair. Single-photon sources. Contemp. Phys., 46:173–206, 2005.

[148] P. G. Kwiat, P. H. Eberhard, A. M. Steinberg, and R. Y. Chiao. Proposal for a loophole-freeBell inequality experiment. Phys. Rev. A, 49:3209–3220, 1994.

109

[149] L. De Caro and A. Garuccio. Reliability of Bell-inequality measurements using polarizationcorrelations in parametric-down-conversion photon sources. Phys. Rev. A, 50:R2803–R2805,1994.

[150] S. Popescu, L. Hardy, and M. Zukowski. Revisiting Bell’s theorem for a class of down-conversion experiments. Phys. Rev. A, 56:R4353–R4356, 1997.

[151] Jun Shao. Mathematical Statistics. Springer, New York, 2nd edition, 2003.

[152] Y. Vardi and D. Lee. From image deblurring to optimal investments: Maximum likelihoodsolutions for positive linear inverse problems. J. Royal Stat. Soc. B, 55:569–612, 1993.

[153] A. G. White, D. F. V. James, P. H. Eberhard, and P. G. Kwiat. Nonmaximally entangledstates: Production, characterization, and utilization. Phys. Rev. Lett., 83:3103–3107, 1999.

[154] G. Brida, M. Genovese, C. Novero, and E. Predazzi. New experimental test of Bell inequalitiesby the use of a non-maximally entangled photon state. Phys. Lett. A, 268:12–16, 2000.

[155] V. Scarani and N. Gisin. Spectral decomposition of Bell’s operators for qubits. J. Phys. A:Math. Gen., 34:6043–6053, 2001.

[156] W. Wasilewski, A. I. Lvovsky, K. Banaszek, and C. Radzewicz. Pulsed squeezed light: Si-multaneous squeezing of multiple modes. Phys. Rev. A, 73:063819, 2006.

[157] A. I. Lvovsky, W. Wasilewski, and K. Banaszek. Decomposing a pulsed optical parametricamplifer into independent squeezers. J. Mod. Optics, 54:721–733, 2007.

[158] C.-E. Bardyn, T. C. H. Liew, S. Massar, M. McKague, and V. Scarani. Device-independentstate estimation based on Bell’s inequalities. Phys. Rev. A, 80:062327, 2009.

[159] Rafael Rabelo, Melvyn Ho, Daniel Cavalcanti, Nicolas Brunner, and Valerio Scarani. Device-independent certification of entangled measurements. Phys. Rev. Lett., 107:050502, 2011.

[160] W. Hoeffding. Probability inequalities for sums of bounded random variables. Journal of theAmerican Statistical Association, 58:13, 1963.

[161] K. Azuma. Weighted sums of certain dependent random variables. TohoKu MathematicalJournal, 19:357, 1967.

[162] C. McDiarmid. On the method of bounded differences. In Surveys in Combinatorics, volume141 of London Math. Soc. Lecture Notes, pages 148–188. Cambridge Univ. Press, 1989.

[163] Rick Durrett. Probability: Theory and Examples. Cambridge, 2010. Also see the optionalstopping theorem at http://en.wikipedia.org/wiki/Optional_stopping_theorem.

[164] A. Acin, N. Gisin, and L. Masanes. From Bell’s theorem to secure quantum key distribution.Phys. Rev. Lett., 97:120405, 2006.

[165] E. S. Ristad. A natural law of succession. 1995. arXiv:cmp-lg/9508012.

[166] Robin Blume-Kohout. Optimal, reliable estimation of quantum states. New J. Phys.,12:043034, 2010.

110

[167] T. M. Cover. An algorithm for maximizing expected log investment return. IEEE Transactionson Information Theory, 30:369, 1984.

[168] S. Kullback and R. A. Leibler. On information and sufficiency. Ann. Math. Statist., 22:79,1951.

[169] Nicolas Brunner, Stefano Pironio, Antonio Acin, Nicolas Gisin, Andre Allan Methot, andValerio Scarani. Testing the dimension of Hilbert spaces. Phys. Rev. Lett., 100:210503, May2008.

[170] Rodrigo Gallego, Nicolas Brunner, Christopher Hadley, and Antonio Acın. Device-independent tests of classical and quantum dimensions. Phys. Rev. Lett., 105:230501, Nov2010.

[171] O. Guhne, P. Hyllus, D. Brub, A. Ekert, M. Lewenstein, C. Macchiavello, and A. Sanpera.Experimental detection of entanglement via witness operators and local measurements. J.Mod. Opt., 50:1079, 2003.

[172] O. Guhne and P. Hyllus. Investigating three qubit entanglement with local measurements.Int. J. Theor. Phys., 42:1001, 2003.

Appendix A

User guide of the local realism analysis engine

A.1 Overview

The purpose of the Local Realism Analysis Engine (LRE) 1 is to perform online and of-

fline analysis of measurements obtained with randomly chosen measurement settings at two well-

separated locations. The LRE determines the current and overall violation of local realism (LR).

If LR is violated, it provides a measure of this violation in terms of the log-p-value for violation.

It is asymptotically optimal for independent measurements of identical states. If LR is not vio-

lated, it can provide feedback on the degree of nonviolation. This is accomplished by comparing

the observed measurements to those that a reference or goal state would produce. The user must

specify the goal state’s probability distribution for measurement settings and outcomes that should

be expected if an experiment is set up as intended. The motivation and theory for the LRE are

described in Chapter 5.

This document gives a user-level specification of the LRE. The implementation of the LRE

provided with this guide can be used with Octave or Matlab. First we describe the state of the

LRE. Then we define the functions that are used to initialize, modify and update the LRE state.

Next we describe functions useful for supporting the LRE, such as functions to compute refer-

ence measurement distributions and calculate anticipated LR violations for standard experimental

situations. Finally we give functions that can be used to simulate an experiment with the LRE.

Examples of LRE usage are given for reference. For a quick start, one can go directly to Sec. A.5.1 The code is available online with the published paper at http://arxiv.org/abs/1108.2468.

112

A.2 LRE state

The LRE state is characterized by an experimental configuration, analysis and display vari-

ables, and the saved statistics of experimental data so far. Once the variables are initialized, the

engine updates the state each time a new block of data is received. The state variables are main-

tained internally. From a user’s perspective, they serve to formally define the state and behavior of

the LRE and are therefore needed for fully understanding LRE dynamics. The state variables are

controlled through functions and are not intended to be accessed or changed directly. We define

them below as part of the LRE specification. In the implementation provided with this guide, they

are accessible as global variables. Future implementations may choose to hide them, so the normal

user should not rely on global access and use only the interface functions specified later.

A.2.1 Experimental configuration

For the purposes of a test of LR, an experimental configuration is characterized by the number

of measurement settings, their probability distribution, and the number of possible outcomes for a

setting. These configuration variables are set when the LRE is initialized. Note that each block of

data provided to the LRE must conform to the values of these parameters. The variables used by

the LRE are:

n_settings_a, n_settings_b: The number of measurement settings available to Alice and

Bob, respectively.

n_outcomes_a, n_outcomes_b: The number of possible measurement outcomes for each

setting of Alice and Bob, respectively.

p_settings: The joint probability distribution with which Alice and Bob choose their

settings. This choice is assumed to be independent of the state being measured.

113

A.2.2 Analysis and display variables

For the purposes of analysis, a “data point” is the setting and outcome combination that

is used and observed in one trial of an experiment. A “data block” consists of a number of data

points.

The main task of the LRE on receiving a new block of data is to update the log-p-values

test_lp_total, test_lp and goal_lp. (We define the log-p-value to be − log2(pn), where pn is the

p-value upper bound as computed after n trials according to our algorithm.) While test_lp_total

is the overall log-p-value reported after a completed experiment, the log-p-values test_lp and

goal_lp help for monitoring the experimental progress. A positive value of test_lp is sufficient

for an LR violation. If the experiment is stable, it is also asymptotically necessary. Negative

values of test_lp are not informative 2 . The value of goal_lp reflects how well we are doing in

approaching a specified goal distribution for settings and outcomes and can be used to tweak an

experiment when there is no violation of LR. (When there is no violation, test_lp is near zero and

insensitive to tweaks.) A goal of tweaking the experiment can be to increase goal_lp. Its use is

explained in the example in Sec. A.5.2.

In order for these log-p-values to be useful during an experiment, they need to adapt to

changes in the experimental state. The amount of data that plays a role is controlled via data “half-

lives”. Roughly, these half-lives control how many of the last data points are used in calculating

and updating the log-p-values. We use half-lives instead of data windows so that we can avoid

storing old data. As a result, the contribution of data points decays exponentially as more data is

acquired.

The relevant part of the LRE state is described by the following variables.

goal_frequencies: This describes the frequencies of settings and outcomes for a goal

state. To be useful, it should be possible to realistically approach them in an experiment,

and they should violate LR. We provide functions to compute them for some typical ex-2 We allow the log-p-values to be negative for the purpose of tweaking an experiment. That is, we set the p-value

upper bound equal to(∏n

k=1 Rk−1(xk))−1

and do not truncate it at 1, see Chapter 5.

114

perimental configurations. It is not a good idea (and probably not realistic) to have any of

the goal state’s frequencies be zero. If the corresponding settings and outcomes occur in

an experiment, then goal_lp can become −∞, and, without intervention, will stay there.

When goal_frequencies is set or updated, the LRE computes the following:

goal_sd: The statistical strength of the goal frequencies for rejection of LR. If an

experimental state achieves the goal frequencies, then goal_lp should approach the

value of goal_sd multiplied by the effective number of trials according to

goal_lp_weight_snumber (defined below).

goal_ratios: The computation of goal_lp requires probability ratios as described

in Chapter 5. This variable contains the needed ratios.

data_half_life: The LRE maintains cumulative setting and outcome frequencies. Each

data point contributes to the cumulative frequencies with a weight that decays over time.

The idea is that a data point’s weight should be 1/2 of the most recent point’s weight after

data_half_life more points have been acquired. The calculation takes into account that

there are only finitely many data points so far but ensures that the ratio of weights for

data points in successive blocks is the one expected in the asymptotic case. For simplicity,

data points in the same block get the same weight. The details of the calculation are given

in Sec. A.6. To help with this calculation the LRE computes and updates the following:

data_weight: The weight used for the most recent data point contributing to the

stored setting and outcome frequencies, test_frequencies.

lp_half_life: The effective half-life for computing test_lp and goal_lp can be set

independently of data_half_life. The interpretation of lp_half_life differs from that

of data_half_life. This is because the log-p-values are weighted sums of logarithms of

probability ratios such as goal_ratios, but not averages. In order to interpret the log-p-

values as intended, the weights have a maximum value of 1, see Sec. A.6 for an explanation.

115

To calculate updated weights, the LRE computes and updates the following:

test_lp_weight_snumber, goal_lp_weight_snumber: Effective numbers of data points

contributing to the computation of test_lp and goal_lp, respectively. These num-

bers are computed based on lp_half_life and the numbers of data points that have

contributed to the respective log-p-value calculations. The log-p-values test_lp and

goal_lp are updated accordingly.

test_lp_data_weight, goal_lp_data_weight: The weights of the most recent data

points according to lp_half_life for updating test_lp_weight_snumber and

goal_lp_weight_snumber, respectively.

test_lp_tolerance: The log-p-values for LR rejection based on the data (e.g., test_lp)

may be underestimated by a maximum of test_lp_tolerance per data point. The under-

estimate accounts for the possibility that the computation of optimal local realistic (LR)

models may not reach the exact optima after the stopping criteria are satisfied. The value

of test_lp_tolerance must be significantly smaller than the achieved log-p-value per data

point violation in an experiment. Thus, if 103 trials are needed before a violation is ap-

parent, test_lp_tolerance should be much less than 10−3. Otherwise log-p-values may

accumulate negative values. The default value of test_lp_tolerance is 10−6. Setting it

significantly higher may speed up updates. See Sec. 5.3.2 of Chapter 5 for the details.

A.2.3 Data dependent variables

test_frequencies: The cumulative, weighted frequencies of experimental settings and

outcomes.

data_weight: The weight of the most recent data point contributing to test_frequencies.

num_exps: The total number of trials since the LRE was last reset.

weight_snumber: The effective number of data points contributing to test_frequencies.

116

pred_ratios: The prediction-based probability ratios that will be used to update test_lp

and test_lp_total when the next block of data arrives. They are determined as explained

in Chapter 5. The estimated probability distribution in the numerator is derived from

test_frequencies by a maximum likelihood method to enforce the setting-distribution

and (if desired) no-signaling constraints. Then, to avoid the possibility of getting stuck

with log-p-values of −∞, the distribution is modified by mixing in the setting-conditional

uniform distribution, with weight 1/(1+weight_snumber).

lr_use_cm: This option variable controls whether to enforce no-signaling (also called

“consistent marginals”) constraints when computing pred_ratios. Its default value

is 0 (false). Turning this option on can slow down updates but helps to reduce the

log-p-value offset caused by the learning transient, when the number of data points is

still small.

test_sd: The statistical strength of rejection of LR of the estimated probability distribu-

tion used to generate pred_ratios.

test_lp, goal_lp: The current log-p-values as described above. These are weighted ac-

cording to lp_half_life.

test_lp_weight_snumber, goal_lp_weight_snumber: Effective numbers of data points

contributing to the computation of test_lp and goal_lp, respectively.

test_lp_total: The total log-p-value for LR rejection since the last reset. Its value is the

one that test_lp would have if lp_half_life were infinity. This log-p-value should be

reported as the overall log-p-value when analyzing data from a completed experiment.

test_lp_v, goal_lp_v and test_lp_total_v: Estimated variances of the test_lp, goal_lp

and test_lp_total. When reporting test_lp_total, one can give the square root of

test_lp_total_v as its standard error when giving it as a quantitative measure of success

117

of an experiment. The other two variances can be used to assess the expected fluctuations

in the corresponding log-p-values when tweaking an experiment.

test_sd2, goal_sd2: The predicted variances of the log-p-value increments at the

next trial, calculated based on pred_ratios and goal_ratios respectively. These

are internal variables needed to update the estimated variances.

A.3 LRE interface

The specification of the LRE interface functions given here includes explicit instructions for

calculating nominally inaccessible state variables and other internal parameters. The calculations

need not be performed exactly as described here, as long as the specified behavior is preserved.

> function lr_init(n_settings_a, n_settings_b, n_outcomes_a, n_outcomes_b,

p_settings)

Initialize the LRE. The arguments are

n_settings_a, n_settings_b: Positive integers giving the number of measurement

settings of Alice and Bob, respectively.

n_outcomes_a, n_outcomes_b: Positive integers giving the number of outcomes of

each setting of Alice and Bob, respectively.

p_settings: A column vector of dimension n_settings_a * n_settings_b giving

the joint probability distribution for the settings used by Alice and Bob. The prob-

ability that Alice and Bob measure the k’th and l’th setting respectively is given by

the k+(l-1)*n_settings_a’th entry of the vector. This argument is optional and

defaults to the uniform distribution.

After lr_init is called, the state of the LRE satisfies the following:

• The experimental configuration variables are set according to the arguments.

118

• goal_frequencies, goal_sd, goal_sd2, test_frequencies, test_sd, and test_sd2

are set to an undefined value (the empty vector []).

• data_half_life and lp_half_life are set to Inf.

• num_exps, weight_snumber, goal_lp_weight_snumber, test_lp_weight_snumber,

data_weight, test_lp_data_weight, goal_lp_data_weight, goal_lp, test_lp,

test_lp_total, goal_lp_v, test_lp_v, and test_lp_total_v are set to 0.

• The ratios in goal_ratios and pred_ratios are set to 1.

> function lr_set_tolerance(tolerance)

Set the variable test_lp_tolerance to tolerance. This can be changed any time. It

should be much less than the predicted violation test_sd, when this is clearly positive.

> function lr_set_use_cm(t_or_f)

Set the variable lr_use_cm to 1 (true) or 0 (false), which determines whether no-signaling

constraints are used in optimizations. The argument is boolean.

> function lr_prime(frequencies_or_block, num)

Prime the LRE with initial frequencies. The arguments are

frequencies_or_block: The frequencies of settings and outcomes of a block of data,

or the block of data itself, as specified below.

num: If the first argument gives the frequencies, then this argument gives the number

of data points that contribute to the frequencies. The argument num is optional. If it

is given, the first argument must be a frequency array, not a block of data.

A block of data consists of a num by 4 matrix. Each row contains a specific data point,

specified by four non-negative integers in the following order: Alice’s setting, Alice’s mea-

surement outcome, Bob’s setting, Bob’s measurement outcome. The integers must be in

119

the appropriate range. For example, Alice’s settings must be between 1 and n_settings_a,

and her outcomes must be between 0 and n_outcomes_a-1.

Frequencies are entered as an array of dimension n_settings_a * n_settings_b by

n_outcomes_a *n_outcomes_b. The entry indexed by (k+(l-1)*n_settings_a,

1+r+s*n_outcomes_a) is the frequency with which Alice’s (Bob’s) outcome is r (s) for the

setting indexed by k (l). The frequencies are normalized so that sum(sum(frequencies)) = 1.

The function lr_prime assumes that the LRE has been initialized. It performs the following

actions:

• The LRE state is reset via lr_reset() (specified below).

• Let bf be the setting and outcome frequencies computed (if necessary) from the input,

and let n be the number of data points that contribute to these frequencies.

• Perform the following assignments:

num_exps = n;

data_weight = 1/n;

weight_snumber = n;

test_frequencies = bf;

• Use test_frequencies to estimate the probabilities fbf of future settings and out-

comes. Our implementation is equivalent to computing fbf in three steps. The first is

to modify bf so that it has the correct setting distribution. The modification is equiv-

alent to a maximum likelihood estimate with the setting distribution as a constraint.

If lr_use_cm is 1 (true), the next step is to obtain the maximum likelihood estimate

subject to no-signaling constraints. The last step adjusts the estimate by mixing in the

setting-conditional uniform distribution with weight 1/(1+weight_snumber). Future

implementations may perform this estimate differently.

120

• Let lf be the optimal LR frequencies witnessing the statistical strength of fbf. Com-

pute the statistical strength test_sd of fbf for rejection of LR according to test_sd =

sum(sum(fbf .* log2(fbf./lf))). The predicted variance is computed as test_sd2

= sum(sum(fbf .* log2(fbf./lf).^2)) - test_sd^2.

• Set pred_ratios = fbf./lf. We multiply pred_ratios by a factor slightly smaller

than 1 to account for not having found the exact optimal LR frequencies due to the

optimization stopping criterion test_lp_tolerance, see Sec. 5.3.2 of Chapter 5 for

details.

• If goal_frequencies is defined, compute the initial values of goal_lp and goal_lp_v,

and set goal_lp_data_weight = 1/n and goal_lp_weight_snumber = n. The initial

values of goal_lp and goal_lp_v are given by sum(sum(n * bf .* log2(goal_ratios)))

and n * goal_sd2, respectively. See function lr_set_goal below for how goal_ratios

and the initial goal_sd2 are calculated. Finally, update goal_sd2 by setting goal_sd2

= sum(sum(fbf .* log2(goal_ratios).^2 )) - sum(sum(fbf .* log2(goal_ratios)))^2.

> function lr_update(frequencies_or_block, num)

Update the LRE state according to new data. The arguments are as for the function

lr_prime. If test_frequencies is undefined, lr_prime(frequencies_or_block, num)

is called. Otherwise the following actions are taken:

• Let bf be the setting and outcome frequencies computed (if necessary) from the input,

and let n be the number of data points that contribute to these frequencies.

• Compute the weight ndw of the contribution of each new data point to test_frequencies.

After computing ndw, test_frequencies is updated according to

test_frequencies = (1 - n*ndw) * test_frequencies + n*ndw * bf;

To compute ndw, solve the equation

121

(1 - n*ndw) * data_weight / ndw = 2^(-n/data_half_life);

The formula is explained in Sec. A.6. For constant block size n and no change in

data_half_life, this ensures that the weight of the contribution of each data point

in the (data_half_life/n)’th last block to test_frequencies is 1/2 of the weight

of the most recent data point.

• Compute the new effective number of data points ndwsn as follows:

ndwsn = ((1 - n*ndw)^2 / weight_snumber + n*ndw^2)^-1;

See Sec. A.6 for an explanation.

• To update test_lp do the following:

+ Calculate the log-p-value increment lpi with respect to pred_ratios for the

block of data:

lpi = sum(sum(n * bf .* log2(pred_ratios)));

test_lp_total = test_lp_total + lpi;

test_lp_total_v = test_lp_total_v + test_sd2 * n;

+ Compute tlp_ndw and tlp_ndwsn by following the steps used to compute ndw and

ndwsn, using test_lp_data_weight, test_lp_weight_snumber and lp_half_life

instead of data_weight, data_weight_snumber and data_half_life, respec-

tively.

+ Obtain tlpw by solving

tlp_ndwsn = (1 - tlpw) * test_lp_weight_snumber + n;

(If test_lp_weight_snumber is zero, set tlpw = 1.)

122

+ Set test_lp = (1 - tlpw) * test_lp + lpi. This ensures that the weight

of each data-point’s contribution to the log-p-value is at most 1, necessary for

interpreting test_lp as a valid log-p-value, see Sec. A.6. It also ensures that the

sum of the weights is tlp_ndwsn, so that tlp_ndwsn is the effective number of

contributing data points consistent with the value of lp_half_life.

+ Set test_lp_v = (1 - tlpw)^2 * test_lp_v + test_sd2 * n.

• Update test_sd, test_sd2 and pred_ratios as explained in the description of the

function lr_prime.

• If goal_frequencies is defined, first calculate the log-p-value increment lpi_g from

goal_ratios, using the method for calculating lpi from pred_ratios. Then up-

date goal_lp and goal_lp_v by following the steps used to update test_lp and

test_lp_v, respectively. After that, update goal_sd2 as in the function lr_prime.

• Complete the update by setting

num_exps = num_exps + n;

data_weight = ndw; weight_snumber = ndwsn;

test_lp_data_weight = tlp_ndw;

test_lp_weight_snumber = tlp_ndwsn;

If goal_frequencies is defined, update goal_lp_data_weight and

goal_lp_weight_snumber similarly.

> function lr_std_analysis(all_trials, verbose)

Run a full analysis of a complete data set using recommended parameters. The argument

all_trials has the same form as a block of data. verbose is a boolean variable 1 or

0 indicating whether to print progress information during the analysis. This argument

is optional and defaults to 1. The function returns the value of test_lp_total and the

123

square root of test_lp_total_v, which is the estimated standard error of test_lp_total

with respect to its mean for repetitions of the same experiment.

Let num be the number of trials as indicated by the number of rows in all_trials. Let

so_num be the number of possible setting and outcome combinations, n_settings_a *

n_settings_b * n_outcomes_a * n_outcomes_b. Our implementation of the function

lr_std_analysis does the following after resetting the LRE:

• Set lr_use_cm = 1 and test_lp_tolerance = min(1E-6, 1/(num*100)).

• Set block_size = ceil(max(num/1000, so_num * log(2*so_num))) as recommended

in Sec. 5.3.1 of Chapter 5.

• Apply the function lr_update to consecutive blocks of block_size rows of all_trials.

The last block may be smaller if block_size does not divide num.

• Return test_lp_total and the square root of test_lp_total_v.

> function lr_set_goal(frequencies)

(Re)set the variable goal_frequencies to the argument frequencies, and (re)compute

the value of the statistical strength goal_sd, the predicted variance goal_sd2 and the prob-

ability ratios goal_ratios. The computations of goal_sd, goal_sd2, and goal_ratios are

performed like those of test_sd, test_sd2, and pred_ratios, respectively (see function

lr_prime), but instead of the estimated probability distribution fbf the computations

use frequencies. Note that, if goal_frequencies is (re)set after priming the LRE, then

goal_sd2 is updated in the same way as in the function lr_prime. The log-p-value goal_lp

and the associated variables goal_lp_data_weight and goal_lp_weight_snumber are un-

changed. This means that the old value of goal_frequencies continues contributing to

goal_lp via the data added before this function was called. This contribution decays

according to the relevant half-life lp_half_life.

Note: If the function lr_prime is used as a way to reduce the learning transient by ini-

124

tializing pred_ratios with prior knowledge from theory or earlier experiments such as

tomography experiments, it is a good idea to delay setting the goal frequencies until after

priming. This makes sure that goal_lp remains zero until an experiment starts properly.

> function lr_set_data_half_life(half_life)

Set data_half_life = half_life. Contributions to frequencies from old and new data

decay at this rate from now on. Relative weights within contributions from old data are

unchanged.

> function lr_set_lp_half_life(half_life)

Set lp_half_life = half_life. Contributions to log-p-values from old and new data

decay at this rate from now on. Relative weights within contributions from old data are

unchanged.

> function lr_reset()

Reset the state of the LRE. Only the data-independent variables are kept. That is, the

state is the same as after

g = goal_frequencies; dh = data_half_life; lh = lp_half_life;

lr_init(n_settings_a, n_settings_b, n_outcomes_a, n_outcomes_b,

p_settings);

lr_set_goal(g); lr_set_data_half_life(dh); lr_set_lp_half_life(lh);

test_lp_tolerance and lr_use_cm are unchanged by the function lr_reset.

> function lr_lps()

Return goal_lp, test_lp, and test_lp_total.

> function lr_lp_vs()

Return goal_lp_v, test_lp_v, and test_lp_total_v.

125

> function lr_sds()

Return goal_sd and test_sd.

> function lr_snumbers()

Return weight_snumber, goal_lp_weight_snumber, and test_lp_weight_snumber.

A.4 LRE support functions

By setting the half-lives to infinity, the LRE can be used directly to analyze existing data

for violation of LR. If one can make a blind prediction of the frequencies of settings and outcomes,

then one can prime the engine with the predicted frequencies and get better log-p-values without

having to learn the setting and outcome frequencies from the initial blocks of data.

One of the applications of the LRE is to monitor an ongoing experiment, even if the data are

not currently in a region where LR is violated. For this we provide functions to compute realistic

goal frequencies associated with goal states and (possibly optimized) measurement settings. With

the goal frequencies, monitoring goal_lp may help to tweak an experiment toward an LR violation.

Once LR is violated, one can try to improve test_lp directly, without necessarily moving the

experiment to the hoped-for goal.

The LRE support functions enable computing goal frequencies and their statistical strengths

for rejecting LR, given experimental settings and noise parameters. They also make it possible

to generate simulated data to test and explore LRE functions. The support function arguments

include a state specification, state_spec; a noise specification (losses and visibility), noise_spec;

and a measurement-setting specification, settings_spec.

state_spec: State specification. Currently only balanced and unbalanced Bell states as

if prepared by idealized down-conversion can be specified. For this case, state_spec is a

two-component vector, specifying the bias θ and population p of the state

|ψ〉 =√

1− p|0〉A|0〉B +√p (sin(θ)|H〉A|H〉B + cos(θ)|V 〉A|V 〉B) . (A.1)

126

We assume that double and higher-order pair emission is negligible. Forms of state_spec

with more than two components are reserved for other types of states.

noise_spec: Noise specification. For the case of experiments involving the states in

Eq. (A.1) (as indicated by state_spec being a vector of length 2), there are up to three

parameters: losses ηa and ηb for Alice and Bob, and visibility v. If noise_spec is a real

number, it specifies the two identical losses, and visibility defaults to 1. If noise_spec is a

vector of length 2, the first entry is ηa = ηb, and the second is the visibility. If it is a vector

of length 3, it contains ηa, ηb, and v, in this order.

settings_spec: Description of Alice’s and Bob’s measurement configurations. For an

experimental test of LR, this is a cell array with the following components:

• settings_spec{1}: The numbers of settings available to Alice and Bob. If

length(settings_spec{1}) == 0, they default to 2. If length(settings_spec{1})

== 1, then both numbers of settings are given by settings_spec{1}. If

length(settings_spec{1}) == 2, then Alice’s and Bob’s numbers of settings are

the first and second entries, respectively.

• settings_spec{2}: The numbers of (setting-independent) measurement outcomes for

Alice and Bob. If length(settings_spec{2}) == 0, they default to 2. Otherwise,

they are treated like settings_spec{1}. For experiments involving the states in

Eq. (A.1), only the values 2 and 3 make sense. The measurement settings are specified

by observables with eigenvalues +1,−1 (see the description of settings_spec{4}).

The +1-eigenvalue outcomes are assigned to outcome 1, when entered in data for

the LRE. If settings_spec{2} == 2, we assume that there is only one detector,

and it “clicks” if the photon is found in the +1-eigenvalue eigenstate. Otherwise we

don’t detect the photon (it is lost, or it would have been found in the −1-eigenvalue

eigenstate), which is assigned to outcome 0. If settings_spec{2} == 3, we assume

that there are two detectors, one for each eigenvalue. The +1 and −1 eigenvalues are

127

assigned to outcomes 1 and 2, respectively, and no detection is assigned to 0.

• settings_spec{3}: The probability distribution of the measurement-setting choices.

If length(settings_spec{3}) == 0, it is assumed to be uniform. Otherwise it has

the same form as p_settings.

• settings_spec{4}: Alice’s measurement settings. This is a n_settings_a by 1 or 2

matrix. The k’th setting is given by its k’th row. It specifies angles for the Jones vector

of the polarization +1-eigenstate of the measurement operator mentioned above. If

there is only one parameter θ, the Jones vector is (cos(θ), sin(θ)), indicating polariza-

tion at angle θ from horizontal polarization. If there are two parameters θ and φ, the

Jones vector is (cos(θ), eiφ sin(θ)). The vectors can be interpreted as photon polariza-

tion states with (0, 1), (1, 0), (1, 1)/√

2, and (1− i, 1 + i)/2 the vertically, horizontally,

diagonally and right-circularly polarized states, respectively.

• settings_spec{5}: Bob’s measurement settings, in the same form as Alice’s but with

n_settings_b many rows.

The following functions are provided:

> function lr_config_freqs(state_spec, noise_spec, settings_spec)

Return the probabilities of settings and outcomes according to the given experimental

configuration. These probabilities are predicted by quantum theory. The returned value’s

data type is the same as that of goal_frequencies.

> function lr_config_analysis(state_spec, noise_spec, settings_spec)

Analyze the given experimental configuration for violation of LR. The function returns sd,

frequencies, ratios, p_settings, and lr_frequencies.

sd: The statistical strength of the configuration for rejecting LR.

frequencies: The quantum predicted frequencies of settings and outcomes. The data

type matches that of goal_frequencies.

128

ratios: The probability ratios for computing log-p-values. The data type matches

that of goal_ratios.

p_settings: The probability distribution of settings inferred from the arguments.

lr_frequencies: The frequencies of settings and outcomes according to the optimal

LR model. The data type is the same as that of goal_frequencies.

> function lr_config_optim(state_spec, noise_spec, settings_spec)

Optimize the measurement settings for a specified experimental configuration by maximiz-

ing the statistical strength of the LR violation. The optimization uses the set of given

measurement settings as an initial point. A local optimum is found. The function returns

opt_settings and opt_sd, where

opt_settings: The locally optimal settings found in the format matching the settings_spec

argument. Only setting parameters of the argument are changed.

opt_sd: The statistical strength for the LR violation of the optimal settings.

> function lr_test_simulation(state_spec, noise_spec, settings_spec, num)

Simulate num many trials according to the given experimental configuration. The function

returns a block of data data_block, in the correct form for use as the frequencies_or_block

argument in the function lr_prime.

A.5 LRE usage examples

Note: The implementation of the LRE provided with this guide is “research grade”. It is in-

consistently documented and should be considered unstable. It does not perform consistency checks

on user-provided inputs. The following observation may help: Inconsistencies in user-provided in-

put often result in array mismatch errors. If changes are made to the code, there is a minimal test

suite (see testsuite.m) to perform a few simple (but incomplete) specification tests.

129

In the examples, the commands are shown without a prompt and are intended to be invoked

in an Octave or Matlab shell. The LRE file prep_lre.m must be located on the path used to find

scripts. Commands that are specific to this implementation of the LRE have comments starting

with %**. Scripts for the examples are given with the code, see example1.m and example2.m.

A.5.1 Analyzing an existing data set

Here is an example of how to use the LRE for analyzing a data set.

For the implementation provided with this guide, the first step is always to initialize the

LRE. This sets up the needed paths and variables.

prep_lre;

%** CAUTION: This script defines the global variables used. Most of

%** them are explained in the description of LRE parameters and

%** states. There are some exceptions. Be aware of the possibility

%** of conflict if any of these variable names is used elsewhere.

%** Check pre_lre.m for the list of variable names.

Normally, one next loads the data set to be analyzed. For the purpose of this example, we

simulate it by using the provided support functions. We assume that the measurement setting is

chosen uniformly at random in each trial, the default.

sim_state_spec = [pi/4, 0.1]; % A Bell state, pair probability = 0.1.

sim_noise_spec = [0.95, 0.97]; % The efficiencies are 0.95,

% the visibility is 0.97.

sim_settings_spec = cell(5, 1);

sim_settings_spec{1} = 2; % Two measurement settings each.

sim_settings_spec{2} = 3; % Three outcomes for each setting.

sim_settings_spec{4} = [0; pi/4]; % Alice’s settings, PBS angles.

sim_settings_spec{5} = [pi/8; -pi/8];% Bob’s settings, PBS angles.

130

[sd, frequencies] = lr_config_analysis(sim_state_spec, sim_noise_spec,

sim_settings_spec);

disp(sd); % Expected statistical strength is sd = 0.0017331.

all_data = lr_test_simulation(sim_state_spec, sim_noise_spec, sim_settings_spec,

100000); % Simulate 100,000 trials.

The first step of the analysis is to initialize the LRE.

lr_init(2, 2, 3, 3); % The arguments specify numbers of settings and outcomes.

The recommended method for analyzing data from a completed experiment is to use the

provided function lr_std_analysis and report the returned value.

[violation_log_p_value, violation_log_p_value_sd] = lr_std_analysis(all_data);

% The final log-$p$-value is expected to be about 173 minus the learning

% transient, which we observed to be about 25 on average. The standard

% deviation of the reported log-$p$-value is around 20.

It may be desirable to choose some of the analysis parameters differently from the ones

recommended in lr_std_analysis. If so, such a choice must be made blindly, that is before any

information about the data to be analyzed is available. Information about an experiment before

the data was acquired can be used. With such information, simulation may help in making better

choices of analysis parameters.

Here is an instance of an explicit analysis with parameters chosen manually. First we reset

the analysis engine.

lr_reset();

The main choice to be made is the number of data points in each block of data to be processed.

It is necessary to make sure that the data in each block has a reasonable setting distribution. In

131

particular, do not provide blocks with just one of the settings’ data, otherwise the LRE estimated

distributions may vary excessively, potentially worsening the log-p-values.

Blocking the data improves efficiency, and having the first block be large enough to contain a

good sample of the possible setting and outcome combinations mitigates the log-p-value offset that

typically arises from the learning transient. If the blocks are too large, we may reduce the log-p-

value by, in effect, not considering the first block for the log-p-value calculations. The first block

is used only to estimate the next block’s setting and outcome frequencies. Thus, we recommend a

block size of the larger of N/1000, where N is the number of trials, and d ln(2d), where d is the

number of possible setting and outcome combinations. The first bound is chosen so that we do

not lose too much significance by using the first block only for learning the setting and outcome

distribution. The second ensures that if the setting and outcome distribution is uniform, the

probability of having at least one event for each setting and outcome combination in each block is

at least 1/2. For our example, the recommended block size is 154. For simplicity, we use 200.

Our implementation of the LRE has two parameters that affect the calculations of log-p-

values. The first is test_lp_tolerance, which affects the stopping criterion for the LR model

optimization and the conservative factors used in pred_ratios and goal_ratios to ensure valid

log-p-values. It should be a small fraction of the anticipated statistical strength. If the strength

is not particularly known, setting it to 1/(100N) ensures that if the set of data has sufficient

violation of LR, the stopping criterion does not significantly decrease the log-p-value compared to

what would be obtained with ideal LR model optimization. The second parameter is lr_use_cm

and determines whether no-signaling constraints are applied in estimating setting and outcome

distributions. Applying these constraints increases computation time, but improves the estimates,

particularly for the first few blocks.

lr_set_tolerance(1e-7); % 1/(100*N)

lr_use_cm = 1; % Turn on "no-signaling" constraints.

If we have a reasonable estimate of the setting and outcome frequencies that was obtained

132

before an experiment was started, we can “prime” the LRE with these frequencies. This can

reduce or eliminate the effect of the learning transient. If these frequencies are available in

predicted_frequencies and the statistics of the prediction are as good as if we had inferred

them from support_number many trials, then we can invoke the following before analyzing the

experimental data (after removing the comment symbol):

% lr_prime(predicted_frequencies, support_number);

If the experiment was not stable for the entire time that it took to acquire the data, it may

help to set the half-lives to values of the order of the stability time. This allows the LRE to adapt

to drifts in the experimental state. Here we assume that the experiment was sufficiently stable and

leave the half-lives at infinity, the default.

We now process one block of data with the block size 200 at a time.

for i = 1:500

lr_update(all_data(((i-1)*200+1):(i*200), 1:4));

%** The following can be used to monitor progress:

disp(sprintf(’Log-p-value so far: %5.2f +/- %4.2f’,test_lp_total,...

sqrt(test_lp_total_v)));

end

[glp, tlp, tlpt] = lr_lps();

disp(tlpt); % The third value returned is the total

% log-$p$-value and the main result of the

% analysis.

[glpv, tlpv, tlptv] = lr_lp_vs();

disp(sqrt(tlptv)); % The third returned value is the variance of the total

% log-$p$-value.

133

A.5.2 Monitoring an experiment in progress

The following shows how one can use the LRE for monitoring an experiment in progress.

The experiment to be simulated involves a (balanced) Bell state expected to be measured with

95% efficiency (for all detectors). Measurements have three possible outcomes (two orthogonal

polarizations and “no detection”). The setup aims for a photon-pair production probability of 0.1

and a visibility of 97%. The settings are intended to maximize the CHSH inequality. We begin

with setting up the LRE environment.

prep_lre;

We initialize the LRE with two settings and three outcomes for each of Alice and Bob. The

setting distribution is uniform, the default.

lr_init(2, 2, 3, 3);

For monitoring an experiment, it can be useful to set up ‘goal’ frequencies, which are the

setting and outcome frequencies that can be realistically expected if the experiment is configured

as intended. The goal frequencies are used by the LRE to calculate a quasi-log-p-value, goal_lp.

This behaves somewhat like the negative logarithm of a distance from the goal to the current

experimental state: To approach the goal one tries to increase goal_lp. More negative values

indicate that the current state is further from the goal.

By tweaking experimental parameters, one can aim for increasing goal_lp. If the goal state

has been reached, then goal_lp is expected to be approximately the statistical strength of the

goal frequencies times the effective number of contributing data points, goal_lp_weight_snumber.

While monitoring goal_lp, one can also monitor test_lp. Unlike goal_lp, negative values of

test_lp are not useful for tweaking. However, once test_lp becomes significantly positive, one

can tweak it directly to optimize the experimental configuration. This allows exploring states that

differ from the anticipated goal but may have better statistical strengths.

134

To set up the goal frequencies, we can use the support functions. We compute the goal

frequencies based on reasonable and achievable state parameters.

state_spec = [pi/4, 0.1]; % Bell state with 0.1 probability of pairs.

noise_spec = [0.95, 0.97]; % Hoped-for efficiencies are 0.95,

% and visibility is 0.97.

settings_spec = cell(5, 1);

settings_spec{1} = 2; % Two settings each.

settings_spec{2} = 3; % Three outcomes for each setting.

settings_spec{4} = [0; pi/4]; % Alice’s settings, PBS angles.

settings_spec{5} = [pi/8; -pi/8]; % Bob’s settings, PBS angles.

[sd, frequencies] = lr_config_analysis(state_spec, noise_spec, settings_spec);

disp(sd);

% The statistical strength for the goal frequencies is sd = 0.0017331.

% Set the goal frequencies in the LRE.

lr_set_goal(frequencies);

Next we set the half-lives for monitoring. It sets the time scale (in number of data points)

after which data no longer contributes significantly to current statistics. The values should be a

good multiple of the inverse of the expected statistical strength.

lr_set_data_half_life(4000);

lr_set_lp_half_life(4000);

For this example, we need to simulate the data, so we set up simulation parameters. For the

purpose of the example, we assume that all is well except for the visibility, which is only 0.7 at the

moment.

sim_state_spec = [pi/4, 0.1];

135

sim_noise_spec = [0.95, 0.7]; % Visibility is still low, at 0.7.

sim_settings_spec = settings_spec;

Normally we are not likely to know what the true experimental state parameters actually

are. But for this example we do, so we can check whether we are going to violate LR, and we find

that we are not.


sim_settings_spec);

disp(sd);

% Statistical strength is numerically 0, no violation.

This is how to get a block of 1000 data points.

data_block = lr_test_simulation(sim_state_spec, sim_noise_spec,

sim_settings_spec, 1000);

% lr_use_cm = 1; %** Turn on "no-signaling" constraints for slightly

%** better predictions.

Run the experiment for a while.

lr_update(data_block);

for i = 1:20

lr_update(lr_test_simulation(sim_state_spec, sim_noise_spec,

sim_settings_spec, 1000));


disp(sprintf(’Block %2d, goal_lp:%5.2f +/- %4.2f, test_lp:%5.2f +/- %4.2f,...

test_lp_total:%5.2f +/- %4.2f’, i+1, goal_lp, sqrt(goal_lp_v), test_lp,...

sqrt(test_lp_v), test_lp_total, sqrt(test_lp_total_v)));

end

136

We can check the status of the LRE by displaying various statistics. See the explanation of

the functions lr_lps, lr_lp_vs, lr_sds, and lr_snumbers.

[glp, tlp, tlpt] = lr_lps(); disp([glp, tlp, tlpt]); % Log-$p$-values.

[glpv, tlpv, tlptv] = lr_lp_vs();

disp([glpv, tlpv, tlptv]); % Variances of log-$p$-values.

[gsd, tsd] = lr_sds(); disp([gsd, tsd]); % Statistical strengths.

[ws, gws, tws] = lr_snumbers(); disp([ws, gws, tws]); % Effective numbers of points.

Suppose we tweak the experiment so that the visibility improves. Using our unrealistic

knowledge of the true visibility, we can check the expected statistical strength.

sim_noise_spec=[0.95, 0.8];


sim_settings_spec);

disp(sd);

% Statistical strength is now 1.5e-5.

Run the experiment for a while.

for i = 1:40







end

The half-lives were not set high enough to clearly see the violation yet. But the goal log-p-

value goal_lp should have increased noticeably, suggesting that we improved the experiment in a

137

useful direction. We tweak it a bit more, in this case improving the visibility to that assumed for

the goal frequencies, and continue running the experiment.

sim_noise_spec=[0.95, 0.97];

for i = 1:40







end

The violation should now be noticeable, with test_lp now positive.

A.6 Technical notes

A.6.1 Data half-lives and data weights

The LRE is designed to require minimal memory of past data. Online monitoring of exper-

iments requires that only recently acquired data contributes significantly to the various statistics

being tracked. This is made possible by updating the recorded setting and outcome frequencies by

setting them to a convex combination of new and old frequencies. Let di be the i’th data point

represented by a 0/1-vector whose length is the number of setting and outcome combinations and

that has a single 1 in the position corresponding to the i’th setting and outcome combination. The

frequency vector for all data points is given by∑N

i=1 di/N , where N is the total number of data

points. This is the final value of test_frequencies if data_half_life = Inf. In general, after

the n’th data point is acquired, the value of test_frequencies is given by a weighted combination

fn =∑n

i=1wn,idi, where 0 ≤ wn,i ≤ 1 and∑n

i=1wn,i = 1. To minimize what we have to remember

of the past, we wish to update the frequencies so that fn+1 = vn+1dn+1 + (1 − vn+1)fn, where

138

0 ≤ vn+1 ≤ 1. The quantity vn+1 is the weight of the (n+ 1)’th data point after acquiring (n+ 1)

data points. For efficiency and because there is little new information in individual data points,

we prefer to update frequencies and other statistics one block of data at a time. Blocking does not

affect the validity of the computed log-p-values. We implement blocking by having the weights of

data points within a block be identical. Let Dk be the sum of the di contributing to the k’th block,

and let mk be the number of data points in the block. In terms of blocks, we use the frequency

update Fk+1 = Vk+1Dk+1 + (1 −mk+1Vk+1)Fk, with 0 ≤ Vk+1 ≤ 1/mk+1. The interpretation of

Vk+1 is similar to vn+1 above, but it is the identical weight of each data point in the (k + 1)’th

block. Its value is computed according to an approximate half-life λ (specified by the parameter

data_half_life). For a normal geometric decay, this requires wn,i/wn,i+1 = 2−1/λ. Because we

have blocked the data, matching the half life on average requires Vk(1−mk+1Vk+1)/Vk+1 = 2−mk+1/λ

to get the right ratio of weights between the last two contributing blocks after the update. We

solve this equation for Vk+1 to get

Vk+1 =Vk

mk+1Vk + 2−mk+1/λ. (A.2)

Thus the update requires keeping track of the weight of the most recent point, which after the

update is given by Vk+1. The first block requires special treatment: We set V1 = 1/m1.

We use a similar geometrical weight strategy to update the log-p-values. However, the con-

tribution to the log-p-value of the most recent data point should be weighted by 1, and the sum

of the weights should be an effective number of contributing data points. To ensure that the in-

terpretation of the weighted combination as a log-p-value is valid, the weights must be between 0

and 1. The reason is as follows. As explained in Chapter 5, the starting point for computing a

valid log-p-value is that the ratio function R (the value of variables such as goal_ratios) satisfies

the two conditions 0 ≤ R(x) and 〈R(x)〉 ≤ 1 for any LR model, where x is a trial’s setting and

outcome combination. Given such a function R, if we modify it according to R′(x) = (R(x))γ

where 0 ≤ γ ≤ 1, then 0 ≤ R′(x), and for any LR model 〈R′(x)〉 ≤ 〈R(x)〉γ ≤ 1 by concavity of the

function y → yγ . Hence the log-p-value computed using R′ instead of R is also valid. In particular,

139

the weighted combination of valid log-p-value increments is a valid log-p-value, if all weights are

between 0 and 1.

One way to quantify an effective number of contributing data points is by noting that if we

have independent instances xi of a random variable X and estimate the mean according to weights

wn,i as∑

iwn,ixi, the variance of the estimate is given by ve =∑

iw2n,iv, where v is the variance

of X. If the weights are all equal, ve = v/n. For arbitrary weights, ve = v/sn, where sn can be

interpreted as an effective number of contributing points. Thus, sn = 1/∑

iw2n,i. Let Sk be the

effective number of contributing points after the k’th block has been received. To update Sk when

it is nonzero, we use the following relationship implied by the formula for sn:

1Sk+1

= mk+1(Vk+1)2 + (1−mk+1Vk+1)2 1Sk, (A.3)

where Vk+1 is given by Eq. (A.2).

We also account for the following issues: First, the first nonzero Sk depends on which effective

number suffixed by weight_snumber it is. For weight_snumber, S1 = m1; for test_lp_weight_snumber,

S1 = 0 and S2 = m2; and for goal_lp_weight_snumber, the Sk are nonzero only after the goal fre-

quencies are set. Second, for different effective numbers, the weights of the most recent data points

Vk are different. This is because the half-life for the log-p-values can be different from the half-life

for the setting and outcome frequencies, and because goal frequencies can be set anytime. Hence,

we need to separately keep track of these weights with the variables suffixed by data_weight.

Also, the weight Vk is nonzero only after the corresponding Sk is nonzero. If Sl is the first nonzero

effective number, then Vi = 0, i = 1, . . . , (l − 1), and Vl = 1/Sl. After that, the Vk are updated

according to Eq. (A.2).

To see what to expect of the effective number of contributing data points given the half-life λ,

consider the case of block size 1 in the asymptotic limit. Compute 1/S∞ =∑

i(1−2−1/λ)22−2i/λ =

(1− 2−1/λ)/(1 + 2−1/λ). For large λ, S∞ ≈ (2/ ln(2))λ ≈ 2.9λ.

Given the corresponding effective numbers of contributing data points, we compute and

update the log-p-values test_lp and goal_lp as follows. Consider the case for test_lp. Write

140

Lk for the value of test_lp after the k’th block of data has been processed. We compute Lk+1

so that it is a weighted combination of Lk and the log-p-value contributions from the last block of

data, and the total weight of log-p-value contributions is given by the corresponding Sk+1. Let Bk

be the sum of the log-p-value contributions of the data points in the k’th block. Thus, we have

Lk+1 = Bk+1 +Wk+1Lk, where Wk+1 is obtained by solving mk+1 +Wk+1Sk = Sk+1. This ensures

that 0 ≤ Wk+1 ≤ 1, so that the weight of the log-p-value increment due to each data point is at

most 1. Consequently, Lk+1 is always a valid log-p-value. We can use the same strategy to compute

and update goal_lp.

Appendix B

Optimization results for Chapter 4

Using the code written in Octave, which is available by request, we find the results as shown

in the following tables. In these tables, the units of the columns labeled as theta, gamma, A_1, A_2,

B_1, and B_2 are degrees (◦). The column labeled as theta (or gamma) contains the values of the

parameter θ in the state |ψuB〉 = cos(θ)|H〉A|H〉B +sin(θ)|V 〉A|V 〉B (or the values of the parameter

γ in the state |ψpB〉 = cos2(γ)|H〉3|H〉4 + sin2(γ)|V 〉3|V 〉4 + cos(γ) sin(γ)(|H〉3|V 〉3 + |H〉4|V 〉4)).

The columns labeled as A_1 and A_2 contain the two optimal measurement-setting angles for Alice,

while the columns labeled as B_1 and B_2 contain the two optimal measurement-setting angles for

Bob. The columns labeled as eta_1 and eta_2 give the detection efficiencies required to achieve the

statistical strengths in the columns labeled as S_1 and S_2, respectively. The column labeled as V

(if shown) means the visibility of the prepared unbalanced Bell state. Due to the limit of numerical

accuracy, we cannot find the exact detection efficiency ηc required to achieve a specified statistical

strength level S. Up to the 10−4 level, we list the best two detection efficiencies, which are closest

to ηc. Using eta_1, the statistical strength of a test of LR is a little higher than the specified level,

while using eta_2, the statistical strength is a little lower than the specified level. In the plots of

Fig. 4.2, Fig. 4.3, and Fig. 4.4, we use eta_1 or eta_2, according to which of them gives a statistical

strength closer to the specified level. Also, for calculations of the minimum detection efficiencies at

0 + ε statistical strength, we truncate the statistical strength to 0 when it is numerically calculated

to be less than 10−9 or 10−10, depending on the situation.

142

B.1 Results for unbalanced Bell states using photon counters or detectors

Statistical strength~=0

theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2

45 22.50 -67.50 -22.50 67.50 82.85% 8.66E-009 82.84% 1.44E-013

40 21.28 -66.89 -21.28 66.89 80.61% 1.47E-010 80.60% 3.18E-015

35 19.40 -65.60 -19.40 65.60 78.50% 3.97E-009 78.49% 1.33E-013

30 17.00 -63.58 -17.00 63.58 76.50% 5.48E-009 76.49% 3.62E-011

25 14.21 -60.72 -14.21 60.72 74.60% 1.48E-009 74.59% 1.66E-013

20 11.14 -56.79 -11.14 56.79 72.81% 2.91E-009 72.80% 5.56E-011

15 7.92 -51.42 -7.92 51.42 71.12% 3.84E-009 71.11% 6.85E-010

10 4.70 -43.88 -4.70 43.88 69.53% 2.56E-009 69.52% 7.10E-010

5 1.81 -32.41 -1.81 32.41 68.06% 1.67E-009 68.05% 8.37E-010

4 1.32 -29.25 -1.32 29.25 67.78% 1.21E-009 67.77% 6.40E-010

3 0.87 -25.55 -0.87 25.55 67.52% 1.48E-009 67.51% 9.78E-010

2 0.48 -21.04 -0.48 21.04 67.27% 1.31E-009 67.26% 9.85E-010

1 0.17 -15.01 -0.17 15.01 67.06% 1.00E-009 67.05% 8.57E-010

Statistical strength~=1E-6


45 22.50 -67.50 -22.50 67.50 82.93% 1.24E-006 82.92% 9.71E-007

40 21.28 -66.89 -21.28 66.89 80.72% 1.00E-006 80.71% 8.30E-007

35 19.41 -65.60 -19.41 65.60 78.62% 1.03E-006 78.61% 8.79E-007

30 17.02 -63.59 -17.02 63.59 76.64% 1.09E-006 76.63% 9.48E-007

25 14.23 -60.73 -14.23 60.73 74.77% 1.07E-006 74.76% 9.53E-007

20 11.15 -56.80 -11.15 56.80 73.01% 1.01E-006 73.00% 9.18E-007

15 7.93 -51.43 -7.93 51.43 71.39% 1.07E-006 71.38% 1.00E-006

10 4.72 -43.89 -4.72 43.89 69.93% 1.04E-006 69.92% 9.88E-007

5 1.82 -32.42 -1.82 32.42 68.86% 1.02E-006 68.85% 9.96E-007

4 1.33 -29.25 -1.33 29.25 68.78% 1.01E-006 68.77% 9.90E-007

143

3 0.88 -25.55 -0.88 25.55 68.84% 1.00E-006 68.83% 9.86E-007

2.5 0.68 -23.43 -0.68 23.43 68.98% 1.01E-006 68.97% 9.94E-007

2 0.49 -21.05 -0.49 21.05 69.25% 1.01E-006 69.24% 9.96E-007

1.5 0.32 -18.31 -0.32 18.31 69.78% 1.00E-006 69.77% 9.94E-007

1 0.18 -15.01 -0.18 15.01 70.96% 1.00E-006 70.95% 9.98E-007

0.75 0.12 -13.03 -0.12 13.03 72.17% 1.00E-006 72.16% 9.98E-007

0.5 0.07 -10.66 -0.07 10.66 74.56% 1.00E-006 74.55% 9.98E-007

0.08 0.00 90.00 -0.16 0.16 100.00% 1.00E-006



45 22.50 -67.50 -22.50 67.50 83.10% 1.08E-005 83.09% 9.95E-006

40 21.30 -66.90 -21.30 66.90 80.96% 1.00E-005 80.95% 9.46E-006

35 19.43 -65.62 -19.43 65.62 78.89% 1.01E-005 78.88% 9.56E-006

30 17.05 -63.61 -17.05 63.61 76.95% 1.02E-005 76.94% 9.79E-006

25 14.26 -60.76 -14.26 60.76 75.14% 1.03E-005 75.13% 9.96E-006

20 11.20 -56.84 -11.20 56.84 73.47% 1.03E-005 73.46% 9.99E-006

15 7.97 -51.47 -7.97 51.47 71.98% 1.01E-005 71.97% 9.90E-006

10 4.75 -43.92 -4.75 43.92 70.81% 1.01E-005 70.80% 9.92E-006

5 1.85 -32.45 -1.85 32.45 70.60% 1.01E-005 70.59% 9.99E-006

4 1.35 -29.27 -1.35 29.27 70.94% 1.00E-005 70.93% 9.97E-006

3 0.90 -25.57 -0.90 25.57 71.69% 1.00E-005 71.68% 9.98E-006

2.5 0.69 -23.45 -0.69 23.45 72.36% 1.00E-005 72.35% 9.98E-006

2 0.50 -21.06 -0.50 21.06 73.41% 1.00E-005 73.40% 9.98E-006

1.5 0.33 -18.32 -0.33 18.32 75.20% 1.00E-005 75.19% 1.00E-005

1 0.19 -15.02 -0.19 15.02 78.69% 1.00E-005 78.68% 9.99E-006

0.5 0.00 90.00 -0.95 0.95 90.11% 1.00E-005 90.10% 9.99E-006

0.31 0.00 90.00 -0.61 0.61 100.00% 1.00E-005

144



45 22.50 -67.50 -22.50 67.50 83.63% 1.01E-004 83.62% 9.83E-005

40 21.34 -66.94 -21.34 66.94 81.71% 1.00E-004 81.70% 9.82E-005

35 19.51 -65.69 -19.51 65.69 79.74% 1.01E-004 79.73% 9.90E-005

30 17.16 -63.71 -17.16 63.71 77.92% 1.00E-004 77.91% 9.91E-005

25 14.39 -60.87 -14.39 60.87 76.28% 1.01E-004 76.27% 9.95E-005

20 11.33 -56.96 -11.33 56.96 74.87% 1.01E-004 74.86% 9.98E-005

15 8.09 -51.58 -8.09 51.58 73.81% 1.00E-004 73.80% 9.94E-005

10 4.86 -44.03 -4.86 44.03 73.50% 1.00E-004 73.49% 9.96E-005

7 3.03 -37.82 -3.03 37.82 74.22% 1.00E-004 74.21% 9.99E-005

5 1.92 -32.52 -1.92 32.52 75.73% 1.00E-004 75.72% 9.99E-005

4 1.41 -29.34 -1.41 29.34 77.21% 1.00E-004 77.20% 9.99E-005

3 0.95 -25.64 -0.95 25.64 79.74% 1.00E-004 79.73% 9.98E-005

2 0.54 -21.12 -0.54 21.12 84.64% 1.00E-004 84.63% 1.00E-004

1 0.00 90.00 -1.98 1.98 99.30% 1.00E-004 99.29% 1.00E-004

0.98 0.00 90.00 -1.93 1.93 100.00% 1.00E-004



45 22.50 -67.50 -22.50 67.50 85.33% 1.01E-003 85.32% 9.99E-004

40 21.47 -67.06 -21.47 67.06 84.02% 1.01E-003 84.01% 1.00E-003

35 19.77 -65.90 -19.77 65.90 82.33% 1.00E-003 82.32% 9.95E-004

30 17.50 -63.99 -17.50 63.99 80.88% 1.00E-003 80.87% 9.97E-004

25 14.79 -61.22 -14.79 61.22 79.74% 1.00E-003 79.73% 9.98E-004

20 11.74 -57.34 -11.74 57.34 79.06% 1.00E-003 79.05% 9.97E-004

15 8.48 -51.97 -8.48 51.97 79.21% 1.00E-003 79.20% 9.99E-004

10 5.18 -44.37 -5.18 44.37 81.17% 1.00E-003 81.16% 9.99E-004

145

7 3.28 -38.13 -3.28 38.13 84.52% 1.00E-003 84.51% 9.99E-004

5 2.15 -32.89 -2.15 32.89 89.16% 1.00E-003 89.15% 1.00E-003

4 1.68 -29.59 -1.68 29.59 93.81% 1.00E-003 93.80% 1.00E-003

3.09 0.00 90.00 -6.10 6.10 100.00% 1.00E-003

B.2 Tradeoff between visibility and efficiency using unbalanced Bell states

where S = 10−6

theta A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2 V

45 22.50 -67.50 -22.50 67.50 99.90% 1.05E-006 99.89% 8.57E-007 0.71

44.4 22.49 -67.49 -22.49 67.49 99.20% 1.15E-006 99.19% 9.48E-007 0.72

43.9 22.47 -67.48 -22.47 67.48 98.49% 1.03E-006 98.48% 8.41E-007 0.73

43.3 22.43 -67.44 -22.43 67.44 97.79% 1.10E-006 97.78% 9.12E-007 0.74

42.8 22.38 -67.42 -22.38 67.42 97.09% 1.18E-006 97.08% 9.89E-007 0.75

42.2 22.31 -67.37 -22.31 67.37 96.38% 1.09E-006 96.37% 9.05E-007 0.76

41.6 22.23 -67.31 -22.23 67.31 95.67% 1.03E-006 95.66% 8.55E-007 0.77

41 22.13 -67.24 -22.13 67.24 94.96% 1.02E-006 94.95% 8.47E-007 0.78

40.4 22.00 -67.16 -22.00 67.16 94.25% 1.07E-006 94.24% 8.94E-007 0.79

39.7 21.86 -67.05 -21.86 67.05 93.53% 1.01E-006 93.52% 8.45E-007 0.80

39.1 21.71 -66.94 -21.71 66.94 92.81% 1.05E-006 92.80% 8.82E-007 0.81

38.5 21.54 -66.82 -21.54 66.82 92.08% 1.02E-006 92.07% 8.59E-007 0.82

37.8 21.34 -66.67 -21.34 66.67 91.35% 1.13E-006 91.34% 9.59E-007 0.83

37.1 21.10 -66.50 -21.10 66.50 90.60% 1.04E-006 90.59% 8.83E-007 0.84

36.4 20.86 -66.32 -20.86 66.32 89.85% 1.14E-006 89.84% 9.77E-007 0.85

35.6 20.56 -66.09 -20.56 66.09 89.08% 1.12E-006 89.07% 9.57E-007 0.86

34.8 20.23 -65.84 -20.23 65.84 88.29% 1.02E-006 88.28% 8.65E-007 0.87

34 19.89 -65.57 -19.89 65.57 87.49% 1.04E-006 87.48% 8.90E-007 0.88

33.1 19.48 -65.24 -19.48 65.24 86.67% 1.08E-006 86.66% 9.34E-007 0.89

32.1 19.00 -64.84 -19.00 64.84 85.82% 1.06E-006 85.81% 9.12E-007 0.90

31.1 18.49 -64.40 -18.49 64.40 84.94% 1.04E-006 84.93% 9.04E-007 0.91

30 17.91 -63.89 -17.91 63.89 84.02% 1.00E-006 84.01% 8.69E-007 0.92

146

28.9 17.29 -63.34 -17.29 63.34 83.06% 1.06E-006 83.05% 9.28E-007 0.93

27.5 16.48 -62.56 -16.48 62.56 82.04% 1.12E-006 82.03% 9.87E-007 0.94

26.1 15.62 -61.71 -15.62 61.71 80.93% 1.04E-006 80.92% 9.21E-007 0.95

24.4 14.55 -60.56 -14.55 60.56 79.72% 1.08E-006 79.71% 9.69E-007 0.96

22.4 13.24 -59.05 -13.24 59.05 78.34% 1.03E-006 78.33% 9.27E-007 0.97

19.8 11.49 -56.80 -11.49 56.80 76.70% 1.05E-006 76.69% 9.60E-007 0.98

16.2 9.00 -53.00 -9.00 53.00 74.50% 1.01E-006 74.49% 9.32E-007 0.99

13.3 7.00 -49.23 -7.00 49.23 72.88% 1.05E-006 72.87% 9.84E-007 0.995

7.06 2.98 -37.85 -2.98 37.85 69.93% 1.03E-006 69.92% 1.00E-006 0.9995

4 1.32 -29.25 -1.32 29.25 68.78% 1.01E-006 68.77% 9.90E-007 1

B.3 Results for pseudo-Bell states using photon counters


gamma A_1 A_2 B_1 B_2 eta_1 S_1 eta_2 S_2

45 22.50 -67.50 -22.50 67.50 90.62% 3.61E-009 90.61% 2.02E-014

40 20.49 -66.01 -20.49 66.01 89.71% 8.90E-009 89.70% 8.94E-014

35 16.76 -62.14 -16.76 62.14 89.78% 4.24E-009 89.77% 3.63E-014

30 12.32 -56.16 -12.32 56.16 90.80% 2.55E-010 90.79% 1.18E-014

25 8.00 -48.43 -8.00 48.43 92.57% 8.08E-009 92.56% 6.21E-013

20 4.43 -39.49 -4.43 39.49 94.71% 8.13E-009 94.70% 1.08E-012

15 1.96 -29.88 -1.96 29.88 96.81% 4.08E-009 96.80% 5.21E-014

10 0.59 -19.98 -0.59 19.98 98.52% 1.36E-010 98.51% 5.54E-015

5 0.07 -10.00 -0.07 10.00 99.63% 5.76E-009 99.62% 1.77E-013



45 22.50 -67.50 -22.50 67.50 91.05% 5.11E-005 91.04% 4.88E-005

40 20.53 -66.05 -20.53 66.05 90.30% 5.11E-005 90.29% 4.94E-005

35 16.81 -62.20 -16.81 62.20 90.40% 5.06E-005 90.39% 4.89E-005

147

30 12.34 -56.21 -12.34 56.21 91.46% 5.12E-005 91.45% 4.97E-005

25 7.98 -48.45 -7.98 48.45 93.25% 5.06E-005 93.24% 4.91E-005

20 4.39 -39.49 -4.39 39.49 95.42% 5.15E-005 95.41% 5.00E-005

15 1.91 -29.87 -1.91 29.87 97.52% 5.02E-005 97.51% 4.87E-005

10 0.56 -19.97 -0.56 19.97 99.20% 5.01E-005 99.19% 4.84E-005

6.30 0.00 90.00 -1.38 1.38 100.00% 5.00E-005



45 22.50 -67.50 -22.50 67.50 91.98% 5.06E-004 91.97% 4.99E-004

40 0.00 90.00 -42.54 42.54 91.53% 5.03E-004 91.52% 4.97E-004

35 16.96 -62.38 -16.96 62.38 91.70% 5.02E-004 91.69% 4.97E-004

30 12.42 -56.35 -12.42 56.35 92.81% 5.05E-004 92.80% 5.00E-004

25 7.96 -48.53 -7.96 48.53 94.64% 5.04E-004 94.63% 4.99E-004

20 4.30 -39.51 -4.30 39.51 96.80% 5.03E-004 96.79% 4.97E-004

15 0.00 90.00 -7.99 7.99 98.73% 5.03E-004 98.72% 4.97E-004

11.26 0.00 90.00 -4.49 4.49 100.00% 5.00E-004

Statistical strength~=1.5E-3


45 22.50 -67.50 -22.50 67.50 92.97% 1.51E-003 92.96% 1.49E-003

40 0.00 90.00 -42.81 42.81 92.73% 1.51E-003 92.72% 1.50E-003

35 0.00 90.00 -37.46 37.46 93.00% 1.51E-003 92.99% 1.50E-003

30 12.54 -56.55 -12.54 56.55 94.16% 1.51E-003 94.15% 1.50E-003

25 0.00 90.00 -21.93 21.93 95.92% 1.50E-003 95.91% 1.49E-003

20 0.00 90.00 -14.44 14.44 97.94% 1.51E-003 97.93% 1.50E-003

15 0.00 90.00 -8.11 8.11 99.97% 1.51E-003 99.96% 1.50E-003

14.92 0.00 90.00 -8.01 8.01 100.00% 1.50E-003

148

B.4 Results for pseudo-Bell states using photon detectors



45 11.64 -63.88 -11.64 63.88 92.23% 1.96E-009 92.22% 2.57E-014

40 11.08 -62.79 -11.08 62.79 91.31% 4.10E-009 91.30% 7.36E-014

35 9.79 -59.60 -9.79 59.60 91.11% 7.81E-009 91.10% 3.78E-013

30 7.93 -54.42 -7.93 54.42 91.71% 3.60E-009 91.70% 7.68E-014

25 5.73 -47.46 -5.73 47.46 93.05% 1.20E-009 93.04% 2.56E-014

20 3.53 -39.09 -3.53 39.09 94.89% 3.16E-010 94.88% 1.03E-014

15 1.68 -29.76 -1.68 29.76 96.85% 2.71E-010 96.84% 7.47E-015

10 0.54 -19.96 -0.54 19.96 98.53% 3.26E-009 98.52% 3.30E-014

5 0.07 -10.00 -0.07 10.00 99.63% 5.66E-009 99.62% 1.44E-013



45 11.62 -63.89 -11.62 63.89 92.69% 5.13E-005 92.68% 4.91E-005

40 11.06 -62.85 -11.06 62.85 91.93% 5.11E-005 91.92% 4.94E-005

35 9.78 -59.69 -9.78 59.69 91.75% 5.12E-005 91.74% 4.96E-005

30 7.92 -54.50 -7.92 54.50 92.37% 5.05E-005 92.36% 4.90E-005

25 5.70 -47.51 -5.70 47.51 93.74% 5.11E-005 93.73% 4.96E-005

20 3.49 -39.10 -3.49 39.10 95.60% 5.07E-005 95.59% 4.92E-005

15 1.67 -29.76 -1.67 29.76 97.57% 5.12E-005 97.56% 4.97E-005

10 0.52 -19.96 -0.52 19.96 99.21% 5.11E-005 99.20% 4.94E-005

6.36 0.00 90.00 -1.40 1.40 100.00% 5.00E-005


149


45 11.58 -63.92 -11.58 63.92 93.67% 5.04E-004 93.66% 4.97E-004

40 11.00 -63.04 -11.00 63.04 93.23% 5.02E-004 93.22% 4.97E-004

35 9.74 -59.93 -9.74 59.93 93.09% 5.04E-004 93.08% 4.99E-004

30 7.88 -54.71 -7.88 54.71 93.74% 5.02E-004 93.73% 4.97E-004

25 5.64 -47.66 -5.64 47.66 95.13% 5.03E-004 95.12% 4.98E-004

20 3.40 -39.19 -3.40 39.19 96.98% 5.02E-004 96.97% 4.97E-004

15 1.58 -29.84 -1.58 29.84 98.85% 5.06E-004 98.84% 5.00E-004

11.64 0.00 90.00 -4.69 4.69 100.00% 5.00E-004

Statistical strength~=1.5E-3


45 11.52 -63.93 -11.52 63.93 94.71% 1.51E-003 94.70% 1.50E-003

40 10.97 -63.22 -10.97 63.22 94.56% 1.50E-003 94.55% 1.49E-003

35 9.70 -60.23 -9.70 60.23 94.46% 1.51E-003 94.45% 1.50E-003

30 7.83 -55.00 -7.83 55.00 95.12% 1.51E-003 95.11% 1.50E-003

25 5.58 -47.90 -5.58 47.90 96.48% 1.50E-003 96.47% 1.49E-003

20 3.33 -39.45 -3.33 39.45 98.24% 1.50E-003 98.23% 1.49E-003

15.34 1.79 -30.61 -1.79 30.61 100.00% 1.50E-003

analysis of tests of local realism

Documents