1, 2,3,4,5 and hidetoshi nishimori1,6,7

26
Breakdown of the weak-coupling limit in quantum annealing Yuki Bando, 1, * Ka-Wa Yip, 2, 3 Huo Chen, 2, 4 Daniel A. Lidar, 2, 3, 4, 5 and Hidetoshi Nishimori 6, 7, 8 1 Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Kanagawa 226-8503, Japan 2 Center for Quantum Information Science & Technology, University of Southern California, Los Angeles, California 90089, USA 3 Department of Physics & Astronomy, University of Southern California, Los Angeles, California 90089, USA 4 Department of Electrical & Computer Engineering, University of Southern California, Los Angeles, California 90089, USA 5 Department of Chemistry, University of Southern California, Los Angeles, California 90089, USA 6 International Research Frontiers Initiative, Tokyo Institute of Technology, Yokohama, Kanagawa 226-8503, Japan 7 Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi 980-8579, Japan 8 RIKEN Interdisciplinary Theoretical and Mathematical Sciences (iTHEMS), Wako, Saitama 351-0198, Japan Reverse annealing is a variant of quantum annealing, in which the system is prepared in a classical state, reverse-annealed to an inversion point, and then forward-annealed. We report on reverse annealing experiments using the D-Wave 2000Q device, with a focus on the p =2 p-spin problem, which undergoes a second order quantum phase transition with a gap that closes polynomially in the number of spins. We concentrate on the total and partial success probabilities, the latter being the probabilities of finding each of two degenerate ground states of all spins up or all spins down, the former being their sum. The empirical partial success probabilities exhibit a strong asymmetry between the two degenerate ground states, depending on the initial state of the reverse anneal. To explain these results, we perform open-system simulations using master equations in the limits of weak and strong coupling to the bath. The former, known as the adiabatic master equation (AME), with decoherence in the instantaneous energy eigenbasis, predicts perfect symmetry between the two degenerate ground states, thus failing to agree with the experiment. In contrast, the latter, known as the polaron transformed Redfield equation (PTRE), is in close agreement with experiment. Thus our results present a challenge to the sufficiency of the weak system-bath coupling limit in describing the dynamics of current experimental quantum annealers, at least for reverse annealing on timescales of a μsec or longer. I. INTRODUCTION Quantum annealing is a quantum metaheuristic origi- nally conceived as a method to obtain the global solution of an optimization problem by using quantum fluctua- tions to escape from local minima [13] (see Refs. [48] for reviews). Commercial devices developed by D-Wave Systems that realize programmable quantum annealing in hard- ware have now been available for more than a decade [9]. Not only can these devices be used to solve optimization problems by physically realizing quantum annealing, but they are also able to perform physics experiments, for example, spin-glass phase transitions [10], the Kosterlitz- Thouless phase transition [11, 12], alternating-sector fer- romagnetic chains [13], the Kibble-Zurek mechanism [1416], the Griffiths-McCoy singularity [17], spin ice [18], the Shastry-Sutherland model [19], field theory [20], and spin liquids [21]. In this work we use a D-Wave 2000Q quantum an- nealer as a simulator of a simple spin system: the p-spin model [2224] with p = 2. But rather than using tradi- tional “forward” annealing, wherein the Hamiltonian of a system initialized in the ground state of a transverse * Present address: Arithmer Inc., R&D Headquarters, Terashima- honchonishi, Tokushima-shi, Tokushima 770-0831, Japan field evolves to a longitudinal target Hamiltonian [1], here we focus on reverse annealing, which is defined as the process of choosing an appropriate classical state as the initial state, starting from the target Hamiltonian and mixing it with the transverse field Hamiltonian to return to the original target Hamiltonian [25]. Reverse anneal- ing is conceptually richer than forward annealing, since it introduces at least two additional parameters: the inver- sion point of the anneal and the pause duration. These parameters can be chosen to make forward annealing a special case of (the forward direction of) reverse anneal- ing. Moreover, one can also iterate the process, leading to a strategy known as iterated reverse annealing [12, 26]. We note that there is some evidence that reverse anneal- ing can outperform forward annealing in solving opti- mization problems [2633]. However, our focus in this work is not on algorithmic performance, but rather on using the rich playground provided by the p =2 p-spin model under reverse annealing to answer the question of which of a variety of models best describes the re- sults obtained from the D-Wave annealer. This question has been addressed before many times [1113, 16, 3441], but as we shall show, the p =2 p-spin model under reverse annealing allows us to rather clearly reveal a defi- ciency of one of the most popular and successful models, the weak-coupling adiabatic master equation (AME) [42]. A relatively recent model, with strong system-bath cou- pling, the polaron-transformed Redfield master equation (PTRE) [43, 44], provides a closer match to the empirical arXiv:2111.07560v2 [quant-ph] 10 May 2022

Upload: others

Post on 21-May-2022

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

Breakdown of the weak-coupling limit in quantum annealing

Yuki Bando,1, ∗ Ka-Wa Yip,2, 3 Huo Chen,2, 4 Daniel A. Lidar,2, 3, 4, 5 and Hidetoshi Nishimori6, 7, 8

1Institute of Innovative Research, Tokyo Institute of Technology, Yokohama, Kanagawa 226-8503, Japan2Center for Quantum Information Science & Technology,

University of Southern California, Los Angeles, California 90089, USA3Department of Physics & Astronomy, University of Southern California, Los Angeles, California 90089, USA

4Department of Electrical & Computer Engineering,University of Southern California, Los Angeles, California 90089, USA

5Department of Chemistry, University of Southern California, Los Angeles, California 90089, USA6International Research Frontiers Initiative, Tokyo Institute of Technology, Yokohama, Kanagawa 226-8503, Japan

7Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi 980-8579, Japan8RIKEN Interdisciplinary Theoretical and Mathematical Sciences (iTHEMS), Wako, Saitama 351-0198, Japan

Reverse annealing is a variant of quantum annealing, in which the system is prepared in a classicalstate, reverse-annealed to an inversion point, and then forward-annealed. We report on reverseannealing experiments using the D-Wave 2000Q device, with a focus on the p = 2 p-spin problem,which undergoes a second order quantum phase transition with a gap that closes polynomially inthe number of spins. We concentrate on the total and partial success probabilities, the latter beingthe probabilities of finding each of two degenerate ground states of all spins up or all spins down,the former being their sum. The empirical partial success probabilities exhibit a strong asymmetrybetween the two degenerate ground states, depending on the initial state of the reverse anneal. Toexplain these results, we perform open-system simulations using master equations in the limits ofweak and strong coupling to the bath. The former, known as the adiabatic master equation (AME),with decoherence in the instantaneous energy eigenbasis, predicts perfect symmetry between the twodegenerate ground states, thus failing to agree with the experiment. In contrast, the latter, knownas the polaron transformed Redfield equation (PTRE), is in close agreement with experiment. Thusour results present a challenge to the sufficiency of the weak system-bath coupling limit in describingthe dynamics of current experimental quantum annealers, at least for reverse annealing on timescalesof a µsec or longer.

I. INTRODUCTION

Quantum annealing is a quantum metaheuristic origi-nally conceived as a method to obtain the global solutionof an optimization problem by using quantum fluctua-tions to escape from local minima [1–3] (see Refs. [4–8]for reviews).

Commercial devices developed by D-Wave Systemsthat realize programmable quantum annealing in hard-ware have now been available for more than a decade [9].Not only can these devices be used to solve optimizationproblems by physically realizing quantum annealing, butthey are also able to perform physics experiments, forexample, spin-glass phase transitions [10], the Kosterlitz-Thouless phase transition [11, 12], alternating-sector fer-romagnetic chains [13], the Kibble-Zurek mechanism [14–16], the Griffiths-McCoy singularity [17], spin ice [18], theShastry-Sutherland model [19], field theory [20], and spinliquids [21].

In this work we use a D-Wave 2000Q quantum an-nealer as a simulator of a simple spin system: the p-spinmodel [22–24] with p = 2. But rather than using tradi-tional “forward” annealing, wherein the Hamiltonian ofa system initialized in the ground state of a transverse

∗ Present address: Arithmer Inc., R&D Headquarters, Terashima-honchonishi, Tokushima-shi, Tokushima 770-0831, Japan

field evolves to a longitudinal target Hamiltonian [1], herewe focus on reverse annealing, which is defined as theprocess of choosing an appropriate classical state as theinitial state, starting from the target Hamiltonian andmixing it with the transverse field Hamiltonian to returnto the original target Hamiltonian [25]. Reverse anneal-ing is conceptually richer than forward annealing, since itintroduces at least two additional parameters: the inver-sion point of the anneal and the pause duration. Theseparameters can be chosen to make forward annealing aspecial case of (the forward direction of) reverse anneal-ing. Moreover, one can also iterate the process, leadingto a strategy known as iterated reverse annealing [12, 26].We note that there is some evidence that reverse anneal-ing can outperform forward annealing in solving opti-mization problems [26–33]. However, our focus in thiswork is not on algorithmic performance, but rather onusing the rich playground provided by the p = 2 p-spinmodel under reverse annealing to answer the questionof which of a variety of models best describes the re-sults obtained from the D-Wave annealer. This questionhas been addressed before many times [11–13, 16, 34–41], but as we shall show, the p = 2 p-spin model underreverse annealing allows us to rather clearly reveal a defi-ciency of one of the most popular and successful models,the weak-coupling adiabatic master equation (AME) [42].A relatively recent model, with strong system-bath cou-pling, the polaron-transformed Redfield master equation(PTRE) [43, 44], provides a closer match to the empirical

arX

iv:2

111.

0756

0v2

[qu

ant-

ph]

10

May

202

2

Page 2: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

2

data we present.We remark that the p-spin model has been studied be-

fore in the context of reverse annealing, for p = 3 [45].The choice of p = 3 was motivated by the fact that thismodel has a first order phase transition in the thermo-dynamic limit [22–24] (gap closing exponentially in thesystem size) and thus is a hard problem for conventional(forward) quantum annealing. However, this model doesnot have a direct physical embedding in current quantumannealing hardware, since it involves three-body interac-tions. While gadgets can be used to embed such a modelin physical hardware supporting only two-body interac-tions, this comes at the cost of using up qubits and alsointroduces various errors [46]. Hence the study [45] waspurely numerical. The p-spin model with p = 2, whichundergoes a second-order phase transition (polynomiallyclosing gap), can be directly represented in the hardwaregraph of the D-Wave devices, and we do this here. Thedirect embedding allows us to avoid gadgetization errorsand also gives us access to larger system sizes than thep = 3 case: we experiment with up to 20 fully connected(logical) qubits. Another reason for our choice of p = 2is the existence of ground-state degeneracy, which leadsto additional information and the crucial ability to dis-tinguish the AME from the other models, as will be de-scribed below.

The structure of this paper is as follows. We firstdefine the p-spin model with p = 2 in Sec. II and de-scribe the reverse annealing protocol we employed in thisstudy. In Sec. III we present and discuss the empiri-cal results from the D-Wave 2000Q device, for a varietyof different parameter settings, including initial condi-tions, annealing time, pause duration, problem size, andthe effect of iteration. This is followed in Sec. IV byan examination of numerical results based first on theclosed-system Schrodinger equation, followed by quan-tum adiabatic master equation with both independentand collective system-bath coupling, and the PTRE. Inall cases we compare and contrast the predictions of thevarious models to the empirical D-Wave data. Conclu-sions are presented in Sec. V, and appendixes providefurther technical details, as well as additional results.

II. PROBLEM DEFINITION AND REVERSEANNEALING PROTOCOL

In this section we describe the problem Hamiltonianand the reverse annealing protocol we used both in ourexperiments on the D-Wave device and in numerical sim-ulations.

A. The p-spin model with p = 2

We consider a quantum annealing Hamiltonian com-prising a driver Hamiltonian HD and a target Hamilto-nian HT , the latter encoding the combinatorial optimiza-

tion problem represented as an Ising model, as a functionof dimensionless time 0 ≤ s(t) ≤ 1:

H(s) =A(s)

2HD +

B(s)

2HT . (1)

Here, A(s) and B(s) are device-dependent annealingschedules. The D-Wave device used in the present exper-iment has A(s) and B(s) as shown in Fig. 1(a). We workin units where ~ = 1 and kB = 1 except in Figs. 1(a),6, 15, and 18, where we opted to use units where h = 1in accordance with the conventions of the D-Wave devicedocumentation [47].

(a) (b)

0.0

1.0

s

t [μs]

FA

RA+Pause

tinv+tp

sinv

tinv 2tinv+tp 0

2

4

6

8

10

12

14

0 0.2 0.4 0.6 0.8 1

Ener

gy

[GH

z]s

A(s)B(s)

FIG. 1. (a) Annealing schedule of D-Wave 2000Q with the“DW 2000Q 6” solver. (b) Forward (FA) and reverse anneal-ing (RA) protocols as defined in Eqs. (4) and (5), respec-tively. The latter incorporates a pause of duration tp. Hereta = 1.2 µs for forward annealing, and τ = 1 µs and sinv = 0.2for reverse annealing.

The final values of the schedule functions are A(s =1) ≈ 0 and B(s = 1) > 0 so that ideally the ground stateof HT is realized as the final state. The time depen-dence of s(t), in particular forward or reverse annealingas depicted in Fig. 1(b), corresponds to different variantsof the general quantum annealing algorithm. The driverHamiltonian HD is usually chosen as

HD = −N∑i=1

σxi (2)

where N is the number of qubits and σxi is the x compo-nent of the Pauli matrix acting on the ith qubit.

We study the p-spin model,

HT = −N

(1

N

N∑i=1

σzi

)p(3)

with p = 2. This Hamiltonian couples every qubit to ev-ery other qubit, which exceeds the “Chimera” graph con-nectivity of the D-Wave 2000Q device. Thus the qubits inHT should be viewed as logical qubits, to be representedin the actual device by physical qubits. These physicalqubits are coupled in ferromagnetic chains to form logi-

Page 3: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

3

cal qubits, a procedure known as minor embedding [48] 1.We chose the ferromagnetic interaction between physicalqubits in a logical qubit to be JF = −1.0, as discussed inApp. A.

B. Reverse annealing protocol

The traditional protocol of forward annealing has

s(t) =t

tat ∈ [0, ta], (4)

where ta is the total annealing time; see the solid red linein Fig. 1(b). The initial state of forward annealing is theground state of the driver Hamiltonian HD.

In reverse annealing, we start from s = 1 and decreases to an intermediate value sinv, pause, then increase s tofinish the process at s = 1. The explicit time dependences(t) realized on the D-Wave device is

s(t) =

1− t

τ0 ≤ t ≤ tinv

1− tinv

τ= sinv tinv ≤ t ≤ tinv + tp

2sinv − 1− tpτ

+t

τtinv + tp < t ≤ 2tinv + tp,

(5)where 1/τ is the annealing rate, tinv is the inversion timedefined as tinv = τ(1−sinv), and tp is the duration of theintermediate pause. Our reverse annealing begins froms(t = 0) = 1 and ends at s(t = ta) = 1, where

ta = 2tinv + tp = 2τ(1− sinv) + tp, (6)

passing through the inversion point sinv; see the bluedashed line in Fig. 1(b). To distinguish between τ andta, we henceforth refer to the former as the “annealingtime” and the latter as the “total annealing time”. Theinitial state of reverse annealing is a classical state, usu-ally a candidate solution to the combinatorial optimiza-tion problem, i.e, a state that is supposed to be close tothe solution.

We can iteratively repeat the process of reverse anneal-ing with the final state of a cycle as the initial conditionof the next cycle. This protocol is called iterated reverseannealing [26] and we denote the iteration number by r.Additional details are provided in App. B.

III. EMPIRICAL RESULTS

In this section we report the results of performing re-verse annealing experiments for the p = 2 p-spin modelon the D-Wave 2000Q device.

1 Embedding a fully-connected graph was implemented in the stan-dard way; see, e.g., Fig. 1 of Ref. [49].

(a)

(b)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

m0 = 0.9

m0 = 0.0

m0 = 0.8

N = 20τ = 5 [μs]

tp = 0 [μs]

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Su

cces

s P

rob

abil

ity

sinv

updownm0 = 0.0

N = 20τ = 5 [μs]

tp = 0 [μs]

updownm0 = 0.8

updownm0 = 0.9

FIG. 2. Empirical success probabilities for different initialconditions m0 (magnetization) in reverse annealing on the D-Wave 2000Q device as a function of the inversion point sinv.The dashed line is the minimum-gap point at s∆ ≈ 0.36 forN = 20. When sinv < s∆ the reverse direction of the an-neal goes through and past the minimum gap point, and thencrosses it again during the forward anneal. When sinv > s∆

there is no crossing of the minimum gap point. (a) Partialsuccess probabilities for m0 = 0, 0.8, and 0.9. (b) Total suc-cess probabilities for the same set of m0 as in (a). Here andin all subsequent figures the labels “up” and “down” meanthe populations of the all-up state and the all-down state, re-spectively; sinv was incremented by steps of 0.02, and errorbars denote one standard deviation; see App. B for details onthe calculation of error bars.

A. Dependence on initial conditions

Figure 2 shows the empirical success probability, i.e.,the probability that the final state observed is one of thetwo degenerate ground states. This is shown for N = 20and no pause. The initial condition is specified by theinitial value of the normalized total magnetization,

m0 =1

N

N∑i=1

〈ψ0|σzi |ψ0〉, (7)

where |ψ0〉 denotes the initial wave function (a classicalstate). Since the doubly degenerate ground state of theproblem Hamiltonian HT has ±1 as the normalized mag-netization, a value of m0 closer to 1 (or −1) representsan initial condition exhibiting higher overlap with the

Page 4: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

4

ground state. In Fig. 2 we present results for a completelyunbiased initial condition (m0 = 0) and initial condi-tions strongly biased toward the all-up state σzi = 1 ∀i(m0 = 0.8, 0.9). The state with m0 = 0 is the highestexcited state, and the state with m0 = 0.9 is the firstexcited state.

1. m0 = 0

Figure 2(a) shows the probability that the systemreaches the all-up state (denoted “up”) and the all-downstate (〈σzi 〉 = −1, ∀i, denoted “down”). We call thesethe partial success probabilities.

When the initial condition is m0 = 0, the up and downprobabilities are equal to within the error bars for allsinv, as expected because the initial state is unbiased.Note that both success probabilities are close to 0.5 forsinv < 0.4 whereas they are zero for sinv > 0.5. The verti-cal dashed line at sinv ≈ 0.36 indicates the minimum-gappoint, i.e., the point where the energy gap between theground and the second excited states becomes minimum,which we denote by s∆ (the first excited state becomesdegenerate with the ground state at the end of the an-neal, hence it is the second excited state that is relevant).It is likely that the system remains close to its initialstate for sinv > 0.5, where the transverse field and theassociated quantum fluctuations are small, and the truefinal ground state with m0 = ±1 is hard to reach, sincethe initial state with m0 = 0 has small overlap with thelatter. The situation is different when sinv < 0.4, i.e.,when the system traverses the minimum-gap region. Inthis case the system enters the paramagnetic (quantum-disordered) phase dominated by the driver HD, so thatthe initial condition is effectively erased. This rendersthe process similar to conventional forward quantum an-nealing, by which the two true ground states are reachedwith nearly equal probability 0.5. Recall that the phasetransition around s∆ is of second order in the presentproblem, and therefore the system finds (one of) the truefinal ground states relatively easily by forward annealingbecause of the mild, polynomial, closing of the energygap, as we discuss in more detail later [see Fig. 6].

2. m0 = 0.8, 0.9: up/down symmetry breaking for0.3 . sinv . 0.5

For the initial states with m0 = 0.8 and m0 = 0.9,the experimental results shown in Fig. 2(a) reveal sig-nificant differences between the probabilities of the finalall-up and all-down states. Of course, the initial con-ditions m0 = 0.8, 0.9 have much larger overlap with theall-up state, and this fact alone suggests a mechanism forbreaking the symmetry between the probabilities of thefinal state being the all-up or all-down state. However, itis remarkable that the success probability for the all-upstate is almost 1 in a region 0.3 . sinv . 0.5 both to the

left and to the right of s∆ ≈ 0.36 (we discuss the addi-tional asymmetry for sinv . 0.3 below). As we explainin Sec. IV C, this behavior is inconsistent with a modelof an open quantum system that is weakly coupled toits environment and is thus described by the adiabaticmaster equation [42]. However, it is consistent with botha quantum model of a system that is strongly coupledto its environment, as explained in Sec. IV D, and witha simple semiclassical model captured by the spin-vectorMonte Carlo algorithm [39, 50], as explained in App. F.

3. sinv & 0.5 > s∆: freezing

Note that for all three initial conditions the successprobability eventually vanishes for sinv & 0.5 > s∆. Thereason is that in all cases the system is initialized in an ex-cited state, and remains in an excited state at the end ofthe anneal, since there is no mechanism for thermal relax-ation to a lower energy state when sinv is large. This is amanifestation of the phenomenon of freezing [37, 42, 51],i.e., the extreme slowdown of relaxation due to the factthat the system-bath interaction (nearly) commutes withthe system Hamiltonian when the transverse field is verysmall. In addition, the annealing timescale is manifestlytoo slow for downward diabatic transitions. This is trueeven with the discontinuity in the derivative of the sched-ule depicted in Fig. 1(b). Despite this, the reversal of theanneal direction is apparently too slow in practice to havea non-adiabatic effect, or if diabatic transitions do occur,then they exclusively populate higher excited states.

4. sinv . 0.3 < s∆: spin-bath polarization

It is also noteworthy from Fig. 2(a) that the initialcondition results in different probabilities for the all-up and all-down states even in the paramagnetic regionsinv . 0.3 < s∆, where quantum fluctuations are largeand the all-up and all-down states are expected to havethe same probability in equilibrium. The system “re-members” the initial condition to a certain extent, ananomaly that may be attributable to spin-bath polar-ization [52, 53]. Namely, the persistent current flowingin the qubit body during the anneal produces a mag-netic field that can partially align or polarize an ensem-ble of environmental spins local to the qubit wiring, witha much slower relaxation time than the anneal duration.Given the polarized initial condition m0 = 0.8 or 0.9,this polarized spin-bath will be aligned with the all-upstate even after the system crosses into the paramagneticphase, thus preventing the system from equilibrating andexplaining the observed memory effect. Spin-bath polar-ization is expected to be particularly pronounced underreverse annealing, since the strong polarization of the ini-tial state will act to polarize the spin bath, more so thanin forward annealing, where the initial state is unpolar-ized.

Page 5: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

5

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

τ = 5 [μs]

N = 20m0 = 0.9

tp = 0 [μs]

τ = 10 [μs]

τ = 20 [μs]

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N = 20m0 = 0.9

tp = 0 [μs]

updown

updown

updownτ = 5 [μs]

τ = 10 [μs]

τ = 20 [μs]

(a)

(b)

FIG. 3. Empirical success probabilities for different annealingtimes τ in reverse annealing on the D-Wave 2000Q device as afunction of sinv. (a) Partial success probabilities for τ = 5, 10,and 20µs. (b) Total success probabilities for the same set ofτ as in (a). The dashed line is the minimum-gap point atN = 20, s∆ ≈ 0.36.

5. Total success probability

Figure 2(b) shows the total success probability, i.e., thesum of the final up and down probabilities. The result-ing curve stays almost flat and close to 1 for sinv < 0.4.This constant 1 is a mixture of the two effects mani-fest in Fig. 2(a), the genuine effects of reverse anneal-ing around sinv ≈ 0.4 for m0 = 0.9 and 0.8, and theeffectively-forward-annealing-like behavior for m0 = 0.Indeed, the case with sinv = 0 is essentially equivalentto standard, forward annealing: the inversion point isset at s = 0, so that the anneal restarts from a Hamil-tonian that is completely dominated by the transversefield. That the empirical total success probability is 1 inthis case shows that the p = 2 problem is easy also forforward annealing, in contrast to the p = 3 case studiedin Ref. [45], which numerically found very small valuesof success probability near sinv = 0. This may be ex-plained by the very fast forward annealing in this parame-ter region in Ref. [45], which keeps the system almost un-changed from the quantum-disordered state at s = sinv.Aside from this subtlety, our experimental data are con-sistent with the numerical results of Ref. [45], supportingthe latter’s expectation that the system-environment in-teraction (through spin-boson coupling) prompts relax-ation of the system toward the ground state around theminimum-gap region.

(a)

(b)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

tp = 0 [μs]

tp = 2 [μs]

tp = 10 [μs]

N = 20τ = 5 [μs]

m0 = 0.9

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N = 20τ = 5 [μs]

m0 = 0.9

updowntp = 10 [μs]

updowntp = 2 [μs]

updowntp = 0 [μs]

FIG. 4. Empirical success probabilities for different pausingtimes tp in reverse annealing as a function of sinv. (a) Partialsuccess probabilities for pause tp = 0, 2, and 10 µs. (b) Totalsuccess probabilities for the same set of tp as in (a). Thedashed line is the minimum-gap point at N = 20, s∆ ≈ 0.36.

We shall see in Sec. IV C 2 that all the salient featuresseen in Fig. 2(b), such as the shift to the left with decreas-ing m0, are captured by our open system simulations.

B. Dependence on annealing time

As seen from Eq. (6), the total annealing time ta islinearly dependent on the annealing time τ , and propor-tional to τ when tp = 0. Figure 3 shows the successprobability for different τ with N = 20, m0 = 0.9, andno pausing. The overall trend is similar to Fig. 2.

As shown in Fig. 3(a), an increase of τ leads to slightlyhigher all-up success probabilities for sinv > s∆ but theother way around for sinv < s∆. This can be explainedin terms of an increased relaxation to the ferromagneticground state near the minimum energy gap for larger τand an enhanced relaxation to the paramagnetic groundstate for sinv < s∆. As seen in Fig. 3(b), the total successprobability stays close to 1 for sinv < s∆ and benefitsslightly from increasing τ for sinv > s∆.

C. Effects of pausing

Figure 4 shows success probabilities for pausing timetp ∈ {0, 2, 10} µs as a function of sinv with N = 20,m0 = 0.9 and τ = 5 µs.

Page 6: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

6

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N=4N=8

N=12

N=16N=20N=24

τ = 5 [μs] tp = 0 [μs]

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N = 4updownupdown

updownupdownN = 16

N = 8

N = 12

updownN = 20updownN = 24

τ = 5 [μs] tp = 0 [μs](a)

(b)

FIG. 5. Empirical success probabilities for different systemsizes N in reverse annealing as a function of sinv. The initialstate for each N is a state with one spin flipped. (a) Partialsuccess probabilities for system size N ∈ {4, 8, 12, 16, 20, 24}.(b) Total success probabilities for the same set of N as in (a).

0 5 10 15 20 250.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

5 10 15 20

2

2.5

3

3.5

FIG. 6. The position s∆ of the minimum gap ∆ as a functionof system size N , along with the value s0.5 of sinv at whichthe total success probability = 0.5 for each N . The jumpat N = 16 is due to the non-monotonicity seen for the upprobabilities in Fig. 5(a). Inset: the minimum gap for each

N . The gap obeys a power law scaling N−11/30.

The overall trend is similar to Fig. 3, i.e., a longer pauseleads to an increased relaxation, but the effects are moresignificant in the present case. Pausing at 0.5 < sinv <0.6 greatly improves the success probability from nearly0 (tp = 0 µs) to nearly 1 (tp = 2 µs and 10 µs), implyingthat pausing in a relatively narrow region slightly pastthe minimum gap point is most effective. This is in line

with previous findings [31, 50, 54].

Another significant difference between pausing and notpausing is that the partial success probabilities shown inFig. 4(a) are nearly piecewise flat for tp > 0, showingthat the effect responsible for the anomaly disappearsunder pausing. This is consistent with the spin-bath po-larization effect discussed in Sec. III A, in that the pauseprovides the time needed for this polarization to relax toequilibrium, on a timescale of a few µs.

D. Size dependence

Figures 5(a) and 5(b), respectively, show the partialand total success probability for different system sizes Nas a function of sinv with τ = 5 µs. The initial state foreach N is the first excited state with m0 < 1.0 (closest tothe ground state for which m0 = 1.0), i.e., a state withone spin flipped. We observe a statistically significantslight non-monotonicity with N in the region sinv ≥ s∆

of the up success probabilities [Fig. 5(a)] and the totalsuccess probabilities [Fig. 5(b)], namely, the ordering ofthe shift to the left with N is: {4, 8, 16, 12, 20, 24}; we donot have an explanation for this effect, though it couldplausibly be an anomaly related to the minor embeddingprocedure. However, the down success probabilities aremonotonic in N , and the overall trend is clear, namely,the drop-off to zero success probability occurs at smallersinv as N is increased.2 This is consistent with the reduc-tion in the position of the minimum gap, s∆, as a func-tion of N , shown in Fig. 6, which is tracked by the valueof sinv at which the total success probability equals 0.5.This suggests that the success probabilities (both partialand total) are sensitive to the location of the minimumquantum gap. See App. C for more details about thespectrum of the p-spin problem.

Additionally, the asymmetry between the all-up andall-down states for sinv < s∆ is enhanced with increas-ing N . This is consistent with the formation of largerdomains of spin-bath polarization, which would be ex-pected to take longer to dissipate due to their size as Nincreases.

Finally, the overlap of data for N = 20 and 24 suggeststhat large-N effects have already converged at these sizes,i.e., that N ∼ 20 is sufficiently large to infer the behaviorat larger N . Also, even the smallest systems with N =4 and 8 already share qualitative features with largersystems.

2 We have confirmed that this tendency persists for larger systemsfrom preliminary data for up to N = 48. We did not collect datasystematically for larger systems because of technical difficulties.

Page 7: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

7

(a)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Su

cces

s P

rob

abil

ity

sinv

N = 20 τ = 5 [μs]

tp = 0 [μs]m0 = 0.9

updown

updownupdownupdown

r = 1

r = 10

r = 25

r = 50

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N = 20τ = 5 [μs]

tp = 0 [μs]

m0 = 0.9

r = 50r = 25

r = 10r = 1

(b)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

sinv

N = 20 τ = 5 [μs]

tp = 0 [μs]m0 = 0.0

updown

updownupdownupdown

r = 1

r = 10

r = 25

r = 50

(d)(c)

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Su

cces

s P

robab

ilit

y

sinv

N = 20τ = 5 [μs]

tp = 0 [μs]

m0 = 0.0

r = 50r = 25

r = 10r = 1

FIG. 7. Empirical success probabilities for different numberof iterations r in iterated reverse annealing as a function ofsinv. Panels (a) and (b) are the partial and total successprobabilities with the initial state m0 = 0.9 (the first excitedstate), respectively. Panels (c) and (d) are the partial andtotal success probabilities with the initial state m0 = 0 (thehighest excited state). The dashed line is the minimum-gappoint s∆ ≈ 0.36 for N = 20.

E. Effects of iteration

We next study the effects of iteration on reverse an-nealing, i.e., how the success probability depends on thenumber of iterations r. For this purpose, we used thereinitialize state=False setting of the D-Wave de-vice, meaning that the output state of the previous iter-ation was the initial state of the next iteration. Figure 7shows the results for r ∈ {1, 10, 25, 50} as a function ofsinv, with N = 20, m0 = 0.9, τ = 5 µs, and no pausing.The success probabilities improve in the region sinv ≥ s∆

as the number of iterations r increases, regardless of theinitial state m0. We expect the relaxation to the low-energy state due to coupling to the environment to be in-duced by successive iterations, and the occupation of theground states to increase correspondingly. The resultsin Fig. 7 confirm this expectation, in that the successprobability is larger at given sinv ≥ s∆ as r is increased.

Comparing Figs. 7(a) and (b) with m0 = 0.9 andFigs. 7(c) and (d) with m0 = 0, we can see that theinitial state with m0 = 0.9 has a larger improvement insuccess probability with fewer iterations r in the regionwhere sinv ≥ s∆. The initial state with m0 = 0 deviatesgreatly from the ground states, and there are many ex-cited states between it and the latter. Therefore, whenthe system transitions from the initial state of m0 = 0to other states by relaxation due to coupling to the envi-ronment, those excited states become populated. Manyiterations are expected to be necessary to obtain a sig-nificant improvement in the success probability. On theother hand, there are only a few excited states between

the initial state m0 . 1.0 and the all-up ground state.Therefore, it is expected that only a few iterations willbe sufficient to make the transition to this ground state.Our results support this picture of improved performanceunder iterated reverse annealing.

IV. NUMERICAL RESULTS

We next present closed and open-systems simulationsand compare the results with the data presented in theprevious section. We choose relatively small system sizesN = 4 and 8 to facilitate numerical computations. Be-cause our experimental data do not show a strong sizedependence, as discussed in Sec. III D, the qualitativecomparison our numerical results enable should sufficeto draw relevant physical conclusions.

In all our numerical studies we simulate the logicalproblem directly, i.e., we do not simulate the embeddedproblem which replaces every logical spin by a ferromag-netic chain, as in our D-Wave experiments.

For simplicity, we focus on reverse annealing withoutpausing (i.e., tp = 0) in this section. We expect that ingeneral pausing will lead to an overall success probabilityincrease in open-system simulations.

A. Closed system model

An analysis of the closed system case, while being un-realistic due to the absence of thermal effects, is instru-mental in isolating the effect and importance of diabatictransitions in explaining our experimental results.

The time evolution of reverse annealing in a closedsystem is described as follows. The state after a singleiteration (cycle) is

|ψ(2tinv)〉 = U(2tinv, 0) |ψ(0)〉 , (8)

where |ψ(0)〉 is the initial state and

U(2tinv, 0) = T exp

[−i∫ 2tinv

0

H(t′)dt′]

(9)

is the unitary time-evolution operator and T denotes for-ward time ordering. At the end of r cycles, the final state|ψ(2rtinv)〉 can be expressed as:

|ψ(2rtinv)〉 = U(2rtinv, 0) |ψ(0)〉 , (10)

with

U(2rtinv, 0) =

r−1∏i=0

U(2(i+ 1)tinv, 2itinv) . (11)

Here, the final state of the rth cycle is the initial stateof the next cycle. This condition is shared by the exper-iments in the previous section. The solution states are

Page 8: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

8

0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

0.14

(a)

0 0.2 0.4 0.6 0.8 10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

(b)

FIG. 8. Total success probabilities as computed by solutionof the Schrodinger equation for a closed system with r = 1(single cycle) and a single-spin down as the initial state. (a)N = 4, (b) N = 8. Here and in the subsequent figures thenanoscecond timescale is set by the energy scale of the D-Wave annealing schedule shown in Fig. 1. Note the differentscale of the vertical axes.

doubly degenerate, and the total success probability atthe end of r cycles is

p(r) = | 〈ψ(2rtinv)|up〉 |2 + | 〈ψ(2rtinv)|down〉 |2 , (12)

where |up〉 = |↑〉⊗N and |down〉 = |↓〉⊗N . For highercomputational efficiency without loss of accuracy, we ro-tate the state vector into the instantaneous energy eigen-basis representation at each time step in our numericalsimulations.

1. Dependence on system size and annealing time

We initialize the state to have a single spin down, i.e.,as the computational basis state |0001〉 [|0〉 ≡ |↑〉, |1〉 ≡|↓〉 , (m0 = 0.5)] for N = 4 and |0000001〉 (m0 = 0.75)for N = 8, respectively. Note that in our simulationsthese are not exact eigenstates of H(1), due to a verysmall residual transverse field at s = 1, as in the D-Waveannealing schedule shown in Fig. 1.3 When the initialstate |ψ(0)〉 is a computational basis state, there is a sim-ple upper bound on the success probability achievable inclosed system reverse annealing: the population of theinitial state in the maximum-spin sector (see App. D).This upper bound explains why the following closed sys-tem results have relatively low success probabilities.

We plot in Fig. 8 the simulation results for the totalsuccess probability for various annealing rates, subjectto the D-Wave annealing schedule shown in Fig. 1. Wesee that the success probability after a single cycle isnon-negligible only when τ is small enough (τ < 1 ns),in which case diabatic transitions to states with lowerenergies take place when sinv < s∆. However, the success

3 A(1) = 1.9 × 10−6 GHz, B(1) = 11.97718 GHz, A(1)/B(1) =1.58635004 × 10−8.

0 0.2 0.4 0.6 0.8 10

0.02

0.04

0.06

0.08

0.1

0.12

(a)

0 0.2 0.4 0.6 0.8 10

0.002

0.004

0.006

0.008

0.01

0.012

(b)

0 0.2 0.4 0.6 0.8 10

0.005

0.01

0.015

0.02

0.025

0.03

0.035

(c)

0 0.2 0.4 0.6 0.8 10

0.002

0.004

0.006

0.008

0.01

0.012

0.014

(d)

FIG. 9. Total success probabilities as computed by solution ofthe Schrodinger equation for a closed system with two spinsdown as the initial state. (a) N = 4 and r = 1, (b) N = 8and r = 1, (c) N = 4 and r = 2, (d) N = 8 and r = 2. Notethe different scale of the vertical axes.

probability remains small even in this case, and in anycase much smaller than in our experimental results wherethermal relaxation plays the dominant role.

2. Dependence on initial state and number of iterations

We next choose the initial state |ψ(0)〉 with two spinsdown |0011〉 (m0 = 0) for N = 4 and |00000011〉 (m0 =0.5) for N = 8, respectively, i.e., the second excited statesof HT for these respective system sizes.

Figure 9 (top row) shows that as expected, for N = 4and r = 1, the success probability is overall lower thanthe case when the initial state is the first excited state,Fig. 8. Only for the highest annealing rate (smallest τ ,most diabatic) is the maximum success probability sim-ilar to the case of the first excited state. For N = 8,the success probability is much smaller, no matter howdiabatic the process is. This indirectly confirms the dom-inant role played by thermal effects in our experimentalresults.

We plot in Fig. 9 (bottom row) the results for r = 2 cy-cles, where we see that the success probability decreasescompared to the r = 1 case. This is consistent with theconclusions of Ref. [26].

Recalling the experimental results reported in the pre-vious section, we may therefore safely conclude that, asexpected, the closed-system picture is far from the ex-perimental reality of the D-Wave device.

Page 9: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

9

B. Open system model: setup

For an open system, the state after the first cycle is

ρ(2tinv) = V (2tinv, 0)ρ(0) , (13)

where

V (t, 0) = T exp

[∫ t

0

L(t′)dt′]. (14)

L(t) is the time-dependent Liouville superoperator,which generates a master equation of the form

dρ(t)

dt= L(t)ρ(t) (15a)

= i[ρ(t), H(t) +HLS(t)

]+D

[ρ(t)

]. (15b)

Here, HLS(t) is the Lamb shift term and D is the dissi-pator. Again, the time dependence of the integrand inEq. (14) is incorporated into s(t).

At the end of r cyles, the final state ρ(2rtinv) is ex-pressed as:

ρ(2rtinv) = V (2rtinv, 0)ρ(0) , (16)

with

V (2rtinv, 0) =

r−1∏i=0

V (2(i+ 1)tinv, 2itinv) . (17)

We next consider both the weak and the strong cou-pling cases, which give rise to different Liouville super-operators.

C. Weak coupling: adiabatic master equation

In this subsection we use the adiabatic master equation(AME), which holds under weak coupling to the environ-ment [42]. In this limit D can be expressed in a diagonalform with Lindblad operators Li,ω(t):

D[ρ(t)] =∑i

∑ω

γi(ω)

(Li,ω(t)ρ(t)L†i,ω(t)

−1

2

{L†i,ω(t)Li,ω(t), ρ(t)

}), (18)

where the summation runs over the qubit index i and theBohr frequencies ω [all possible differences of the time-dependent eigenvalues of H(t)]. This dissipator expressesdecoherence in the energy eigenbasis [55]: quantumjumps occur only between the eigenstates of H(t) [56].

We consider two different models of system-bath cou-pling: independent and collective dephasing [57]. In thefirst case, we assume that the qubit system is coupled toindependent, identical bosonic baths, with the bath and

interaction Hamiltonians being

HB =

N∑i=1

∞∑k=1

ωkb†k,ibk,i , (19a)

H indSB = g

N∑i=1

σzi ⊗∑k

(b†k,i + bk,i

), (19b)

where b†k,i and bk,i are, respectively, the raising and low-ering operators for the kth oscillator mode with naturalfrequency ωk. The rates appearing in Eq. (18) are givenby

γi(ω) = 2πηg2ωe−|ω|/ωc

1− e−βω, (20)

arising from an Ohmic spectral density, and satisfyingthe Kubo-Martin Schwinger (KMS) condition [58, 59]γi(−ω) = e−βωγi(ω), with β = 1/T the inverse tem-perature. We use ηg2 = 10−3, the cutoff frequencyωc = 1 THz, and the D-Wave device operating temper-ature T = 12.1mK = 1.57 GHz. With the independentdephasing assumption, the Lindblad operators are:

Li,ω(t) =∑

εb−εa=ω

|εa(t)〉〈εa(t)|σzi |εb(t)〉〈εb(t)| , (21)

corresponding to dephasing in the instantaneous eigen-basis {|εa(t)〉} of H(t). Similarly to the closed systemcase, we rotate the density matrix and Lindblad oper-ators into the instantaneous energy eigenbasis at eachtime step in our numerical simulations. This keeps thematrices sparse, without loss of accuracy. For N = 8 wetruncate the system size to the lowest n = 18 levels outof 256 [18 = 2(1 + 8): the total number of degenerateground and first excited states at s = 1].

We also consider the collective dephasing model, whereall the qubits are coupled to a collective bath with thesame coupling strength g, which preserves the spin sym-metry. In this case, the interaction Hamiltonian becomes

HcolSB = gSz ⊗B, (22)

where

Sz =∑i

σzi , B =∑k

(b†k + bk

). (23)

With this assumption, we can group together the Lind-blad operators corresponding to different qubits i into asingle one:

Lω(t) =∑

εb−εa=ω

|εa(t)〉〈εa(t)|Sz|εb(t)〉〈εb(t)| . (24)

The resulting number of Lindblad operators is a factorof N smaller than that of the independent system-bathcoupling model.

The total success probability at the end of the r cycles

Page 10: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

10

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a)

0 0.2 0.4 0.6 0.8 10

0.05

0.1

0.15

0.2

0.25

(b)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(c)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(d)

FIG. 10. Top row: Total success probabilities as computedby the AME as a function of sinv with (a) independent and(b) collective dephasing (SB denotes system-bath coupling).Initial state: |0001〉 (m0 = 0.5) and r = 1, a single cycle.Bottom row: Total success probabilities as computed by theAME with different τ . Initial state: |0011〉 (m0 = 0). (c)r = 1, (d) r = 2. Note the different scale of the vertical axisof panel (b).

is

p(r) = 〈up| ρ(2rtinv) |up〉+ 〈down| ρ(2rtinv) |down〉 .(25)

Any relaxation during the reverse annealing dynamicsto the global instantaneous ground state or the instan-taneous first excited state of H(t) is beneficial, the lat-ter since it becomes degenerate with the ground states{|up〉 , |down〉} of HT at the end of the anneal (seeApp. C).

1. Dependence on the annealing time.

Figure 10 (top row) shows the AME simulation resultsfor the success probability as a function of sinv with N =4, various τ , and the initial state |0001〉 (m0 = 0.5),using the independent and collective dephasing models.

The independent dephasing model leads to a maximumsuccess probability of 1 for large τ and small sinv. How-ever, the collective dephasing model leads to a maximumsuccess probability of 1/4. The reason is the same asin the closed system simulations. For the initial state|0001〉, only 1/4 of its population is in the subspace ofmaximum quantum spin number. However, the global(instantaneous) ground state and the first excited stateof the annealing Hamiltonian H(s) both belong to the

maximum-spin subspace. Since the collective dephasingmodel preserves the spin symmetry and the dynamics isrestricted to each subspace, at most 1/4 of the initialpopulation can be relaxed to the instantaneous groundstate and the first excited state, and reach the correctsolutions at the end of the anneal (see App. D for moredetails). It is also noteworthy that coherent oscillationsare visible for τ = 1 ns [compare with Fig. 8(a)], but notfor the larger values of τ we have simulated. Recall thatthe experimental timescale in Sec. III is on the order ofa µs.

Comparing the simulation results in Fig. 10 and the ex-perimental results in Fig. 3(b), we observe that they sharethe same main features. Namely, the success probabilityincreases with τ ; and, as τ increases the maximum suc-cess probability can be achieved with a larger inversionpoint sinv. Most notably, the total success probabilityalso drops to zero for sufficiently large sinv in our simula-tions. This is the freeze-out effect that is well captured bythe adiabatic master equation (as reported in Ref. [42]),since thermal relaxation is suppressed when the trans-verse field magnitude becomes so small that the systemand system-bath Hamiltonians effectively commute.

Note that the experimental results in Sec. III have amaximum success probability as high as 1, while ourclosed system simulations and open-system simulationswith the collective dephasing model have success proba-bilities upper bounded by some constants < 1, as alreadydiscussed. The high total success probability observed inour experiments is evidence that, as expected, the dy-namics in the D-Wave device do not preserve spin sym-metry. Collective system-bath coupling cannot explainthe experimental results, leaving independent system-bath coupling as the only candidate consistent with theexperiments according to our simulations.

2. Dependence on initial condition and the number ofcycles.

For the initial state |0011〉 (m0 = 0) and with r = 1,the independent dephasing model still gives a maximumsuccess probability of 1 as seen in Fig. 10(c). The col-lective dephasing model (not shown) has a maximumsuccess probability of 1/6, since the initial state has1/(

42

)= 1/6 of its population in the maximum-spin sub-

space.Comparing Fig. 10(a) and (c), the dependence of the

success probability on sinv is similar to that of the initialstate |0001〉. The τ = 1 ns coherent oscillations visiblefor the latter are more attenuated for |0011〉, but this wasalso the case for the closed system simulations [contrastFig. 8(a) and Fig. 9(a) at τ = 1 ns]. The main featuredistinguishing Fig. 10(a) and (c) is the shift to the left ofthe τ ≥ 100 ns curves for the initial state |0011〉, i.e., asm0 is reduced from 0.5 to 0. This is consistent with ourexperimental results, as can be seen in Fig. 2(b).

In Fig. 10(d), we consider the dependence on the num-

Page 11: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

11

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

0 0.2 0.4 0.6 0.8 10

0.1

0.2

0.3

0.4

0.5

0.6

FIG. 11. Total and partial success probabilities as computedby the AME as a function of sinv for independent dephasing.The initial state has one spin down and τ = 1 µs. The mainplot shows the total success probabilities for N = 4 and 8.The inset shows the partial success probabilities for N = 8.

ber of cycles r, using the independent system-bath model.For r = 2, we see that the results are similar to those of asingle cycle. However, we do see a small improvement inthe sense of a slight shift to the right of the curves withτ ≥ 100 ns compared with r = 1, which is consistent withthe experimental result shown in Fig. 7(d). We note thatthis improvement was not observed in the closed systemcase, as can be seen by contrasting Figs. 9(a) and (c).Interestingly, there is also a small signature of coherentoscillations for τ ≤ 100 ns.

Finally, we note that compared to the closed systemscase, the results depend much less on the initial condi-tion.

3. Size dependence and partial success probability.

In Fig. 11, we show the success probability for twodifferent system sizes N = 4 and 8. The initial state hasa single spin down. We observe that the results do notdepend much on the system size, with the total successprobabilities slightly larger (shifted to the right) for N =4. This is consistent with the experimental results forN = 4 and 8 shown in Fig. 5(b).

While the total success probabilities of our open-system simulations with independent dephasing are gen-erally consistent with the experimental total successprobabilities, our open-system simulations always pro-duce symmetric partial success probabilities as shown inFig. 11, in stark contrast with the experiments, wherethe all-up state is strongly favored over the all-down statein the region around the minimum gap (see all the toppanels in the figures in Sec. III). Clearly, this reflects asignificant failure of the AME model.

One reason for this discrepancy is that our simula-tions do not include a mechanism such as the polar-ized spin-bath which we believe explains the experimen-

tally observed asymmetry for small sinv (see Sec. III A 4).Moreover, our adiabatic master equation simulations re-sult in a symmetric final all-up and all-down popula-tion since any relaxation event during the anneal (atany s) to either the global instantaneous ground state|ε0(s)〉 or the instantaneous first excited state |ε1(s)〉 ofH(s), which become degenerate at s = 1, eventually con-

tributes amplitude to |ε0(s = 1)〉 = (|↑〉⊗N + |↓〉⊗N )/√

2

or |ε1(s = 1)〉 = (|↑〉⊗N − |↓〉⊗N )/√

2. These two stateshave equal all-up and all-down populations, which is whythe AME simulation results do not distinguish them. No-tice the importance of the energy eigenbasis (i.e., thestates |ε0(s)〉 and |ε1(s)〉) in this argument; this is spe-cial to the AME.

D. Strong coupling: PTRE

Given the troubling discrepancy between the empiri-cal partial success probability results and the AME sim-ulations, in this subsection we use the Lindblad formpolaron-transformed Redfield equation (PTRE) [43, 44].The idea is to check whether a quantum model that isnot subject to the weak coupling limit is capable of avoid-ing the discrepancy. This is motivated in part by pre-vious studies, which showed that a hybrid AME/PTREmodel works better than the pure AME in explaining thelinewidth broadening phenomenon observed in a tunnel-ing spectroscopy experiment [44, 60]. The PTRE holdsunder intermediate to strong coupling to the environ-ment [44]. Unlike the AME, wherein decoherence is be-tween energy eigenstates, decoherence in the PTRE is be-tween computational basis states (eigenbasis of σz). But,unlike the singular coupling limit (SCL) [55], where de-coherence is also between computational basis states, thePTRE is governed by a non-flat (and hence non-trivial)spectral density. This means that, unlike the SCL, thePTRE is sensitive to the particulars of the bath model,a fact we take advantage of below when we combine lowand high frequency components into a single noise spec-trum; see Eqs. (30) and (31) below. This type of hybridspectrum approach has been used successfully [61] to ex-plain a 16-qubit quantum annealing experiment with asmall gap [62].

To be explicit, the PTRE is given by Eq. (15) with thepolaron-frame system Hamiltonian

H(t) =1

2B(t)HT , (26)

the polaron-frame dissipator

D (ρ) =∑

α∈{+,−}

∑ω,i

γp(ω)(Lαi,ω(t)ρLα†i,ω(t)

− 1

2

{Lα†i,ω(t)Lαi,ω(t), ρ

}), (27)

Page 12: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

12

and the Lamb shift term

HLS(t) =∑i,α,ω

Lα†i,ω(t)Lαi,ω(t)Sp(ω) . (28)

Here γp (ω) and Sp(ω) denote the polaron-frame noisespectrum and the corresponding principal value integral(discussed below), and the Lindblad operators are

Lαi,ω(t) =A(t)

2

∑εb−εa=ω

〈a|σαi |b〉 |a〉〈b| , (29)

where |a〉 is the eigenstate of H(t) with eigenenergy εa.Because H(t) is diagonal in the σz basis, the eigenstates|a〉 are classical spin states. This should be contrastedwith the AME, for which the Lindblad operators involvethe instantaneous eigenstates of the system Hamiltonianthat includes the transverse field, as in Eq. (21).

Because the effective Hamiltonian H(t) + HLS(t)[Eqs. (26) and (28)] and the Lindblad operators inEq. (29) are diagonal in the σz-basis, the PTRE can besimplified into an equation involving only the diagonalcomponents of the density matrix, as shown in App. E.We use this form henceforth.

In contrast to the Ohmic noise spectrum we used forthe AME [Eq. (20)], here we choose a hybrid model witha high-frequency Ohmic bath and a low-frequency com-ponent [61, 63]. Namely, γp(ω) has a convolutional form

γp(ω) =1

∫ ∞−∞

GL(ω − x)GH(x)dx , (30)

where

GL(ω) =

√π

2W 2exp

[− (ω − 4εL)

8W 2

](31a)

GH(ω) =4γ(ω)

ω2 + 4γ2(0)(31b)

which describe the low and high-frequency components,respectively. W and εL in Eq. (31a) are known as themacroscopic resonant tunneling (MRT) linewidth and re-organization energy. They are connected through thefluctuation-dissipation theorem: W 2 = 2εLT . The quan-tity γ(ω) in Eq. (31b) is the standard spectral functionfor an Ohmic bath, given by Eq. (20). We assume thatthe low-frequency and high-frequency components sharethe same temperature.

We note that the PTRE can be approximately thoughtof as a multi-level extension of the noninteracting-blipapproximation (NIBA) method [61, 64], which success-fully modeled the open system dynamics in a multiqubit-cluster tunneling experiment [37]. Because in the lowtemperature limit NIBA only works for stronger couplingthan PTRE [43, 65], the success of NIBA indicates theexistence of a strong coupling region during the anneal.

0.2 0.4 0.6 0.8sinv

0

0.2

0.4

0.6

0.8

Popula

tion

Initial state j"""#i, r = 1, PTRE

Exp: GS (all up)Exp: GS (all down)PTRE: GS (all up)PTRE: GS (all down)

(a)

0.2 0.4 0.6 0.8sinv

0

0.2

0.4

0.6

Popula

tion

Initial state j""##i, r = 1, PTRE

Exp: GS (all up)Exp: GS (all down)PTRE: GS (all up)PTRE: GS (all down)

(b)

FIG. 12. PTRE simulation results versus the empirical resultsfor the 4-qubit case. The initial states are (a) |↑↑↑↓〉 and(b) |↑↑↓↓〉. The best-fit simulation parameters are: ηg2 =2.5×10−3, T = 25 mK, τ = 5µ s, W = 8mK, and fc = 1THz.Here and in the subsequent figures the error bars represent 1σconfidence intervals. Overall the PTRE results are in goodqualitative agreement with the empirical results for all casesshown. Note the different scale of the vertical axes.

1. 4-qubit case.

We first calibrate the simulation parameters using theempirical data from the N = 4 case. We run the PTREsimulation with τ = 5 µs and fc = 1 THz, and varyT ∈ {6, 30} mK (step size of 1 mK) and W ∈ {6, 40} mK(step size of 2 mK), ηg2 ∈ {2.5× 10−i, 5× 10−i}5i=1. Wepick the optimal parameters from this set such that thesimulation results have the smallest distance from the D-Wave data in the following six cases: two cases where westart with |↑↑↑↓〉 and end up with the all-up and all-downstates, and four cases where we start with |↑↑↓↓〉 andend up with the all-up, all-down, one-up and one-downstates. Because all the numerical experiments are doneon the same grid of inversion points sinv, we denote themeasured population (corresponding to the grid-points)by a vector Pi, where i is the index for the six differentcases. Then our parameter estimation procedure can beformally written as:

minW,T,ηg2

∑i

‖Pi − Pi‖ , (32)

where Pi is the simulation curve corresponding to the em-pirical data. The simulations results using the optimalparameters thus obtained are compared to the D-Wavedata in Fig. 12. Panel (a) of this figure reproduces theempirical N = 4 data from Fig. 5(a); panel (b) corre-sponds to the m0 = 0 results shown in Figs. 2(a) and 7(c)(with r = 1), though the latter are for N = 20. Cru-cially, unlike the AME the PTRE correctly describes theasymmetry in the populations of the all-up and all-downground states when the initial state has one spin down,as is clearly visible in Fig. 12(a).

Page 13: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

13

0.2 0.4 0.6 0.8sinv

0

0.2

0.4

0.6

0.8

1Popula

tion

Initial state j"""""""#i, r = 1, PTRE

Exp: GS (all up)Exp: GS (all down)PTRE: GS (all up)PTRE: GS (all down)

(a)

0.2 0.4 0.6 0.8sinv

0

0.1

0.2

0.3

0.4

0.5

Popula

tion

Initial state j""""####i, r = 1, PTRE

Exp: GS (all up)Exp: GS (all down)PTRE: GS (all up)PTRE: GS (all down)

(b)

FIG. 13. PTRE simulation results versus the experimentalresults for the 8-qubit case. The initial states are (a) thefirst excited state |↑ · · · ↑↓〉 and (b) the maximally excitedstate |↑↑↑↑↓↓↓↓〉. The simulation parameters are identicalto the optimal ones obtained from the parameter estimationprocedure in Fig. 12. Note the different scale of the verticalaxes.

2. 8-qubit case.

Next, we simulate the 8-qubit case using the optimalparameters obtained above. Again, we consider the ex-periments with two different initial states and presentthe results in Fig. 13. Panel (a) of this figure reproducesthe empirical N = 8 data from Fig. 5(a); panel (b) corre-sponds to the m0 = 0 results shown in Figs. 2(a) and 7(c)(with r = 1), though the latter are for N = 20.

Similar trends are observed as in the 4-qubit case. ThePTRE works best when the initial state has only a singlespin flip from the ground state [Fig. 13(a)]. When anequal number of spins is up as down (the maximally ex-cited state), the PTRE is less accurate but the qualitativetrend is correct [Fig. 13(b)]. It is possible that fine-tuningof the PTRE parameters could further improve the quan-titative agreement. However, we did not pursue this sincethe main goal has already been achieved: our simulationresults demonstrate that the PTRE achieves a much bet-ter qualitative agreement with the experiments than theAME (Fig. 11). In particular, it correctly predicts theasymmetry in the populations of the two degenerate all-up and all-down ground states when the initial state isthe first excited state, for both N = 4 and N = 8. Thisis the case because in the PTRE decoherence happensbetween the classical spin states [Eq. (29)] while in theAME, decoherence happens between the eigenstates ofthe full Hamiltonian [Eq. (24)].

V. CONCLUSIONS

In this work, we presented a comprehensive study of re-verse annealing of the p-spin model with p = 2, and com-pared empirical results from the D-Wave 2000Q used asa quantum simulator of this model, with two quantummaster equations in the weak and strong coupling lim-its (the AME and PTRE, respectively). Results for oneclassical Monte Carlo based algorithm (SVMC-TF [50])

are presented in App. F; to keep the scope manage-able we did not consider other popular models such assimulated quantum annealing (SQA) [35, 66–70] (alsoknown as Path-Integral Monte Carlo Quantum Anneal-ing in Refs. [5, 71] or Quantum Monte Carlo Annealing inRef. [4]), or the recent Spin-Vector Langevin model [72];these are interesting topics for a future study.

Our main observation is a stark failure of the AME,which successfully captured the D-Wave annealers’ be-havior in a variety of previous studies [13, 34, 40, 50, 60,73], in correctly describing the empirical “partial” suc-cess probabilities, i.e., the probability of finding eitherthe all-up or all-down ground state, as opposed to thesum of the probabilities of these two degenerate groundstates of the p = 2 p-spin model. As illustrated in Fig. 11,the AME predicts equal partial success probabilities, butthe empirical data are strongly biased towards one or theother ground state, depending on the initial condition ofthe reverse anneal; see, e.g., Fig. 2 and all subsequent fig-ures displaying empirical partial success probability data.The reason for this failure of the AME is that in theweak coupling limit that it describes, the two degenerateground states are energy eigenstates that are equal super-positions of the all-up and all-down states with oppositesigns, and there is nothing about the dynamics describedby the AME that breaks the symmetry between these twostates for the p = 2 problem under consideration. Thefailure of the AME thus indicates that the weak couplinglimit itself is the problem in the setting of the D-Wave2000Q device subject to reverse annealing, and indeed,when we considered the PTRE (strong coupling) it ex-hibited the observed asymmetry; see, e.g., Fig. 12. Thisrepresents strong evidence of the breakdown of the weakcoupling limit in the D-Wave 2000Q device, at least forreverse annealing on timescales of a µsec or longer.

Our results demonstrate the importance of choosingthe correct decoherence model when analyzing real-worlddevices, and that even when agreement is observed (aswas the case for previous studies involving the AME),certain aspects that can reveal a way to distinguish be-tween different decoherence models may remain hidden.The case in point is the difference between the partialprobabilities of the two degenerate ground states, whichforced us to conclude that strong coupling is the moreappropriate decoherence model for the D-Wave 2000Qon the & 1µsec timescale. Had we considered only thetotal success probability, we would not have been ableto distinguish between decoherence models with weak vsstrong coupling.

Our work shows that the PTRE – a first principles,fully quantum dynamical model with strong decoherence– achieves the best agreement overall with the empiri-cal data among the various models we have tested. Thisdoes not necessarily imply that the D-Wave 2000Q be-haves as a classical device on the & 1µsec timescale. In-stead, our results should be interpreted to mean thatmodels with strong decoherence can be successful in pre-dicting the outcome of quantum annealing experiments

Page 14: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

14

when the chosen observable is not sensitive to quantumeffects. An earlier study using a previous generation ofthe D-Wave devices already established evidence for en-tanglement [74], which was verified using AME simula-tions [60], and this is a clear-cut example of the mea-surement of an observable that is sensitive to quantumeffects. We anticipate that new experiments based onmuch shorter anneal times than were available to us inthis work will provide further evidence of quantum ef-fects in experimental quantum annealing [75]. Our workhighlights the importance of testing such new evidenceusing models that critically evaluate the strength of anyclaimed quantum effects.

ACKNOWLEDGMENTS

We thank Mohammad Amin and Masayuki Ohzeki foruseful comments and Sigma-i Co., Ltd. for providingpart of the D-Wave 2000Q machine time. The researchis based partially upon work supported by the Office ofthe Director of National Intelligence (ODNI), IntelligenceAdvanced Research Projects Activity (IARPA) and theDefense Advanced Research Projects Agency (DARPA),via the U.S. Army Research Office contract W911NF-17-C-0050. The views and conclusions contained herein arethose of the authors and should not be interpreted asnecessarily representing the official policies or endorse-ments, either expressed or implied, of the ODNI, IARPA,DARPA, or the U.S. Government. The U.S. Governmentis authorized to reproduce and distribute reprints forGovernmental purposes notwithstanding any copyrightannotation thereon. Computation for some of the workdescribed in this paper was supported by the Univer-sity of Southern California Center for High-PerformanceComputing and Communications (hpcc.usc.edu).

Appendix A: Optimal Value of FerromagneticInteractions within a Logical Qubit

The ferromagnetic coupling strength between physicalqubits for a given logical qubit on the D-Wave device af-fects the final success probability. The appropriate valueof JF depends on the system size N [76]. We checked thesuccess probability for various system sizes of the p = 2p-spin model under traditional forward annealing. Theresult is shown in Fig. 14. It is clearly observed thatthe smallest value of coupling JF = −1.0 we tried yieldsthe best result for all N values we tried, and we adoptedthis value in all our experiments. The smallest allowedvalue is JF = −2.0, but saturation is already observedfor JF = −1.

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

Succ

ess

Pro

bab

ilit

y

JF

N = 8

N = 12

N = 16N = 20

N = 24N = 28

N = 32

FIG. 14. Success probability of forward annealing with an-nealing time τ = 1.0 µs as a function of the coupling betweenphysical qubits in a logical qubit for different system sizes.The solid lines are Bezier curves. The penalty value we usedis −JF, i.e., ferromagnetic.

Appendix B: Reverse annealing details

In our experiments for single-iteration (r = 1) re-verse annealing, we constructed 15 instances (differenthardware embeddings of HT on the Chimera graph ofthe device) and generated 10 random gauges for eachset of parameter values. A gauge (originally known asa spin inversion transformation [34]) is a given choiceof {hi, Jij} in the general Ising problem Hamiltonian

HT =∑Ni=1 hiσ

zi +∑Ni<j Jijσ

zi σ

zj ; a new gauge is realized

by randomly selecting ai = ±1 and performing the sub-stitution hi 7→ aihi and Jij 7→ aiajJij . Provided we alsoperform the substitution σzi 7→ aiσ

zi , we map the orig-

inal Hamiltonian to a gauge-transformed Hamiltonianwith the same energy spectrum but where the identityof each energy eigenstates is relabeled accordingly. In to-tal, there are 2N different gauges for an N -spin problem.See Ref. [77] for a review and more details.

1000 annealing runs were performed for a given ran-dom gauge. Thus, the total number of runs for each setof annealing schedule parameter values is 150,000. In it-erated reverse annealing with r > 1, we constructed 15instances and generated 90 random gauges for each r.Thus, the total number of samples at each r and eachsinv is 1350.

We computed 1σ error bars by cluster sampling over in-stances. Namely, we regarded each of the 15 instances asas a sample of size 10000 [the number of random gauges(10) × the number of iterations (1000)], and computedthe standard error of the mean for each data point shownin our experimental figures.

We remark that the reverse annealing protocol we haveadopted here for experiments on the D-Wave device is dif-ferent from the one used in Ref. [45] for numerical com-putation. In particular, when sinv ≈ 0, the second partof Ref. [45]’s protocol has a very sharp increase of s asa function of t, which seems to have led to a significantdrop of success probability. This is not the case in thepresent protocol, where the time derivative of s(t) is aconstant ±1/τ (or 0 during pausing) irrespective of sinv.

Page 15: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

15

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20

[GHz]

FIG. 15. Full spectrum of four qubits for the p = 2 p-spinmodel, subject to the annealing schedules shown in Fig. 1.

Appendix C: Spectrum of the p-spin problem withp = 2 and scaling of the minimum gap with N

The spectrum of the p-spin problem and the order-ing of energies of different spin sectors can be found inRef. [24]. For the particular case of n = 4, p = 2 and theD-Wave 2000Q annealing schedule, we plot the spectrumin Fig. 15.

We denote the energy gap between the instantaneouseigenenergies εi(s) and εj(s) by ∆ij(s) = εi(s) − εj(s),and the corresponding minimum energy gap by ∆ij =mins ∆ij(s). For p = 2, we are interested instead in theminimum energy gap between the ground state ε0(s) andthe second excited state ε2(s), denoted by

∆ = ∆20 = mins

∆20(s) = minsε2(s)− ε0(s) , (C1)

since as can be seen from Fig. 15, the instantaneous firstexcited state ε1(s) and the instantaneous ground stateε0(s) converge at s = 1. This, of course, is true for everyN due to the double degeneracy of the ground state ofthe p = 2 p-spin model, which exhibits Z2 symmetry.

Figure 6 shows the value of ∆ for N ∈ {4, · · · , 22},along with the position s of the minimum gap, i.e.,

s∆ = argmin∆20(s) . (C2)

Appendix D: Upper bound on the successprobabilities for collective dephasing

If the dynamics preserve spin symmetry (for exam-ple in the closed system Schrodinger equation and inopen-system simulations with collective system-bath cou-pling), then there exists a natural upper bound on themaximal success probabilities achievable in reverse an-nealing. The upper bound is the population of the initialstate in the maximum-spin sector.

For the example of N = 4, there are two degenerateground states (|0000〉 and |1111〉) of the problem Hamil-

tonian HT and they both belong to the maximum-spinsubspace of S = Smax = N/2 = 2. While the computa-tional basis state with one spin down, for example, |0001〉(|0〉 ≡ |↑〉, |1〉 ≡ |↓〉) is a first excited state of HT , it doesnot belong to the maximum-spin subspace S = 2. A uni-form superposition of the computational basis states withone spin down, i.e., (|0001〉+ |0010〉+ |0100〉+ |1000〉)/2,however, does belong to the maximum-spin subspaceS = 2. Unfortunately, the D-Wave device does not allowsuch an initialization.

In general, suppose that the initial computational basisstate is |ψ(t = 0)〉comp, with a particular magnetization

m0 [Eq. (7)]. It can be represented as a linear combi-nation of states with a fixed value of total spin S andmagnetization m0:

|ψ(t = 0)〉comp =

Smax∑S=Smin

aS |S,mS = m0〉 . (D1)

From angular momentum addition theory [78], we knowthat Smin = N/2 − bN/2c and Smax = N/2. Thetotal spin (integer or half-integer) S ∈ {Smin, Smin +1, . . . , Smax}. The (unnormalized) magnetization isMS ∈ {−S,−S + 1, . . . , S}, but we can also label thestates using the normalized magnetization mS = 2

NMS ;

then |S,mS = m0〉 is a simultaneous eigenstate of S2 andSz with eigenvalues

S2 |S,mS = m0〉 = S(S + 1) |S,mS = m0〉 , (D2)

Sz |S,mS = m0〉 =(N

2m0

)|S,mS = m0〉 . (D3)

In Eq. (D1), |aS |2 is the initial state’s population in thatspin subspace.

For the p-spin Hamiltonian, in a closed system thestate in each spin subspace evolves independently dueto the preservation of spin symmetry of the Hamilto-nian [26]. At the end of a single cycle of reverse annealing(r = 1), we have from Eq. (D1)) that

U(2tinv, 0) |S,mS = m0〉 =

S∑MS=−S

cM |S,mS = 2MS/N〉 ,

(D4)where cM is the amplitude developed by the other basiselements of the spin S subspace at time t = 2tinv.

Therefore, the final state of 1-cycle reverse annealingcan be expressed as:

|ψ〉fin = U(2tinv, 0) |ψ(t = 0)〉comp

=

Smax∑S=Smin

aS

(S∑

MS=−ScM |S,mS = 2MS/N〉

).

(D5)

The all-up state (|0〉⊗N = |↑〉⊗N = |up〉) and all-down

state (|1〉⊗N = |↓〉⊗N = |down〉) are both ground states

Page 16: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

16

of HT , and moreover they lie in the maximum-spin sub-space S = Smax. In particular, |up〉 = |Smax, 1〉 and|down〉 = |Smax,−1〉. Therefore, projecting |ψ〉fin onto|up〉 gives:

〈up|ψ〉fin

=

Smax∑S=Smin

aS

(S∑

MS=−ScM 〈up|S,mS = 2MS/N〉

)(D6)

= aSmaxcSmax

. (D7)

Similarly, 〈down|ψ〉fin = aSmaxc−Smax

.The total success probability is thus bounded by:

p(r = 1) = | 〈up|ψ〉fin |2 + | 〈down|ψ〉fin |

2

= |aSmax |2(|cSmax |2 + |c−Smax |2)

≤ |aSmax |2 , (D8)

with the bound saturated when (|cSmax|2+|c−Smax

|2) = 1.�

The upper bound |aSmax|2 is, as claimed above, the

population of the initial state in the maximum-spin sub-space. For the initial state |0001〉 we have aSmax

= 1/2since |S = Smax = 2,mS = 0.5〉 = (|0001〉 + |0010〉 +|0100〉 + |1000〉)/2. Therefore, the total success prob-ability is bounded by |aSmax

|2 = 1/4. For the otherexamples in the main text, the initial state |0011〉 has

aSmax = 1/√(

42

)= 1/

√6; while the initial state of

|00000001〉 has aSmax= 1/

√8.

This conclusion is directly generalized to the open-system case under collective dephasing, where the pop-ulation of each spin-sector is also preserved during thedynamics. This is illustrated in the main text for N = 4in Fig. 10(b), and also in Fig. 16, which shows the anal-ogous result for N = 8, for which the initial state is|00000001〉. In this case the maximum success probabil-ity of collective dephasing is 1/8, which is the populationof the initial state in the maximal spin subspace. Weremark that this case is different from Ref. [45], wherethe initial state (a superposition of computational basisstate) lies completely inside the maximal spin subspace,in which case the collective dephasing model has a max-imum success probability of 1.

Appendix E: The PTRE equation involves only thediagonal components of the density matrix

As argued in Sec. IV D, the PTRE equation involvesonly the diagonal components of the density matrix. Toshow this we first consider

M = [He, ρ] , (E1)

where He ≡ H + HLS is a diagonal matrix, i.e., Habe =

δabHaae . The explicit form of M can be written as

Mab =(Haa

e −Hbbe

)ρab, which implies that the diago-

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

FIG. 16. Success probability as a function of sinv for indepen-dent and collective dephasing, for N = 8 spins. The initialstate has a single spin flipped and τ = 1 µs.

nal elements of M (Maa) vanish. Next, we consider

Nω,αi = Lαi,ωρL

α†i,ω −

1

2

{Lα†i,ωL

αi,ω, ρ

}, (E2)

where the Lindblad operators Lαi,ω(t) were defined inEq. (29). To simplify the notation, we omit the indicesω, α and i in the following discussion. Let us denote

Aab ≡ A(t)2 〈a|σ

αi |b〉 = A∗ab. The first term in Eq. (E2)

can be written as

LρL =∑

a,b,a′,b′

AabAa′b′ρbb′ |a〉〈a′| , (E3)

where ρbb′

= 〈b| ρ |b′〉. The second term in Eq. (E2) is

−1

2

{L†L, ρ

}= −

∑a,b,b′

1

2Aab′Aab{|b′〉〈b|, ρ} (E4a)

= −∑a,b,b′

1

2AabAab′ (|b′〉〈b|ρ+ ρ|b′〉〈b|) .

(E4b)

If we assume ρ is diagonal, then all the diagonal elementsof ρ depend only on the diagonal elements of ρ, with anexplicit form

ρaa =∑b6=a

γp(ωba)Zabρbb −

∑b6=a

γp(ωab)Zbaρaa , (E5)

where Zab = A2(t)∑α,i | 〈a|σαi |b〉 |2/4, and ωab = ωa −

ωb. Thus, if the initial state is diagonal (a classical spinstate), we can consider the Pauli master equation [79]for the diagonal elements only [Eq. (E5)] to speed up the

Page 17: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

17

computation. The matrix form of this equation isρ00

ρ11

...

= T

ρ00

ρ11

...

(E6a)

T ≡

−∑b 6=0 γp(ω0b)Zb0 γp(ω10)Z01 · · ·γp(ω01)Z10 −

∑b 6=1 γp(ω1b)Zb1 · · ·

......

. . .

.

(E6b)

Before proceeding, we note that the sparsity of T isdetermined by the sparsity of Zab, whose full size is 2N ×2N . To calculate the number non-zero elements in Zab,we first notice that for each σαi operator, 2N−1 elementsare non-zero (recall that α ∈ {+,−}). The number of σαioperators is 2N . So there are N2N non-zero elements inZab. If we add the number of diagonal elements in T , thetotal number of non-zero elements in T is (N+1)2N . As aresult, the sparsity of the transfer matrix T is (N+1)/2N .

The transfer matrix T in Eq. (E6) provides the in-coherent “tunneling” rate between classical spin states.Because the total Hamiltonian is symmetric under per-mutations of qubits, we may also want to calculate thetunneling rate between the spin coherent states [80]

|θ, φ〉 =

n⊗i=1

[cos

2

)|0〉i + sin

2

)eiϕ|1〉i

]. (E7)

However, because in the current form of the PTRE thecoherence between computational basis decays exponen-tially, the spin coherent state cannot survive.

Complementing the results shown in the main text,the differences (in absolute value) between the simulationand experimental results are shown in Fig. 17.

Appendix F: Fully classical simulations using thespin-vector Monte Carlo algorithm

In an effort to better understand the asymmetric par-tial success probability observed in our experiments, wealso performed fully classical simulations of the sameproblem using the spin-vector Monte Carlo (SVMC) [39]algorithm and a new variant with transverse-field-dependent updates (SVMC-TF) [50], where this variantwas successfully used to explain empirical D-Wave resultsfor a particular 12-qubit instance. For the p-spin prob-lem, we replace the Hamiltonian of Eq. (1) by a classicalHamiltonian:

H(s) = −A(s)

2

(N∑i

sin θi

)− B(s)N

2

(1

N

N∑i

cos θi

)p.

(F1)

Each qubit i is replaced by a classical O(2) spin ~Mi =(sin θi, 0, cos θi), θi ∈ [0, π]. For the purpose of reverseannealing, we also need to specify the t dependence of

0.2 0.4 0.6 0.8sinv

0

0.05

0.1

0.15

jPD

Wav

e!

PPTR

Ej

Initial state: j"""#i""""####

(a)

0.2 0.4 0.6 0.8sinv

0

0.1

0.2

0.3

jPD

Wav

e!

PPTR

Ej

Initial state: j""##i""""####one upone down

(b)

0.2 0.4 0.6 0.8sinv

0

0.1

0.2

0.3

jPD

Wav

e!

PPTR

Ej

Initial state: j"" " " " #iall upall down

(c)

0.2 0.4 0.6 0.8sinv

0

0.1

0.2

0.3

0.4

jPD

Wav

e!

PPTR

Ej

Initial state: j""""####iall upall down

(d)

FIG. 17. (a) and (b): The population differences between theexperimental and PTRE simulation results shown in Fig. 12for N = 4 qubits. (c) and (d): The population differencesbetween the experimental and simulation results shown inFig. 13 for N = 8 qubits. Note the different scale of thevertical axes.

s(t). The concept of time is here replaced by the num-ber of Monte Carlo sweeps: we replace τ by a specifiednumber of total sweeps. The total number of sweeps isthen 2τ(1− sinv), in analogy to the total annealing timein Eq. (6).

To simulate the effect of thermal hopping through thissemi-classical landscape with inverse temperature β, weperform at each time step a spin update according to theMetropolis rule. In SVMC, a random angle θ

i ∈ [0, π] is

picked for each spin i. Updates of the spin angles θi to θ′

i

are accepted according to the standard Metropolis ruleassociated with the change in energy (∆E) of the classi-cal Hamiltonian. For the p-spin problem, ∆E cannot beexpressed in a simple form as in case of the Ising problemHamiltonian [39].

In SVMC-TF, the random angle θ′

i = θi+εi(s) is pickedin a restricted range

εi(s) ∈[−min

(1,A(s)

B(s)

)π,min

(1,A(s)

B(s)

]. (F2)

The goal of SVMC-TF is to restrict the angle update forA(s) < B(s), and imitate the freeze-out effect discussedin Sec. III A 3. The full SVMC and SVMC-TF algorithmsfor reverse annealing are summarized in App. I, includingthe expression for ∆E.

A simple intuitive way to visualize the semiclassicaldynamics described by the SMVC and SVMC-TF algo-rithms is to consider the energy landscape defined by

Page 18: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

18

FIG. 18. The semiclassical potential landscape correspondingto the Hamiltonian of Eq. (F1) for N = 1. Ground statescorrespond to θ = 0, π. The two saddle points are at s =0.3948 and θ = π/2± 0.31518π.

H(s) [Eq. (F1)] when equating all angles (θi ≡ θ). Weplot the resulting surface in Fig. 18 for the simple caseof N = 1. For sinv < 0.3948, Metropolis updates toeither spin direction happen frequently and are equallylikely since there is no potential barrier separating them.For 1 � sinv > 0.3948, under Metropolis updates thesystem prefers staying in the original well and escapingto the opposite well is unlikely, due to the potential bar-rier. This preference explains the asymmetry part of par-tial success probabilities in the experimental results. For1 > sinv � 0.3948, Metropolis updates are very rare dueto the rate suppression in SVMC-TF.

1. Total success probability

We report on SVMC and SVMC-TF simulations forN = 4 and 8. We choose the initial condition with asingle spin down. In terms of angles, for N = 4 the initialangles are {0, 0, 0, π}. We again use the temperature T =12.1 mK. The classical analogue of the annealing time ischosen to be τ = 103 and 104 sweeps.

In Fig. 19 we display the simulation results for thetotal success probability using SVMC and SVMC-TF.The number of samples is 104. SVMC gives high totalsuccess probabilities even with large inversion points sinv.A large sinv value means that during the whole reverseannealing process the ratio A(s)/B(s) is small. In the D-Wave device, it is expected that for small A(s)/B(s) thedynamics freezes, which makes it difficult to reach theground state(s) when the initial state is excited. Withthe number of sweeps increased from τ = 103 to 104,we observe that even for very large inversion points sinv

the total success probability of SVMC can be as highas 1. This is because the angle updates in SVMC arecompletely random and thus with a sufficient number ofsweeps, it is possible for the state to flip to the correct

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(b)

FIG. 19. Total success probabilities as computed by SVMCand SVMC-TF. (a) τ = 103 sweeps, (b) τ = 104 sweeps.Error bars are 2σ over 104 samples.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(b)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(c)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(d)

FIG. 20. Partial success probabilities for the one-down initialstate as computed by (a) SVMC for N = 4, (b) SVMC-TF forN = 4. Panels (c) and (d) show the partial ground state (GS)success probabilities for SVMC-TF (“Sim.”) at optimizedsweeps numbers of (c) τ = 450 for N = 4 and (d) τ = 150 forN = 8, which yields a qualitatively good agreement with theexperimental data (“Exp.”). In panels (a) and (b) τ = 103

sweeps and error bars are 2σ over 104 samples. In panels (c)and (d) error bars are 1σ over 104 samples.

solutions.

However, in SVMC-TF, the range of angle updates isrestricted for A(s)/B(s) < 1. The restricted angle up-dates (freeze-out effect) prevent the state from flipping tothe correct solutions. Therefore the total success prob-ability for large inversion points sinv is basically zero inSVMC-TF, regardless of the number of sweeps. This isalso what we observe in the empirical data and in theadiabatic master equation simulations, as discussed inSec. IV C 1.

Page 19: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

19

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(a)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(b)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(c)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

(d)

FIG. 21. Partial success probabilities as computed by SVMC-TF. (a) N = 16, τ = 103 sweeps, (b) N = 16, τ = 104 sweeps,(c) N = 32, τ = 103 sweeps, (d) N = 32, τ = 104 sweeps.Error bars are 2σ over 104 samples.

2. Partial success probability

We compare the partial success probability obtainedfrom SVMC and SVMC-TF in Figs. 20(a) and (b), re-spectively, for N = 4 and τ = 103. It is again seen thatSVMC-TF more accurately captures the early freezingthan SVMC. The empirical data from Fig. 5(a) is re-produced in Fig. 20(c) and (d), where is it compared toSVMC-TF at an optimized number of sweeps (explainedin App. H). This yields a semi-quantitative agreement, inparticular the correct trend and transition locations forthe unequal up and down partial success probabilities.However, the agreement is clearly better for N = 4 thanfor N = 8 despite the optimization, which suggests theinteresting possibility that SVMC-TF becomes less accu-rate at higher numbers of spins. Also noteworthy is thatfor N = 8 we observe a deviation from the empirical datafor sinv . 0.3, where there exists a small but clear dif-ference in the probabilities of all-up and all-down states,whereas the SVMC-TF data do not show such a trend.As discussed in Sec. III A 4, we attribute the differencefor sinv . 0.3 to the spin-bath polarization effect, whichis not modeled in our SVMC-TF simulations.

In Fig. 21 we display SVMC-TF reverse annealing sim-ulation results of partial success probabilities for N =16, 32 with τ = 103, 104 sweeps. For both sizes shown,the regime of high partial success probability for the all-up state is shifted slightly to higher sinv for τ = 104

sweeps than for τ = 103. This is consistent with thetrend in the experimental results observed in Fig. 3(a).

However, the trend with system size is inconsistent withthe empirical partial success probabilities for the up caseshown in Fig. 5(a): the numerical results for N = 16and N = 32 are virtually indistinguishable (apart fromstatistical fluctuations), while the empirical data showsthat N = 16 is not yet large enough for convergence.This (small) failure of the SVMC-TF model may hintat an interesting way in which to identify a “quantumsignature” in experimental quantum annealing [34, 40].However, we did not pursue this direction further sincewe cannot rule out that further fine-tuning of the SVMCparameters will result in a closer match with the empir-ical data. To further explore ways in which a classicalmodel such as SVMC-TF, or a model with strong cou-pling to the bath such as PTRE, might fail in describingthe empirical data, we focus on excited state populationsin the next Appendix.

Appendix G: Excited states

In this Appendix we present results comparing D-Wavedata to simulations for the population in low-lying ex-cited states. Our goal is not to be comprehensive, butrather to highlight agreements and discrepancies betweenthe empirical and the numerical results.

The overall conclusion of this Appendix is that be-cause of a persistence of small discrepancies even for thePTRE and SVMC-TF, in particular their failure to ac-curately predict the excited-state populations as shownbelow, further work is needed in order to improve open-system models.

In both the empirical and the simulation data, the pop-ulation of the i’th excited up (down) state is obtainedafter summation over all permutations of computationalbasis states with N − i and i up, or i and N − i downspins, where N is the number of spins. We chose τ = 5µsfor the D-Wave experiments.

1. N = 4 with a maximally excited initial state

We compare in Fig. 22 the AME, PTRE, and SVMC-TF simulation results to the empirical data for the ini-tial state |0011〉. The open-system parameter settingsare the same as in the ground state simulations reportedabove, for all three simulation methods, but the numberof sweeps used in SVMC-TF is optimized for the closestagreement with the D-Wave data, as explained in App. H.

Since the initial state has no bias toward spin up ordown, it is surprising that the D-Wave data exhibits anasymmetry between probability of ending in states withone up or one down spin. Since the same anomaly is notobserved for N = 8 (see Fig. 24), we attribute it to anunexplained peculiarity associated with the embeddingof the N = 4 problem. All three simulation methods cor-rectly predict no distinction between the up and downstates. The predicted position of the peak in the first

Page 20: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

20

0.4 0.5 0.6 0.7 0.80

0.1

0.2

0.3

0.4Exp: 1st excited upExp: 1st excited downSim: 1st excited upSim: 1st excited down

(a)

0.4 0.5 0.6 0.7 0.8sinv

0

0.1

0.2

0.3

0.4

Popula

tion

Initial state j""##i, r = 1, PTRE

Exp: 1st excited upExp: 1st excited downPTRE: 1st excited upPTRE: 1st excited down

(b)

0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

Exp: 1st excited upExp: 1st excited downSim: 1st excited upSim: 1st excited down

(c)

FIG. 22. Population in the first excited state as a function of sinv for D-Wave (Exp) vs simulations (Sim): (a) the AME, (b)the PTRE, and (c) SVMC-TF with τ = 450 sweeps. The initial state is |↑↑↓↓〉. Error bars denote 1σ over 104 samples. Theslight difference in SVMC-TF peak heights in panel (c) is a numerical artifact. Note the different scale of the vertical axis ofpanel (c).

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Exp: GS (all up)Exp: GS (all down)Exp: 2nd excitedSim: GS (all up)Sim: GS (all down)Sim: 2nd excited

(a)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Exp: GS (all up)Exp: GS (all down)Exp: 2nd excitedSim: GS (all up)Sim: GS (all down)Sim: 2nd excited

(b)

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Exp: GS (all up)Exp: GS (all down)Exp: 4th excitedSim: GS (all up)Sim: GS (all down)Sim: 4th excited

(c)

FIG. 23. Population in the ground and second excited states as a function of sinv for D-Wave (Exp) vs simulations (Sim): (a)the AME and (b) SVMC-TF with τ = 450 sweeps, for the initial state |↑↑↓↓〉. (c) Ground and fourth excited states for D-Wavevs SVMC-TF with τ = 150 sweeps, for the initial state |↑↑↑↑↓↓↓↓〉. Error bars denote 1σ over 104 samples.

excited state population is different for the three simu-lation methods; the AME [panel (a)] predicts a positionthat is roughly the average of the empirically observedpeaks for the case where the final state has a down oran up spin, while the PTRE [panel (b)] and SVMC-TF[panel (c)] are shifted to the right and left, respectively.It is not possible to say which prediction is correct dueto the aforementioned anomaly.

The partial success probability results are shown inFig. 23 for the second excited state as well as the groundstate, for the AME and SVMC-TF. From (a) we see thatthe AME is qualitatively but not quantitatively in agree-ment with the empirical results for the second excitedstate, where it predicts an sinv value that is too largefor the onset of the rise in the population of this state,and this rise is also somewhat too steep. The same istrue for SVMC-TF [panel (b)] but it is in slightly closeragreement than AME for the second excited state. Wealso show results for N = 8 [panel (c)], where SVMC-TFcontinues to exhibit good qualitative agreement with the

empirical results.

2. N = 8 with a maximally excited initial state

In Fig. 24 we display the results for N = 8 with|↑↑↑↑↓↓↓↓〉 as the initial state, for the probability of end-ing in the first, second, or third excited state, using thePTRE and SVMC-TF.4 This time the empirically ob-served population in the states with i spins up or i spinsdown (1 ≤ i ≤ 3) is identical, as expected, i.e., we do notobserve the anomaly mentioned above for N = 4.

The top row shows the results for the PTRE, with thesame set of optimal parameters as explained in Sec. IV D.

4 We do not present AME results since the computational cost ofN = 8 highly excited states using the AME is prohibitive: all256 eigenstates need to be taken into account.

Page 21: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

21

0.4 0.5 0.6 0.7 0.8sinv

0

0.05

0.1

0.15

0.2

0.25

Popula

tion

Initial state j""""####i, r = 1, PTRE

Exp: 1st excited upExp: 1st excited downPTRE: 1st excited upPTRE: 1st excited down

(a)

0.4 0.5 0.6 0.7 0.8sinv

0

0.05

0.1

0.15

0.2

0.25

Popula

tion

Initial state j""""####i, r = 1, PTRE

Exp: 2nd excited upExp: 2nd excited downPTRE: 2nd excited upPTRE: 2nd excited down

(b)

0.4 0.5 0.6 0.7 0.8sinv

0

0.1

0.2

0.3

Popula

tion

Initial state j""""####i, r = 1, PTRE

Exp: 3rd excited upExp: 3rd excited downPTRE: 3rd excited upPTRE: 3rd excited down

(c)

0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

Exp: 1st excited upExp: 1st excited downSim: 1st excited upSim: 1st excited down

(d)

0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

Exp: 2nd excited upExp: 2nd excited downSim: 2nd excited upSim: 2nd excited down

(e)

0.4 0.5 0.6 0.7 0.80

0.05

0.1

0.15

0.2

0.25

Exp: 3rd excited upExp: 3rd excited downSim: 3rd excited upSim: 3rd excited down

(f)

FIG. 24. Population in the first, second, and third excited states vs sinv for D-Wave vs the PTRE [(a)-(c)] and SVMC-TF withτ = 150 sweeps [(d)-(f)]. The initial state is |↑↑↑↑↓↓↓↓〉. Error bars denote 1σ over 104 samples. For the PTRE the simulationparameters are identical to the optimal ones obtained from the parameter estimation procedure in Fig. 12. Note the differentscale of the vertical axes.

The agreement is relatively poor, in that both the mag-nitude and the position of the population peak is missed,both being systematically overestimated.

The bottom row shows the results for SVMC-TF, withthe optimal number of sweeps as determined in App. H.The agreement is somewhat better than for the PTRE,in that both the peak’s position and magnitude are closerto the empirical data, though the agreement in the peak’sposition deteriorates with increasing excitation level.

3. First excited state as initial state

As a final test, we checked the PTRE and SVMC-TFfor initial states with one spin down, i.e., the first excitedstate. The results for N = 4 and N = 8 are shown inFig. 25, and are in reasonable qualitative agreement. Theagreement is overall somewhat better for the PTRE, es-pecially for N = 8. For N = 4 the PTRE predicts a smallnon-zero probability for the first excited down state nearthe minimum gap point, which is absent in the empiricaldata and in the SVMC-TF results. This suggests thatthe PTRE slightly overestimates the incoherent tunnel-ing rates for small N .

Appendix H: SVMC-TF’s dependence on thenumber of sweeps

For the i’th eigenstate population, the `2 norm betweenthe experiment and simulation data series (over a rangeof sinv) is:

`i2 =

√√√√ 44∑k=1

(xi,expk − xi,simk )2 , (H1)

where for the k’th data point, xik is the population of thei’th eigenstate at sinv = 0.02k at the end of the reverseanneal (recall that the experimental data is sampled atevery point from {0.02, 0.04, . . . , 0.88}).

To rigorously evaluate the differences between the ex-periment data and simulation data, we consider the sumof such `2 norms over all eigenstates of the Hamiltonian,i.e., `2 =

∑i `i2. We plot `2 vs sweeps in Fig. 26. We

observe that for the initial states shown, with the excep-tion of |↑ · · · ↑↓〉, we can find an optimal sweep number.Considering both initial states at the respective systemsize, the optimal number of sweeps is 450 for N = 4 and150 for N = 8.

Page 22: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

22

0.4 0.5 0.6 0.7 0.8sinv

0

0.2

0.4

0.6

0.8

1Popula

tion

Initial state j"""#i, r = 1, PTRE

Exp: 1st excited upExp: 1st excited downPTRE: 1st excited upPTRE: 1st excited down

(a)

0.4 0.5 0.6 0.7 0.8sinv

0

0.2

0.4

0.6

0.8

1

Popula

tion

Initial state j"""""""#i, r = 1, PTRE

Exp: 1st excited upExp: 1st excited downPTRE: 1st excited upPTRE: 1st excited down

(b)

0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Exp: 1st excited upExp: 1st excited downSim: 1st excited upSim: 1st excited down

(c)

0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

Exp: 1st excited upExp: 1st excited downSim: 1st excited upSim: 1st excited down

(d)

FIG. 25. Comparison of experimental data (Exp) and simu-lation data (Sim) for the initial state |↑↑↑↓〉. (a) PTRE, (b)SVMC-TF with 450 sweeps. (c) PTRE, (d) SVMC-TF with150 sweeps, for the initial state |↑↑↑↑↑↑↑↓〉 . The populationshown is for the state with all but one spin up (“1st excitedup”) and all but one spin down (“1st excited down”). Errorbars denote 1σ over 104 samples.

0 500 1000 1500 20000

2

4

6

8

10

12

14

(a)

0 500 1000 1500 20000

2

4

6

8

10

12

14

(b)

FIG. 26. Sum of `2-norms (between simulation and experi-ment data series) over all the eigenstates vs the number ofsweeps used in SVMC-TF simulations. (a) Subplot legendshows the initial states. (b) For N = 4, SVMC-TF producessimulation data best matched with the empirical date at 450sweeps, while for N = 8 the optimal number is 150.

Page 23: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

23

Appendix I: Pseudocode for SVMC and SVMC-TF

Algorithm 1 SVMC and SVMC-TF (p-spin reverse annealing)

H(s) = −A(s)2

(∑Ni=1 sin θi) −

B(s)N2

(1N

∑Ni=1 cos θi

)p, with a specified reverse annealing dependence s(t).

procedure SVMCfor k = 1 to K (number of samples) do

Initial computational basis state: |0〉i → θki,t : 0, |1〉i → θki,t : π.

for t = 1 to T (total number of sweeps) dofor i = 1 to n (number of qubits) do

Randomly choose a new angle θ′ki,t ∈ [0, π].

Calculate ∆E.if ∆E ≤ 0 then

θki,t → θ′ki,t

else if p < exp(−β∆E) where p ∈ [0, 1] is drawn with uniform probability. then

θki,t → θ′ki,t

end ifend for

end for

Take the mean of K samples: θi,t = (∑Kk=1 θ

′ki,t)/K.

end forend procedurereturn θi,t.

procedure SVMC-TFfor k = 1 to K (number of samples) do

Initial computational basis state: |0〉i → θki,t : 0, |1〉i → θki,t : π.

for t = 1 to T (total number of sweeps) dofor i = 1 to n (number of qubits) do

θ′ki,t = θki,t + εi(s(t/T )),

a random εi(s(t/T ) ∈ [−min

(1,A(s(t/T )B(s(t/T )

)π,min

(1,A(s(t/T )B(s(t/T )

)π]

Calculate ∆E.if ∆E ≤ 0 then

θki,t → θ′ki,t

else if p < exp(−β∆E) where p ∈ [0, 1] is drawn with uniform probability. then

θki,t → θ′ki,t

end ifend for

end for

Take the mean of K samples: θi,t = (∑Kk=1 θ

′ki,t)/K.

end forend procedurereturn θi,t.

procedure Projection onto computational basis

if 0 ≤ θki,t ≤ π/2 then

|ψk(t)〉i = |0〉else if π/2 ≤ θki,t ≤ π then

|ψk(t)〉i = |1〉end if

end procedureRemark: ∆E due to the update of qubit k, can be expressed in the following form:

∆Ek =

−A(s)

2

N∑i=1i6=k

sin θi + sin θ′k

−B(s)N

2

1

N

N∑i=1i6=k

cos θi + cos θ′k

2 −H(s) .

[1] T. Kadowaki and H. Nishimori, Quantum annealing inthe transverse Ising model, Phys. Rev. E 58, 5355 (1998).

[2] E. Farhi, J. Goldstone, S. Gutmann, J. Lapan, A. Lund-gren, and D. Preda, A quantum adiabatic evolution al-gorithm applied to random instances of an NP-completeproblem., Science 292, 472 (2001).

[3] G. E. Santoro, R. Martonak, E. Tosatti, and R. Car, The-ory of quantum annealing of an Ising spin glass, Science295, 2427 (2002).

[4] A. Das and B. K. Chakrabarti, Colloquium: Quantumannealing and analog quantum computation, Rev. Mod.Phys. 80, 1061 (2008).

[5] S. Morita and H. Nishimori, Mathematical foundation ofquantum annealing, J. Math. Phys. 49, 125210 (2008).

[6] T. Albash and D. A. Lidar, Adiabatic quantum compu-

tation, Rev. Mod. Phys. 90, 015002 (2018).[7] P. Hauke, H. G. Katzgraber, W. Lechner, H. Nishi-

mori, and W. D. Oliver, Perspectives of quantum anneal-ing: Methods and implementations, Rep. Prog. Phys. 83,054401 (2020).

[8] E. J. Crosson and D. A. Lidar, Prospects for quantumenhancement with diabatic quantum annealing, NatureReviews Physics 3, 466 (2021).

[9] M. W. Johnson, M. H. S. Amin, S. Gildert, T. Lant-ing, F. Hamze, N. Dickson, R. Harris, A. J. Berkley,J. Johansson, P. Bunyk, E. M. Chapple, C. Enderud,J. P. Hilton, K. Karimi, E. Ladizinsky, N. Ladizinsky,T. Oh, I. Perminov, C. Rich, M. C. Thom, E. Tolkacheva,C. J. S. Truncik, S. Uchaikin, J. Wang, B. Wilson, andG. Rose, Quantum annealing with manufactured spins,

Page 24: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

24

Nature 473, 194 (2011).[10] R. Harris, Y. Sato, A. J. Berkley, M. Reis, F. Al-

tomare, M. H. Amin, K. Boothby, P. Bunyk, C. Deng,C. Enderud, S. Huang, E. Hoskinson, M. W. Johnson,E. Ladizinsky, N. Ladizinsky, T. Lanting, R. Li, T. Med-ina, R. Molavi, R. Neufeld, T. Oh, I. Pavlov, I. Perminov,G. Poulin-Lamarre, C. Rich, A. Smirnov, L. Swenson,N. Tsai, M. Volkmann, J. Whittaker, and J. Yao, Phasetransitions in a programmable quantum spin glass simu-lator, Science 361, 162 (2018).

[11] A. D. King, J. Carrasquilla, J. Raymond, I. Ozfidan,E. Andriyash, A. Berkley, M. Reis, T. Lanting, R. Har-ris, F. Altomare, K. Boothby, P. I. Bunyk, C. En-derud, A. Frechette, E. Hoskinson, N. Ladizinsky, T. Oh,G. Poulin-Lamarre, C. Rich, Y. Sato, A. Y. Smirnov,L. J. Swenson, M. H. Volkmann, J. Whittaker, J. Yao,E. Ladizinsky, M. W. Johnson, J. Hilton, and M. H.Amin, Observation of topological phenomena in a pro-grammable lattice of 1,800 qubits, Nature 560, 456(2018).

[12] A. D. King, J. Raymond, T. Lanting, S. V.Isakov, M. Mohseni, G. Poulin-Lamarre, S. Ejtemaee,W. Bernoudy, I. Ozfidan, A. Y. Smirnov, M. Reis,F. Altomare, M. Babcock, C. Baron, A. J. Berkley,K. Boothby, P. I. Bunyk, H. Christiani, C. Enderud,B. Evert, R. Harris, E. Hoskinson, S. Huang, K. Jooya,A. Khodabandelou, N. Ladizinsky, R. Li, P. A. Lott,A. J. R. MacDonald, D. Marsden, G. Marsden, T. Med-ina, R. Molavi, R. Neufeld, M. Norouzpour, T. Oh,I. Pavlov, I. Perminov, T. Prescott, C. Rich, Y. Sato,B. Sheldan, G. Sterling, L. J. Swenson, N. Tsai, M. H.Volkmann, J. D. Whittaker, W. Wilkinson, J. Yao,H. Neven, J. P. Hilton, E. Ladizinsky, M. W. John-son, and M. H. Amin, Scaling advantage over path-integral monte carlo in quantum simulation of geomet-rically frustrated magnets, Nature Communications 12,1113 (2021).

[13] A. Mishra, T. Albash, and D. A. Lidar, Finite tempera-ture quantum annealing solving exponentially small gapproblem with non-monotonic success probability, NatureCommunications 9, 2917 (2018).

[14] B. Gardas, J. Dziarmaga, W. H. Zurek, and M. Zwolak,Defects in Quantum Computers, Sci. Rep. 8, 4539 (2018).

[15] P. Weinberg, M. Tylutki, J. M. Ronkko, J. Westerholm,J. A. Astrom, P. Manninen, P. Torma, and A. W. Sand-vik, Scaling and Diabatic Effects in Quantum Annealingwith a D-Wave Device, Phys. Rev. Lett. 124, 090502(2020).

[16] Y. Bando, Y. Susa, H. Oshiyama, N. Shibata, M. Ohzeki,F. J. Gomez-Ruiz, D. A. Lidar, S. Suzuki, A. del Campo,and H. Nishimori, Probing the universality of topologicaldefect formation in a quantum annealer: Kibble-zurekmechanism and beyond, Phys. Rev. Research 2, 033369(2020).

[17] K. Nishimura, H. Nishimori, and H. G. Katzgraber,Griffiths-McCoy singularity on the diluted Chimeragraph: Monte Carlo simulations and experiments onquantum hardware, Phys. Rev. A 102, 042403 (2020).

[18] A. D. King, C. Nisoli, E. D. Dahl, G. Poulin-Lamarre,and A. Lopez-Bezanilla, Qubit Spin Ice, Science 373, 576(2021).

[19] P. Kairys, A. D. King, I. Ozfidan, K. Boothby, J. Ray-mond, A. Banerjee, and T. S. Humble, Simulating theshastry-sutherland ising model using quantum annealing,

PRX Quantum 1, 020320 (2020).[20] S. Abel and M. Spannowsky, Quantum-field-theoretic

simulation platform for observing the fate of the falsevacuum, PRX Quantum 2, 010349 (2021).

[21] S. Zhou, D. Green, E. D. Dahl, and C. Chamon, Exper-imental realization of classical z2 spin liquids in a pro-grammable quantum device, Phys. Rev. B 104, L081107(2021).

[22] B. Derrida, Random-energy model: An exactly solvablemodel of disordered systems, Phys. Rev. B 24, 2613(1981).

[23] D. J. Gross and M. Mezard, The simplest spin glass, Nucl.Phys. B 240, 431 (1984).

[24] V. Bapst and G. Semerjian, On quantum mean-field mod-els and their quantum annealing, J. Stat. Mech.: TheoryExp. 2012 (6), P06007.

[25] A. Perdomo-Ortiz, S. E. Venegas-Andraca, andA. Aspuru-Guzik, A study of heuristic guesses foradiabatic quantum computation, Quantum Inf. Process.10, 33 (2011).

[26] Y. Yamashiro, M. Ohkuwa, H. Nishimori, and D. A. Li-dar, Dynamics of reverse annealing for the fully con-nected p-spin model, Physical Review A 100, 052321(2019).

[27] N. Chancellor, Modernizing quantum annealing using lo-cal searches, New J. Phys. 19, 023024 (2017), 1606.06833.

[28] M. Ohkuwa, H. Nishimori, and D. A. Lidar, Reverse an-nealing for the fully connected p-spin model, PhysicalReview A 98, 022314 (2018).

[29] D. Ottaviani and A. Amendola, Low Rank Non-Negative Matrix Factorization with D-Wave 2000Q,arXiv:1808.08721 (2018), arXiv:1808.08721.

[30] D. Venturelli and A. Kondratyev, Reverse quantum an-nealing approach to portfolio optimization problems,Quantum Mach. Intell. 1, 17 (2019).

[31] J. Marshall, D. Venturelli, I. Hen, and E. G. Rieffel,Power of Pausing: Advancing Understanding of Ther-malization in Experimental Quantum Annealers, Phys.Rev. App. 11, 044083 (2019).

[32] E. Pelofske, G. Hahn, and H. N. Djidjev, Advanced an-neal paths for improved quantum annealing, in 2020IEEE International Conference on Quantum Computingand Engineering (QCE) (IEEE Computer Society, LosAlamitos, CA, USA, 2020) pp. 256–266.

[33] L. Rocutto, C. Destri, and E. Prati, Quantum semanticlearning by reverse annealing of an adiabatic quantumcomputer, Adv. Quantum Technol. 4, 2000133 (2021).

[34] S. Boixo, T. Albash, F. M. Spedalieri, N. Chancellor, andD. A. Lidar, Experimental signature of programmablequantum annealing, Nat. Commun. 4, 2067 (2013).

[35] S. Boixo, T. F. Ronnow, S. V. Isakov, Z. Wang,D. Wecker, D. A. Lidar, J. M. Martinis, and M. Troyer,Evidence for quantum annealing with more than one hun-dred qubits, Nat. Phys. 10, 218 (2014).

[36] J. A. Smolin and G. Smith, Classical signature of quan-tum annealing, Frontiers in Physics 2, 52 (2014).

[37] S. Boixo, V. N. Smelyanskiy, A. Shabani, S. V. Isakov,M. Dykman, V. S. Denchev, M. H. Amin, A. Y. Smirnov,M. Mohseni, and H. Neven, Computational multiqubittunnelling in programmable quantum annealers, NatCommun 7 (2016).

[38] T. Albash, T. F. Rønnow, M. Troyer, and D. A. Lidar,Reexamining classical and quantum models for the D-Wave One processor, Eur. Phys. J. Spec. Top. 224, 111

Page 25: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

25

(2015).[39] S. W. Shin, G. Smith, J. A. Smolin, and U. Vazirani, How

“quantum” is the D-Wave machine?, arXiv:1401.7087(2014).

[40] T. Albash, W. Vinci, A. Mishra, P. A. Warburton, andD. A. Lidar, Consistency tests of classical and quantummodels for a quantum annealer, Phys. Rev. A 91, 042314(2015).

[41] V. S. Denchev, S. Boixo, S. V. Isakov, N. Ding, R. Bab-bush, V. Smelyanskiy, J. Martinis, and H. Neven, What isthe computational value of finite-range tunneling?, Phys.Rev. X 6, 031015 (2016).

[42] T. Albash, S. Boixo, D. A. Lidar, and P. Zanardi, Quan-tum adiabatic Markovian master equations, New J. Phys.14, 123016 (2012).

[43] D. Xu and J. Cao, Non-canonical distribution and non-equilibrium transport beyond weak system-bath couplingregime: A polaron transformation approach, Frontiers ofPhysics 11, 110308 (2016).

[44] H. Chen and D. A. Lidar, HOQST: Hamiltonian OpenQuantum System Toolkit, arXiv:2011.14046 (2020).

[45] G. Passarelli, K.-W. Yip, D. A. Lidar, H. Nishimori, andP. Lucignano, Reverse quantum annealing of the p-spinmodel with relaxation, Phys. Rev. A 101, 022331 (2020).

[46] R. Babbush, B. O’Gorman, and A. Aspuru-Guzik, Re-source efficient gadgets for compiling adiabatic quantumoptimization problems, Ann. Physik 525, 877 (2013).

[47] D-Wave System Documentation – QPU-Specific Charac-teristics.

[48] V. Choi, Minor-embedding in adiabatic quantum compu-tation: I. the parameter setting problem, Quantum Inf.Process. 7, 193 (2008).

[49] T. Albash, W. Vinci, and D. A. Lidar, Simulated-quantum-annealing comparison between all-to-all con-nectivity schemes, Phys. Rev. A 94, 022327 (2016).

[50] T. Albash and J. Marshall, Comparing relaxation mech-anisms in quantum and classical transverse-field anneal-ing, Phys. Rev. Appl. 15, 014029 (2021).

[51] M. H. Amin, Searching for quantum speedup in qua-sistatic quantum annealers, Phys. Rev. A 92, 052323(2015).

[52] D-Wave System Documentation – Spin-Bath Polariza-tion Effect.

[53] T. Lanting, M. H. Amin, C. Baron, M. Babcock,J. Boschee, S. Boixo, V. N. Smelyanskiy, M. Foygel, andA. G. Petukhov, Probing environmental spin polarizationwith superconducting flux qubits, arXiv:2003.14244.

[54] H. Chen and D. A. Lidar, Why and when pausing is bene-ficial in quantum annealing, Phys. Rev. Appl. 14, 014100(2020).

[55] T. Albash and D. A. Lidar, Decoherence in adiabaticquantum computation, Phys. Rev. A 91, 062320 (2015).

[56] K. W. Yip, T. Albash, and D. A. Lidar, Quantum tra-jectories for time-dependent adiabatic master equations,Phys. Rev. A 97, 022116 (2018).

[57] D. A. Lidar and K. B. Whaley, in Irreversible QuantumDynamics, Lecture Notes in Physics, Vol. 622 (Springer,Berlin, 2003) p. 83.

[58] R. Kubo, Statistical-mechanical theory of irreversibleprocesses. i. general theory and simple applications tomagnetic and conduction problems, J. Phys. Soc. Jpn.12, 570 (1957).

[59] P. C. Martin and J. Schwinger, Theory of many-particlesystems. I, Phys. Rev. 115, 1342 (1959).

[60] T. Albash, I. Hen, F. M. Spedalieri, and D. A. Lidar, Re-examination of the evidence for entanglement in a quan-tum annealer, Phys. Rev. A 92, 062328 (2015).

[61] A. Y. Smirnov and M. H. Amin, Theory of open quantumdynamics with hybrid noise, New J. Phys. 20, 103037(2018).

[62] N. G. Dickson, M. W. Johnson, M. H. Amin, R. Harris,F. Altomare, A. J. Berkley, P. Bunyk, J. Cai, E. M. Chap-ple, P. Chavez, F. Cioata, T. Cirip, P. deBuen, M. Drew-Brook, C. Enderud, S. Gildert, F. Hamze, J. P. Hilton,E. Hoskinson, K. Karimi, E. Ladizinsky, N. Ladizinsky,T. Lanting, T. Mahon, R. Neufeld, T. Oh, I. Perminov,C. Petroff, A. Przybysz, C. Rich, P. Spear, A. Tcaciuc,M. C. Thom, E. Tolkacheva, S. Uchaikin, J. Wang, A. B.Wilson, Z. Merali, and G. Rose, Thermally assisted quan-tum annealing of a 16-qubit problem, Nat. Commun. 4,1903 (2013).

[63] M. H. S. Amin and D. V. Averin, Macroscopic ResonantTunneling in the Presence of Low Frequency Noise, Phys.Rev. Lett. 100, 197001 (2008).

[64] H. Dekker, Noninteracting-blip approximation for a two-level system coupled to a heat bath, Phys. Rev. A 35,1436 (1987).

[65] M. Grifoni and P. Hanggi, Driven quantum tunneling,Phys. Rep. 304, 229 (1998).

[66] H. Rieger and N. Kawashima, Application of a continu-ous time cluster algorithm to the two-dimensional ran-dom quantum Ising ferromagnet, Eur. Phys. J. B 9, 233(1999).

[67] R. Martonak, G. E. Santoro, and E. Tosatti, Quantumannealing by the path-integral Monte Carlo method: Thetwo-dimensional random Ising model, Phys. Rev. B 66,094203 (2002).

[68] V. Bapst and G. Semerjian, Thermal, quantum and sim-ulated quantum annealing: analytical comparisons forsimple models, J. Phys.: Conf. Ser. 473, 012011 (2013).

[69] T. F. Rønnow, Z. Wang, J. Job, S. Boixo, S. V. Isakov,D. Wecker, J. M. Martinis, D. A. Lidar, and M. Troyer,Defining and detecting quantum speedup, Science 345,420 (2014).

[70] Y. Bando and H. Nishimori, Simulated quantum anneal-ing as a simulator of nonequilibrium quantum dynamics,Phys. Rev. A 104, 022607 (2021).

[71] G. E. Santoro and E. Tosatti, Optimization using quan-tum mechanics: quantum annealing through adiabaticevolution, J. Phys. A 39, R393 (2006).

[72] D. Subires, F. J. Gomez-Ruiz, A. Ruiz-Garcıa, D. Alonso,and A. del Campo, Benchmarking quantum anneal-ing dynamics: the spin-vector langevin model (2021),arXiv:2109.09750.

[73] T. Albash, D. A. Lidar, M. Marvian, and P. Zanardi,Fluctuation theorems for quantum processes, Phys. Rev.E 88, 032146 (2013).

[74] T. Lanting, A. J. Przybysz, A. Y. Smirnov, F. M.Spedalieri, M. H. Amin, A. J. Berkley, R. Harris, F. Al-tomare, S. Boixo, P. Bunyk, N. Dickson, C. Enderud,J. P. Hilton, E. Hoskinson, M. W. Johnson, E. Ladizin-sky, N. Ladizinsky, R. Neufeld, T. Oh, I. Perminov,C. Rich, M. C. Thom, E. Tolkacheva, S. Uchaikin, A. B.Wilson, and G. Rose, Entanglement in a quantum an-nealing processor, Phys. Rev. X 4, 021041 (2014).

[75] A. D. King, S. Suzuki, J. Raymond, A. Zucca, T. Lanting,F. Altomare, A. J. Berkley, S. Ejtemaee, E. Hoskinson,S. Huang, E. Ladizinsky, A. MacDonald, G. Marsden,

Page 26: 1, 2,3,4,5 and Hidetoshi Nishimori1,6,7

26

T. Oh, G. Poulin-Lamarre, M. Reis, C. Rich, Y. Sato,J. D. Whittaker, J. Yao, R. Harris, D. A. Lidar, H. Nishi-mori, and M. H. Amin, Coherent quantum annealing in aprogrammable 2000-qubit Ising chain, arXiv:2202.05847(2022).

[76] D. Venturelli, S. Mandra, S. Knysh, B. O’Gorman,R. Biswas, and V. Smelyanskiy, Quantum optimizationof fully connected spin glasses, Phys. Rev. X 5, 031040(2015).

[77] J. Job and D. Lidar, Test-driving 1000 qubits, Quantum

Sci. Tech. 3, 030501 (2018).[78] M.E. Rose, Elementary Theory of Angular Momentum

(Dover, New York, 1995).[79] D. A. Lidar, Lecture notes on the theory of open quantum

systems, arXiv preprint arXiv:1902.00967 (2019).[80] S. Muthukrishnan, T. Albash, and D. A. Lidar, Tunneling

and Speedup in Quantum Optimization for Permutation-Symmetric Problems, Phys. Rev. X 6, 031010 (2016).