strong converse exponent for classical-quantum channel coding · universitat autónoma barcelona...
TRANSCRIPT
Strong converse exponent forclassical-quantum channel coding
Milán Mosonyi 1,2 and Tomohiro Ogawa 3
1Física Teòrica: Informació i Fenomens QuànticsUniversitat Autónoma Barcelona
2Mathematical InstituteBudapest University of Technology and Economics
3Graduate School of Information Systems,University of Electro-Communications, Tokyo
Beyond I.I.D. Banff 2015
Main result
• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is
Ps ∼ e−nHR,c(W )
HR,c(W ) := supα>1
α− 1
α
[R− sup
Pinfσ
∑x∈X
P (x)D∗α (W (x)‖σ)
]
D∗α (.‖.) is the sandwiched Rényi divergence.
• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.
• Utilizing a new family of quantum Rényi divergences.
D[
α (%‖σ) :=1
α− 1log Tr eα log %+(1−α) log σ
Main result
• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is
Ps ∼ e−nHR,c(W )
HR,c(W ) := supα>1
α− 1
α
[R− sup
Pinfσ
∑x∈X
P (x)D∗α (W (x)‖σ)
]
D∗α (.‖.) is the sandwiched Rényi divergence.
• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.
• Utilizing a new family of quantum Rényi divergences.
D[
α (%‖σ) :=1
α− 1log Tr eα log %+(1−α) log σ
Main result
• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is
Ps ∼ e−nHR,c(W )
HR,c(W ) := supα>1
α− 1
α
[R− sup
Pinfσ
∑x∈X
P (x)D∗α (W (x)‖σ)
]
D∗α (.‖.) is the sandwiched Rényi divergence.
• Operational interpretation of a Rényi quantity not in the context ofhypothesis testing.
• Utilizing a new family of quantum Rényi divergences.
D[
α (%‖σ) :=1
α− 1log Tr eα log %+(1−α) log σ
Strong converse things
• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.
• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.
• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.
• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.
Strong converse things
• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.
• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.
• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.
• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.
Strong converse things
• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.
• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.
• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.
• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.
Strong converse things
• Achievable rate: Coding below that, the error probability can be madeto go to zero in the asymptotics.
• Strong converse rate: Coding above that rate makes the errorprobability go to one in the asymptotics.
• Strong converse property: The smallest strong converse rate coincideswith the largest achievable rate.
• Strong converse exponent: The exact exponent of the decaying successprobability for a given rate above the smallest strong converse rate.
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
• Quantum Stein’s lemma:1
αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD(%‖σ) is the optimal decay
D(%‖σ) := Tr %(log %− log σ) relative entropy2
1Hiai, Petz, 1991, Ogawa, Nagaoka, 2001; 2Umegaki, 1962
Binary state discrimination
• Two candidates for the true state of a system: H0 : % vs. H1 : σ
• Many identical copies are available: H0 : %⊗n vs. H1 : σ
⊗n
• Decision is based on a binary POVM (T, I − T ) on H⊗n.
• error probabilities: αn(T ) := Tr %⊗n(In − T ) (first kind)βn(T ) := Trσ⊗nT (second kind)
• trade-off: min0≤T≤I {αn(T ) + βn(T )} > 0 unless %n ⊥ σn
• Quantum Stein’s lemma:1
αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD(%‖σ) is the optimal decay
D(%‖σ) := Tr %(log %− log σ) relative entropy2
1Hiai, Petz, 1991, Ogawa, Nagaoka, 2001; 2Umegaki, 1962
Quantifying the trade-off
• Stein’s lemma: αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD1(%‖σ)
Quantifying the trade-off
• Stein’s lemma: αn(Tn)→ 0 =⇒ βn(Tn) ∼ e−nD1(%‖σ)
• Direct domain: Quantum Hoeffding bound1
βn(Tn) ∼ e−nr =⇒ αn(Tn) ∼ e−nHr , r < D1(%‖σ)
Converse domain: Quantum Han-Kobayashi bound2
βn(Tn) ∼ e−nr =⇒ αn(Tn) ∼ 1− e−nH∗r , r > D1(%‖σ)
• Hoeffding (anti-)divergences:
Hr := sup0<α<1
α− 1
α[r −Dα (%‖σ)]
H∗r := sup1<α
α− 1
α[r −D∗α (%‖σ)]
1Hayashi; Nagaoka; Audenaert, Nussbaum, Szkoła, Verstraete; 20062Mosonyi, Ogawa, 2013
Quantum Rényi divergences
• Quantum Rényi divergences:1
Dα (%‖σ) :=1
α− 1log Tr %ασ1−α
D∗α (%‖σ) :=1
α− 1log Tr
(%
12σ
1−αα %
12
)α
• The right quantum definition is
Dqα(%‖σ) :=
{Dα(%‖σ), α ∈ [0, 1),
D∗α(%‖σ), α ∈ (1,+∞].
• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014
1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013
Quantum Rényi divergences
• Quantum Rényi divergences:1
Dα (%‖σ) :=1
α− 1log Tr %ασ1−α
D∗α (%‖σ) :=1
α− 1log Tr
(%
12σ
1−αα %
12
)α• The right quantum definition is
Dqα(%‖σ) :=
{Dα(%‖σ), α ∈ [0, 1),
D∗α(%‖σ), α ∈ (1,+∞].
• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014
1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013
Quantum Rényi divergences
• Quantum Rényi divergences:1
Dα (%‖σ) :=1
α− 1log Tr %ασ1−α
D∗α (%‖σ) :=1
α− 1log Tr
(%
12σ
1−αα %
12
)α• The right quantum definition is
Dqα(%‖σ) :=
{Dα(%‖σ), α ∈ [0, 1),
D∗α(%‖σ), α ∈ (1,+∞].
• Supported by further binary hypothesis testing resultsHayashi, Tomamichel 2014; Cooney, Mosonyi, Wilde 2014
1Petz 1986; Müller-Lennert, Dupuis, Szehr, Fehr, Tomamichel, 2013;Wilde, Winter, Yang, 2013
Yet another quantum Rényi divergence
• New quantum Rényi divergence connected to classical variationalformulas:
D[
α (%‖σ) :=1
α− 1log Tr eα log %+(1−α) log σ
A new Rényi divergence
Hoeffding (anti-)divergences:
Hr = sup0<α<1
α− 1
α[r −Dα (%‖σ)]
H∗r = sup1<α
α− 1
α[r −D∗α (%‖σ)]
A new Rényi divergence
Hoeffding (anti-)divergences: %σ = σ%
Hr = sup0<α<1
α− 1
α[r −Dα (%‖σ)] = inf
D(τ‖σ)≤rD(τ‖%)
H∗r = sup1<α
α− 1
α[r −D∗α (%‖σ)] = inf
D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}
A new Rényi divergence
Hoeffding (anti-)divergences: %σ 6= σ%
Hr = sup0<α<1
α− 1
α[r −Dα (%‖σ)] 6= inf
D(τ‖σ)≤rD(τ‖%)
H∗r = sup1<α
α− 1
α[r −D∗α (%‖σ)] 6= inf
D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}
A new Rényi divergence
Hoeffding (anti-)divergences: %σ 6= σ%
sup0<α<1
α− 1
α[r −D[
α(%‖σ)] = infD(τ‖σ)≤r
D(τ‖%)
sup1<α
α− 1
α
[r −D[
α(%‖σ)]= inf
D(τ‖σ)≤r{D(τ‖%) + r −D(τ‖σ)}
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?
• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Direct Rényi divergence
Q(%‖σ) := Tr %ασ1−α
“old”, “Petz type”, “WYD type”, “f -divergence type”, “non-sandwiched”
“direct Rényi divergence”
• operational interpretation: α ∈ (0, 1) Hoeffding bound1
• recovery from classical: Nussbaum-Szkoła distributions2
% =∑
i riPi, σ =∑
j sjQj
p(i, j) := λiTrPiQj , q(i, j) := ηj TrPiQj
Qα(%‖σ) = Qα(p‖q)
• variational expression: ?• joint convexity/concavity and monotonicity:3 α ∈ [0, 2]
1Hayashi 2006; Nagaoka 2006; 2Nussbaum, Szkoła 20063Lieb 1973; Ando 1979; Petz 1986
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal”
“converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}
• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Converse Rényi divergence
Q∗(%‖σ) := Tr(%
12σ
1−αα %
12
)α“new”, “sandwiched”, “minimal” “converse Rényi divergence”
• operational interpretation:1 α ∈ (1,+∞) strong converse exponent
• recovery from classical:1,2
σ =∑
j sjQj pinching: Pσ(X) :=∑
j QjXQj .
D∗α(%‖σ) = limn→+∞
1
nDα(Pσ⊗n%⊗n‖σ⊗n)
• variational expression:3 s(α) = sign(α− 1)
Q∗α(%‖σ) = s(α) supH≥0
s(α)
{αTrH%+ (1− α) Tr
(H
12σ
α−1α H
12
) αα−1
}• joint convexity/concavity and monotonicity: α ∈ [1/2,+∞)
1Mosonyi, Ogawa 2013; 2Hayashi, Tomamichel 2014; 3Frank, Lieb 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• operational interpretation: ?
• recovery from classical: ?
• variational expression:1
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
max attained at τα = eα log %+(1−α) log σ/Tr(. . .)
• joint concavity and monotonicity: α ∈ [0, 1]
1Mosonyi, Ogawa, 2013
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• variational expression:
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
• Equivalent forms:1
Tr eH+logA = maxτ>0{Tr τ +Tr τH −D(τ‖A)}
log Tr eH+logA = maxτ∈S(H)
{Tr τH −D(τ‖A)}
• H = α log %, A = σ1−α
1Tropp 2011; Hiai, Petz 1993
Variational Rényi divergence
Q[(%‖σ) := Tr eα log %+(1−α) log σ
• variational expression:
Q[α(%‖σ) = maxτ≥0{Tr τ − αD(τ‖%)− (1− α)D(τ‖σ)}
logQ[α(%‖σ) = maxτ∈S(H)
{−αD(τ‖%)− (1− α)D(τ‖σ)}
• Equivalent forms:1
Tr eH+logA = maxτ>0{Tr τ +Tr τH −D(τ‖A)}
log Tr eH+logA = maxτ∈S(H)
{Tr τH −D(τ‖A)}
• H = α log %, A = σ1−α
1Tropp 2011; Hiai, Petz 1993
Ordering
D∗α ≤ Dα≤ D[α, α ∈ [0, 1),
D[α ≤ D∗α ≤ Dα, α > 1.
blue ⇐⇒ Araki-Lieb-Thirring inequality
red ⇐⇒ Golden-Thompson inequality
Classical-quantum channels: definition
• classical-quantum channel:
W : X → S(H)
• X arbitrary set input alphabet
special case: X = S(HA), W CPTP: quantum channel
• I.i.d. extensions:
W⊗n : x 7→W (x1)⊗ . . .⊗W (xn), x ∈ X n
special case: quantum channel with product encoding
Holevo capacity
• lifted channel:
W : X → S(HX ⊗H), W(x) := |x〉〈x| ⊗W (x).
|x〉〈x|, x ∈ X , set of orthogonal rank-1 projections in HX
• P ∈ Pf (X ) finitely supported probability density
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
marginals: TrHW(P ) =∑x∈X
P (x)|x〉〈x| =: P
TrHX W(P ) =∑x∈X
P (x)W (x) =:W (P )
• Holevo quantity:
χ(W,P ) := D(W(P )‖P ⊗W (P ))
Rényi capacities
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
• Rényi mutual informations:
χ(t)α,1(W,P ) := inf
σ∈S(H)D(t)α (W(P )‖P ⊗ σ)
χ(t)α,2(W,P ) := inf
σ∈S(H)
∑x
P (x)D(t)α (W (x)‖σ)
• Rényi capacities:
χ(t)(W ) := supP∈Pf (X )
χ(t)α,1(W,P )
= supP∈Pf (X )
χ(t)α,2(W,P )
= infσ∈S(H)
supx∈X
D(t)α (W (x)‖σ)
Rényi capacities
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
• Rényi mutual informations:
χ(t)α,1(W,P ) := inf
σ∈S(H)D(t)α (W(P )‖P ⊗ σ)
χ(t)α,2(W,P ) := inf
σ∈S(H)
∑x
P (x)D(t)α (W (x)‖σ)
• Rényi capacities:
χ(t)(W ) := supP∈Pf (X )
χ(t)α,1(W,P )
= supP∈Pf (X )
χ(t)α,2(W,P )
= infσ∈S(H)
supx∈X
D(t)α (W (x)‖σ)
Rényi capacities
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
• Rényi mutual informations:
χ(t)α,1(W,P ) := inf
σ∈S(H)D(t)α (W(P )‖P ⊗ σ)
χ(t)α,2(W,P ) := inf
σ∈S(H)
∑x
P (x)D(t)α (W (x)‖σ)
• Rényi capacities:
χ(t)(W ) := supP∈Pf (X )
χ(t)α,1(W,P )
= supP∈Pf (X )
χ(t)α,2(W,P )
= infσ∈S(H)
supx∈X
D(t)α (W (x)‖σ)
Rényi capacities
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
• Rényi mutual informations:
χ(t)α,1(W,P ) := inf
σ∈S(H)D(t)α (W(P )‖P ⊗ σ)
χ(t)α,2(W,P ) := inf
σ∈S(H)
∑x
P (x)D(t)α (W (x)‖σ)
• Rényi capacities:
χ(t)(W ) := supP∈Pf (X )
χ(t)α,1(W,P )
= supP∈Pf (X )
χ(t)α,2(W,P )
= infσ∈S(H)
supx∈X
D(t)α (W (x)‖σ)
Rényi capacities
W(P ) :=∑x∈X
P (x)|x〉〈x| ⊗W (x)
• Rényi mutual informations:
χ(t)α,1(W,P ) := inf
σ∈S(H)D(t)α (W(P )‖P ⊗ σ)
χ(t)α,2(W,P ) := inf
σ∈S(H)
∑x
P (x)D(t)α (W (x)‖σ)
• Rényi capacities:
χ(t)(W ) := supP∈Pf (X )
χ(t)α,1(W,P )
= supP∈Pf (X )
χ(t)α,2(W,P )
= infσ∈S(H)
supx∈X
D(t)α (W (x)‖σ)
Classical-quantum channels: codes
• code: Cn = (En,Dn)En = (x1, . . . , xMn
) ∈ (X n)Mn , Dn(1), . . . ,Dn(Mn) POVM
• size of the code: |Cn| =Mn
• average success probability:
Ps(Cn) :=1
|Cn|
|Cn|∑m=1
TrW⊗n(En(m))Dn(m)
• channel capacity:1
C(W ) := sup{Cn}n∈N
{lim infn→+∞
1
nlog |Cn| : lim
n→+∞Ps(Cn) = 1
}= χ(W ).
• strong converse exponent:
|Cn| ∼ enr =⇒ Ps(Cn) ∼ e−n·sc(r)1Holevo 1997; Schumacher, Westmoreland 1997
Strong converse exponent: lower bound
Lemma:
sc(r) ≥ supα>1
α− 1
α[R− χ∗α(W )]
Follows by a standard argument due to Nagaoka, using the monotonicity ofD∗α, α > 1.
Dueck-Körner upper bound
Theorem:1
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}Proof idea:
Tr(V ⊗n(En(k))− enaW⊗n(En(k))
)+
≥ Tr(V ⊗n(En(k))− enaW⊗n(En(k))
)Dn(k),
and hence
Ps(W⊗n, Cn)
≥ e−na{Ps(V
⊗n, Cn)−1
Mn
Mn∑k=1
Tr(V ⊗n(En(k))− enaW⊗n(En(k))
)+
}.
1Dueck, Körner 1979; Mosonyi, Ogawa 2014
Dueck-Körner upper bound
Theorem:1
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}Proof idea:
Ps(W⊗n, Cn)
≥ e−na{Ps(V
⊗n, Cn)−1
Mn
Mn∑k=1
Tr(V ⊗n(En(k))− enaW⊗n(En(k))
)+
}.
Expectation E over random coding: Mn = denreE[Ps(W
⊗n, Cn)]
≥ e−naE
[Ps(V
⊗n, Cn)]−∑x∈Xn
Pn(x) Tr(V ⊗n(x)− enaW⊗n(x)
)+
= e−na
{E[Ps(V
⊗n, Cn)]− Tr
(V(P )⊗n − enaW(P )⊗n
)+
}.
1Dueck, Körner 1979; Mosonyi, Ogawa 2014
Dueck-Körner upper bound
Theorem:1
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}Proof idea: Expectation E over random coding: Mn = denre
E[Ps(W
⊗n, Cn)]
≥ e−na{E[Ps(V
⊗n, Cn)]− Tr
(V(P )⊗n − enaW(P )⊗n
)+
}.
r < χ(V, P ), a > D(V(P )‖W(P )) =⇒ E [Ps(W⊗n, Cn)] ≥ e−na
sc(r) ≤ infV :χ(V,P )>r
D(V(P )‖W(P ))
sc(r) ≤ infV :χ(V,P )≤r
{D(V(P )‖W(P )) + r − χ(V, P )}
1Dueck, Körner 1979; Mosonyi, Ogawa 2014
Dueck-Körner upper bound
Theorem:1
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}Proof idea: Expectation E over random coding: Mn = denre
E[Ps(W
⊗n, Cn)]
≥ e−na{E[Ps(V
⊗n, Cn)]− Tr
(V(P )⊗n − enaW(P )⊗n
)+
}.
r < χ(V, P ), a > D(V(P )‖W(P )) =⇒ E [Ps(W⊗n, Cn)] ≥ e−na
sc(r) ≤ infV :χ(V,P )>r
D(V(P )‖W(P ))
sc(r) ≤ infV :χ(V,P )≤r
{D(V(P )‖W(P )) + r − χ(V, P )}
1Dueck, Körner 1979; Mosonyi, Ogawa 2014
Dueck-Körner upper bound
• Theorem:
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}
• Theorem:
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}= sup
α>1
α− 1
α
[r − χ[α,2(W,P )
]• Theorem:
infP
supα>1
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1infP
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
Dueck-Körner upper bound
• Theorem:
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}• Theorem:
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}= sup
α>1
α− 1
α
[r − χ[α,2(W,P )
]
• Theorem:
infP
supα>1
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1infP
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
Dueck-Körner upper bound
• Theorem:
sc(r) ≤ infP∈Pf (X )
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}• Theorem:
infV :X→S(H)
{D(V(P )‖W(P )) + |r − χ(V, P )|+
}= sup
α>1
α− 1
α
[r − χ[α,2(W,P )
]• Theorem:
infP
supα>1
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1infP
α− 1
α
[r − χ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,2(W,P )
]= sup
α>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
Dueck-Körner upper bound
• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.
lim infk
1
klogPs(W
⊗k, Ck) ≥ − supα>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
• Let σm be a universal symmetric state on H⊗m
∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)
2
• pinched channel:
Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm
Dueck-Körner upper bound
• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.
lim infk
1
klogPs(W
⊗k, Ck) ≥ − supα>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
• Let σm be a universal symmetric state on H⊗m
∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)
2
• pinched channel:
Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm
Dueck-Körner upper bound
• ∀W, ∀r ∃ codes {Ck}k∈N with rate r s.t.
lim infk
1
klogPs(W
⊗k, Ck) ≥ − supα>1
α− 1
α
[r − sup
Pχ[α,1(W,P )
]
• Let σm be a universal symmetric state on H⊗m
∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)
2
• pinched channel:
Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm
Dueck-Körner upper bound
• ∀W, ∀r ∃ codes {Ck}k∈N with rate rm s.t.
lim infk
1
klogPs(W
⊗km , Ck) ≥ − sup
α>1
α− 1
α
[rm− sup
Pχ[α,1(Wm, P )
]
• Let σm be a universal symmetric state on H⊗m
∀ ω ∈ Ssymm(H⊗m) : ω ≤ vm,dσm, vm,d ≤ (m+ 1)(d+2)(d−1)
2
• pinched channel:
Wm : x 7→ Eσm(W (x1)⊗ . . .⊗W (xm)), x ∈ Xm
Dueck-Körner upper bound
• ∀W, ∀r ∃ codes {Ck}k∈N with rate rm s.t.
lim infm
1
klogPs(W
⊗km , Ck) ≥ − sup
α>1
α− 1
α
[rm− sup
Pχ[α,1(Wm, P )
]
• Construct codes {C̃n}n∈N with rate r s.t.
lim infn
1
nlogPs(W
⊗n, C̃n) =1
mlim inf
m
1
klogPs(W
⊗km , Ck)
≥ − supα>1
α− 1
α
[r − 1
msupPχ[α,1(Wm, P )
]
Dueck-Körner upper bound
Construct codes {C̃n}n∈N with rate r s.t.
lim infn
1
nlogPs(W
⊗n, C̃n) =1
mlim inf
m
1
klogPs(W
⊗km , Ck)
≥ − supα>1
α− 1
α
[r − 1
msupPχ[α,1(Wm, P )
]
χ[α(Wm) = supPm∈Pf (Xm)
χ[α,1(EmW⊗m, Pm)
≥ supP∈Pf (X )
χ[α,1(EmW⊗m, P⊗m)
≥ supP∈Pf (X )
χ∗α,1(W⊗m, P⊗m)− 3 log vm,d
= m supP∈Pf (X )
χ∗α,1(W,P )− 3 log vm,d
= mχ∗α(W )− 3 log vm,d,
Dueck-Körner upper bound
Construct codes {C̃n}n∈N with rate r s.t.
lim infn
1
nlogPs(W
⊗n, C̃n) =1
mlim inf
m
1
klogPs(W
⊗km , Ck)
≥ − supα>1
α− 1
α
[r − 1
msupPχ[α,1(Wm, P )
]
χ[α(Wm) = supPm∈Pf (Xm)
χ[α,1(EmW⊗m, Pm)
≥ supP∈Pf (X )
χ[α,1(EmW⊗m, P⊗m)
≥ supP∈Pf (X )
χ∗α,1(W⊗m, P⊗m)− 3 log vm,d
= m supP∈Pf (X )
χ∗α,1(W,P )− 3 log vm,d
= mχ∗α(W )− 3 log vm,d,
Dueck-Körner upper bound
Construct codes {C̃n}n∈N with rate r s.t.
lim infn
1
nlogPs(W
⊗n, C̃n) =1
mlim inf
m
1
klogPs(W
⊗km , Ck)
≥ − supα>1
α− 1
α
[r − 1
msupPχ[α,1(Wm, P )
]≥ − sup
α>1
α− 1
α
[r − sup
Pχ∗α,1(W,P )
]− f(m)
χ[α(Wm) = supPm∈Pf (Xm)
χ[α,1(EmW⊗m, Pm)
≥ supP∈Pf (X )
χ[α,1(EmW⊗m, P⊗m)
≥ supP∈Pf (X )
χ∗α,1(W⊗n, P⊗m)− 3 log vm,d
= m supP∈Pf (X )
χ∗α,1(W,P )− 3 log vm,d
= mχ∗α(W )− 3 log vm,d,
Summary
• When coding with a rate R above the Holevo capacity for aclassical-quantum channel W : X → S(H), the optimal asymptoticsof the success probability is
Ps ∼ e−nHR,c(W )
HR,c(W ) := supα>1
α− 1
α[R− χ∗α(W )]
D∗α (.‖.) is the sandwiched Rényi divergence.
• Operational interpretation of the Rényi capacity χ∗α(W ), α > 1.
• Utilizing a new family of quantum Rényi divergences.
D[
α (%‖σ) :=1
α− 1log Tr eα log %+(1−α) log σ