information capacity of the bsc permutation channel · 2021. 5. 27. · information capacity of the...
TRANSCRIPT
Information Capacity of the BSC Permutation Channel
Anuran Makur
EECS Department, Massachusetts Institute of Technology
Allerton Conference 2018
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 1 / 21
Outline
1 IntroductionMotivation: Coding for Communication NetworksThe Permutation Channel ModelCapacity of the BSC Permutation Channel
2 Achievability
3 Converse
4 Conclusion
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 2 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel
:
Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packets ⇒ 2L input symbols
multipath routed networkpackets are impaired
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packetsmultipath routed network or evolving network topology
packets are impaired
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositions
packets are impaired
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired (e.g. deletions, substitutions)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Motivation: Point-to-point Communication in Networks
NETWORK
SENDER RECEIVER
Model communication network as a channel:
Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired ⇒ model using channel probabilities
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
Abstraction:
n-length codeword = sequence of n packets:Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
RANDOM DELETION
RANDOM PERMUTATION
Abstraction:
n-length codeword = sequence of n packets
:
Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
RANDOM DELETION
RANDOM PERMUTATION
Abstraction:
n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)
Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
RANDOM DELETION
RANDOM PERMUTATION
Abstraction:
n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)
Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
RANDOM DELETION
RANDOM PERMUTATION
Abstraction:
n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
RANDOM DELETION
RANDOM PERMUTATION
Abstraction:
n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
ERASURE CHANNEL
RANDOM PERMUTATION?
?
Abstraction:
n-length codeword = sequence of n packetsEquivalent Erasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
ERASURE CHANNEL
RANDOM PERMUTATION?
?1 2 33
31
1
Abstraction:
n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers (packet size = L + log(n) bits,alphabet size = n 2L)
More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
ERASURE CHANNEL
RANDOM PERMUTATION?
?1 2 33
31
1
Abstraction:
n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniques
More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
ERASURE CHANNEL
RANDOM PERMUTATION?
?1 2 33
31
1
Abstraction:
n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniquesMore refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
Example: Coding for Random Deletion Network
Consider a communication network where packets can be dropped:
NETWORK
SENDER RECEIVER
ERASURE CHANNEL
RANDOM PERMUTATION?
?
Abstraction:
n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword
How do you code in such channelswithout increasing alphabet size?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21
The Permutation Channel Model
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Sender sends message M ∼ Uniform(M)
Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)
Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn
1 :
PZn1 |X n
1(zn1 |xn1 ) =
n∏i=1
PZ |X (zi |xi )
Random permutation generates Y n1 from Zn
1
Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n
1 ) at receiver
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21
The Permutation Channel Model
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Sender sends message M ∼ Uniform(M)
Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)
Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn
1 :
PZn1 |X n
1(zn1 |xn1 ) =
n∏i=1
PZ |X (zi |xi )
Random permutation generates Y n1 from Zn
1
Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n
1 ) at receiver
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21
The Permutation Channel Model
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Sender sends message M ∼ Uniform(M)
Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)
Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn
1 :
PZn1 |X n
1(zn1 |xn1 ) =
n∏i=1
PZ |X (zi |xi )
Random permutation generates Y n1 from Zn
1
Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n
1 ) at receiver
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21
The Permutation Channel Model
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Sender sends message M ∼ Uniform(M)
Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)
Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn
1 :
PZn1 |X n
1(zn1 |xn1 ) =
n∏i=1
PZ |X (zi |xi )
Random permutation generates Y n1 from Zn
1
Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n
1 ) at receiver
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21
The Permutation Channel Model
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Sender sends message M ∼ Uniform(M)
Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)
Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn
1 :
PZn1 |X n
1(zn1 |xn1 ) =
n∏i=1
PZ |X (zi |xi )
Random permutation generates Y n1 from Zn
1
Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n
1 ) at receiver
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21
Coding for the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]
Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]
What about the information theoreticaspects of this model?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21
Coding for the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]
Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]
What about the information theoreticaspects of this model?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21
Coding for the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]
Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]
What about the information theoreticaspects of this model?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR because number of empirical distributions of Y n1 is poly(n)
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Information Capacity of the Permutation Channel
ENCODER CHANNEL RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Average probability of error Pnerror , P(M 6= M)
“Rate” of encoder-decoder pair (fn, gn):
R ,log(|M|)
log(n)
|M| = nR
Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞
Pnerror = 0
Definition (Permutation Channel Capacity)
Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21
Capacity of the BSC Permutation Channel
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Channel is binary symmetric channel, denoted BSC(p):
∀z , x ∈ {0, 1}, PZ |X (z |x) =
{1− p, for z = x
p, for z 6= x
Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1
2
Main Question
What is the permutation channel capacity of the BSC?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21
Capacity of the BSC Permutation Channel
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Channel is binary symmetric channel, denoted BSC(p):
∀z , x ∈ {0, 1}, PZ |X (z |x) =
{1− p, for z = x
p, for z 6= x
Alphabets are X = Y = {0, 1}
Assume crossover probability p ∈ (0, 1) and p 6= 12
Main Question
What is the permutation channel capacity of the BSC?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21
Capacity of the BSC Permutation Channel
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Channel is binary symmetric channel, denoted BSC(p):
∀z , x ∈ {0, 1}, PZ |X (z |x) =
{1− p, for z = x
p, for z 6= x
Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1
2
Main Question
What is the permutation channel capacity of the BSC?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21
Capacity of the BSC Permutation Channel
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Channel is binary symmetric channel, denoted BSC(p):
∀z , x ∈ {0, 1}, PZ |X (z |x) =
{1− p, for z = x
p, for z 6= x
Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1
2
Main Question
What is the permutation channel capacity of the BSC?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21
Outline
1 Introduction
2 AchievabilityEncoder and DecoderTesting between Converging HypothesesIntuition via Central Limit TheoremSecond Moment Method for TV Distance
3 Converse
4 Conclusion
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 9 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}
, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}
1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Warm-up: Sending Two Messages
ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER
𝑀 𝑋 𝑍 𝑌 𝑀
Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1
i.i.d.∼ Ber(qm)
𝑞13
𝑞23
10
Memoryless BSC(p) outputs Zn1
i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm
Random permutation generates Y n1
i.i.d.∼ Ber(p ∗ qm)
Maximum Likelihood (ML) decoder: M = 1{1n
∑ni=1 Yi ≥ 1
2
}1n
∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]
⇒ limn→∞
Pnerror = 0 as p ∗ q0 6= p ∗ q1
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)(as before)
ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg maxm∈M
PY n1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Encoder and Decoder
Suppose M = {1, . . . , nR} for some R > 0
Randomized encoder: Given m ∈M, fn(m) = X n1
i.i.d.∼ Ber( m
nR
)10
𝑛
Given m ∈M, Y n1
i.i.d.∼ Ber(p ∗ m
nR
)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max
m∈MPY n
1 |M(yn1 |m)
Challenge: Although 1n
∑ni=1 Yi → p ∗ m
nRin probability as n→∞,
consecutive messages become indistinguishable i.e. mnR− m+1
nR→ 0
Fact: Consecutive messages distinguishable ⇒ limn→∞
Pnerror = 0
What is the largest R such that two consecutive messagescan be distinguished?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21
Testing between Converging Hypotheses
Binary Hypothesis Testing:
Consider hypothesis H ∼ Ber(12
)with uniform prior
For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:
Given H = 0 : X n1
i.i.d.∼ PX |H=0 = Ber(q)
Given H = 1 : X n1
i.i.d.∼ PX |H=1 = Ber
(q +
1
nR
)Define the zero-mean sufficient statistic of X n
1 for H:
Tn ,1
n
n∑i=1
Xi − q − 1
2nR
Let HnML(Tn) denote the ML decoder for H based on Tn with
minimum probability of error PnML , P(Hn
ML(Tn) 6= H)
Want: Largest R > 0 such that limn→∞
PnML = 0?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21
Testing between Converging Hypotheses
Binary Hypothesis Testing:
Consider hypothesis H ∼ Ber(12
)with uniform prior
For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:
Given H = 0 : X n1
i.i.d.∼ PX |H=0 = Ber(q)
Given H = 1 : X n1
i.i.d.∼ PX |H=1 = Ber
(q +
1
nR
)
Define the zero-mean sufficient statistic of X n1 for H:
Tn ,1
n
n∑i=1
Xi − q − 1
2nR
Let HnML(Tn) denote the ML decoder for H based on Tn with
minimum probability of error PnML , P(Hn
ML(Tn) 6= H)
Want: Largest R > 0 such that limn→∞
PnML = 0?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21
Testing between Converging Hypotheses
Binary Hypothesis Testing:
Consider hypothesis H ∼ Ber(12
)with uniform prior
For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:
Given H = 0 : X n1
i.i.d.∼ PX |H=0 = Ber(q)
Given H = 1 : X n1
i.i.d.∼ PX |H=1 = Ber
(q +
1
nR
)Define the zero-mean sufficient statistic of X n
1 for H:
Tn ,1
n
n∑i=1
Xi − q − 1
2nR
Let HnML(Tn) denote the ML decoder for H based on Tn with
minimum probability of error PnML , P(Hn
ML(Tn) 6= H)
Want: Largest R > 0 such that limn→∞
PnML = 0?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21
Testing between Converging Hypotheses
Binary Hypothesis Testing:
Consider hypothesis H ∼ Ber(12
)with uniform prior
For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:
Given H = 0 : X n1
i.i.d.∼ PX |H=0 = Ber(q)
Given H = 1 : X n1
i.i.d.∼ PX |H=1 = Ber
(q +
1
nR
)Define the zero-mean sufficient statistic of X n
1 for H:
Tn ,1
n
n∑i=1
Xi − q − 1
2nR
Let HnML(Tn) denote the ML decoder for H based on Tn with
minimum probability of error PnML , P(Hn
ML(Tn) 6= H)
Want: Largest R > 0 such that limn→∞
PnML = 0?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21
Testing between Converging Hypotheses
Binary Hypothesis Testing:
Consider hypothesis H ∼ Ber(12
)with uniform prior
For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:
Given H = 0 : X n1
i.i.d.∼ PX |H=0 = Ber(q)
Given H = 1 : X n1
i.i.d.∼ PX |H=1 = Ber
(q +
1
nR
)Define the zero-mean sufficient statistic of X n
1 for H:
Tn ,1
n
n∑i=1
Xi − q − 1
2nR
Let HnML(Tn) denote the ML decoder for H based on Tn with
minimum probability of error PnML , P(Hn
ML(Tn) 6= H)
Want: Largest R > 0 such that limn→∞
PnML = 0?
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Figure:
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
12𝑛
12𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Figure:
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Figure:
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Case R < 12 :
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Case R < 12 : Decoding is possible ,
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Case R > 12 :
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Intuition via Central Limit Theorem
For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]
|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR
Standard deviations are Θ(1/√n)
Case R > 12 : Decoding is impossible /
𝑡0
𝑃 | 𝑡|0 𝑃 | 𝑡|1
𝛳1𝑛
1𝑛
𝛳1𝑛
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof:
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(
E[T+n
]− E
[T−n])2
=
(∑t
t(PTn|H(t|1)− PTn|H(t|0)
))2
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(
E[T+n
]− E
[T−n])2
=
(∑t
t√PTn(t)
(PTn|H(t|1)− PTn|H(t|0)
)√PTn(t)
)2
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof: Cauchy-Schwarz inequality
(E[T+n
]− E
[T−n])2
=
(∑t
t√
PTn(t)
(PTn|H(t|1)− PTn|H(t|0)
)√PTn(t)
)2
≤
(∑t
t2PTn(t)
)(∑t
(PTn|H(t|1)− PTn|H(t|0)
)2PTn(t)
)
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof: Recall that Tn is zero-mean(E[T+n
]− E
[T−n])2
=
(∑t
t√
PTn(t)
(PTn|H(t|1)− PTn|H(t|0)
)√PTn(t)
)2
≤ VAR(Tn)
(∑t
(PTn|H(t|1)− PTn|H(t|0)
)2PTn(t)
)
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof: Hammersley-Chapman-Robbins bound
(E[T+n
]− E
[T−n])2
=
(∑t
t√
PTn(t)
(PTn|H(t|1)− PTn|H(t|0)
)√PTn(t)
)2
≤ 4VAR(Tn)
(1
4
∑t
(PTn|H(t|1)− PTn|H(t|0)
)2PTn(t)
)︸ ︷︷ ︸
Vincze-Le Cam distance
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Second Moment Method for TV Distance
Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0
∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance
between the distributions P and Q.
Proof:(E[T+n
]− E
[T−n])2
=
(∑t
t√
PTn(t)
(PTn|H(t|1)− PTn|H(t|0)
)√PTn(t)
)2
≤ 4VAR(Tn)
(1
4
∑t
(PTn|H(t|1)− PTn|H(t|0)
)2PTn(t)
)≤ 4VAR(Tn)
∥∥PTn|H=1 − PTn|H=0
∥∥TV
�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0. This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: Start with Le Cam’s relation
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0. This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: Apply second moment method lemma
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)≤ 1
2
(1− (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
)
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0. This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: After explicit computation and simplification...
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)≤ 1
2
(1− (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
)
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0. This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: For any 0 < R < 12 ,
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)≤ 1
2
(1− (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
)≤ 3
2n1−2R
�
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0.
This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: For any 0 < R < 12 ,
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)≤ 1
2
(1− (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
)≤ 3
2n1−2R→ 0 as n→∞
�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Achievability Proof
Theorem (Achievability)
For any 0 < R < 1/2, consider the binary hypothesis testing problem with
H ∼ Ber(12
), and X n
1i.i.d.∼ Ber
(q + h
nR
)given H = h ∈ {0, 1}.
Then, limn→∞
PnML = 0. This implies that:
Cperm(BSC(p)) ≥ 1
2.
Proof: For any 0 < R < 12 ,
PnML =
1
2
(1−
∥∥PTn|H=1 − PTn|H=0
∥∥TV
)≤ 1
2
(1− (E[Tn|H = 1]− E[Tn|H = 0])2
4VAR(Tn)
)≤ 3
2n1−2R→ 0 as n→∞
�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21
Outline
1 Introduction
2 Achievability
3 ConverseFano’s Inequality ArgumentCLT Approximation
4 Conclusion
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 16 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]:
R log(n)
= H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M;Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ I (X n1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]: M is uniform
R log(n) = H(M)
= H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M;Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ I (X n1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]: Fano’s inequality, DPI
R log(n) = H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M;Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ I (X n1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]: sufficiency
R log(n) = H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M; Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ I (X n1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]: DPI
R log(n) = H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M; Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ I (X n1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]:
R log(n) = H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M; Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n)
and let n→∞:
R ≤ 1
log(n)+ Pn
errorR +I (X n
1 ;Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: Fano’s Inequality Argument
Consider the Markov chain M → X n1 → Zn
1 → Y n1 → Sn ,
∑ni=1 Yi ,
and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim
n→∞Pnerror = 0
Standard argument, cf. [Cover-Thomas 2006]:
R log(n) = H(M|M) + I (M; M)
≤ 1 + PnerrorR log(n) + I (M;Y n
1 )
= 1 + PnerrorR log(n) + I (M; Sn)
≤ 1 + PnerrorR log(n) + I (X n
1 ;Sn)
Divide by log(n) and let n→∞:
R ≤ limn→∞
I (X n1 ; Sn)
log(n)
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21
Converse: CLT Approximation
Upper bound on I (X n1 ;Sn):
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Since Sn ∈ {0, . . . , n},I (X n
1 ;Sn) = H(Sn)− H(Sn|X n1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(Sn|X n
1 = xn1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Given X n1 = xn1 with
∑ni=1 xi = k , Sn = bin(k , 1− p) + bin(n − k , p):
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Using Problem 2.14 in [Cover-Thomas 2006],
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Approximate binomial entropy using CLT, cf. [Adell-Lekuona-Yu 2010]:
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )
(1
2log(πep(1− p)n) + O
(1
n
))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Upper bound on I (X n1 ;Sn):
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)
Hence, we have:
R ≤ limn→∞
I (X n1 ;Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Upper bound on I (X n1 ;Sn):
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ; Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Converse: CLT Approximation
Upper bound on I (X n1 ;Sn):
I (X n1 ;Sn) = H(Sn)− H(Sn|X n
1 )
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H(bin(k , 1− p) + bin(n − k , p))
≤ log(n + 1)−∑
xn1∈{0,1}nPX n
1(xn1 )H
(bin(n
2, p))
= log(n + 1)− 1
2log(πep(1− p)n) + O
(1
n
)Hence, we have:
R ≤ limn→∞
I (X n1 ; Sn)
log(n)=
1
2
Theorem (Converse)
Cperm(BSC(p)) ≤ 1
2
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21
Outline
1 Introduction
2 Achievability
3 Converse
4 Conclusion
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 19 / 21
Conclusion
Theorem (Pemutation Channel Capacity of BSC)
Cperm(BSC(p)) =
1, for p = 0, 112 , for p ∈
(0, 12)∪(12 , 1)
0, for p = 12
𝑝0
𝐶perm BSC 𝑝
0
1
112
12
Remarks:
Cperm(·) is discontinuousand non-convex
Cperm(·) is generally agnosticto parameters of channel
Computationally tractablecoding scheme in proof
Proof technique yields moregeneral results
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21
Conclusion
Theorem (Pemutation Channel Capacity of BSC)
Cperm(BSC(p)) =
1, for p = 0, 112 , for p ∈
(0, 12)∪(12 , 1)
0, for p = 12
𝑝0
𝐶perm BSC 𝑝
0
1
112
12
Remarks:
Cperm(·) is discontinuousand non-convex
Cperm(·) is generally agnosticto parameters of channel
Computationally tractablecoding scheme in proof
Proof technique yields moregeneral results
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21
Conclusion
Theorem (Pemutation Channel Capacity of BSC)
Cperm(BSC(p)) =
1, for p = 0, 112 , for p ∈
(0, 12)∪(12 , 1)
0, for p = 12
𝑝0
𝐶perm BSC 𝑝
0
1
112
12
Remarks:
Cperm(·) is discontinuousand non-convex
Cperm(·) is generally agnosticto parameters of channel
Computationally tractablecoding scheme in proof
Proof technique yields moregeneral results
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21
Conclusion
Theorem (Pemutation Channel Capacity of BSC)
Cperm(BSC(p)) =
1, for p = 0, 112 , for p ∈
(0, 12)∪(12 , 1)
0, for p = 12
𝑝0
𝐶perm BSC 𝑝
0
1
112
12
Remarks:
Cperm(·) is discontinuousand non-convex
Cperm(·) is generally agnosticto parameters of channel
Computationally tractablecoding scheme in proof
Proof technique yields moregeneral results
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21
Conclusion
Theorem (Pemutation Channel Capacity of BSC)
Cperm(BSC(p)) =
1, for p = 0, 112 , for p ∈
(0, 12)∪(12 , 1)
0, for p = 12
𝑝0
𝐶perm BSC 𝑝
0
1
112
12
Remarks:
Cperm(·) is discontinuousand non-convex
Cperm(·) is generally agnosticto parameters of channel
Computationally tractablecoding scheme in proof
Proof technique yields moregeneral results
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21
Thank You!
A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 21 / 21