information capacity of the bsc permutation channel · 2021. 5. 27. · information capacity of the...

Post on 21-Aug-2021

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Information Capacity of the BSC Permutation Channel

Anuran Makur

EECS Department, Massachusetts Institute of Technology

Allerton Conference 2018

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 1 / 21

Outline

1 IntroductionMotivation: Coding for Communication NetworksThe Permutation Channel ModelCapacity of the BSC Permutation Channel

2 Achievability

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 2 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel

:

Alphabet symbols = all possible L-bit packetsmultipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packets ⇒ 2L input symbols

multipath routed networkpackets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network or evolving network topology

packets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositions

packets are impaired

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired (e.g. deletions, substitutions)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Motivation: Point-to-point Communication in Networks

NETWORK

SENDER RECEIVER

Model communication network as a channel:

Alphabet symbols = all possible L-bit packetsmultipath routed network ⇒ packets received with transpositionspackets are impaired ⇒ model using channel probabilities

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 3 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

Abstraction:

n-length codeword = sequence of n packets:Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packets

:

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)

Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

RANDOM DELETION

RANDOM PERMUTATION

Abstraction:

n-length codeword = sequence of n packetsRandom deletion channel: Delete each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsEquivalent Erasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers (packet size = L + log(n) bits,alphabet size = n 2L)

More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniques

More refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?1 2 33

31

1

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codewordCoding: Add sequence numbers and use standard coding techniquesMore refined coding techniques simulate sequence numbers,e.g. [Mitzenmacher 2006], [Metzner 2009]

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

Example: Coding for Random Deletion Network

Consider a communication network where packets can be dropped:

NETWORK

SENDER RECEIVER

ERASURE CHANNEL

RANDOM PERMUTATION?

?

Abstraction:

n-length codeword = sequence of n packetsErasure channel: Erase each symbol/packet of codewordindependently with probability p ∈ (0, 1)Random permutation block: Randomly permute packets of codeword

How do you code in such channelswithout increasing alphabet size?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 4 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

The Permutation Channel Model

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Sender sends message M ∼ Uniform(M)

Possibly randomized encoder fn :M→ X n produces codewordX n1 = (X1, . . . ,Xn) = fn(M) (with block-length n)

Discrete memoryless channel PZ |X with input and output alphabetsX and Y produces Zn

1 :

PZn1 |X n

1(zn1 |xn1 ) =

n∏i=1

PZ |X (zi |xi )

Random permutation generates Y n1 from Zn

1

Possibly randomized decoder gn : Yn →M produces estimateM = gn(Y n

1 ) at receiver

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 5 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Coding for the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

General Principle:“Encode the information in an object that is invariant under the[permutation] transformation.” [Kovacevic-Vukobratovic 2013]

Multiset codes are studied in [Kovacevic-Vukobratovic 2013],[Kovacevic-Vukobratovic 2015], and [Kovacevic-Tan 2018]

What about the information theoreticaspects of this model?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 6 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR because number of empirical distributions of Y n1 is poly(n)

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Information Capacity of the Permutation Channel

ENCODER CHANNEL RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Average probability of error Pnerror , P(M 6= M)

“Rate” of encoder-decoder pair (fn, gn):

R ,log(|M|)

log(n)

|M| = nR

Rate R ≥ 0 is achievable ⇔ ∃{(fn, gn)}n∈N such that limn→∞

Pnerror = 0

Definition (Permutation Channel Capacity)

Cperm(PZ |X ) , sup{R ≥ 0 : R is achievable}

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 7 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}

Assume crossover probability p ∈ (0, 1) and p 6= 12

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Capacity of the BSC Permutation Channel

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Channel is binary symmetric channel, denoted BSC(p):

∀z , x ∈ {0, 1}, PZ |X (z |x) =

{1− p, for z = x

p, for z 6= x

Alphabets are X = Y = {0, 1}Assume crossover probability p ∈ (0, 1) and p 6= 1

2

Main Question

What is the permutation channel capacity of the BSC?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 8 / 21

Outline

1 Introduction

2 AchievabilityEncoder and DecoderTesting between Converging HypothesesIntuition via Central Limit TheoremSecond Moment Method for TV Distance

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 9 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}

, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}

1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Warm-up: Sending Two Messages

ENCODER BSC 𝒑 RANDOM PERMUTATION DECODER

𝑀 𝑋 𝑍 𝑌 𝑀

Fix a message m ∈ {0, 1}, and encode m as fn(m) = X n1

i.i.d.∼ Ber(qm)

𝑞13

𝑞23

10

Memoryless BSC(p) outputs Zn1

i.i.d.∼ Ber(p ∗ qm), wherep ∗ qm , p(1− qm) + qm(1− p) is the convolution of p and qm

Random permutation generates Y n1

i.i.d.∼ Ber(p ∗ qm)

Maximum Likelihood (ML) decoder: M = 1{1n

∑ni=1 Yi ≥ 1

2

}1n

∑ni=1 Yi → p ∗ qm in probability as n→∞ [WLLN]

⇒ limn→∞

Pnerror = 0 as p ∗ q0 6= p ∗ q1

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 10 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)(as before)

ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg maxm∈M

PY n1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Encoder and Decoder

Suppose M = {1, . . . , nR} for some R > 0

Randomized encoder: Given m ∈M, fn(m) = X n1

i.i.d.∼ Ber( m

nR

)10

𝑛

Given m ∈M, Y n1

i.i.d.∼ Ber(p ∗ m

nR

)ML decoder: For yn1 ∈ {0, 1}n, gn(yn1 ) = arg max

m∈MPY n

1 |M(yn1 |m)

Challenge: Although 1n

∑ni=1 Yi → p ∗ m

nRin probability as n→∞,

consecutive messages become indistinguishable i.e. mnR− m+1

nR→ 0

Fact: Consecutive messages distinguishable ⇒ limn→∞

Pnerror = 0

What is the largest R such that two consecutive messagescan be distinguished?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 11 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)

Define the zero-mean sufficient statistic of X n1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Testing between Converging Hypotheses

Binary Hypothesis Testing:

Consider hypothesis H ∼ Ber(12

)with uniform prior

For any n ∈ N, q ∈ (0, 1), and R > 0, consider likelihoods:

Given H = 0 : X n1

i.i.d.∼ PX |H=0 = Ber(q)

Given H = 1 : X n1

i.i.d.∼ PX |H=1 = Ber

(q +

1

nR

)Define the zero-mean sufficient statistic of X n

1 for H:

Tn ,1

n

n∑i=1

Xi − q − 1

2nR

Let HnML(Tn) denote the ML decoder for H based on Tn with

minimum probability of error PnML , P(Hn

ML(Tn) 6= H)

Want: Largest R > 0 such that limn→∞

PnML = 0?

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 12 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

12𝑛

12𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Figure:

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R < 12 : Decoding is possible ,

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 :

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Intuition via Central Limit Theorem

For large n, PTn|H(·|0) and PTn|H(·|1) are Gaussian distributions [CLT]

|E[Tn|H = 0]− E[Tn|H = 1]| = 1/nR

Standard deviations are Θ(1/√n)

Case R > 12 : Decoding is impossible /

𝑡0

𝑃 | 𝑡|0 𝑃 | 𝑡|1

𝛳1𝑛

1𝑛

𝛳1𝑛

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 13 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof:

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(

E[T+n

]− E

[T−n])2

=

(∑t

t(PTn|H(t|1)− PTn|H(t|0)

))2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Let T+n ∼ PTn|H=1 and T−n ∼ PTn|H=0(

E[T+n

]− E

[T−n])2

=

(∑t

t√PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Cauchy-Schwarz inequality

(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

(∑t

t2PTn(t)

)(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Recall that Tn is zero-mean(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ VAR(Tn)

(∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof: Hammersley-Chapman-Robbins bound

(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)︸ ︷︷ ︸

Vincze-Le Cam distance

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Second Moment Method for TV Distance

Lemma (2nd Moment Method [Evans-Kenyon-Peres-Schulman 2000])∥∥PTn|H=1 − PTn|H=0

∥∥TV≥ (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

where ‖P − Q‖TV = 12 ‖P − Q‖`1 is the total variation (TV) distance

between the distributions P and Q.

Proof:(E[T+n

]− E

[T−n])2

=

(∑t

t√

PTn(t)

(PTn|H(t|1)− PTn|H(t|0)

)√PTn(t)

)2

≤ 4VAR(Tn)

(1

4

∑t

(PTn|H(t|1)− PTn|H(t|0)

)2PTn(t)

)≤ 4VAR(Tn)

∥∥PTn|H=1 − PTn|H=0

∥∥TV

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 14 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Start with Le Cam’s relation

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: Apply second moment method lemma

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: After explicit computation and simplification...

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0.

This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Achievability Proof

Theorem (Achievability)

For any 0 < R < 1/2, consider the binary hypothesis testing problem with

H ∼ Ber(12

), and X n

1i.i.d.∼ Ber

(q + h

nR

)given H = h ∈ {0, 1}.

Then, limn→∞

PnML = 0. This implies that:

Cperm(BSC(p)) ≥ 1

2.

Proof: For any 0 < R < 12 ,

PnML =

1

2

(1−

∥∥PTn|H=1 − PTn|H=0

∥∥TV

)≤ 1

2

(1− (E[Tn|H = 1]− E[Tn|H = 0])2

4VAR(Tn)

)≤ 3

2n1−2R→ 0 as n→∞

�A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 15 / 21

Outline

1 Introduction

2 Achievability

3 ConverseFano’s Inequality ArgumentCLT Approximation

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 16 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: M is uniform

R log(n) = H(M)

= H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: Fano’s inequality, DPI

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M;Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: sufficiency

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]: DPI

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ I (X n1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n)

and let n→∞:

R ≤ 1

log(n)+ Pn

errorR +I (X n

1 ;Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: Fano’s Inequality Argument

Consider the Markov chain M → X n1 → Zn

1 → Y n1 → Sn ,

∑ni=1 Yi ,

and a sequence of encoder-decoder pairs {(fn, gn)}n∈N such that|M| = nR and lim

n→∞Pnerror = 0

Standard argument, cf. [Cover-Thomas 2006]:

R log(n) = H(M|M) + I (M; M)

≤ 1 + PnerrorR log(n) + I (M;Y n

1 )

= 1 + PnerrorR log(n) + I (M; Sn)

≤ 1 + PnerrorR log(n) + I (X n

1 ;Sn)

Divide by log(n) and let n→∞:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 17 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Since Sn ∈ {0, . . . , n},I (X n

1 ;Sn) = H(Sn)− H(Sn|X n1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(Sn|X n

1 = xn1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Given X n1 = xn1 with

∑ni=1 xi = k , Sn = bin(k , 1− p) + bin(n − k , p):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Using Problem 2.14 in [Cover-Thomas 2006],

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Approximate binomial entropy using CLT, cf. [Adell-Lekuona-Yu 2010]:

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )

(1

2log(πep(1− p)n) + O

(1

n

))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)

Hence, we have:

R ≤ limn→∞

I (X n1 ;Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Converse: CLT Approximation

Upper bound on I (X n1 ;Sn):

I (X n1 ;Sn) = H(Sn)− H(Sn|X n

1 )

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H(bin(k , 1− p) + bin(n − k , p))

≤ log(n + 1)−∑

xn1∈{0,1}nPX n

1(xn1 )H

(bin(n

2, p))

= log(n + 1)− 1

2log(πep(1− p)n) + O

(1

n

)Hence, we have:

R ≤ limn→∞

I (X n1 ; Sn)

log(n)=

1

2

Theorem (Converse)

Cperm(BSC(p)) ≤ 1

2

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 18 / 21

Outline

1 Introduction

2 Achievability

3 Converse

4 Conclusion

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 19 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Conclusion

Theorem (Pemutation Channel Capacity of BSC)

Cperm(BSC(p)) =

1, for p = 0, 112 , for p ∈

(0, 12)∪(12 , 1)

0, for p = 12

𝑝0

𝐶perm BSC 𝑝

0

1

112

12

Remarks:

Cperm(·) is discontinuousand non-convex

Cperm(·) is generally agnosticto parameters of channel

Computationally tractablecoding scheme in proof

Proof technique yields moregeneral results

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 20 / 21

Thank You!

A. Makur (MIT) Capacity of BSC Permutation Channel 5 October 2018 21 / 21

top related