the connection between information theory and networked control
TRANSCRIPT
The connection between information theory andnetworked control
Anant Sahaibased in part on joint work with students:
Tunc Simsek, Hari Palaiyanur, and Pulkit Grover
Wireless FoundationsDepartment of Electrical Engineering and Computer Sciences
University of California at Berkeley
Tutorial Seminar at theGlobal COE Workshop on Networked Control Systems
Kyoto University
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 1 / 54
Networked Control Systems
ZZZ
ZZ
ZZJ
JJ
JJZ
ZZ
ZZ QQ
QQs @@@R
CCCO
XXXXXXXXy
PPPPi
HHHj
-
1
HHHHHHHHHHY
System
Sensor
User
System
Controller
Relay
Communication Links
Systems, sensors, and users connected with network links overnoisychannels.
Signals evolve in real time and the communication links carry ongoingand interacting streams of information.
Holistic approach: overall cost function.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 3 / 54
Ho, Kastner, and Wong (1978)
“. . . sporadic and not too successful attempts have been made torelate Shannon’s information theory with feedback control systemdesign.”
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 4 / 54
Shannon tells us
Separate source and channel coding
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 5 / 54
Shannon tells us
Separate source and channel codingBut delay is the price of reliability.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 5 / 54
Shannon tells us
Separate source and channel codingBut delay is the price of reliability.
“[The duality between source and channel coding] can bepursued further and is related to a duality between past and futureand the notions ofcontrol and knowledge. Thus we may haveknowledge of the past and cannot control it; we may control thefuture but have no knowledge of it.”— Claude Shannon 1959
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 5 / 54
Shannon tells us
Separate source and channel codingBut delay is the price of reliability.
“[The duality between source and channel coding] can bepursued further and is related to a duality between past and futureand the notions ofcontrol and knowledge. Thus we may haveknowledge of the past and cannot control it; we may control thefuture but have no knowledge of it.”— Claude Shannon 1959
What is this relationship since delays hurt control?
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 5 / 54
Outline
1 A bridge to nowhere? From control to information theory. A simple control problem A connection to information theory Fixing information theory and filling in the gaps.
2 Coming back to the control problem What is wrong with random coding The role of noiseless feedback
3 Taking control thinking to the forefront of information theory. The “holy grail” problem Control thinking to the rescue!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 6 / 54
A simple distributed control problem
1 StepDelay
Channel
Fortified
DesignedObserver
DesignedController
-?
?
6
-
Possible Control Knowledge6
Possible Channel Feedback
1 StepDelay
6
Ut
Control Signals
O
C
Xt
Wt
Ut−1
UnstableSystem
?
Xt+1 = λXt + Ut + Wt
Unstableλ > 1, bounded initial condition and disturbanceW.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 8 / 54
A simple distributed control problem
1 StepDelay
Channel
Fortified
DesignedObserver
DesignedController
-?
?
6
-
Possible Control Knowledge6
Possible Channel Feedback
1 StepDelay
6
Ut
Control Signals
O
C
Xt
Wt
Ut−1
UnstableSystem
?
Xt+1 = λXt + Ut + Wt
Unstableλ > 1, bounded initial condition and disturbanceW.Goal: Performance = supt>0 E[‖Xt‖
η] ≤ K for some targetK < ∞.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 8 / 54
Review: Entirely noiseless channel
Window known to containXt
SendingR bits, cut window by a factor of 2−R
0 1
?
-
JJJ
?
- -grows by Ω
2 on each side
giving a new window forXt+1
will grow by factor ofλ > 1
∆t+1
Encode which controlUt to apply
λ∆t
∆t
As long as R > log2 λ, we can have ∆ stay bounded forever.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 10 / 54
The separation-principle oriented program
Use entropy and mutual information
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Write out entropic inequalities Key Inequality: Directed data-processing inequality
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Write out entropic inequalities Key Inequality: Directed data-processing inequality
Set up a mapping between bits and performance You probably don’t care about the entropy of the state.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Write out entropic inequalities Key Inequality: Directed data-processing inequality
Set up a mapping between bits and performance You probably don’t care about the entropy of the state. Lower bound performance assuming nested information Equivalent to estimation
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Write out entropic inequalities Key Inequality: Directed data-processing inequality
Set up a mapping between bits and performance You probably don’t care about the entropy of the state. Lower bound performance assuming nested information Equivalent to estimation Rate-distortion theory can be developed
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The separation-principle oriented program
Use entropy and mutual information Tatikonda’s insight: directed mutual information captures causality
Write out entropic inequalities Key Inequality: Directed data-processing inequality
Set up a mapping between bits and performance You probably don’t care about the entropy of the state. Lower bound performance assuming nested information Equivalent to estimation Rate-distortion theory can be developed
Get tight upper bounds and architectures?
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 11 / 54
The rate-distortion part
0
0.2
0.4
0.6
0.8
1
0.5 1 1.5 2 2.5 3 3.5Rate (in bits)
Squ
ared
err
or d
isto
rtio
n
"Sequential" Rate−distortion (obeys causality)
Rate−distortion curve (non−causal)
Stable counterpart (non−causal)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 12 / 54
How bad can entropic bounds be?
Consider a system with λ = 2 for the dynamics Real packet-drop channel (C = ∞)
Zt =
Yt with Probability 12
0 with Probability12
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 13 / 54
How bad can entropic bounds be?
Consider a system with λ = 2 for the dynamics Real packet-drop channel (C = ∞)
Zt =
Yt with Probability 12
0 with Probability12
No other constraints, so design is obvious:Yt = Xt andUt = −λZt
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 13 / 54
How bad can entropic bounds be?
Consider a system with λ = 2 for the dynamics Real packet-drop channel (C = ∞)
Zt =
Yt with Probability 12
0 with Probability12
No other constraints, so design is obvious:Yt = Xt andUt = −λZt
Xt+1 =
Wt with Probability 12
2Xt + Wt with Probability 12
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 13 / 54
How bad can entropic bounds be?
Consider a system with λ = 2 for the dynamics Real packet-drop channel (C = ∞)
Zt =
Yt with Probability 12
0 with Probability12
No other constraints, so design is obvious:Yt = Xt andUt = −λZt
Xt+1 =
Wt with Probability 12
2Xt + Wt with Probability 12
Under stochastic disturbances, the variance of the state is asymptoticallyinfinite. (St. Petersburg Lottery Style)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 13 / 54
Delay-universal (anytime) communication
-AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA
-
? ? ? ? ? ? ? ? ?
? ? ?? ? ? ? ? ? ? ? ? ?
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24 Y25 Y26
Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 Z10 Z11 Z12 Z13 Z14 Z15 Z16 Z17 Z18 Z19 Z20 Z21 Z22 Z23 Z24 Z25 Z26
fixed delayd = 7
bB6 bB8bB1 bB2 bB3 bB4 bB5 bB7 bB9
B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 15 / 54
Delay-universal (anytime) communication
-AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA
-
? ? ? ? ? ? ? ? ?
? ? ?? ? ? ? ? ? ? ? ? ?
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24 Y25 Y26
Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 Z10 Z11 Z12 Z13 Z14 Z15 Z16 Z17 Z18 Z19 Z20 Z21 Z22 Z23 Z24 Z25 Z26
fixed delayd = 7
bB6 bB8bB1 bB2 bB3 bB4 bB5 bB7 bB9
B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13
Fixed-delay reliabilityα is achievable if there exists a sequence ofencoder/decoder pairs with increasing end-to-end delaysdj such thatlim j→∞
−1dj
ln P(Bi 6= Bji) = α.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 15 / 54
Delay-universal (anytime) communication
-AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA
-
? ? ? ? ? ? ? ? ?
? ? ?? ? ? ? ? ? ? ? ? ?
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24 Y25 Y26
Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 Z10 Z11 Z12 Z13 Z14 Z15 Z16 Z17 Z18 Z19 Z20 Z21 Z22 Z23 Z24 Z25 Z26
fixed delayd = 7
bB6 bB8bB1 bB2 bB3 bB4 bB5 bB7 bB9
B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13
α is achievabledelay-universallyor in ananytime fashionif a singleencoder works for all sufficiently large delaysd.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 15 / 54
Delay-universal (anytime) communication
-AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA AA
-
? ? ? ? ? ? ? ? ?
? ? ?? ? ? ? ? ? ? ? ? ?
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10 Y11 Y12 Y13 Y14 Y15 Y16 Y17 Y18 Y19 Y20 Y21 Y22 Y23 Y24 Y25 Y26
Z1 Z2 Z3 Z4 Z5 Z6 Z7 Z8 Z9 Z10 Z11 Z12 Z13 Z14 Z15 Z16 Z17 Z18 Z19 Z20 Z21 Z22 Z23 Z24 Z25 Z26
fixed delayd = 7
bB6 bB8bB1 bB2 bB3 bB4 bB5 bB7 bB9
B1 B2 B3 B4 B5 B6 B7 B8 B9 B10 B11 B12 B13
The anytime capacityCany(α) is the supremal rate at which reliabilityαis achievable in a delay-universal way.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 15 / 54
Separation theorem for scalar control
Necessity:If a scalar system with parameterλ > 1 can be stabilized withfinite η-moment across a noisy channel, then thechannel with noiselessfeedback must have
Cany(η ln λ) ≥ ln λ
In general: IfP(|X| > m) < f (m), then∃K : Perror(d) < f (Kλd)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 16 / 54
Separation theorem for scalar control
Necessity:If a scalar system with parameterλ > 1 can be stabilized withfinite η-moment across a noisy channel, then thechannel with noiselessfeedback must have
Cany(η ln λ) ≥ ln λ
In general: IfP(|X| > m) < f (m), then∃K : Perror(d) < f (Kλd)
Sufficiency:If there is anα > η ln λ for which thechannel with noiselessfeedback has
Cany(α) > ln λ
then the scalar system with parameterλ ≥ 1 with a bounded disturbance canbe stabilized across the noisy channel with finiteη-moment.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 16 / 54
Separation theorem for scalar control
Necessity:If a scalar system with parameterλ > 1 can be stabilized withfinite η-moment across a noisy channel, then thechannel with noiselessfeedback must have
Cany(η ln λ) ≥ ln λ
In general: IfP(|X| > m) < f (m), then∃K : Perror(d) < f (Kλd)
Sufficiency:If there is anα > η ln λ for which thechannel with noiselessfeedback has
Cany(α) > ln λ
then the scalar system with parameterλ ≥ 1 with a bounded disturbance canbe stabilized across the noisy channel with finiteη-moment.
Captures stabilization only.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 16 / 54
Some easy implications
If we wantP(|Xt| > m) ≤ f (m) = 0 for some finitem, we requirezero-error reliability across the channel. Also required (for DMCs) if wewant the controller to be finite memory.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 17 / 54
Some easy implications
If we wantP(|Xt| > m) ≤ f (m) = 0 for some finitem, we requirezero-error reliability across the channel. Also required (for DMCs) if wewant the controller to be finite memory.
For generic DMCs, anytime reliability with feedback is upper-bounded:
f (Kλd) ≥ ζd
f (m) ≥ K′m−
log21ζ
log2 λ
A controlled state can have at best a power-law tail.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 17 / 54
Some easy implications
If we wantP(|Xt| > m) ≤ f (m) = 0 for some finitem, we requirezero-error reliability across the channel. Also required (for DMCs) if wewant the controller to be finite memory.
For generic DMCs, anytime reliability with feedback is upper-bounded:
f (Kλd) ≥ ζd
f (m) ≥ K′m−
log21ζ
log2 λ
A controlled state can have at best a power-law tail.
If we just want limm→∞ f (m) = 0, then just Shannon capacity> log2 λis required for DMCs.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 17 / 54
Some easy implications
If we wantP(|Xt| > m) ≤ f (m) = 0 for some finitem, we requirezero-error reliability across the channel. Also required (for DMCs) if wewant the controller to be finite memory.
For generic DMCs, anytime reliability with feedback is upper-bounded:
f (Kλd) ≥ ζd
f (m) ≥ K′m−
log21ζ
log2 λ
A controlled state can have at best a power-law tail.
If we just want limm→∞ f (m) = 0, then just Shannon capacity> log2 λis required for DMCs.
Almost-sure stabilization forWt = 0 follows by simple time-varyingtransformation.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 17 / 54
Asymptotic communication problem hierarchy
The easiest: Shannon communication Asymptotically: a single figure of meritC Equivalent to most estimation problems of stationary ergodic processes
with bounded distortion measures. Feedback does not matter.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 18 / 54
Asymptotic communication problem hierarchy
The easiest: Shannon communication Asymptotically: a single figure of meritC Equivalent to most estimation problems of stationary ergodic processes
with bounded distortion measures. Feedback does not matter.
Hardest level: Zero-error communication Single figure of meritC0 Feedback matters.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 18 / 54
Asymptotic communication problem hierarchy
The easiest: Shannon communication Asymptotically: a single figure of meritC Equivalent to most estimation problems of stationary ergodic processes
with bounded distortion measures. Feedback does not matter.
Intermediate families: Anytime communication Multiple figures of merit:(~R, ~α) Feedback case equivalent to stabilization problems Related nonstationary estimation problems fall here also Does feedback matter?
Hardest level: Zero-error communication Single figure of meritC0 Feedback matters.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 18 / 54
My favorite example: the BEC
f
f
f
ff
PPPPPPP
1 − δ
1 − δ
δ
δ
1
0
1
0
e
Simple capacity 1− δ bitsper channel useWith perfect feedback,simple to achieve:retransmit until it getsthrough
Time till success:Geometric(1− δ)
Expected time to getthrough: 1
1−δ
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 20 / 54
My favorite example: the BEC
f
f
f
ff
PPPPPPP
1 − δ
1 − δ
δ
δ
1
0
1
0
e
Simple capacity 1− δ bitsper channel useWith perfect feedback,simple to achieve:retransmit until it getsthrough
Time till success:Geometric(1− δ)
Expected time to getthrough: 1
1−δ
0
0.2
0.4
0.6
0.8
1
1.2
0.1 0.2 0.3 0.4 0.5 0.6Rate (in bits)
Err
or E
xpon
ent (
base
2)
Classical bounds Sphere-packing bound
D(1− R||δ) Random coding bound
maxρ∈[0,1] E0(ρ) − ρR
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 20 / 54
My favorite example: the BEC
f
f
f
ff
PPPPPPP
1 − δ
1 − δ
δ
δ
1
0
1
0
e
Simple capacity 1− δ bitsper channel useWith perfect feedback,simple to achieve:retransmit until it getsthrough
Time till success:Geometric(1− δ)
Expected time to getthrough: 1
1−δ
0
0.2
0.4
0.6
0.8
1
1.2
0.1 0.2 0.3 0.4 0.5 0.6Rate (in bits)
Err
or E
xpon
ent (
base
2)
Classical bounds Sphere-packing bound
D(1− R||δ) Random coding bound
maxρ∈[0,1] E0(ρ) − ρR
What happens withfeedback?
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 20 / 54
BEC with feedback and fixed blocks
At rateR < 1, haveRnbits to transmit inn channel uses.
Typically (1− δ)n code bits will be received.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 21 / 54
BEC with feedback and fixed blocks
At rateR < 1, haveRnbits to transmit inn channel uses.
Typically (1− δ)n code bits will be received.Block errors caused by atypical channel behavior.
Doomed if fewer thanRnbits arrive intact.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 21 / 54
BEC with feedback and fixed blocks
At rateR < 1, haveRnbits to transmit inn channel uses.
Typically (1− δ)n code bits will be received.Block errors caused by atypical channel behavior.
Doomed if fewer thanRnbits arrive intact. Feedback can not save us. D(1− R||δ)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 21 / 54
BEC with feedback and fixed blocks
At rateR < 1, haveRnbits to transmit inn channel uses.
Typically (1− δ)n code bits will be received.Block errors caused by atypical channel behavior.
Doomed if fewer thanRnbits arrive intact. Feedback can not save us. D(1− R||δ)
Dobrushin-62 showed that this type of behavior is common:E+(R) = Esp(R) for symmetric channels.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 21 / 54
BEC with feedback and fixed delay
R = 12 example:
-
-
-
-
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)210 2 3 · · ·
Birth-death chain: positive recurrent ifδ < 12
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 23 / 54
BEC with feedback and fixed delay
R = 12 example:
-
-
-
-
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)210 2 3 · · ·
Birth-death chain: positive recurrent ifδ < 12
Delay exponent easy to see:
P(D ≥ d) = P(L >d2) = K(
δ
1− δ)d
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 23 / 54
BEC with feedback and fixed delay
R = 12 example:
-
-
-
-
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)210 2 3 · · ·
Birth-death chain: positive recurrent ifδ < 12
Delay exponent easy to see:
P(D ≥ d) = P(L >d2) = K(
δ
1− δ)d
≈ 0.584 vs 0.0294 for block-coding withδ = 0.4
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 23 / 54
BEC with feedback and fixed delay
R = 12 example:
-
-
-
-
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)2
δ2
(1 − δ)210 2 3 · · ·
Birth-death chain: positive recurrent ifδ < 12
Delay exponent easy to see:
P(D ≥ d) = P(L >d2) = K(
δ
1− δ)d
≈ 0.584 vs 0.0294 for block-coding withδ = 0.4
Block-coding is misleading!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 23 / 54
Pinsker’s bounding construction explained
Without feedback:E+(R) continues to be a bound.Consider a code with target delayd
Use it to construct a block-code with blocksizen ≫ d Genie-aided decoder: has the truth of all bits beforei
feedforward
delay
Fixed
delay
decoder
Fixed
delay
decoder
- - -
- -
?
-
6
6?
- -
....................................................................................................................................
DMCCausal
encoder
XOR
bB
XOR B
BX Y
Yt1
eB
eB(t−d)R1
B(t−d)R1
bB(t−d)R1
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 25 / 54
Pinsker’s bounding construction explained
Without feedback:E+(R) continues to be a bound.Consider a code with target delayd
Use it to construct a block-code with blocksizen ≫ d Genie-aided decoder: has the truth of all bits beforei Error events for genie-aided system depend only on lastd
feedforward
delay
Fixed
delay
decoder
Fixed
delay
decoder
- - -
- -
?
-
6
6?
- -
....................................................................................................................................
DMCCausal
encoder
XOR
bB
XOR B
BX Y
Yt1
eB
eB(t−d)R1
B(t−d)R1
bB(t−d)R1
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 25 / 54
Pinsker’s bounding construction explained
Without feedback:E+(R) continues to be a bound.Consider a code with target delayd
Use it to construct a block-code with blocksizen ≫ d Genie-aided decoder: has the truth of all bits beforei Error events for genie-aided system depend only on lastd Apply a change of measure argument
feedforward
delay
Fixed
delay
decoder
Fixed
delay
decoder
- - -
- -
?
-
6
6?
- -
....................................................................................................................................
DMCCausal
encoder
XOR
bB
XOR B
BX Y
Yt1
eB
eB(t−d)R1
B(t−d)R1
bB(t−d)R1
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 25 / 54
Using Esp to bound α∗ in general
-
dλ
1−λd
bits within deadline bits to ignore
λn (1 − λ)n
Past behavior Future
λR′n
The block error probability is like exp(−α(1− λ)n) which cannotexceed the Haroutunian bound exp(−Esp(λR)n)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 27 / 54
Using Esp to bound α∗ in general
-
dλ
1−λd
bits within deadline bits to ignore
λn (1 − λ)n
Past behavior Future
λR′n
The block error probability is like exp(−α(1− λ)n) which cannotexceed the Haroutunian bound exp(−Esp(λR)n)
α∗(R) ≤Esp(λR)
1− λ
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 27 / 54
Using Esp to bound α∗ in general
-
dλ
1−λd
bits within deadline bits to ignore
λn (1 − λ)n
Past behavior Future
λR′n
The block error probability is like exp(−α(1− λ)n) which cannotexceed the Haroutunian bound exp(−Esp(λR)n)
α∗(R) ≤Esp(λR)
1− λ
The error events involveboththe past and the future.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 27 / 54
Uncertainty-focusing bound for symmetric DMCs
Minimize overλ for symmetric DMCs to sweep out frontier by varyingρ > 0:
R(ρ) =E0(ρ)
ρ
E+a (ρ) = E0(ρ)
Using the Gallager function:
E0(ρ) = −maxq
ln∑
j
(
∑
i
qip1
1+ρ
ij
)1+ρ
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 28 / 54
Uncertainty-focusing bound for symmetric DMCs
Minimize overλ for symmetric DMCs to sweep out frontier by varyingρ > 0:
R(ρ) =E0(ρ)
ρ
E+a (ρ) = E0(ρ)
Using the Gallager function:
E0(ρ) = −maxq
ln∑
j
(
∑
i
qip1
1+ρ
ij
)1+ρ
Same form as Viterbi’s “convolutional coding bound” for constraint-lengths,but a lot more fundamental!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 28 / 54
Upper bound tight for the BEC with feedback
0
1
2
3
4
5
6
7
0.2 0.4 0.6 0.8 1 1.2 1.4
Err
or E
xpon
ent (
base
2)
Rate (in bits)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 29 / 54
Implications for scalar moment stabilization
0
0.2
0.4
0.6
0.8
1
1.2
0.1 0.2 0.3 0.4 0.5 0.6Rate (in nats)
Err
or E
xpon
ent (
base
e)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 30 / 54
Implications for scalar moment stabilization
0
2
4
6
8
10
1 1.2 1.4 1.6 1.8
Mom
ents
sta
biliz
ed
Open−loop unstable gain
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 30 / 54
Outline
1 A bridge to nowhere? A simple control problem A connection to information theory Fixing information theory and filling in the gaps.
2 Coming back to control What is wrong with random coding The role of noiseless feedback
3 Taking control thinking to the forefront of information theory. The “holy grail” problem Control thinking to the rescue!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 31 / 54
Random coding bound is relatively easy to achieve
Randomly label the uniformly quantized state!
6
?
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
?6
t=0 t=1 t=2 t=3 t=4
∆
R = log2 3
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 33 / 54
Random coding bound is relatively easy to achieve
Randomly label the uniformly quantized state!
Stable system state “renews” itself.
6
?
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
?6
t=0 t=1 t=2 t=3 t=4
∆
R = log2 3
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 33 / 54
Random coding bound is relatively easy to achieve
Randomly label the uniformly quantized state!
Stable system state “renews” itself.
It diverges locally whenever the channel misbehaves.
6
?
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
?6
t=0 t=1 t=2 t=3 t=4
∆
R = log2 3
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 33 / 54
Random coding bound is relatively easy to achieve
Randomly label the uniformly quantized state!
Stable system state “renews” itself.
It diverges locally whenever the channel misbehaves.
Semi-reasonable implementation complexity.
6
?
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
6
?
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
PPq-1
PPq-1-1PPq
-1>1>77
-PPqZZ~PPqZZ~SSSw
ZZ~SSSw
JJJ
?6
t=0 t=1 t=2 t=3 t=4
∆
R = log2 3
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 33 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Apply the control based on current ML state.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Apply the control based on current ML state.
Computational nightmare: effort grows exponentially with time.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Apply the control based on current ML state.
Computational nightmare: effort grows exponentially with time.Use “Stack-based” greedy search algorithm instead.
Log likelihoods are additive. The score of a path is a random walk with drift. Bias it so that the true path goes up and false ones down.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Apply the control based on current ML state.
Computational nightmare: effort grows exponentially with time.Use “Stack-based” greedy search algorithm instead.
Log likelihoods are additive. The score of a path is a random walk with drift. Bias it so that the true path goes up and false ones down.
Classical results tell us that with appropriate bias, achieveEr(Rbranch) forerror probability and hence power-law in state
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Controller and Computations
All “false” disjoint paths through the trellis are pairwise independentwith the true path.
Bound the number of distinct paths by assuming no remerging.
Gallager’sEr(Rbranch) emerges as the governing exponent.
Apply the control based on current ML state.
Computational nightmare: effort grows exponentially with time.Use “Stack-based” greedy search algorithm instead.
Log likelihoods are additive. The score of a path is a random walk with drift. Bias it so that the true path goes up and false ones down.
Classical results tell us that with appropriate bias, achieveEr(Rbranch) forerror probability and hence power-law in state
At the cost of only finite expected computation.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 34 / 54
Catch up “all-at-once” phenomenon
Simulation Parameters:λ = 1.1ε = 0.05Ω = 2.0∆ = 5000.0Bias = 0.55T = 10100,000 Blocks17 seconds to run
Rate = 0.317Capacity = 0.71
1674 1676 1678 1680 1682 1684 1686
−100
0
100
200
300
400
500
Block Number
Bin
Decoded BinActual Bin
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 35 / 54
Truth in advertising: computation revisited
Although we are doing better than exponential growth, we still havepower laws on both sides.
What if we needed a finite speed computer in the controller?
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 36 / 54
Truth in advertising: computation revisited
Although we are doing better than exponential growth, we still havepower laws on both sides.
What if we needed a finite speed computer in the controller?Bad news:
Assume 0 control applied if we can not decode yet.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 36 / 54
Truth in advertising: computation revisited
Although we are doing better than exponential growth, we still havepower laws on both sides.
What if we needed a finite speed computer in the controller?Bad news:
Assume 0 control applied if we can not decode yet. Power law for comp. implies power low for waiting.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 36 / 54
Truth in advertising: computation revisited
Although we are doing better than exponential growth, we still havepower laws on both sides.
What if we needed a finite speed computer in the controller?Bad news:
Assume 0 control applied if we can not decode yet. Power law for comp. implies power low for waiting. Exponentially rare doubly exponentially bad states!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 36 / 54
How to hit the higher bound?
0
0.2
0.4
0.6
0.8
1
1.2
0.1 0.2 0.3 0.4 0.5 0.6Rate (in nats)
Err
or E
xpon
ent (
base
e)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 37 / 54
How to hit the higher bound?
0
2
4
6
8
10
1 1.2 1.4 1.6 1.8
Mom
ents
sta
biliz
ed
Open−loop unstable gain
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 37 / 54
Fortified channels
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Some mix of noisy and noiseless channels
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 39 / 54
Fortified channels
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Some mix of noisy and noiseless channels
Is it all or nothing?
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 39 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.Use the noiseless channel for supervisory information:
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.Use the noiseless channel for supervisory information:
Have the observer do event-based “sampling” of the state.
000101111 110 100 011 001 010
Inner catchment area to resample the state
Outer net to quantize and encode the state
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.Use the noiseless channel for supervisory information:
Have the observer do event-based “sampling” of the state. “Quantization net” grows as needed, but has onlyenR boxes.
000101111 110 100 011 001 010
Inner catchment area to resample the state
Outer net to quantize and encode the state
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.Use the noiseless channel for supervisory information:
Have the observer do event-based “sampling” of the state. “Quantization net” grows as needed, but has onlyenR boxes. Noiseless channel tells controller when it has “resampled.”
000101111 110 100 011 001 010
Inner catchment area to resample the state
Outer net to quantize and encode the state
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Noiseless channel can enable event-based sampling
-
-Noisy forward channel uses
S1 S2 S3 S4 S5 S6
”Fortification” noiseless forward channel uses
Need to allow for gradual progress during bad periods.Use the noiseless channel for supervisory information:
Have the observer do event-based “sampling” of the state. “Quantization net” grows as needed, but has onlyenR boxes. Noiseless channel tells controller when it has “resampled.”
Use the noisy channel for variable-length block-coding.
000101111 110 100 011 001 010
Inner catchment area to resample the state
Outer net to quantize and encode the state
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 41 / 54
Why gradual progress is better: intuition
90
100
110
120
130
180 200 220 240 260
Pro
gres
(
in b
its)
time (in BEC uses)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 42 / 54
Why this works: proof strategy
Lift problem by using largenR
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)).
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)). Interpret with lnλ < R = E0(ρ)
ρ< E0(η+ǫ)
η+ǫ
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)). Interpret with lnλ < R = E0(ρ)
ρ< E0(η+ǫ)
η+ǫ
Behaves like a “virtual” packet-erasure channel.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)). Interpret with lnλ < R = E0(ρ)
ρ< E0(η+ǫ)
η+ǫ
Behaves like a “virtual” packet-erasure channel. Each packet carriesn(R− ln λ) nats.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)). Interpret with lnλ < R = E0(ρ)
ρ< E0(η+ǫ)
η+ǫ
Behaves like a “virtual” packet-erasure channel. Each packet carriesn(R− ln λ) nats. Disturbances grow by factorO(λn)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Why this works: proof strategy
Lift problem by using largenR Very few noiseless channel uses required Stopping time for variable-length channel is liken + T, whereT is
geometric exp(−E0(ρ)). Interpret with lnλ < R = E0(ρ)
ρ< E0(η+ǫ)
η+ǫ
Behaves like a “virtual” packet-erasure channel. Each packet carriesn(R− ln λ) nats. Disturbances grow by factorO(λn) Erasure probabilityexp(−E0(ρ))
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 43 / 54
Outline
1 A bridge to nowhere? A simple control problem A connection to information theory Fixing information theory and filling in the gaps.
2 Coming back to control What is wrong with random coding The role of noiseless feedback
3 Taking control thinking to the forefront of information theory. The “holy grail” problem Control thinking to the rescue!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 44 / 54
The “holy grail:” understanding complexity
Classical goal: arbitrarily low probability of error.Classical assumption: not delay sensitive at all.New twist: minimizetotal power consumption
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 45 / 54
The “holy grail:” understanding complexity
Classical goal: arbitrarily low probability of error.Classical assumption: not delay sensitive at all.New twist: minimizetotal power consumptionImportant technology trends
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 45 / 54
The “holy grail:” understanding complexity
Classical goal: arbitrarily low probability of error.Classical assumption: not delay sensitive at all.New twist: minimizetotal power consumptionImportant technology trends
“Moore’s law” allows billions of transistors, and but only mildly reducespower-consumption per transistor.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 45 / 54
The “holy grail:” understanding complexity
Classical goal: arbitrarily low probability of error.Classical assumption: not delay sensitive at all.New twist: minimizetotal power consumptionImportant technology trends
“Moore’s law” allows billions of transistors, and but only mildly reducespower-consumption per transistor.
New short-range applications: swarm behavior, in-home networks, densemeshes, personal-area networks, UWB, between-chip communication, etc.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 45 / 54
Decoding power vs communication range
1mm 10mm 1m 100m 10km−120
−100
−80
−60
−40
−20
0
20
40
60
Distance
γ (d
B)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 46 / 54
Dense linear codes with brute-force decoding
0 10 20 30 40 50 60 70 80
−2
−4
−6
−8
−10
−12
−14
−16
−18
−20
−22
−24
Power
log 10
(⟨ P
e ⟩)Total powerOptimal transmit powerDecoding powerShannon waterfall
Decoding PowernR2nR, Error Prob 2−Esp(R,P)n
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 47 / 54
Convolutional codes with Viterbi decoding
0 10 20 30 40 50 60 70 80
−2
−4
−6
−8
−10
−12
−14
−16
−18
−20
−22
−24
Power
log 10
(⟨ P
e ⟩)Total powerOptimal transmit powerDecoding powerShannon waterfall
Decoding PowerLcR2LcR, Error Prob 2−Econv(R,P)Lc
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 48 / 54
Convolutional with “magical” sequential decoding
0 5 10 15
−2
−4
−6
−8
−10
−12
−14
−16
−18
−20
−22
−24
Power
log 10
(⟨ P
e ⟩)Shannon waterfallOptimal transmit powerDecoding powerTotal power
Decoding PowerLcR, Error Prob 2−Econv(R,P)Lc
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 49 / 54
Dense linear codes with “magical” syndrome decoding
0 5 10 15
−2
−4
−6
−8
−10
−12
−14
−16
−18
−20
−22
−24
Power
log 10
(⟨ P
e ⟩)Shannon waterfallOptimal transmit powerTotal powerDecoding power
Decoding Power(1− R)nR, Error Prob 2−Esp(R,P)n
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 50 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Massively parallel computational nodes. Each connected to at mostα + 1 other nodes.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Massively parallel computational nodes. Each connected to at mostα + 1 other nodes. Each consumesEnodeenergy per iteration and can sendarbitrary messages
to its neighbors.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Massively parallel computational nodes. Each connected to at mostα + 1 other nodes. Each consumesEnodeenergy per iteration and can sendarbitrary messages
to its neighbors. Some nodes initialized with a single received codeword symbol. Some nodes responsible for decoding a single message bit.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Massively parallel computational nodes. Each connected to at mostα + 1 other nodes. Each consumesEnodeenergy per iteration and can sendarbitrary messages
to its neighbors. Some nodes initialized with a single received codeword symbol. Some nodes responsible for decoding a single message bit. Run for a fixed number of iterationsi.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
A new hope: iterative decoding
Make assumptions about the decoder implementation rather than thecode.
Massively parallel computational nodes. Each connected to at mostα + 1 other nodes. Each consumesEnodeenergy per iteration and can sendarbitrary messages
to its neighbors. Some nodes initialized with a single received codeword symbol. Some nodes responsible for decoding a single message bit. Run for a fixed number of iterationsi.
Rich enough to capture LDPC, RA, Turbo, etc. codes.
Power-consumption≥ iEnodeper received sample.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 51 / 54
How to lower-bound the number of iterations?
X XY iB i
7 8R o o t b i tN o d e
YY 721B B1 2 YY 8B X4 5 YY 4 5 B X9 1 0 YY 4 1 0Key concept:decodingneighborhoods = informationpatterns
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 52 / 54
How to lower-bound the number of iterations?
X XY iB i
7 8R o o t b i tN o d e
YY 721B B1 2 YY 8B X4 5 YY 4 5 B X9 1 0 YY 4 1 0Key concept:decodingneighborhoods = informationpatterns
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 52 / 54
How to lower-bound the number of iterations?
X XY iB i
7 8R o o t 5 b i tN o d e
YY 721B B1 2 YY 8B X4 5 YY 4 5 B X9 1 0 YY 4 1 0Key concept:decodingneighborhoods = informationpatterns
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
Need to lower-bound averageprobability of bit error in termsof n.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 52 / 54
How to lower-bound the number of iterations?
X XY iB i
7 8R o o t K b i tN o d e
YY 721B B1 2 YY 8B X4 5 YY 4 5 B X9 1 0 YY 4 1 0Key concept:decodingneighborhoods = informationpatterns
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
Need to lower-bound averageprobability of bit error in termsof n.
Key insight:n is playing a roleanalogous to delay.
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 52 / 54
A local “sphere-packing” bound for the AWGN
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
〈Pe〉 ≥ supσ2G>σ2
Pµ(n): C(G)<R
h−1b (δ(G))
2exp
(
−nD(σ2G||σ
2P) −
12φ(n, h−1
b (δ(G)))
(
σ2G
σ2P
− 1
))
C(G) = 12 log2(1 + PT
σ2G), δ(G): 1− C(G)
R
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 53 / 54
A local “sphere-packing” bound for the AWGN
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
〈Pe〉 ≥ supσ2G>σ2
Pµ(n): C(G)<R
h−1b (δ(G))
2exp
(
−nD(σ2G||σ
2P) −
12φ(n, h−1
b (δ(G)))
(
σ2G
σ2P
− 1
))
C(G) = 12 log2(1 + PT
σ2G), δ(G): 1− C(G)
R
µ(n) = 12(1 + 1
T(n)+1 + 4T(n)+2nT(n)(1+T(n)))
T(n) = −WL(−exp(−1)(1/4)1/n)
WL(x) solvesx = WL(x) exp(WL(x))
φ(n, y) = −n(WL
(
−exp(−1)( y2)
2n
)
+ 1)
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 53 / 54
A local “sphere-packing” bound for the AWGN
Decoding neighborhood sizen ≤ 1 + (α + 1)αi−1 ≈ αi .
〈Pe〉 ≥ supσ2G>σ2
Pµ(n): C(G)<R
h−1b (δ(G))
2exp
(
−nD(σ2G||σ
2P) −
12φ(n, h−1
b (δ(G)))
(
σ2G
σ2P
− 1
))
C(G) = 12 log2(1 + PT
σ2G), δ(G): 1− C(G)
R
µ(n) = 12(1 + 1
T(n)+1 + 4T(n)+2nT(n)(1+T(n)))
T(n) = −WL(−exp(−1)(1/4)1/n)
WL(x) solvesx = WL(x) exp(WL(x))
φ(n, y) = −n(WL
(
−exp(−1)( y2)
2n
)
+ 1)
Double-exponential potential return on investments in decoding power!
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 53 / 54
Waterslide curves for general AWGN case
0 1 2 3 4
−2
−4
−8
−16
−32
−64
Power
log 10
(⟨ P
e ⟩)γ = 0.4γ = 0.3γ = 0.2Shannon limit
Anant Sahai (UC Berkeley) IT + Control Mar 26, 2008 54 / 54