the natural mathematics arising in information theory and ...player 1: portfolio b1. wealth s1 =...
TRANSCRIPT
The natural mathematics arising in information theory andinvestment
Thomas Cover
Stanford University
Page 1 of 40
Felicity of mathematics
We wish to maximize the growth rate of wealth.
There is a satisfactory theory. The strategy achieving this goal is controversial.(Probably because the strategy involves maximizing the expected logarithm.)
Why is π fundamental? π = C/D,∑
n1
n2 = π2
6, φ(x) = 1√
2πe−
x2
2 .
Recall from physics the statement that the laws of physics have a strangely felicitousrelation with mathematics. We shall try to establish the reasonableness of the theory ofgrowth optimality by presenting the richness of the mathematics that describes it andby giving a number of problems having growth optimality as the answer.
A theory is natural if it fits and has few “moving parts”. Ideally, it should “predict” otherproperties.
The new or unpublished statements will be identified.
Page 2 of 40
Outline
Setup
Mean variance theory
Growth optimal portfolios for stochastic markets
Properties:
Stability of optimal portfolioExpected Ratio OptimalityCompetitive optimalitySn/S∗
n Martingale
S∗n
.= enW∗
(AEP)
Growth optimal portolios for arbitrary markets
Universal portfolios
Sn/S∗n ≥ 1
2√
n+1for all xn
Amplification
Relationship of growth optimality to information theory
Page 3 of 40
Portfolio Selection
Stock X:X = (X1, X2, . . . , Xm) ∼ F (x)
X ≥ 0
Xi = price-relative of stock i
Portfolio b:b = (b1, b2, . . . , bm), bi ≥ 0,
∑bi = 1
proportion invested
Wealth Relative S: Factor by which wealth increases
S =m∑
i=1
biXi = btX
Find the “largest” S.
Page 4 of 40
Mean-Variance Theory.Markowitz, Tobin, Sharpe, . . .
Choose b so that (Var S, ES) is undominated. S = btX.
Page 5 of 40
Conflict of mean-variance theory and growth rate.
Portfolio selection:
Maximize growth rate of wealth.
Sn(X1, X2, . . . , Xn)·= 2nW
Efficient portfolio is not necessarily growth optimal (E.Thorp)
Page 6 of 40
Consider the stock market process Xi:
Xi ∈ Rm,
Portfolios bi(·):m∑
j=1
bij(xi−1) = 1
for each time i = 1, 2, ... and for every past xi−1 = (x1, x2, ...,xi−1).
Note: bij < 0 corresponds to shorting stock j on day i. Shorting cash is calledbuying on margin.
Goal: Given a stochastic process Xi with known distribution, find portfoliosequence bi(·) that “maximizes”
Sn =n∏
i=1
bti(X
i−1)Xi
.
Page 7 of 40
Page 8 of 40
Page 9 of 40
1. Asymptotic Growth Rate of Wealth
X1,X2, . . . i.i.d. ∼ F (x)
Wealth at time n:
Sn =n∏
i=1
btXi
= 2(n 1n
∑log b
tXi)
= 2n(E log btX+o(1)), a.e.
Definition: Growth rate
W (b, F ) =
∫log btx dF (x)
W ∗ = maxb
W (b, F )
Sn.= 2nW∗
.
Page 10 of 40
Example
Cash vs. Hot Stock
X =
(1, 2), prob 12
(1, 1
2
), prob 1
2
b = (b1, b2)
E log S =1
2log(b1 + 2b2) +
1
2log(b1 +
1
2b2)
b∗ = (1
2,1
2)
W ∗ =1
2log
9
8
S∗n
.=
(9
8
)n/2.= (1.06)n
Page 11 of 40
Live off fluctuations
n
s
Cash
Hot stock
S∗n
Page 12 of 40
Calculation of optimal portfolio
X ∼ F (x)
Log Optimal Portfolio b∗:maxb
E log btX = W ∗
Log Optimal Wealth:S∗ = b∗tX
∂
∂biE lnbtX = E
Xi
btX
Kuhn-Tucker conditions:
b∗ : E Xi
b∗tX= 1, b∗i > 0≤ 1, b∗i = 0
Consequence: ES/S∗ ≤ 1, for all S.
Theorem E ln SS∗ ≤ 0,∀S ⇔ E S
S∗ ≤ 1, ∀S
Page 13 of 40
Properties of growth rate W (b, F )
.Theorem W (b, F ) is concave in b and linear in F .
Let bF maximize W (b, F ) over all portfolios b :∑m
i=1 bi = 1.W ∗(F ) = W (bF , F )
W (b, F )
b
0 1
Theorem W ∗(F ) is convex in F .
Question: Let W (b) =∫
lnbtx dF (x). Is W (b) a transform?
Page 14 of 40
2. Stability of b∗: Expected proportion remains constant
b∗ is a stable point
Let b = (b1, b2, ..., bm) denote the proportion of wealth in each stock.
The proportions held in each stock at the end of the trading day are
b = (b1X1
btX,b2X2
btX, ...,
bmXm
btX)
Then b is log optimal if and only if
b = Eb
i.e. bi = E biXi
btX, i = 1, 2, ...,m, i.e. the expected proportions remain unchanged.
This is the counterpart to Kelly gambling.
Page 15 of 40
Generalization to arbitrary stochastic processes Xn
Xn: arbitrary stochastic process:
Wealth from bi(·) : Sn =n∏
i=1
btiXi, bi = bi(X
i−1)
Let S∗n =
n∏
i=1
b∗ti Xi, b∗
i = b∗i (Xi−1)
where b∗i is conditionally log optimal . Thus
b∗i (Xi−1) : max
b
ElnbtXi|Xi−1
Page 16 of 40
Optimality for arbitrary stochastic processes Xn
Theorem For any market process Xi,
ESn+1/S∗n+1|Xn ≤ Sn/S∗
n.
Sn/S∗n is a nonnegative super martingale with respect to Xn
Sn/S∗n −→ Y, a.e.
EY ≤ 1.
Corollary:Prsup
n
Sn
S∗n
≥ t ≤ 1/t,
by Kolmogorov’s inequality. So Sn cannot ever exceed S∗n by factor t with probability
greater than 1/t. Same as fair gambling.
Theorem If Xi is ergodic, then 1n
log S∗n −→ W , a.e.
Page 17 of 40
3. Value of Side Information
Theorem: Believe that X ∼ g, when in fact X ∼ f . Loss in growth rate:
∆(f‖g) = Ef logbt
fX
btgX
≤ D(f ||g) =
∫f log
f
g.
Mutual information: I(X; Y ) =∑
p(x, y) logp(x, y)
p(x)p(y)
Value of side information:
W (X) = maxb
E lnbtX, W (X|Y) = maxb(·)
E lnbt(Y)X
W (X) → W (X|Y )
b∗ b∗(y)
∆(X; Y ) = Increase in growth rate for market X.
Theorem: (A.Barron ,T.C.)∆(X;Y ) ≤ I(X;Y ).
Page 18 of 40
4. Black-Scholes option pricing
Cash: 1
Stock: Xi =
1 + u, w.p. p
1 − d, w.p. q
Option: Pay c dollars today for option to buy at time n the stock at price K.
c →
(Xn − K), Xn ≥ K
0, Xn < K
Black, Scholes idea:Replicate option by buying and selling Xi, at times i = 1, 2, ..., n.Example: Option expiration date n = 1. Strike price K. Initial wealth = c.
c1 + c2X = (X − K)+. c = c1 + c2.
If it takes c dollars to replicate option, then c is a correct price for the option.
Page 19 of 40
Black-Scholes option pricing
Growth optimal approach:(
1, X,(X − K)+
c
)
Best portfolio without option:
maxb1+b2=1E ln (b1 + b2X)
Growth optimal wealth:X∗ = b∗1 + b∗2X
Add option:
maxb
E ln
((1 − b)X∗ + b
(X − K)+
c
)
d
dbE ln
((1 − b)X∗ +
b(X − k)+
c
)∣∣∣∣b=0
= E
(X−K)+
c− X∗
X∗ ≥ 0,
or E(X − K)+
X∗ ≥ c.
Critical price:
c∗ = E(X − K)+
X∗ .
But this is the same critical option price c∗ as the Black Scholes theory.Note: c∗ does not depend on probabilities, only on u and d.
Page 20 of 40
5. Asymptotic Equipartition Principle
AEPX1, X2, ..., Xn i.i.d. ∼ p(x),
1
nlog
1
p(X1, X2, ...,Xn)→ H.
AEP for marketsWealth:
Sn =n∏
i=1
btXi.
1
nlog Sn → W.
Proof:1
nlog Sn =
1
nlog
n∏
i=1
btXi =1
n
n∑
i=1
log btXi → W.
p(X1, X2, ...,Xn).= 2−nH
Sn(X1, X2, ...,Xn).= 2nW
Page 21 of 40
Asymptotic Equipartition Principle: Horse race
b = (b1, b2, ..., bm),
X = (0, 0, ...,0, m︸︷︷︸, 0, ...,0), with probability pi,
b∗ = (p1, p2, ..., pm) Kelly gambling
Proof:
W = E log S
=m∑
i=1
pi log bim
= log m +∑
i
pi logbi
pi+
∑
i
pi log pi
≤ log m − H(p1, ..., pm),
with equality if and only if bi = pi, for i = 1, 2, ...,m.
Conservation law
W + H = log m
Page 22 of 40
Comparisons
Information Theory Investment
Entropy Rate Doubling RateH = −
∑pi log pi W ∗ = maxb E log btX
AEPp(X1, X2, ...,Xn)
.= 2−nH S∗(X1, X2, ...,Xn)
.= 2nW∗
Universal Data Compression Universal Portfolio Selectionl∗∗(X1, X2, ...,Xn)
.= nH S∗∗(X1, X2, ...,Xn)
.= 2nW∗
W ∗ + H ≤ log m
Page 23 of 40
6. Competitive optimality
X ∼ F (x). Consider the two-person zero sum game:
Player 1: Portfolio b1. Wealth S1 = W1bt1X.
Player 2: portfolio b1. Wealth S2 = W2bt2X.
Fair randomization: EW1 = EW2 = 1, Wi ≥ 0.
Payoff: PrS1 ≥ S2V = max
b1,W1
minb2,W2
PrS1 ≥ S2
Theorem (R.Bell, T.C.) The value V of the game is 1/2. Optimal strategy for player1 is b1 = b∗, where b∗ is the log optimal portfolio. W1 ∼ unif[0, 2].
Comment: b∗ is both long run and short run optimal.
Page 24 of 40
7. Universal portfolio selection
Market sequencex1,x2, . . . , xn
Sn(b) =n∏
i=1
btxi
S∗n = max
b
Sn(b) =n∏
i=1
b∗tXi.
Investor:bi(x1,x2, . . . ,xi−1)
Sn =n∏
i=1
btixi
Page 25 of 40
Page 26 of 40
Page 27 of 40
Minimax regret universal portfolio
Minimax regret for horizon n is defined as
R∗n = min
b(·)maxxn,b
∏ni=1 btxi∏n
i=1 bi(xi−1)xi
= minb
maxxn
S∗n
Sn
Theorem: (Erik Ordentlich, T.C.)
R∗n =
1
Vn,
where Vn =∑ ( n
n1,...,nm
)2−nH(
n1n
,..., nmn
)
Note: For m = 2 stocks,
Vn =∑n
k=0
(nk
)2−nH( k
n) ∼
√2
πn
Vn ≤ 2√n+1
Corollary: For m = 2 stocks, there exists bi(xi−1) such that
Sn ≥ 2S∗n√
n + 1, for every sequence x1, . . . , xn.
Page 28 of 40
Achieving R∗n: Universal Portfolio for horizon n
Portfolio bi(Xi−1) :Invest
b(jn) =1
Vn
(n1(jn)
n
)n1(jn) (n2(jn)
n
)n2(jn)
· · ·(
nm(jn)
n
)nm(jn)
in “plunging” strategy jn and let it ride, where jn ∈ 1, 2, ...,mn.
Example For horizon n = 2. For m = 2.
X1 = (X11, X12)
b1 = ( 12, 12)
b2(X1) = (45
X11+ 15
X12
X11+X12,
15
X11+ 45
X12
X11+X12)
b(11) = 4/10
b(12) = 1/10
b(21) = 1/10
b(22) = 4/10
Page 29 of 40
8. Accelerated Performance
Stock x ∈ Rm+ , requires b ∈ Rm
+ , so that btx ≥ 0.
Let X(α) = x ∈ Rm: xi ≥ α,
m∑
i=1
xi = 1
B(α) = b ∈ Rm :m∑
i=1
bi = 1, btx ≥ 0, ∀x ∈ X(α)
B(α) is polar cone to X (α): B(α) = X⊥(α).
B(α) allows short selling and buying on margin.
Thus x ∈ X (α), b ∈ B(α) yields S = btx ≥ 0.Let Ω = Rm
+ , X (α) = AΩ, B(α) = A−1Ω.
A =
(α 1 − α
1 − α α
)A
−1 =1
2α − 1
(α −(1 − α)
−(1 − α) α
)
b ∈ Ω,X ∈ Ω. b = A−1b ∈ B(α), X = AX ∈ X (α).
btX = bt(A−1
)tAX = btX
α 1 − α
X (α)
B(α)
Page 30 of 40
Accelerated Performance
Theorem (Acceleration (Erik Ordentlich, T.C., to appear))
m = 2 stocks. The short selling investor can come within factor Vn(α) of the bestlong-only investor given hindsight:
maxbi(·)∈B(α)
minx∈Xn(α),
b∈B(0)
∏ni=1 bt
ixi∏ni=1 btxi
= Vn(α),
where [x] = x rounded off to interval [α, α].
Vn(α) =n∑
k=0
(n
k
) [k
n
]k [n − k
n
]n−k
Note: Vn(α) ր. Vn(0) ∼√
2π
1√n
. Vn( 12) = 1.
Page 31 of 40
Accelerated Performance
0 50 100 150 200 250 300900
1000
1100
1200
1300
1400
1500
160028−Sep−07 till 14−Oct−08
Time
S&
P50
0
Page 32 of 40
Accelerated Performance
−6 −4 −2 0 2 4 60
0.5
1
1.5
2
2.5
b
Sn
Sn*
Sn**
9/28/07 – 10/14/08, n = 263.S∗
n: Wealth of best long-only constant rebalanced portfolio in hindsight.S∗∗
n : Wealth of best short selling and margin constant rebalanced portfolio in hindsight.
Page 33 of 40
Accelerated Performance
−6 −4 −2 0 2 4 60
0.5
1
1.5
2
2.5
b
Sn
α=0.45 Sn^= 1.0475
Sn*
Sn**
Sn^
9/28/07 – 10/14/08, n = 263.S∗
n: Wealth of best long-only constant rebalanced portfolio in hindsight.S∗∗
n : Wealth of best short selling and margin constant rebalanced portfolio in hindsight.Sn: Wealth of universal portfolio.
Page 34 of 40
Comparisons with Information Theory
General Market Horse Race Market
X ∼ F (x) X = mei, pi
b∗ : Eb∗i Xi
b∗tX= b∗i bi = pi Kelly gambling
W ∗ = Eb∗tX W ∗ = log m − H(p), H =entropy
Wrong distribution G(x):
∆(F ||G) =∫ b
tF x
btG
xdF (x) ∆ =
∑pi ln pi
gi= D(p||g), relative entropy
Side information (X, Y) ∼ f(x, y):
∆ =∫
lnb
tf(x|y)x
btf(x)
xf(x,y)dxdy ∆ =
∑p(x, y) ln
p(x,y)p(x)p(y)
= I(X; Y ), mutual information
Page 35 of 40
Comparisons
General Market Horse Race Market
Asymptotic growth rateXi stationary:
W ∗ = maxb ElnbtX0|X−1−∞ W ∗ = log m − H(X0|X−1
−∞)= log m − H(X ), H(X ) = entropy rate
AEP for ergodic processes:
1n
log S∗n → W ∗, a.e. − 1
nlog p(Xn) → H(X ), a.e.
S∗n
·= 2nW∗
p(Xn)·= 2−nH
Page 36 of 40
Comparisons
Universal portfolio (individual sequences):
General Market Horse Race Market
x1,x2, ...,xn ∈ Rm+ x1, x2, ...,xn ∈ e1, ..., em
Sn(b, xn) =∏n
i=1 btxi Sn(b, xn) =∏m
i=1 bni(x
n)i
Sn(bn,xn) =∏n
i=1 bt(xi−1)xi Sn(bn, xn) = b(xn)
Vn Vn
Same cost of universality for both.
Vn = minb(·)
maxb,xn
Sn(bn,xn)
Sn(b, xn)
=∑ ( n
n1, ..., nm
)2−nH(
n1n
,...,nmn
)
Page 37 of 40
Concluding remarks
Growth optimal portfolios have many properties:
Long run optimality
Martingale property
Competitive optimality
Asymptotic equipartition property
Universal achievability
Black-Scholes
Amplification
Relationship with information theory
Page 38 of 40
References
Algoet Barron Bell BorodinCover Erkip Gluss GyorfiHakansson Iyengar Jamshidian LugosiMathis Merton Ordentlich PlatenSamuelson Shannon Thorp VajdaWarmuth Ziemba Markowitz SharpeDuffie
Page 39 of 40
References
R. Bell and T. Cover, “Game-Theoretic Optimal Portfolios,” Management Science,34(6):724-733, June 1988.
T. Cover, “Universal Portfolios,” Mathematical Finance, 1(1):1-29, January 1991.
T. Cover and E. Ordentlich, “Universal Portfolios with Side Information,”IEEETransactions on Information Theory, 42(2):348-363, March 1996.
E. Ordentlich and T. Cover, “The Cost of Achieving the Best Portfolio in Hindsight,”Mathematics of Operations Research, 23(4):960-982, November 1998.
Page 40 of 40