looking for limits in branch prediction with the gtl predictor
DESCRIPTION
Looking for limits in branch prediction with the GTL predictor. André Seznec IRISA/INRIA/HIPEAC. Motivations. Geometric history length predictors introduced in 2004-2006 OGEHL, CBP-1, dec. 2004 TAGE, JILP ’06, feb. 2006 Storage effective Exploits very long global histories - PowerPoint PPT PresentationTRANSCRIPT
1
André Seznec Caps Team
IRISA/INRIA
Looking for limits in branch predictionwith the GTL predictor
André Seznec
IRISA/INRIA/HIPEAC
2
André SeznecCaps Team
Irisa
Motivations
Geometric history length predictors introduced in 2004-2006 OGEHL, CBP-1, dec. 2004 TAGE, JILP ’06, feb. 2006
• Storage effective• Exploits very long global histories• Were defined with possible implementation in mind
What are the limits of accuracy that can be captured with these schemes ?
How do they compare with unconstrained prediction schemes ?
3
André SeznecCaps Team
Irisa
L(0) ?
L(4)
L(3)
L(2)L(1)
TOT1
T2
T3
T4
Geometric history length predictors:
global history +multiple lengths
4
André SeznecCaps Team
Irisa
GEometric History Length predictor
L(1)1iαL(i)
0 L(0)
The set of history lengths forms a geometric series
What is important: L(i)-L(i-1) is drastically increasing
most of the storage for short history !!
{0, 2, 4, 8, 16, 32, 64, 128}
Capture correlation on very long histories
5
André SeznecCaps Team
Irisa
Combining multiple predictions
Neural inspired predictors Use a (multiply)-add tree
Partial matching Use tagged tables and the longest matching history
O-GEHL, CBP-1
TAGE, JILP’ 06
6
André SeznecCaps Team
Irisa
L(0) ∑
L(4)
L(3)
L(2)L(1)
TOT1
T2
T3
T4
CBP-1 (2004): O-GEHL
Final computation through a sum
Prediction=Sign
256Kbits: 12 components 3.670 misp/KI
7
André SeznecCaps Team
Irisa
=? =? =?
11 1 1 1 1 1
1
1
JILP ‘06: TAGElongest matching history
256Kbits: 3.358 misp/KI
8
André SeznecCaps Team
Irisa
What is global history
conditional branch history: path confusion on short histories
path history: Direct hashing leads to path confusion
1. Represent all branches in branch history
2. Use path AND direction history
9
André SeznecCaps Team
Irisa
Using a kernel history and a user history
Traces mix user and kernel activities: Kernel activity after exception
• Global history pollution
Solution: use two separate global histories
User history is updated only in user mode Kernel history is updated in both modes
10
André SeznecCaps Team
Irisa
Accuracy limits for TAGE
Varying the predictor size, the number of components, the tag width, the history length.
Allowing multiple allocations
The best accuracy on distributed traces:
3.054 misp/KI• History length around 1,000• 15-20 components• No need for tags wider than 16 bits
11
André SeznecCaps Team
Irisa
Accuracy limits for GEHL
Varying the predictor size, the number of components, the history length, counter width
(slightly) improving the update policy and fitting in the two hours simulation rule
on the distributed traces:
2.842 misp/KI• 97 components• 8 bits counter• 2,000 bits global history
12
André SeznecCaps Team
Irisa
GEHL vs TAGE
Realistic implementation parameters (storage budget, number of components)TAGE is more accurate than (O-)GEHL
Unlimited budget, huge number of componentsGEHL is more accurate than TAGE
13
André SeznecCaps Team
Irisa
Will it be sufficient to win
The Championship ?
GEHL history length: 2,00097 components
2.842 misp/KI
14
André SeznecCaps Team
Irisa
A step further: hybrid GEHL-TAGE
On a few benchmarks, TAGE is more accurate than GEHL,
Let us try an hybrid GEHL-TAGE predictor
15
André SeznecCaps Team
Irisa
Hybrid GEHL-TAGE
Bran
ch/p
ath h
istory + P
C
GEHL
TAGE
Meta=
egskew
mu
x
Inherit from:Agree/bimode, YAGS, 2bcgskew,
16
André SeznecCaps Team
Irisa
GEHL+TAGE
GEHL provides the main prediction: also used as the base predictor for TAGE
(YAGS inspired)
TAGE records when GEHL fails:
{prediction, address, history}
(agree/bimode, YAGS inspired)
Meta selects between GEHL and TAGE
(2bcgskew inspired)
17
André SeznecCaps Team
Irisa
Let us have fun !!
GEHL history length: 400
TAGE history length: 100,000
2.774 misp/KI
18
André SeznecCaps Team
Irisa
Might still be unsufficient
GEHL history length: 400
TAGE history length: 100,000
2.774 misp/KI
19
André SeznecCaps Team
Irisa
Adding a loop predictor
The loop predictor captures the number of iterations of a loopWhen successively encounters 8 times the
same number of iterations, the loop predictor provides the prediction.
Advantage:Very reliable
20
André SeznecCaps Team
Irisa
GTL predictor
Bran
ch/p
ath h
istory + P
C
GEHL
TAGE
Meta=
egskew
mu
x
Looppredictor
mu
x
+ static prediction on first occurrence
confid
ence
21
André SeznecCaps Team
Irisa
Hope this will be sufficient to win
the Championship !!
GTL
GEHL, 97 comp., 400 hist. + TAGE, 19 comp., 100,000 hist
+ loop predictor
2.717 misp/KI
22
André SeznecCaps Team
Irisa
Geometric History Length predictorsand limits on branch prediction
Unlimited budget, huge number of components GEHL is more accurate than TAGE
Very old correlation can be captured: On two benchmarks, using 10,000 history is really
helping
Does not seem to be a lot of potential extra benefit from local history We did not find any interesting extra scheme apart loop
prediction Loop prediction, very marginal apart gzip
23
André SeznecCaps Team
Irisa
The End