optimized hybrid scaled neural analog predictor
DESCRIPTION
Optimized Hybrid Scaled Neural Analog Predictor. Daniel A. Jiménez Department of Computer Science The University of Texas at San Antonio. Branch Prediction with Perceptrons. Branch Prediction with Perceptrons cont. SNP/SNAP [St. Amant et al. 2008]. - PowerPoint PPT PresentationTRANSCRIPT
Optimized Hybrid Scaled Neural Analog Predictor
Daniel A. Jiménez
Department of Computer ScienceThe University of Texas at San Antonio
Branch Prediction with Perceptrons
2
Branch Prediction with Perceptrons cont.
3
4
SNP/SNAP [St. Amant et al. 2008]
A version of piecewise linear neural prediction [Jiménez 2005]
Based on perceptron prediction
SNAP is a mixed digital/analog version of SNP
Uses analog circuit for costly dot-product operation
Enables interesting tricks e.g. scaling
5
Weight Scaling
Scaling weights by coefficients
Different history positions
have different importance!
6
The Algorithm: Parameters and Variables
C – array of scaling coefficients
h – the global history length
H – a global history shift register
A – a global array of previous branch addresses
W – an n × (GHL + 1) array of small integers
θ – a threshold to decide when to train
7
The Algorithm: Making a Prediction
Weights are selected based on the current branch and the ith most recent branch
The Algorithm: Training
If the prediction is wrong or |output| ≤ θ then
For the ith correlating weight used to predict this branch:
Increment it if the branch outcome = outcome of ith in history
Decrement it otherwise
Increment the bias weight if branch is taken
Decrement otherwise
8
SNP/SNAP Datapath
9
10
Tricks
Use alloyed [Skadron 2000] global and per-branch history Separate table of local perceptrons
Output from this stage multiplied by empircally determined coefficient
Training coefficients vector(s) Multiple vectors initialized to f(i) = 1 / (A + B × i)
Minimum coefficient value determined empircally
Indexed by branch PC
Each vector trained with perceptron-like learning on-line
Tricks(2)
Branch cache Highly associative cache with entries for branch information Each entry contains:
A partial tag for this branch PC The bias weight for this branch An “ever taken” bit A “never taken” bit
The “ever/never” bits avoid needless use of weight resources The bias weight is protected from destructive interference LRU replacement >99% hit rate
11
Tricks(3)
Hybrid predictor
When perceptron output is below some threshold: If a 2-bit counter gshare predictor has high confidence, use it
Else use a 1-bit counter PAs predictor
Multiple θs indexed by branch PC
Each trained adaptively [Seznec 2005]
Ragged array Not all rows of the matrix are the same size
12
Benefit of Tricks
13
Graph shows effect of one trick in isolation
Training coefficients yields most benefit
14
References
Jiménez & Lin, HPCA 2001 (perceptron predictor)
Jiménez & Lin, TOCS 2002 (global/local perceptron)
Jiménez ISCA 2005 (piecewise linear branch predictor)
Skadron, Martonosi & Clark, PACT 2000 (alloyed history)
Seznec 2005 (adaptively trained threshold)
St. Amant, Jiménez & Burger, MICRO 2008 (SNP/SNAP)
McFarling 1993, gshare
Yeh & Patt 1991, PAs
15
The End