29 september 2005 dynamic voting schemes to enhance evolutionary repair in reconfigurable logic...
TRANSCRIPT
29 September 2005 29 September 2005
Dynamic Voting Schemes to Enhance Dynamic Voting Schemes to Enhance Evolutionary Repair in Evolutionary Repair in
Reconfigurable Logic DevicesReconfigurable Logic Devices
C. Milliord, C. A. Sharma, and R. F. DeMara C. Milliord, C. A. Sharma, and R. F. DeMara University of Central FloridaUniversity of Central Florida
C. Milliord, C. A. Sharma, and R. F. DeMara C. Milliord, C. A. Sharma, and R. F. DeMara University of Central FloridaUniversity of Central Florida
Technical Objective:Autonomous FPGA Regeneration
Redundancy
increases with amount of spare capacity
restricted at design-time
based on time required to select spare resource
determined by adequacy of spares available (?)
yes
Regeneration
weakly-related to number
recovery capacity
variable at recovery-time
based on time required to find suitable recovery
affected by multiple characteristics (+ or -)
yes
Overhead from Unutilized Spares weight, size, power
Granularity of Fault Coverage resolution where fault handled
Fault-Resolution Latency availability via downtime required to handle fault
Quality of Repair likelihood and completeness
Autonomous Operation recover without outside intervention
Increased availability without pre-configured spares …
everyday example spare tire can of fix-a-flat
NASA Moon, Mars, and Beyond:
Realize 10’s years service life ???
Reconfiguration allows new fault-handling paradigm
Problem Statement
• FPGAs in Space Harsh conditions lead to faults in hardware
Radiation Extreme temperatures Mechanical stress Long Mission duration
• Experiment with several combinations of GAs and voting schemes Population of FPGA configurations that are physically distinct,
but functionally equivalent Voting involves 3 or more configurations, with a majority output
• Hypothesis The added space and computation associated with a voting
scheme is justified by a quicker and more complete repair
EHW Environments
• Evolvable Hardware (EHW) Environments enable experimental methods to research soft computing intelligent search techniques
• EHW operates by repetitive reprogramming of real-world physical devices using an iterative refinement process:
Genetic
Algorithm
Hardware in the loop
orTwo
modes
of
Evolvabl
e
Hardwar
e
Extrinsic Evolution
Genetic
Algorithm
software modelDone? Build it
device “design-time”refinement
Simulation in the loop
Intrinsic Evolution
device “run-time”refinement
new approach to
Autonomous Repair
of failed devices
Stardust Satellite: • >100 FPGAs onboard• hostile environment: radiation, thermal stress• How to achieve reliability to avoid mission failure???
Application
Genetic Algorithms (GAs)
selection of
parents
population of candidate solutions
parents
offspring
crossover
mutation
evaluatefitness
ofindividuals
replacement
start
Fitnessfunction
Goal reached
• Initial population of configurations Functionally equivalent, Physically distinct
• Fitness level Based on number of correct outputs for all possible inputs
• Creating a new generation Mutation “100011101” -> “101011101” Crossover “101100” & “011110” -> “101110”
Previous Work
• [1] Re-routing scheme replaces faulty CLB Time-saving method with low overhead
• [2] TMR fault-detection On-line approach High overhead and power consumption
• [3] On-line technique using a BIST Limited power consumption Spare resources
• [4] GA repair of integer multiplier Voting system may not always outperform individual with the highest
fitness Initialized GA with copies of one hand-designed configuration
[1] Xu, J., Si, P., Huang, W., and Lombardi, F., “A novel fault tolerant approach for SRAM-based FPGAs”, Proceedings of the Pacific Rim Int’l Symposium, Dec. 1999, pp. 40-44.
[2] Li, Y., Li, D., and Wang, Z., “A new approach to detect-mitigate-correct radiation-induced faults for SRAM-based FPGAs in aerospace application”, Proceedings of the IEE National Aerospace and Electronics Conference, Oct. 2000, pp. 588-594.
[3] Abramovici, M., Emmert, J., and Stroud, C., “Roving STARs: an integrated approach to on-line testing, diagnosis, and fault tolerance for FPGAs in adaptive computing systems”, Proceedings of The Third NASA DoD Workshop, July 2001, pp. 73-92.
[4] Vigander, S., “Evolutionary fault repair of electronics in space applications”, Dissertation, University of Sussex, Brighton, UK, 2001.
Experimental Setups
• C++ program that simulates FPGA circuit design/repair Input files
GA parameters Logic function truth table
Input/Output pairs FPGA parameters Configuration properties of
perfect individuals Simulate repair in voting
experiments
Output files Configuration properties at
selected generations Data showing fitness level at
each generation Produce graphs
Avnet FPGA Development Board
PCI I nt er f ace
Virtex-IIPro FPGA
Off ChipRAM
Controlhosted on
PC
FP
GA
Ou
tp
ut
Bit file
Input Data
• Loosely Coupled (LC) Virtex System PC WorkStation running Xilinx EDK
and ISE with AVNET V2Pro PCI card
(SoC) version using PowerPC embedded in FPGA fabric now operational … results reported on previous environment
Experimental Inputs
• GA parameters Population size Offspring population size Mutation rate Tournament size (2) Maximum number of
generations• FPGA parameters
Number of inputs (6) Number of outputs (6) Number of CLBs Number of look-up tables
(LUTs) per CLB (SW only)
Number of LUT select lines (SW only)
I1
I2
I3
I4
I5
I6
O1
O2
O3
O4
O5
O6
0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0 0 0 0 0
0 1 0 0 0 1 0 0 0 0 1 0
0 1 0 0 1 0 0 0 0 1 0 0
0 1 0 0 1 1 0 0 0 1 1 0
0 1 0 1 0 0 0 0 1 0 0 0
0 1 0 1 0 1 0 0 1 0 1 0
0 1 0 1 1 0 0 0 1 1 0 0
0 1 0 1 1 1 0 0 1 1 1 0
0 1 1 0 0 0 0 0 0 0 0 0Ideal Fitness = 60
Experiment #1
• Circuit evolution – no repair• Maximize GA performance before voting
(tweak parameters) Used 200 for max number of generations Varied the mutation rate from .001 to .097 with a step
of .004 Population sizes of 15, 40, and 50 6, 9, 12, 16, and 36 for number of CLBs
• Evolve several perfect configurations repeated the most successful runs for 1000
generations
FPGA Genetic Representations
• Chromosome Goals: Allow all possible LUT configurations Allow all possible CLB interconnections given constraints of routing support Disallow illegal FPGA configurations and non-coding introns (junk DNA) Facilitate crossover operator
• Bitstring representation is natural choice, though may not scale well (investigating generative reps)
• Representation shown here is sample specific to Xilinx Virtex FPGA
LUT 0 BITS
R-CLB = REMOTE CLB
R-LUTR-CLB
R-LUT = REMOTE LUT
R-LUTR-CLBLUT 0 INPUTS
R-LUTR-CLB R-LUTR-CLBLUT 3 INPUTS
LUT 3 BITS CLB 0 CLB 1
CLB 0
LUT0
LUT1
LUT2
LUT3
CLB 1 CLB n
LUT
0
LUT1
LUT2
LUT3
LUT0
LUT1
LUT2
LUT3
Generations = 200, pop size = 50,
CLBs = 9
MR F MR F MR F MR F MR F
.001 55 .021 54 .041 53 .061 55 .081 53
.005 57 .025 53 .045 52 .065 55 .085 52
.009 54 .029 54 .049 54 .069 54 .089 59
.013 54 .033 56 .053 52 .073 53 .093 55
.017 54 .037 53 .057 52 .077 54 .097 53
Experiment #1 Results
Perfect Individuals
• Parameters used in evolving perfect individuals (fitness of 60) Maximum Number of Generations: 1000 Mutation Rate: .002 Population Size: 50 Number of CLBs: 9
• These create a diverse initial population for TMR style voting in Experiment #2
…Perfect Individuals
Config. Generations AND OR NOR XOR NAND
1 150 11 8 4 5 8
2 382 8 10 3 8 7
3 473 13 8 5 7 3
4 582 7 6 5 10 8
5 881 8 7 10 6 5
Avg. 493.6 9.4 7.8 5.4 7.2 6.2
Three-plex Experiments
• Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47
• Parameters Number of Generations: 400 Mutation Rate: .089 Population Size: 50 Number of CLBs: 9
Experiment #2
• Simulating repair• Implement voting schemes
Injected stuck-at faults Implemented 3-plex and 5-plex voting schemes Chose GA/FPGA parameters according to Experiment #1 For each voting run, graphed the fitness of best fit individual vs.
number of generations for voting elements and system Repeated 3-plex experiment with a single element (no voting) for
3X number of generations
GA #1
Configuration
FPGA Input Data
Voter
FPGA Output Data
Output OutputOutput
GA #2
ConfigurationGA #3
Configuration
Partial Repair: Max Fitness = 58 at generation 68
47
49
51
53
55
57
59
1 51 101 151 201 251 301 351
Generation
Fitn
ess o
f b
est in
div
idu
al
GA #1 GA #2 GA #3 Voting Result
Three-plex Voting Results
Three-plex Voting Results
Complete Repair achieved at generation 302
47
49
51
53
55
57
59
1 51 101 151 201 251 301 351
Generation
Fitn
ess o
f b
est in
div
idu
al
GA #1 GA #2 GA #3 Voting Result
Three-plex Voting Results
Complete Repair at generation 33
47
49
51
53
55
57
59
1 51 101 151 201 251 301 351
Generation
Fitn
ess o
f b
est in
div
idu
al
GA #1 GA #2 GA #3 Voting Result
Three-plex Voting Results
Perfect fitness is temporarily reached at generation 17
47
49
51
53
55
57
59
1 51 101 151 201 251 301 351
Generation
Fitn
ess o
f b
est in
div
idu
al
GA #1 GA #2 GA #3 Voting Result
Three-plex Voting Summary
Rank
Highest Voting Fitness
Reached
Earliest Generation of Highest
Fitness
GA #1 (voting
fitness/final fitness)
GA #2 (voting
fitness/final fitness)
GA #3 (voting
fitness/final fitness)
Final Vote Fitness
(Generation 400)
10 56 2 56/56 48/54 56/56 56
9 57 2 48/54 55/55 52/55 56
8 58 68 56/57 55/56 55/57 58
7 60 302 55/55 56/56 58/58 60
6 60 261 58/58 56/56 58/58 60
5 60 179 57/57 56/56 55/55 60
4 60 33 51/56 53/58 59/59 60
3 60 17 58/58 56/56 52/54 59
2 60 3 51/56 52/56 54/56 60
1 60 2 55/55 52/54 56/56 59
Compare: Single GA Run
• 1200 generations Total GA computation equivalent to a 3-plex run for
400 generations
• 3 runs Max fitness of 56 at 934 generations Max fitness of 56 at 852 generations Max fitness of 57 at 274 generations
• N-plex Voting advantageous Improved the likelihood of obtaining a complete
repair significantly with fewer total number of circuit evaluations
n x gv << go
for n-plex voting with gv voting generations vs. go
evolutionary generations without voting
Experiment #3: 5-plex
• Six injected stuck-at faults on LUT inputs Resulting fitness of perfect individuals: 38, 40, 47
• Parameters Number of Generations: 300 Mutation Rate: .089 Population Size: 50 Number of CLBs: 9
Five-plex Voting Results
Complete Repair at generation 48
47
49
51
53
55
57
59
1 51 101 151 201 251
Generation
Fitn
ess o
f b
est
ind
ivid
ua
l
GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result
Five-plex Voting Results
Complete Repair fitness at generation 34
47
49
51
53
55
57
59
1 51 101 151 201 251
Generation
Fitn
ess o
f b
est
ind
ivid
ua
l
GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result
Five-plex Voting Results
Perfect fitness at generation 2
47
49
51
53
55
57
59
1 51 101 151 201 251
Generation
Fitn
ess o
f b
est
ind
ivid
ua
l
GA #1 GA #2 GA #3 GA #4 GA #5 Voting Result
Five-plex Voting Summary
Rank
Highest Voting Fitness
Reached
Earliest Generation of Highest
Fitness
GA #1 (voting
fitness/final fitness)
GA #2 (voting
fitness/final fitness)
GA #3 (voting
fitness/final fitness)
GA #4 (voting
fitness/final fitness)
GA #5 (voting
fitness/final fitness)
Final Vote Fitness
(Generation 300)
10 59 68 55/58 54/55 55/55 54/55 53/55 57
9 60 154 54/56 56/56 52/52 56/58 55/55 60
8 60 108 56/56 54/54 55/55 55/55 53/53 60
7 60 55 53/55 56/56 57/57 56/56 56/56 60
6 60 48 55/55 56/56 53/56 55/57 59/59 60
5 60 34 56/56 55/58 51/55 57/57 55/55 60
4 60 27 56/56 53/55 52/57 55/55 52/56 58
3 60 4 56/56 53/55 51/56 56/56 52/56 60
2 60 3 49/54 51/55 55/55 53/57 52/56 55
1 60 2 50/54 56/56 54/54 50/55 54/56 60
3-plex vs. 5-plex
• 3-plex scheme 7 out of 10 runs reached perfect fitness Average of 113.86 generations to do so 5 out of 10 runs exhibited perfect fitness upon
completion (400 generations)
• 5-plex scheme 9 out of 10 reached perfect fitness Average of 48.33 generations needed 7 out of 10 exhibited perfect fitness at completion
(300 generations)
Conclusion
• Autonomous FPGA Repair Strategy combining dynamic redundancy with online evolution
• TMR Style Voting beneficial in presence of partial refurbishment Complete repair can be quickly obtained with three/five
imperfectly repaired individuals• Improvement of fitness in an individual GA can
outperform voting fitness• Stabilization of a complete repair is more
important than how quickly it is achieved In all six runs where a perfect fitness was obtained after 50
generations, the fitness was maintained Only 5 of 10 runs which obtained a perfect fitness before 50
generations maintained that fitness for remainder of run
Development Board to Self-Contained FPGA
Year 1 Year 3Year 2
CRR on a Chip(Xilinx Virtex-II Pro)
Control viaon-chip
Power PC
Re-config
Config
Data
Configurationsin On ChipRAM Blocks
FunctionalCLBs
ICAP
Bit file
Data
Output
Request
Avnet FPGA Development Board
PCI Interface
Virtex-IIPro FPGA
Off ChipRAM
Controlhosted on
PCOutput
Bit file
Input Data
CRR on a Chip(Xilinx Virtex-II Pro)
Device Fault
Qualitative Analysis of CRR modelQualitative Analysis of CRR model• Number of iterations and completeness of regeneration repair • Percentage of time the device remains online despite physical resource
fault (availability)Hardware Resource ManagementHardware Resource Management
• Optimization of hardware profile for Xilinx Virtex II ProField Testing on SRAM-based FPGA in a Cubesat missionField Testing on SRAM-based FPGA in a Cubesat mission
For further info … EH Websitehttp://cal.ucf.edu
Backup Slides
• On following pages …
Approach Online Recovery
Basis for Recovery
Test Vectors
Availability Externally-supplied Elements
Resource Recycling
Pre-determined
Limits
Power Consumption
TMR with Jiggling [Garvie,
Thompson]
Yes
Requires 2 datapaths
are operational
Pseudo-Exhaustive
100% for single fault,
0% thereafter 2 of 3 Majority Voter Yes Single
datapath
3n+v
[Vigander01] No Design complexity
Exhaustive Non-deterministic
GA Controller, function test vectors
Yes None 3n+v+r
[Lohn, Larchev, DeMara03]
No Design complexity
Pseudo-Exhaustive Functional
Test
Non-deterministic
GA Controller, function test vectors
Yes None 2n+r
[Lach98] No Available spares
Not Addressed
Either cmplete or
none
Device test vectors and controller
No Only one
faulty CLB per tile
2n+r
STARS
[Abramovici01] Yes Available
spares
Exhaustive Resource
Test
Only ~93% regardless of
fault occurrence
Test Reconfiguration Controller + device
test vectors Yes
Available spares within
routing chokepoints
s • (c+r)
[Keymeulen, Stoica,
Zebulum00] No
Depends on characteristics at design
time
Exhaustive during or
after evolution
Non-deterministic
None at runtime No Depends on redundancy
during design n • (1 + f(g))
Competitive Runtime
Reconfiguration (CRR)
[DeMara05]
Yes Recovery complexity
None Adaptable
Optional RAM … RAM coverage is
intrinsic
No test vectors
Yes None 2n+r
Fault Recovery Characteristics of Selected ApproachesFault Recovery Characteristics of Selected Approaches
Previous Work on Fault Recovery
Normalized Power Consumption (Energy per Operation):
n-plex solution using n redundant devices
Reconfiguration cost r
Gate-Level redundancy g
Updated with scan rate s
on c CLBs
Previous Work - Tool LevelPrevious Work - Tool Level
ApproachFPGA
SupportedOn-chip System
Bit Stream Reuse
System Coupling Degree
Potential Limitations
Moraes,
Mesquita,
Palma, Moller
Virtex XCV300 devices
No N LooseLack of Area
Relocation Capability
Raghavan, Sutton
Xilinx Virtex
devicesNo N Loose
Cumbersome CAD flow
Blodget, McMillan
Virtex II devices
Partial Y Medium
Limited hardware speed and capacity. Lack of
information for bit stream
reuse