approximate on-chip communication2
TRANSCRIPT
![Page 1: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/1.jpg)
Approximate On-Chip Communication
Davide Patti, Ph.D. [email protected] University of Catania, Italy
![Page 2: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/2.jpg)
…in the Previous Episodes1. The goal of computing was to be the fastest
2. The challenge to maximize MHz hit the ‘power wall in the mid-2000s
3. Initial solution: “ok, no problem, let’s optimise for speed and power…”
4. …but, eventually, the dramatically increasing workloads ruined the party…
![Page 3: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/3.jpg)
!3
Why?! Ever-increasing amount of information
! Industry reports – 2010 – 2020 amount of information will expand by 50x – ...number of servers will only grow by a factor of 10!
![Page 4: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/4.jpg)
Emerging RMS Applications
![Page 5: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/5.jpg)
Error-Resilience Property Forgiving workloads: multimedia, recognition, search, can tolerate not perfect computing, examples: • Inexact inputs, derived from noisy and redundant
sources (e.g. sensors) • human consumer of results may not discern small
variations • data/algortihms including statistical/probabilistic
computations • computations which may be refined with multiple
iterations
![Page 6: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/6.jpg)
!6
![Page 7: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/7.jpg)
Approximate Computing: A Third Dimension for Optimization
![Page 8: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/8.jpg)
“Error” or “Feature” ?• Approximation not as a “problem” to deal with, not as a
“limitation”, but part of the game
• A neuron spikes when a combination of all the excitation and inhibition it receives makes it reach threshold (around -50mV )
![Page 9: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/9.jpg)
Approximating at Multiple Levels of the Stack
Hardware level
• Less accurate yet energy-efficient circuits (e.g., simplified adder)
• Tuning the supply voltage
Software level
• Ignore some computations (skip loop iterations, relaxing control dependences)
• Data structures, e.g., reducing vector sizes
• Ignore certain memory accesses replacing them by estimated values
![Page 10: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/10.jpg)
Current Applications• Database Querying/Visualization:
• BlinkDB, Facebook’s Presto, M4 from SAP
2B points (70 mins) vs 1M points (3 mins)
![Page 11: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/11.jpg)
Current Applications• Neural Networks
• Using NN to replace some expensive computation or algorithm
• Approximate NN implementations for inference (e.g., less bits to represent weights)
• SqueezeNet, Google’s Neural Machine Translation
![Page 12: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/12.jpg)
Approximate Communication: the NOC Case Study
■ Shared bus➔Low area ➔Poor scalability ➔High energy consumption
■Network-on-Chip➔Mesh of Routers (in red) ➔Each Processing Element
connected to a Router ➔Scalability and modularity ➔Low energy consumption ➔ Increase of design complexity
Shared bus
![Page 13: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/13.jpg)
Communication Overhead• Interconnection networks consume 10% to 20% of the power in
current HPC systems
• Majority due to network's links NoC based design
• More than one-third of the chip's power consumption
![Page 14: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/14.jpg)
!14
Example
for (i=0; i<n; i++) v[i] = f(w[i]);
MemoryMI
CPU
![Page 15: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/15.jpg)
!15
Example – Load w[i]
for (i=0; i<n; i++) v[i] = f(w[i]);
MemoryMI
CPU
Address Data
![Page 16: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/16.jpg)
!16
Example – Store v[i]
for (i=0; i<n; i++) v[i] = f(w[i]);
MemoryMI
CPU
Data
![Page 17: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/17.jpg)
!17
Approximate Communication! Send(data, destination) ! Send(data, destination, reliability_level)
Reliability Level
Communication Energy
Communication System “aware” of error-resilience Acting on two Knobs:
Voltage Swing (wired) Transmission Power (wireless)
![Page 18: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/18.jpg)
!18
Tuning the Link Voltage Swing! Reliability vs. Energy (1mm bit-line):
! Nominal voltage swing → low BER, high energy ! Low voltage swing → high BER, low energy
![Page 19: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/19.jpg)
ReconfigurableLink
coreNI
coreNI
coreNI
coreNI
R R RR
coreNI
coreNI
coreNI
coreNI
R R RR
coreNI
coreNI
coreNI
coreNI
R R RR
R R RR
coreNI
coreNI
coreNI
coreNI
core IPCore
NI NetworkInterface
R Router
PhysicalLink
TilecoreNI
R
![Page 20: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/20.jpg)
ReconfigurableLink
coreNI
coreNI
coreNI
coreNI
R R RR
coreNI
coreNI
coreNI
coreNI
RR
coreNI
coreNI
coreNI
coreNI
R R RR
R R RR
coreNI
coreNI
coreNI
coreNI
R R
![Page 21: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/21.jpg)
ReconfigurableLink
coreNI
coreNI
coreNI
coreNI
R R RR
coreNI
coreNI
coreNI
coreNI
RR
coreNI
coreNI
coreNI
coreNI
R R RR
R R RR
coreNI
coreNI
coreNI
coreNI
R R
![Page 22: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/22.jpg)
HSPICELinkSimulation• 45nmCMOStechnology(NanGate'sOpenCellLibrary):• 10metallayers• 3mmlinklineusingtheseventhmetallayer• 2GHztargetfrequency
Improving energy efficiency in wireless network-on-chip architectures, V Catania, A Mineo, S Monteleone, M Palesi, D Patti, ACM Journal on Emerging Technologies in Computing Systems (JETC) 14 (1), 2018
![Page 23: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/23.jpg)
HSPICELinkSimulation
70%saving3%overhead
Improving energy efficiency in wireless network-on-chip architectures, V Catania, A Mineo, S Monteleone, M Palesi, D Patti, ACM Journal on Emerging Technologies in Computing Systems (JETC) 14 (1), 9
![Page 24: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/24.jpg)
HSPICELinkSimulation
Improving energy efficiency in wireless network-on-chip architectures, V Catania, A Mineo, S Monteleone, M Palesi, D Patti, ACM Journal on Emerging Technologies in Computing Systems (JETC) 14 (1), 9
![Page 25: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/25.jpg)
!25
ImplementationHeader Data Data Data Tail
Reliability LevelDestination Other
control info
![Page 26: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/26.jpg)
!26
Annotation Example
! Data coming from/delivered to w[i] travel with a reliability level rl
#pragma resilient(w, rl) for (i=0; i<n; i++) v[i] = f(w[i]);
![Page 27: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/27.jpg)
![Page 28: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/28.jpg)
!28
Application Characterization
! How the imprecision on inputs and internal data reflects on the outputs ?
! Classify data structures according to their impact on the outputs – Exploitation
! Storing less sensitive data on energy efficient memories (low voltage, low refresh rate, ...)
! Optimizing communication of less sensitive data (unreliable communications, lossy compression, ...)
![Page 29: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/29.jpg)
!29
Experiments
! Two voltage swing levels – Nominal 1.1 V → BER: 10-17, Ebit: 512 fJ – Low 0.6 V → BER: 10-6, Ebit: 152 fJ
![Page 30: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/30.jpg)
!30
Experiments! JPEG encoding pipeline (AXBench)
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
UINT8* encodeMcu(UINT32 imageFormat, UINT8 *outputBuffer) { levelShift(Y1); dct(Y1); quantization(Y1, ILqt); outputBuffer = huffman(1, outputBuffer); return outputBuffer; }
![Page 31: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/31.jpg)
!31
ExperimentsUINT8* encodeMcu(UINT32 imageFormat, UINT8 *outputBuffer) { #pragma resilient_load(Y1, rl_load) levelShift(Y1); ... }
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Memory
rl_load
![Page 32: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/32.jpg)
!32
ExperimentsUINT8* encodeMcu(UINT32 imageFormat, UINT8 *outputBuffer) { #pragma resilient_store(Y1, rl_store) levelShift(Y1); ... }
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Memory
rl_store
![Page 33: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/33.jpg)
!33
ExperimentsUINT8* encodeMcu(UINT32 imageFormat, UINT8 *outputBuffer) { #pragma resilient(Y1, rl) levelShift(Y1); ... }
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Memory
rlrl
![Page 34: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/34.jpg)
Approximation Profiles
![Page 35: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/35.jpg)
!35
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
![Page 36: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/36.jpg)
!36
Experiments
Level Shifter
R
DCT
R
MC
R
Quantizer
R
Entropy Encoder
R
MC
R
Mem 1
Mem 2
![Page 37: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/37.jpg)
!37
Evaluation FlowApplication Resilient data
selection
Annotated application
Resilience level selection
Full Simulation (MIT Graphite)
Memory Reference
trace
NoC architecture
Energy estimation (Noxim)
Error injection
Perturbated Application
Execution
Communication energy
Execution
Imprecise results
Exactresults
Comparison Quality metric
![Page 38: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/38.jpg)
!38
Experiments
![Page 39: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/39.jpg)
!39
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 0 (gold)
![Page 40: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/40.jpg)
!40
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 1
![Page 41: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/41.jpg)
!41
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 2
![Page 42: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/42.jpg)
![Page 43: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/43.jpg)
!43
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 3
![Page 44: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/44.jpg)
!44
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 4
![Page 45: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/45.jpg)
!45
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 5
![Page 46: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/46.jpg)
![Page 47: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/47.jpg)
!47
Experiments
![Page 48: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/48.jpg)
!48
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 6
![Page 49: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/49.jpg)
![Page 50: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/50.jpg)
!50
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 7
![Page 51: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/51.jpg)
![Page 52: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/52.jpg)
!52
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 8
![Page 53: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/53.jpg)
![Page 54: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/54.jpg)
!54
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 9
![Page 55: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/55.jpg)
![Page 56: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/56.jpg)
!56
Experiments
![Page 57: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/57.jpg)
!57
Experiments
0 1 2 3 4 5 6 7 8 90.0000
0.0001
0.0002
0.0003
0.0004
0.0005
0.0006
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Image diff Normalized energy
Configuration
Imag
e di
ff (R
MS
E)
Nor
mal
ized
ene
rgy
![Page 58: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/58.jpg)
!58
Sensitivity Analysis
in Y
1/le
velS
hift
out Y
1/le
velS
hift
in Y
1/dc
t
out Y
1/dc
t
in Y
1/qu
antiz
atio
n
in Il
qt/q
uant
izat
ion
out T
emp/
quan
tizat
ion
in T
emp/
huffm
an
out o
utpu
tBuf
fer/
huffm
an
0.00000
0.00005
0.00010
0.00015
0.00020
0.00025
0.00030
Sensitivity
![Page 59: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/59.jpg)
!59
Experiments
Level Shift DCT Quantize Entropy
Encode
Quantizer Table
Huffman Table
Mem 1
Mem 2
Nominal (high energy, high reliability)
Approx (low energy, low reliability)
Conf 9
![Page 60: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/60.jpg)
!60
Experiments
0.0E+0 1.0E-4 2.0E-4 3.0E-4 4.0E-4 5.0E-4 6.0E-40.00
0.20
0.40
0.60
0.80
1.00
1.20
Image diff (RSME)
Nor
mal
ized
ene
rgy
![Page 61: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/61.jpg)
Next Step: On-Chip Wireless Communications
V. Catania, A. Mineo, S. Monteleone, M. Palesi, and D. Patti, “Improving energy efficiency in wireless network-on-chip architectures,” ACM Journal on Emerging Technologies in Computing Systems, vol. 14, no. 1, 2017.
![Page 62: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/62.jpg)
!62
Tuning Transmitting Power
! High BER as compared to wired NoC – 10-9 vs. 10-14
! General approach – Increasing the transmitting power for compensating
the attenuation introduced by the wireless medium ! Proposed approach – Tuning the transmitting power based on the reliability
level of the current transmitted data
![Page 63: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/63.jpg)
Tunable Transmitting PowerZigzag antenna modeled with Ansoft HFSS to compute attenuation (16Gbps)
Variable Power Amplifier
• S. Kaushik, M. Agrawal, H. K. Mondal, S. H. Gade, and S. Deb, “Path loss-aware adaptive transmission power control scheme for energy- efficient wireless noc,” in International Midwest Symposium on Circuits and Systems (MWSCAS), Aug. 2017, pp. 132–135.
• A. Mineo, M. Palesi, G. Ascia, and V. Catania, “Exploiting antenna directivity in wireless noc architectures,” Microprocessors and Microsys- tems, vol. 43, pp. 59–66, 2016.
![Page 64: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/64.jpg)
Simulation Setup• Two transmission profiles:
• normal) BER 10e-12 —> 1.47 pJ/bit
• (approximate) BER 10e-6 —> 1pJ/bit
• Wireless Interfaces placement same as Memory Controllers (mesh corners)
• 8 × 8 mesh-based NoC architecture simulated by using the Graphite Multicore Simulator with the following parameters:
![Page 65: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/65.jpg)
RepresentativeApplicationsApplication Description Approximated Regions
streamcluster:aRMSkerneldevelopedbyPrincetonUniversitythatsolvestheonlineclusteringproblem
Regions of 256 bytes required for storing the 64 dimensions of each point encoded as a floating point value of 4 bytes, for a total of 8192 regions.
canneal: developedbyPrincetonUniversity,itusescache-awaresimulatedannealing(SA)tominimizetherouXngcostofachipdesign
The annotation has been performed on the netlist element, for a total of 160,000 instances of 64 bytes netlist elements.
blackscholes:anIntelRMSbenchmarkthatcalculatespricesforaporYolioofEuropeanopXonsanalyXcallywiththeBlack-ScholesparXaldifferenXalequaXon
Two data structures have been annotated: optiondata a 36 bytes floating point structure, and prices (4 bytes floating point), for a total of 147,456 bytes and a 16,384 bytes, respectively.
radiosity: computestheequilibriumdistribuXonoflightinasceneusingthehierarchicaldiffuseradiositymethod.
elemvertex buf.col, a data structure encoding the three RGB components as 4 bytes floating point values, and elemvertex buf.vertex, a data structure encoding the 3-dimensional coordinates of each vertex of the polygons describing the 3D model of the scene. Each of these two structure occupies 12 bytes, for a total of 65,535 regions and 786,420 annotated bytes size each.
![Page 66: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/66.jpg)
EvaluationFlowFourscenarios:
3. Approx.NoC4. Approx.WiNoC
1. NoC2. WiNoC
![Page 67: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/67.jpg)
Results
∗AllenergyvaluesarenormalizedwithrespecttothewiredNoCenergyconsumption.
![Page 68: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/68.jpg)
Results–PerformanceMetrics
![Page 69: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/69.jpg)
Conclusions• ApproximatecommunicationtechniqueforimprovingtheenergyefficiencyofWiNoCarchitectures.• Dynamiclinkvoltageswing(NoClinks)• Dynamictransmittingpowermodulation(wirelesscommunications)
• Pragmabasedannotationoftheapplicationcode• loadandstoreinducedcommunicationsrelatedtoerrortolerantdata
• Assessmentonasetofrepresentativebenchmarks• Energysavingversusapplicationaccuracytrade-off.• Upto30%oftotalcommunicationenergysavinghasbeenobservedwithoutanyappreciableimpactontheaccuracymetrics
![Page 70: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/70.jpg)
Future Developments• Generalize & Automate in order to reduce the
required knowledge about the Application
• A methodology to identify approximable communication flows
• Automated choice of the most efficient approximation technique (reduced bits representation, reduced iterations, etc..)
• Automatic exploration loop
![Page 71: Approximate On-Chip Communication2](https://reader033.vdocuments.mx/reader033/viewer/2022041701/625365976681ba3f3a79bd94/html5/thumbnails/71.jpg)
Bibliography
• Vincenzo Catania, Andrea Mineo, Salvatore Monteleone, Maurizio Palesi, and Davide Patti. 2016. Cycle-Accurate Network on Chip Simulation with Noxim. ACM Trans. Model. Comput. Simul. 27, 1, Article 4 (August 2016), 25 pages. DOI: https://doi.org/10.1145/2953878
• Improving energy efficiency in wireless network-on-chip architectures, V Catania, A Mineo, S Monteleone, M Palesi, D Patti, ACM Journal on Emerging Technologies in Computing Systems (JETC) 14 (1), 9
• . Kaushik, M. Agrawal, H. K. Mondal, S. H. Gade, and S. Deb, “Path loss-aware adaptive transmission power control scheme for energy- efficient wireless noc,” in International Midwest Symposium on Circuits and Systems (MWSCAS), Aug. 2017, pp. 132–135.
• C. Roth, H. Bucher, S. Reder, F. Buciuman, O. Sander, and J. Becker. 2013. A SystemC modeling and simulation methodology for fast and accurate parallel MPSoC simulation. In Integrated Circuits and Systems Design (SBCCI), 2013 26th Symposium on. 1–6. DOI:http://dx.doi.org/10.1109/SBCCI.2013.6644853
• S. Deb, K. Chang, M. Cosic, A. Ganguly, P. P. Pande, D. Heo, and B. Belzer, “Enhancing performance of network-on-chip architectures with millimeter-wave wireless interconnects,” in IEEE International Conference on Application-specific Systems Architectures and Processors, 2010, pp. 73–80.
• E. Miller, H. Kasture, G. Kurian, C. Gruenwald, N. Beckmann, C. Celio, J. Eastep, and A. Agarwal, “Graphite: A distributed parallel simulator for multicores,” in High Performance Computer Architecture (HPCA), 2010 IEEE 16th International Symposium on. IEEE, 2010, pp. 1–12.
•