sonic millip3de : massively parallel 3d stacked accelerator for 3d ultrasound
DESCRIPTION
Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound. Richard Sampson * Ming Yang † Siyuan Wei † Chaitali Chakrabarti † Thomas F. Wenisch * * University of Michigan † Arizona State University. Portable Medical Imaging Devices. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/1.jpg)
Sonic Millip3De:Massively Parallel 3D Stacked Accelerator for
3D Ultrasound
Richard Sampson* Ming Yang† Siyuan Wei† Chaitali Chakrabarti† Thomas F. Wenisch*
*University of Michigan †Arizona State University
![Page 2: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/2.jpg)
2
Portable Medical Imaging Devices
• Medical imaging moving towards portability– MEDICS (X-Ray CT) [Dasika ‘10]
– Handheld 2D Ultrasound [Fuller ‘09]
• Not just a matter of convenience– Improved patient health [Gunnarsson ‘00, Weinreb ‘08]
– Access in developing countries• Why ultrasound?
– Low transmit power [Nelson ‘10]
– No dangers or side-effects
![Page 3: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/3.jpg)
3
Handheld 3D Ultrasound
• 3D has numerous benefits over 2D– Easier to interpret images– Greater volumetric accuracy
• … as well as many challenges– 12k transducers, 10M image points
• 10-20x beyond state of the art– High raw data bandwidth (6Tb/s)
• Major bottleneck in state of the art– Tight handheld power budget (5W)
![Page 4: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/4.jpg)
4
Why a Custom Accelerator?
• Software algorithms load/store intensive– von Neumann designs inefficient
• Large system would require over 700 DSPs– General purpose CPUs even less efficient
Architecture Energy/Scanline(1 fps)
Single CoreTime/Scanline
Intel Core i7-2670 25.08J 4.46sARM Cortex-A8 33.04J 132.18sTI C6678 DSP 2.84J 2.27s
![Page 5: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/5.jpg)
5
Contributions
• Iterative delay calculation algorithm– Reduces storage by over 400x– Enables streaming data flow
• Sonic Millip3De design– Leverages 3D die stacking technology– Transform-select-reduce accelerator framework
• Power and image analysis of Sonic Millip3De– Negligible change in image quality– Able to meet 5W power budget by 11nm node
![Page 6: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/6.jpg)
6
Outline
• Introduction• Ultrasound background• Algorithm design• System design
– Sonic Millip3De– Select Sub-Unit
• Results and analysis• Conclusions
![Page 7: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/7.jpg)
7
Ultrasound: Transmit and Receive
Receive Raw Channel Data
ImageSpace
FocalPoints
ReceiveTransducer
TransmitTransducer
𝜏
![Page 8: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/8.jpg)
Ultrasound: Transmit and Receive
8
𝜏
![Page 9: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/9.jpg)
Ultrasound: Transmit and Receive
9
𝜏
![Page 10: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/10.jpg)
Ultrasound: Transmit and Receive
10
𝜏
![Page 11: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/11.jpg)
Ultrasound: Transmit and Receive
11
𝜏
![Page 12: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/12.jpg)
Ultrasound: Transmit and Receive
12
𝜏
![Page 13: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/13.jpg)
Ultrasound: Transmit and Receive
13
𝜏
![Page 14: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/14.jpg)
Ultrasound: Transmit and Receive
14
𝜏
![Page 15: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/15.jpg)
Ultrasound: Transmit and Receive
15
𝜏
![Page 16: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/16.jpg)
Ultrasound: Transmit and Receive
16
𝜏
![Page 17: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/17.jpg)
Ultrasound: Transmit and Receive
17
𝜏
![Page 18: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/18.jpg)
Ultrasound: Transmit and Receive
18
𝜏
![Page 19: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/19.jpg)
Ultrasound: Transmit and Receive
19
𝜏
![Page 20: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/20.jpg)
20
Ultrasound: Transmit and Receive
Each transducer stores array of raw receive data
𝜏
![Page 21: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/21.jpg)
21
Ultrasound: Image Reconstruction
Image reconstructed from data based on round trip delay
![Page 22: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/22.jpg)
22
Ultrasound: Image Reconstruction
Images from each transducer combined to produce full frame
![Page 23: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/23.jpg)
23
Delay Index Calculation
• Iterate through all image points for each transducer and calculate delay index
• Often done with lookup tables (LUTs) instead• 50 GB LUT required for target 3D system
𝜏 𝑃
𝑃
![Page 24: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/24.jpg)
24
Challenges of Handheld 3D Ultrasound
• Delay index LUT requires too much storage– New iterative algorithm reduces necessary
constant storage by 400x• Peak raw data bandwidth (6Tb/s) infeasible
– Sub-aperture multiplexing reduces peak data rate, but requires more transmits
• Handheld power budget very tight (5W)– 3D stacked, highly parallel data streaming design
reconstructs images efficiently
![Page 25: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/25.jpg)
25
Iterative Delay Index Calculation
• Deltas between adjacent focal points on a scanline form smooth curve
• Fit piecewise quadratic approx. to delta function
• Two sections sufficient for negligible error
Section 1 Section 2
![Page 26: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/26.jpg)
26
Sub-aperture Multiplexing
• Peak raw data bandwidth (6Tb/s) infeasible• Solution: sub-aperture multiplexing
– Transmit multiple times from same location– Receive with subset of transducers (sub-aperture)– Sum images together
• Prior work: reduce data rate• Our design: also reduces HW
and power requirements
![Page 27: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/27.jpg)
27
System Design
![Page 28: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/28.jpg)
28
System Design
Sonic Millp3De comprises 1,024 parallel pipelines
![Page 29: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/29.jpg)
29
System Design: Transducers
Interchangeable CMOS transducer layer; can use older process
![Page 30: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/30.jpg)
30
System Design: ADC/Storage
Separate storage layer to reduce wire lengths
![Page 31: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/31.jpg)
31
System Design: Transform-Select-Reduce
Accelerator units in fast, low power process
![Page 32: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/32.jpg)
32
Select Sub-Unit Design
Selects sample closest to each focal point using our algorithm
![Page 33: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/33.jpg)
33
Select Sub-Unit Design
All delays for a scanline estimated using 9 constants
Section 1 Section 2
![Page 34: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/34.jpg)
34
Select Sub-Unit Design
Adders calculate next iteration of quadratic approximation
A(n+1)2 + B(n+1) + C = (An2 + Bn + C) + 2An + (A+B)
Section 1 Section 2
![Page 35: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/35.jpg)
35
Select Sub-Unit Design
Decrementor selects sample for next image focal point
Section 1 Section 2
![Page 36: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/36.jpg)
36
Select Sub-Unit Design
Section decrementor indicates when to change constants
Section 1 Section 2
![Page 37: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/37.jpg)
37
Outline
• Introduction• Ultrasound background• Algorithm design• System design
– Sonic Millip3De– Select Sub-Unit
• Results and analysis• Conclusions
![Page 38: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/38.jpg)
38
System ParametersParameters Value
Sub-apertures 12Transmit Sources 16
Transmits per Frame 192Transducers per Sub-aperture 1,024
Total Transducers 12,288Storage per Transducer 4,096 x 12 bits
Focal Points per Scanline 4,096Image Depth 6 cm
Image Angular Width π/4Sampling Frequency 40 MHzInterpolation Factor 4x
Interpolated Sampling Frequency (fs) 160 MHzSpeed of Sound (tissue) 1,540 m/s
Target Frame Rate 1 fps
![Page 39: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/39.jpg)
39
Image Quality Comparison
Ideal Our Design (12 bit)
Our design has negligible difference from ideal system
11 bit
Bits Ideal 14 13 12 11 10CNR 2.972 2.942 2.960 2.942 2.536 2.233
Simulations using Field II [Jensen ‘92, ‘95]
![Page 40: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/40.jpg)
40
Power Analysis and Scaling
45 32 22 16 110
5
10
15
20DRAMMemory InterfaceNetwork WiresAcceleratorSRAMADCTransducers
Technology Node
Pow
er (W
)
Can meet 5W by 11nm node
![Page 41: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/41.jpg)
41
Conclusions
• 3D die stacked Sonic Millip3De design is able to meet 5W power budget by 11nm
• Algorithm/HW co-design enables order-of-magnitude gains– Power and output quality goals often in conflict– Need guidance from domain experts to balance
• Architects have much to offer for application-specific system designs
![Page 42: Sonic Millip3De : Massively Parallel 3D Stacked Accelerator for 3D Ultrasound](https://reader035.vdocuments.mx/reader035/viewer/2022062310/5681664c550346895dd9c6dd/html5/thumbnails/42.jpg)
42
Questions?
Special thanks to:
Brian FowlkesOliver KripfgansRon Dreslinski