research presentation
DESCRIPTION
Covers summary of research work I have done till date and what I am working on right now.TRANSCRIPT
Nirav A. Desai [email protected]
1
Nirav A. Desai [email protected]
2
Nirav A. Desai [email protected]
3
MM-Wave Active Sensor: BPSK Spectrum can be seen in the Spectrum Analyzer
Nirav Desai
Nirav A. Desai [email protected]
5
Nirav A. Desai [email protected]
6
Nirav A. Desai [email protected]
7
Nirav A. Desai [email protected]
8
Nirav A. Desai [email protected]
9
Nirav A. Desai [email protected]
10
Nirav A. Desai [email protected]
11
Nirav A. Desai [email protected]
12
Nirav A. Desai [email protected]
13
Nirav A. Desai [email protected]
14
Nirav A. Desai [email protected]
15
Nirav A. Desai [email protected]
16
Nirav A. Desai [email protected]
17
Nirav A. Desai [email protected]
18
Nirav A. Desai [email protected]
19
Nirav A. Desai [email protected]
20
Nirav A. Desai [email protected]
21
Nirav A. Desai [email protected]
22
Nirav A. Desai [email protected]
23
Nirav A. Desai [email protected]
24
EE 5323: VLSI DESIGN 1 PROJECTCourse Instructor: Prof. Chris Kim
16-bit BRENT KUNG ADDER DESIGN in 45nM CMOSNirav DesaiID: 4280229
Department of Electrical and Computer EngineeringUniversity of Minnesota
Nirav A. Desai [email protected]
25
Nirav A. Desai [email protected]
26
Brent Kung Adder Gate Level Diagram
1. Input Block with Pre Computation
Input Adder Chain 1
Input Adder Chain 2
Input Adder Chain 3
Input Adder Chain 4
1X
1X
1X
1X
1.224X
1.562X
1.23X
1.274X
1.097X
1.553X
1.108X
1.034X
3.883X
3.043X
2.943X
10.1683X
10.8506X
36X
40X
Output Buffers to driveCapacitive Loads
Output Buffers to driveCapacitive Loads
Pi*Pi-1
Gi + Pi*Gi-1
Nirav A. Desai [email protected]
27
Brent Kung Adder Gate Level Diagram
2. Intermediate Dot Product Blocks
Intermediate Adder Chain 1
Intermediate Adder Chain 21X
1X
1X
1X
1.72X
6X
4X
16X
16X
Output Buffers to driveCapacitive Loads
Pi*Pi-1
Gi + Pi*Gi-1
Nirav A. Desai [email protected]
28
Brent Kung Adder Gate Level Diagram
3. Output Block for Post Computation
1.182X1.117X
Ci-1
Pi
Output Buffers to driveCapacitive Loads
Si
Nirav A. Desai [email protected]
30
Brent Kung Adder Transistor Level Design
Inverter Design Optimization
• NMOS Width = 90nm• PMOS / NMOS Length = 50nM• Vdd = 1.1V• Current Averaged Over One Period of 2 ns• Optimal PMOS Width = 165nM• βinverter = 165/90 = 1.834• Sizing for NAND, NOR and XOR Changed appropriately
Nirav A. Desai [email protected]
31
Brent Kung Adder Transistor Level Design
1. Input Block with Pre Computation
Input Adder Block Chain 1
Gate Number 1.000 2.000 3.000 4.000 5.000 Stage G Stage F Stage B Stage H Gate HGate Name BUFFER INVERTER NOR INVERTER NAND LOAD hg value 1.000 1.000 1.646 1.000 1.352 36.000 2.225 36.000 6.943 556.248 3.540f value 3.540 3.540 2.151 3.540 2.618648b value 2.893 2.400 1.000 1.000 1.000 1.000S Value 1.000 1.224 1.097 3.883 10.16831 36.000
Input Adder Block Chain 2
Gate Number 1.000 2.000 3.000 4.000 Stage G Stage F Stage B Stage H Gate HGate Name BUFFER INVERTER XOR NAND LOAD hg value 1.000 1.000 1.893 1.295 13.748 2.451 13.748 12.359 416.510 4.518f value 4.518 4.518 2.386 3.488b value 2.893 2.400 1.780 1.000 1.000S Value 1.000 1.562 1.553 3.043 13.748
Input Adder Block Chain 3
Gate Number 1.000 2.000 3.000 Stage G Stage F Stage B Stage H Gate HGate Name BUFFER INVERTER NOR LOAD hg value 1.000 1.000 1.646 3.941 1.646 3.941 6.943 45.038 3.558f value 3.558 3.558 2.162b value 2.893 2.400 1.000S Value 1.000 1.230 1.108 3.941
Input Adder Block Chain 4
Gate Number 1.000 2.000 3.000 4.000 5.000 Stage G Stage F Stage B Stage H Gate HGate Name BUFFER INVERTER XOR NAND INVERTER LOAD hg value 1.000 1.000 1.893 1.295 1.000 40.000 2.451 40.000 6.943 680.832 3.686f value 3.686 3.686 1.947 2.847 3.686447b value 2.893 2.400 1.000 1.000 1.000 1.000S Value 1.000 1.274 1.034 2.943 10.85056 40.000
3.94084
Logical Effort Design for Signal Chains labeled in previous slide #2
Nirav A. Desai [email protected]
32
Brent Kung Adder Transistor Level Design
2. Intermediate Dot Product Blocks
Logical Effort Design for Signal Chains labeled in previous slide #3
Intermediate Adder Block Chain 1
Gate Number 1.000 2.000 Stage G Stage F Stage B Stage H Gate HGate Name INVERTER NAND LOAD hg value 1.000 1.352 1.000 1.352 6.000 1.000 8.112 2.848f value 2.848 2.107 2.848b value 1.000 1.000 1.000S Value 1.000 2.107 6.000
Intermediate Adder Block Chain 2
Gate Number 1.000 2.000 Stage G Stage F Stage B Stage H Gate HGate Name BUFFER NAND LOAD hg value 1.000 1.352 2.848 1.352 2.848 2.000 7.701 2.775f value 2.775 2.053b value 2.000 1.000S Value 1.000 1.026
Nirav A. Desai [email protected]
33
Brent Kung Adder Simulated Performance
Voltage (V) Delay Max-C14 (nS)
Power Max (mW)
Power-DelayProduct (xE-12)
1.1 0.359 6.73 2.41
0.9 0.503 2.95 1.483
0.7 0.937 0.924 0.865
Simulations with maximally sized 1 stage buffers as determined by Logical Effort Designof individual chains
Voltage (V) Delay Max-C14 (nS)
Power Max (mW)
Power-DelayProduct (xE-12)
1.1 0.403 5.186 2.089
0.9 0.569 2.277 1.295
0.7 1.069 0.692 0.739
Simulations with minimally sized 1 stage buffers
Without Parasitic Extraction and Interconnect Parasitics buffering doesn’t improve performance significantly.
Nirav A. Desai [email protected]
34
Brent Kung Adder Worst Case Delay
Input Pattern: A: FFFF B: 0000 -> 0001
Dotted Lines show Carry Bits 15 and 14
Carry Bit 15 Carry Bit 14
Nirav A. Desai [email protected]
35
Brent Kung Adder Layout
Input Block with Pre Computation
Input Inverters for Bit 0 and Bit 1
Output BuffersPEX waveforms show
larger size may be needed
XORNAND10X
Nirav A. Desai [email protected]
37
Brent Kung Adder Layout
NAND 10.57X Layout with inter digitated fingers to reduce parasitics
Nirav A. Desai [email protected]
38
Brent Kung Adder Layout
Intermediate Dot Product Generator
Output BuffersPEX Waveforms
show largerSize may be necessary
here
Nirav A. Desai [email protected]
41
Future Design Modifications
• The design uses large buffers at the output of every stage to drive large capacitances• The buffers are not needed at nodes with low fanouts and can be eliminated.• The buffers at input nodes right now cause more power consumption and add to the delay .• Thus the overall performance can be improved with fewer buffers.
Nirav A. Desai [email protected]
42
References:
Course Slides from Prof. Kia Bazargan’s Course on VLSI
A Taxonomy of Parallel Prefix Networks
(David Harris ) – Reference paper on course
website
Digital Integrated Circuits by Jan Rabaey
Nirav A. Desai [email protected]
43
SRAM DESIGN PROJECT PHASE 2
Nirav Desai4280229
VLSI DESIGN 2: Prof. Kia BazarganDept. of ECE
College of Science and EngineeringUniversity of Minnesota, Twin Cities
43University of Minnesota
Nirav A. Desai [email protected]
44
SRAM CELL READ AND WRITE MARGIN FROM BUTTERFLY CURVE •NMOS inverter = 110nM PMOS inverter = 220nM NMOS Access = 90nM•NMOSinv/NMOSaccess = 1.2 PMOSinv/NMOSaccess=2.4 •Cbitline = 0.747fF for 512 cell array ( Interconnect Parasitics from ASU PTM Website )
University of Minnesota
Nirav A. Desai [email protected]
45
SRAM CELL READ AND WRITE MARGIN FROM BUTTERFLY CURVE •NMOS inverter = 150nM PMOS inverter = 555nM NMOS Access = 180nM•NMOSinv/NMOSaccess = 1.2 PMOSinv/NMOSaccess = 3 Cbitline = 0.747fF•Curve shows SRAM cell is close to write failure. •Bitline Precharge to less than 1.1V could be explored to increase SNM.
University of Minnesota
Nirav A. Desai [email protected]
46
Simulation Setup
• M0,M1,M3,M4 form the cross coupled inverter pair• M5,M6 are access transistors• C1, C2 is the bitline capacitance• M7 is the precharge switch for bitline ( bit ) - V3 precharges the bitline to 0.8V• V6 precharges bitbar and writes a 0 to the cell
V(write)
V(ic) V(word)
V(qbar)
V(q)
V(bitbar)V(bit)
University of Minnesota
Nirav A. Desai [email protected]
47
Timing Waveforms for Characterization
V(write) – Applied to source of M7 (precharge switch)
V(word) – Wordline Voltage
V(qbar)
V(q)
V(ic) – Enables the precharge switch M7
V(bitbar)
V(bit)
• V(write) precharges Cbit to 0.8V via M7• V(word) disables access transistors M5 and M6 during precharge .• V(qbar) and V(q) are used to generate the butterfly curves.• V(ic) enables M7 during precharge It could be implemented as
NOT(V(word)).• V(bitbar) precharges to 0.8V, shows
charge pumping when M7 turns off and follows V(qbar) when wordline is enabled.
• V(bit) follows V(q) after word line is enabled.• V(bit) precharged to Vdd by V6
University of Minnesota
Nirav A. Desai [email protected]
48
PASS TRANSISTOR BASED TREE DESIGN
1:8 Row Decoder Tree
Similar Tree Decoder for 16 LSB Bits
University of Minnesota
Nirav A. Desai [email protected]
50
PASS TRANSISTOR BASED TREE DESIGN
IN OUT
CK
CK
50
880
L
W
Identical Sizing for NMOS and PMOS to minimize charge injection effects
• Delay drops by ~40ps/2 for every Doubling of transistor widths• Delay drop saturates around 1000nM to 89ps• Used W/L of 880/50 for final tree
University of Minnesota
Nirav A. Desai [email protected]
51
TREE DECODER TIMING DIAGRAMS
The following waveforms were applied to the row and column selection inputs of the tree decoder
University of Minnesota
Nirav A. Desai [email protected]
52
TREE DECODER TIMING DIAGRAMS
It takes one cycle for initializing the tree decoder after which we get clean pulses for each row output
LSB pulse is wider than MSB pulse in bottom figure to allow the tree to clear present state before next
University of Minnesota
Nirav A. Desai [email protected]
53
TREE DECODER TIMING DIAGRAMS
The top waveforms shows the matrix point output where the row and column select inputs are highThe output node discharges when the input goes low
University of Minnesota
Nirav A. Desai [email protected]
54
Nirav A. Desai [email protected]
55
READ WRITE CIRCUIT ( Design by Bong Jin )
Sense Amplifier Write Driver
Precharge Circuit
University of Minnesota
Nirav A. Desai [email protected]
56
READ WRITE CIRCUIT TEST SETUP
Bitline Capacitance estimate from ASU PTM Website
Cbit estimate for 512 rows
NMOS Switches to allow read without disabling write circuit
Single SRAM Cell for simulations
University of Minnesota
Nirav A. Desai [email protected]
57
READ / WRITE TIMING WAVEFORMS
Precharge Pulse ( Active Low )
Data Meant to be written to cell
Write Enable Pulse
Read Enable Pulse
Output of Write Buffer
Disable output buffer ( tristate logic )
Bitline
Bitline Bar
Data Output
Data Out Bar
University of Minnesota
Nirav A. Desai [email protected]
59
2X2 SRAM Array Layout
VDD
GND
GND
WORD 1
WORD 0
B0 B0BAR B1 B1BAR
This unit can be replicated in all directions without any changes. LVS check remainingArray Size = 3.7975umX2.4725um
University of Minnesota
Nirav A. Desai [email protected]
60
References
Digital Integrated Circuits
Jan Rabaey, Anantha Chandrakasan, Borivoje Nikolic
( SRAM Cell Design, Decoders, Read Write Circuits )
CMOS VLSI Design by Weste and Harris
( Butterfly Curves )
CMOS Circuit Design, Layout and Simulation
Baker, Li, Boyce (Decoder Design)
Course slides of Prof. Kia Bazargan
( Precharge Techniques, Decoders, SRAM Cell Design )
University of Minnesota
Nirav A. Desai [email protected]
61
System Diagram for developing LMS Algorithm for Channel Estimation ( H(z) )
Errors e1 and e2 ( e2 being the Quantized Error ) could have the same convergence
If the channel model H(z) is adapted using a LMS Model
Next few slides show regular LMS and modified LMS Error Convergence
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
62
Error Convergence for regular LMS takes more time than the modified LMS
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
63
Modified LMS Adapts all tap weights using different errors computed using as many filter output estimates as the filter order. The assumption being that the optimum gradient direction for each tap weight is different and is given by the corresponding errorLattice Predictors would be a more efficient way to do this as compared to LMS since each stage of a predictor is optimum for that order unlike modified LMS where you adapt each tap weight in a sub optimal manner.
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
64
EEG Spectral Estimates for Pre-Ictal, Ictal and Post-Ictal Signal Sequences
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
65
Spectral Estimation for a low pass filtered impulse sequence using different techniques
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
66
Correlograms provide best Spectral Estimates for Low Pass Filtered Impulse Trains
Adaptive DSP Course by Prof. Keshab Parhi
Nirav A. Desai [email protected]
67
EE 5364 / CS 5204:Advanced Computer Architecture
Final Course Project on Design of a Branch Predictor
Prepared by:Nirav Desai 4280229
Amanda Skinner 3749048 Course Instructor: Prof. Pen-Chung Yew
Department of ECEUniversity of Minnesota, Twin Cities
Nirav A. Desai [email protected]
68Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Why Branch Predictor?• Branch Predictors improve the flow of
the instruction pipeline
• As Branch predictor accuracy increases,
cache misses decrease, or improve, for
both data and instruction caches
Nirav A. Desai [email protected]
69
Why Branch Predictor?
Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Nirav A. Desai [email protected]
70Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
• As branch predictor accuracy increases, cache misses go down
• Prefetching and increasing cache size decreases cache misses
Miss Rate for Mesa benchmark. Both the L1-Data and L2 cache associativities were changed
Why Prefetching ?
[4]
Nirav A. Desai [email protected]
71Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
• LA-PC runs ahead of PC and keeps track of load and store instructions
• RPT keeps track of previous reference addresses and strides for load and store instructions
• L2 Cache prefetching can be done by storing spill over data and instructions from L1 Cache blocks.
• INTEL CORE 2 Duo uses RPT for L1 Cache Prefetching and Loop Counter Local Branch Predictor
Reference Prediction Table[1]
Nirav A. Desai [email protected]
72
• Loop Counter would give high accuracy on matrix multiplication
• Track all registers for loop counter as possibility of different interleaved threads using different registers
• Loop Counter error would imply dynamic update of registers based on non-local values
• Tag registers giving repeated conditional branch errors on the Branch Decision Table
• Use the O-GEHL predictor for all tagged branches
• Using the loop counter and duplicate ALU will allow indexing long histories with limited geometric length
Design of Branch Predictor
Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Nirav A. Desai [email protected]
73Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Branch Decision Table
Branch Address
Predicted Direction
Predicted Branch Target
Actual Direction
Actual BranchTarget
Counters UsedC(i)(j)
Tag
Counters UsedC(i)(j)
Entered by LA-PC
Entered by Loop Counter or O-GEHL
Entered by Duplicate ALU
Entered by PC
Entered by PC
Entered by O-GEHL
Entered by O-GEHL
if prediction != actual decision
Prediction computed by Loop Counter ?
Yes - Incorrect Duplicate Register Values
Re-Initialize Duplicate Register Stack Set LA-PC to PC
After 2 successive errors make an entry in O-GEHLAlso tag the branch address in Branch Decision Table
to be used with O-GEHL
Prediction computed by O-GEHL ?
Yes – Run the update equation on counters listed in table
Set LA-PC to PC
Nirav A. Desai [email protected]
74Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Loop Counter Branch Predictor
Op-Code = 4 (beq) OR Op-Code = 5 (bne)
Duplicate Register Flag == 0 ?
Yes No
First Conditional Branch
Copy Register Stack to Duplicate Register Stack( Equivalent to initializing
the duplicate register stack)
Duplicate Register Stack Initialized
Set Register Flag for rs and rt = 1These registers will be tracked by the Duplicate ALU
Proceed to Branch Prediction Computation
rs == rt ? rs != rt ?
Op code == 4 ? Op code == 5 ?
yesno yes noExecute
Copy Off-Set from bits 15 to bit 0
Sign Extend Off Set to bit 31 ( Total 32 bits )Left Shift by 2 ( to get Word Address )
Add to PC+4 to get Branch Target Address
Inc LA-PCBy 4
Inc LA-PCBy 4
Do addition and subtraction for all instructions having rs and rt with
register flags set to 1 rs – Bits 25:21 rt – Bits: 20:16
The loop counter looks at only the conditional branches
Can be extended to bgtz, blez
Op-Code:Bits 31:26
Nirav A. Desai [email protected]
75Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
O-GEHL Branch Predictor[2]
C12()
C11()
C24()
C23()
C22()
C21()
C39()
C38()
C37()
C36()
C35()
C34()
C33()
C32()
C31()
History Lengths go in Geometric Progression given by L(i) = αi-1 L(1) + constantBest Series found from experiments: 2, 4, 9, 12, 18, 31, 54, 114, 145, 266
Dynamic History length fitting with variable α also possible.
C10266()
C10265()
C101()
Sum = ΣC(i)(j)+C(i+1)(k)+…C(i+9)(l)
• j,k,l .. Are incremented on every unconditional branch.
• j increments are modulo 2, k increments are modulo 4, l increments are modulo 266.• Each C(i)(j) is a 4 bit saturating counter
that counts -8 to 7.• Counter Update given by:
if(p!=out) if(branch==taken) c(i)(j)++
if(branch!=taken) c(i)(j)-- • Dynamic Threshold (θ) Fitting possible• Threshold(θ) by default is 0.
Sum > θ then p = takenSum < θ then p = not taken
Nirav A. Desai [email protected]
76Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Duplicate ALU ( for MIPS )[3]
LA-PC Address -Instruction
Duplicate Instruction Queue
Reg 3
Reg 2
Reg 1
Op Code
31-26
25-21
20-16
15-11
Decode Unit
CompareOp-Code
Op-Code == 4 OR 5: (beq, bne) Use Loop CounterOp-Code == 2 OR 3: (jump, jal) Always takeOp-Code == 0 & FUNCT==8 OR 9: (jr, jalr) Always take
Branch Target for Jump: 32bits: bits 31:28: 4 MSB bits of current PC+4 bits 27:2: Jump Target from instruction
bits 1:0 : 00 ( Word Addresses )Branch Target for Branch: 32 bits: Current PC + 4 + bits 15:0 left shifted by 2 to give word addresses
Compare Register Flags for reg1, reg2, reg3
If register flags set, do the computation forOp-Code: 0 bits(5:0) 32: add r1, r2, r3Op-Code: 0 bits(5:0) 34: sub r1, r2, r3Op-Code: 0 bits(5:0) 33: addu r1, r2, r3Op-Code: 0 bits(5:0) 35: subu r1, r2, r3Op-Code: 8: addi r1, constantOp-Code: 9: addiu r1, constant
• Set LA-PC Busy bit on instruction read• When LA-PC updated by branch predictors,
busy bit reset• For arithmetic, reset busy bit after 2 cycles• Instruction read when busy bit reset• LA-PC different from that used in RPT
This branch predictor can be used on Multi Threaded CPUs
Nirav A. Desai [email protected]
77
Test results on O-GEHL Branch Predictor[5]
Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
Nirav A. Desai [email protected]
78Nirav Desai 4280229 ECEAmanda Skinner 3749048 CS
References1. An Effective On-Chip Preloading Scheme to Reduce Data Access Penalty Jean-Loup Baer, Tien-Fu Chen Department of Computer Science and Engineering, University of Washington, Seattle, WA 98195 Supercomputing '91 Proceedings of the 1991 ACM/IEEE Conference on Supercomputing
2. The O-GEHL Branch Predictor Andre Seznec The 1st JILP Championship Branch Prediction Competition CBP1 (2004) Available from www.jilp.org
3. Computer Organisation and Design The Hardware-Software Interface David Patterson and John Hennessy
4. http://en.wikipedia.org/wiki/CPU_cache
5. Analysis of the Optimized GEHL Predictor Andre Seznec Available from: http://www.irisa.fr/caps/people/seznec/ISCA05.pdf
Nirav A. Desai [email protected]
80
Strained Silicon on SiGe Solar Cell
• Requires Chemical Vapor Deposition or MBE techniques for fabrication
• Tandem Solar Cell design gives a wide band of absorbable frequencies with different band gaps.
• Optimal thickness at quarter wavelength will give maximum absorption at designed frequency
• Back plate metal contacts and top plate fingered contacts
• Economically viable for charging battery packs in electric vehicles and for replacing LPG cooking gas cylinders.
• Long term viability for power generation feasible due to low operating costs and low distribution costs in a distributed model.
• Reference: Si/multicrystalline-SiGe heterostructure as a candidate for solar cells with high conversion efficiency: Photovoltaic Specialists Conference, 2002. Conference Record of the Twenty-Ninth IEEEDate of Conference: 19-24 May 2002Author(s): Usami, N. Inst. for Mater. Res., Tohoku Univ., Sendai, Japan Takahashi, T. ; Fujiwara, K. ; Ujihara, T. ; Sazaki, G. ; Murakami, Y. ; Nakajima, K. Page(s): 247 - 249
Nirav A. Desai [email protected]
81
Rake Receiver with MDS Codes
• Rake receivers could be used to identify strongest multi path component from a received signal.
• This could be achieved by correlating the received signal with itself over different delays and finding the strongest delay component.
• This does not involve maximal ratio combining.
• It could be combined with MDS codes for wireless communications where given any d bits corrupted by channel noise or multi path effects, the signal could still be recovered uniquely.
• Reference: Lectures of Prof. Cutter on iTunesU under the course on Digital Communications 2 taught at MIT.
• Reference: W-CDMA Rake Receiver implementation in DSP: EE Times: Link: http://www.eetimes.com/electronics-news/4139933/W-CDMA-RAKE-Receiver-Comes-to-Life-in-DSP
• Reference: A Rake Receiver for Maximal Ratio Combining without Channel Estimation for UWB Communications: http://digitalcommons.unf.edu/cgi/viewcontent.cgi?article=1044&context=ojii_volumes
Nirav A. Desai [email protected]
82
Class S RF Power Amplifiers on GaN HEMTs
• Class S RF Power Amplifiers with fully differential H-Bridge topology could give a theoretical 100% efficiency.
• GaN HEMTs give the best high frequency switching characteristics.
• The 2 features could be combined to give a high efficiency RF power amplifier topology.
• Reference: Ph.D. Dissertation of Stephan Maroldt, University of Freiburg
Nirav A. Desai [email protected]
83
Microprocessor Design
• The attached slides describe the design of a 16 bit Brent Kung Adder and 1024x16 asynchronous SRAM in 45 nM CMOS along with the design of a branch predictor and cache prefetch unit for a MIPS microprocessor.
• These design ideas could be combined with other ideas for pipeline design, ALU design and interconnect circuit design to give a full physical layer design of a MIPS microprocessor in 45nM CMOS.
• Various power reduction and clock gating techniques could be applied at a higher level of the hierarchy.
Nirav A. Desai [email protected]
84
mm-wave MIMO OFDM
• mm-wave MIMO OFDM could be used for wireless backhaul networks due to its high capacity
• mm-wave MIMO systems could be extended to 2x2, 4x4, 8x8, etc topologies to exploit spatial diversity and get higher data rate.
• Reference:
• 4 channel spatial multiplexing over a mm-wave line of sight link
Microwave Symposium Digest, 2009. MTT '09. IEEE MTT-S InternationalDate of Conference: 7-12 June 2009Author(s): Sheldon, C. Dept. of Electr. & Comput. Eng., Univ. of California, Santa Barbara, CA, USA Munkyo Seo ; Torkildson, E. ; Rodwell, M. ; Madhow, U.
Page(s): 389 - 392
Nirav A. Desai [email protected]
85
Routing algorithm to reduce congestion
• The routing algorithm to reduce congestion could be based on the idea of sparsity.
• High congestion nodes could be dropped from the network map till congestion on the node drops.
• The underlying packet streams would be using a flow control based routing protocol.
• Each node would store a map of the network which would be updated periodically using ping back messages.
• Could be applied to packet switched networks, traffic control and wireless sensor networks.
Nirav A. Desai [email protected]
86
Photonic Computers
• These could use multiplexer based logic gates.
• Photonic multiplexers have been widely researched and developed for optical communications.
• Phase detectors could be used to identify the phase and thus the value of the stored signal.
• These would use electronic charge storage and high speed electro-optic conversion.
• Reference: Prior research on this has been carried out in UCSB.