the future of the automobile - university of california ...€¦ · ibob “f engine” in detail...
Post on 01-Apr-2020
0 Views
Preview:
TRANSCRIPT
Jason Manley
Internal presentation:
Operation overview and drill-down
October 2007
System overview
Achievements to date
iBOB “F Engine” in detail
BEE2 “X Engine” in detail
Backend System in detail
Future developments
Discussion
8, 16 and 32 antenna dual-pol designs
Bandwidth <200MHz using iBOBs and BEEs
Full Stokes
2048 frequency channels
Control, initialisation and monitoring using Python scripts
100Mbps Ethernet output with integration times >16seconds
Capture UDP output packets using Python code into Miriad format
Web interface for near real-time visualisation
F Engine 0
10GbE Switch
F Engine 1
F Engine N-1
X Engine 0
X Engine 1
X Engine N-1
. . .
. . .
. . . FX architecture
F Engine 0
10GbE Switch
F Engine 1
F Engine N-1
X Engine 0
X Engine 1
X Engine N-1
. . .
. . .
. . .
FX architecture
BEE2
10GbE Switch
X Eng X Eng
BEE2 user FPGA
X Eng X Eng
BEE2 user FPGA
X Eng X Eng
BEE2 user FPGA
X Eng X Eng
BEE2 user FPGA
F Eng
F Eng
iBOB
F Eng
F Eng
iBOB
F Eng
F Eng
iBOB
F Eng
F Eng
iBOB
“Known good” – mirrors pocket correlator
Two F engines per iBOB
Dual polarization design
Currently uses combination of ASTRO and CASPER
libraries
Major data flow components:
X Engine
ADC
DDC Channelizer Equalization Reformat
Two X engines per BEE user FPGA
Uses CASPER library only
Pktize 10GbE Buffer X Eng AccumF Engine
Clocks:
X engines each run off independent clock
Sampling synchronized at F engines, but clock not distributed to X engines
Synchronized using global 1pps signal at ADCs
Propagated to X engines using out-of-band signaling on XAUI links
Headers labeling 10GbE Ethernet packet data
System control: separate 100Mbps Ethernet network
F engines configured through out-of-band signals on XAUI links
Control packets: UDP to Python server on BEE2 control FPGAs
Python scripts for configuration
X Engine
ADC
DDC Channelizer Equalization Reformat
Analogue Input 600MHz, but up to 800MHz t7, t6, t5, t4, t3, t2, t1, t0
Output:t4, t0
t5, t1
t6, t2
t7, t3
fout = fs/4 (normally 150MHz)8 bits × 4Signed Fixed point: 8.7 Numeric range: -1 to 1
DDC
X Engine
ADC
DDC Channelizer Equalization Reformat
Extracts a frequency band from the input signal
Input: Data: signed fix 8.7 Path: 32 bits
Output: Data: 8 bits “I”, 8 bits “Q”Path: 16 bits
Current setup:For fs = 600MHz,
Selects output band = 75 to 225 MHz
Decimation Filter
X Engine
ADC
DDC Channelizer Equalization Reformat
Improves out-of-band rejection ratio
Data: signed fix 18.17, complex Path: 36 bits
Input: Data: 8 bits “I”, 8 bits “Q”Path: 16 bits
Current setup:2048 channel, 4 tap PFB, hamming window
PFB
FIR
Output: Data: signed fix 18.17, complex Path: 36 bits
Data: Signed fix 18.15, complexPath: 36 bits
Downshift to prevent overflow in first stage of FFT
Non-detrimental: effective signal resolution from PFB is 8 bits.
FFTDown
shift
Runtime configurable downshifting through each
stage
X Engine
ADC
DDC Channelizer Equalization Reformat
Ax
Multiplies each frequency by a 17.3 bit scale factor.Can be used to correct
system frequency response irregularities at runtime.
Equalizer
Output: Data: signed fix 4.3, complex Total Path: 32 bits
Data: Signed fix 35.20, complex
Selects 4 bits,With saturating rounding
Numeric range: -0.875 to +0.875
Input (four signals): Data: signed fix 18.17, complex Path: 36 bits (x4)
Ay
Bx
By
Decimation
BRAM
lookup
table
X Engine
ADC
DDC Channelizer Equalization Reformat
Corner
Turner
Input: Data: signed fix 4.3, Complex Total Path: 32 bits
Ax
Ay
Bx
By
Ch 0
Ax
Ay
Bx
By
Ch 1
Ax
Ay
Bx
By
Ch N-1
…………
…
Data: signed fix 4.3, Complex, dual polTotal Path: 32 bits
Divide
by 2XAUI
Output:Data:
signed fix 4.3, Complex, dual
pol, four frequency chans
Total Path: 64 bits
Ax
Ay
Ax
Ay
Ch 0
Bx
By
Bx
By
Ch 0
t384 t256 t128 t0
Ax
Ay
Ax
Ay
Ch 1
Bx
By
Bx
By
Ch 1
XAUI
Data: signed fix 4.3, Complex, dual pol, four frequency chansTotal Path: 64 bits
Pktize 10GbE Buffer X Eng AccumF Engine
Header
Generation,
Processing
Allocation
Sync control,
System reset,
Ant decode
Payload size: 32 x 64bits + 64 bit hdr = 264 Bytes(Jumbo packet: 1120 bytes)
MSb LSb
MCNT ANT Hdr
f0t3 f0t2 f0t1 f0t0
Dat
a(3
2 x
64
b)
f0t7 f0t6 f0t5 f0t4
… … … …
f0t127 f0t126 f0t125 f0t124
64 bits
Pktize 10GbE Buffer X Eng AccumF Engine
F Engine 0
10GbE Switch
F Engine 1
F Engine N-1
X Engine 0
X Engine 1
X Engine N-1
. . .
. . .
. . .
“N” antennas“n” frequency channels f0
f1
.
.
.
fN-1
t0
fN
fN+1
.
.
.
f2N-1
t1
fn-N
fn-N+1
.
.
.
fn-1
tN/n -1
…
…
.
.
.
…
…
F Engine
Packet
stream
Pktize 10GbE Buffer X Eng AccumF Engine
10GbE
Transceiver
10GbE Ethernet
Data Unpack
Total packet size: 32 x 64 bit words + 64 bit hdr = 2112 bitsor, 264 bytes(Giant packet: 1120 bytes)
F Engine
Packet
stream
Pktize 10GbE Buffer X Eng AccumF Engine
10GbE
Transceiver
10GbE Ethernet
Data Unpack
Loopback
Mux
Pktize 10GbE Buffer X Eng AccumF Engine
2
1
3
0
6
5
7
4
1 Packet:1 Freq or 128 words
Data inserted into position in buffer determined by MCNT
in packet header.
Timeout if no packet received for 220 clocks.
Ship out a window when first packet of ½ buffer ahead
received (ie ship 1 when first packet of 5 received)
Only accept packets with MCNT:
½ buffer size back to ¼ buffer size ahead
(ie if already received up to packet 5, accept MCNTs for
windows 2, 3, 4, 5 or 6) – prevents spurious locks.
Circular buffer
Pktize 10GbE Buffer X Eng AccumF Engine
X engine
(streaming)
Streaming architecture assumes data valid on every clock.
Integration occurs
Each antenna input must thus be valid for integration_period clock cycles
Output must be filtered as duplicates occur
Pktize 10GbE Buffer X Eng AccumF Engine
X engine
(streaming)
Ax
Ay
t0
……
…
Bx
By
t128
……
…
Data: 4.3 bits , dual pol, complex Path: 16 bits
Z-128 Z-128 Z-128
Data: 16.6 bits, cplx , 4 terms Path: 128 bits
ACBD…
ABBCCD…
AD…
5 4 3 2 1
AA X X X X
BB AB X X X
CC BC AC X X
DD CD BD AD X
EE DE CE BE AE
FF EF DF CF BF
GG FG EG DG CG
HH GH FH EH DH
O AH AG AF AE
O O BH BG BF
O O O CH CG
O O O O DH
AxAx
AyAy
AxAy
AyAx
t128
AxBx
AyBy
AxBy
AyBx
t256
BxBx
ByBy
BxBy
ByBx
t257
Read out direction
Accumulation for 128 clocks
Simplification!See detail on last slide
Pktize 10GbE Buffer X Eng AccumF Engine
X engine
Re-orderData: 16.6 bits, complex , 4 terms Path: 128 bits
5 4 3 2 1
AA X X X X
BB AB X X X
CC BC AC X X
DD CD BD AD X
EE DE CE BE AE
FF EF DF CF BF
GG FG EG DG CG
HH GH FH EH DH
O AH AG AF AE
O O BH BG BF
O O O CH CG
O O O O DH
AxAx
AyAy
AxAy
AyAx
t0
AxBx
AyBy
AxBy
AyBx
t2
BxBx
ByBy
BxBy
ByBx
t4
Windowed bufferingData throttling
CxHx
CyHy
CxHy
CyHx
t71
Windowed baselines fed out every second clock
…………
…
Pktize 10GbE Buffer X Eng AccumF Engine
DRAM
Reformat
Data: 16.6 bits, complex , 4 terms Path: 128 bits
Increase number space to 32 bits
Data: 32.6 bits, complex , 2 terms Path: 128 bits
DRAM
Accumulator
Shared
BRAM
Integration length run-time configurable
Data: 32.6
bits, complex Path: 32 bits
Listen Config Start tx Start rx Display
Software registers on User FPGAs addressable by BORPH on Control FPGA
UDP Listener on BEE2 Control FPGA processor
Automated Python scripts for writing these registers
Special command “Start TX” begins dumping Shared BRAM output on separate UDP port
Receiver collects UDP output packets, buffers and writes to disk
Multiple files generated for storage, display and debugging
Web interface for plotting output data (useful for debugging)
Confirmed working using simulated correlator output data, generated on BEE processor
Listen Config Start tx Start rx Display
Python script executed on BEEs.
Allows programming of any software register on the BEE by name
Includes special functions, which can start/stop programs or scripts on the
BEE
Start or Stop transmitting data
Globally program gains on all connected iBOBs
Listen Config Start tx Start rx Display
Command-line parameterized
Sends packetized commands to listener on BEE
Arms iBOBs, sets FFT shifting schedule, sets iBOB EQ gains to defaults, set
accumulation length, sets antenna indices, ip addresses and ports.
Reads debug registers and snap blocks to confirm correct dataflow.
Attempts recovery through block reset and/or reprogram
Listen Config Start tx Start rx Display
Receiving computer requests dump start
BEE2 control FPGA monitors shared BRAMs and reads out when full
Data is enclosed in a timestamped packet (determined using BORPH’s system
time)
Header: 21 Bytes: Time, X Engine number, 4B vector num, 4B flags, 4B payload
length (historical)
Transmitted via UDP packet to pre-determined receiver
Listen Config Start tx Start rx Display
Receives packets, decodes header and appends to buffer (thus requires to receive in order)
If header out of order, dumps as invalid
Correct with new C code
Requires system parameters passed on command line when executed
Ability to read source from UDP packets, files or std-in pipe (untested)
Generates 4 files:
Miriad UV,
Info file (n_chans, integration length , system gain etc),
Numpy database (Python) of last integration for plotting
Numpy database of raw data (for debugging)
Listen Config Start tx Start rx Display
cgi script
10Gbps output gives sub 1 second integration times
High speed, scalable, distributed data capture software
Walsh codes and phase switching
64 antenna design
Upgrade to 4096 channels
ROACH hardware:
<550MHz bandwidth
16 384 channels
128 antennas with no architectural changes
casper_n/cn_i…
Currently in revision 3.02 testing, using ASTRO lib
Revision 4 will use CASPER library
casper_n/cn_b…
Revision 3.08 testing <- DEBUG!
Revision 3.07d stable
Revision 4 will have 10GbE output
top related