Normal text - click to edit
FeeCom software during TPC commissioning (Benchmarks)
22-01-2007
Sebastian Bablok Dag Toppe LarsenMatthias Richter
Benjamin Schockert
Department of Physics and Technology, University of Bergen, Norway
Center for Telecommunication and Technology Transfer,University of Applied Science Worms, Germany
Normal text - click to edit
TOC
TPC commissioning DCS –FEE part
Setup overview
Observations
Conclusion
Benchmarks during commissioning
results
remarks
Future plans
Normal text - click to edit
Front-End-Electronics in DCSControl and monitor channels
Cmd / ACKChannel
ServiceChannel
MessageChannel
FED Server
FEE Client
InterComLayer
FeeServer
PVSS II(FED - Client)
FeeServerFeeServer
Supervisory Layer
Control Layer
Field Layer
Front-End Device Interface (FED)
Front-End Electronics Interface (FEE)
HardwareDevice
HardwareDevice
HardwareDevice
Internal BusSystems
Load configuration data from file OR database
Config.DB
Config.File
Normal text - click to edit
Schematically layout for commissioning
Switch
Switch tpcfee01 (ICL)
tpcfee02 (Test-FedClient)
PVSS (incl. FedClient)
6 DCS boards (FeeServer
incl. TPC CE)
100MBit/s
10MBit/s
100MBit/s
External network Internalnetwork
Normal text - click to edit
DCS network setupBased on standard protcols/tools: DHCP, DNS, NFS
DCS boards on private network 10.x.x.x
.feenet used as local TLD
Board number used for MAC and IP addresses (24 LSB) and hostname-alias (dcs<board#>.feenet)
Gateway running ICL provides communication with outside world
Hostname in format tpc-fee_x_yy_z.feenet, dcs<board#>.feenet as alias
FeeServer name set from hostname
FeeServer stored on and run from external NFS share
Logs written to NFS share
Normal text - click to edit
DCS bootupMAC address set to board number
DCS board sends MAC address to DHCP server, requesting IP address and hostname
DHCP server looks up IP address for MAC address, then queries Domain Name Server for hostname matching IP-address
DHCP server returns IP configuration and hostname to DCS board
DCS board mounts two NFS shares – one RO and one RW
Boot-script run from RO shared directory
May start update scripts
Starts FeeServer with hostname as FeeServer name and logs outputed to RW share
Normal text - click to edit
Cables
DCS-side:
Uses non-standard connector without any locking
May easily fall out
Connectors are glued together, cable attached to cooling plate using cable ties
Switch-side:
Standard ethernet connector
Connectors not well made/attached, bad contact
Had to be re-crimped
Are still sensible to twisting when plugged into switch/patch panel
Normal text - click to edit
Network problems during commissioning
Some boards were unreachable via the network: 90% packet drop
Switch indicated 100Mb/s – not 10 as expected
Most boards affected, but some always, some rarely
However: a short power cycle seemed to help?
Turned out there was a bug in the kernel driver: autonegitiation not always enabled on boot
Ethernet interface switched to 100Mb/s operation
The electronics between ethernet chip and cable on DCS board does not support this because of modifications due to the strong magnetic field
Only a few packets got through
After kernel update, problems gone
Normal text - click to edit
Temperature measurements
• All FECs have temperature sensors– If temperature too high electronics
may be damaged
– The FeeServer will export temperatures to higher layers
– High temperatures will cause electronics to be switched off
• During commissioning temperature was written continuously to log files
– A temperature cross section for each partition was plotted forevery 12th hour
– No alarming temperatures were seen
Normal text - click to edit
Software
Mostly OK
InterComLayer/FeeServers interplay is working
FeeServers sometimes “disappear” from DID, but not from ICL. It seems like they are running, but not in a working state
FeeServers sometimes do not publish services – registration timeout
FeeServers crashes (and restarts) when FECs are turned on and off via DDL
The kernel update took care of most other problems (“impossible” to get all DCS boards running without “dirty tricks”)
Normal text - click to edit
Commissioning conclusion
Network based configuration worked as planed
Some initial network problems, OK after kernel update
No alarming electronics temperatures seen
Some minor FeeServer issues
Ethernet cables must be handled with care
Normal text - click to edit
Benchmarks during TPC commissioning
Benchmark done with one patch and a complete slice of the TPC
Benchmark test performed on TPC side 0 (a), slice 13 (single cast on patch 0)
Setup:
6 FeeServer with TPC ControlEngine (CE)
Switch: NETGEAR 7300S Series Layer 3 Managed Switch
InterComLayer on P4 (3.4GHz, dual core, 512 MB RAM, SLC 3)
FedClient implementation for testing purpose on different machine
Normal text - click to edit
Setup during commissioning and benchmark tests
Switch
Switchtpcfee01 (ICL)
tpcfee02 (Test-FedClient)
PVSS (incl. FedClient)
6 DCS boards (FeeServer
incl. TPC CE)
100MBit/s10MBit/s
100MBit/s
Normal text - click to edit
Components used during benchmark
Cmd / ACKChannel
FED Server
FEE Client
InterComLayer
PVSS II(FED - Client)
FeeServer/ CE
Supervisory Layer
Control Layer
Field Layer
Front-End Device Interface (FED)
Front-End Electronics Interface (FEE)
Load configuration data from file
Config.File
FeeServer/ CE
FeeServer/ CE
Normal text - click to edit
Benchmarks layout
Issued command:
Switching on / off of all Front-End-Cards of the patch
command size: 12 Byte (+ 12 Byte of FeePacket header = 24 Byte)
CE was emulating the execution of “switch on/off FEC” command
Send as:
Singlecast and Broadcast for a complete slice
from Test-FedClient and from PVSS
Normal text - click to edit
Benchmark results during TPC commissioning
SingleCast ControlFero command:
time period for [sec] average max min
Command in FedServer –
ACK in FeeClient
0.358162 1.092122 0.243506
SEND – ACK in FeeClient 0.3574644 1.091613 0.243026
Process time in ICL 0.000698 0.000999 0.00048
FeeServer computing 0.1118 0.84 0.02
Annotations:
command issued 100 times
no lost ACKs
Normal text - click to edit
Benchmark results during TPC commissioning
BroadCast ControlFero command (FedServer – Ack in FeeClient):
[sec] all patch0 patch1 patch2 patch3 patch4 patch5
average 0.404874 0.267716 0.275715 0.303979 0.290279 0.313083 0.32129
max 1.012536 0.619624 0.847929 0.775591 1.011102 0.902006 0.848276
min 0.249206 0.235348 0.032372 0.236584 0.236367 0.064199 0.228168
count 96 84 92 91 95 92 90
Annotations:
command issued 96 times,
lost ACKs: 21 (for missing already FeeServer no command had been issued)
Normal text - click to edit
Benchmark results during TPC commissioning
FeeServer/CE benchmark (receive command – send ACK):
patch0 patch1 patch2 patch3 patch4 patch5
average [sec] 0.028901 0.041023 0.031837 0.042708 0.028316 0.027245
max [sec] 0.22 1.11 0.61 0.62 0.66 0.44
min [sec] 0.02 0.02 0.02 0.02 0.02 0.02
seg faults 3 4 0 0 1 1
duplicated ACKs 6 15 4 8 10 4
counts 91 88 98 96 95 98
Annotations:
command issued 100 times,
duplicated ACKs may indicate temporarily lost links to ICL and/or DIM-DNS
Normal text - click to edit
Remarks to Benchmark tests
ACKs very delayed
very few ACK reached at the FeeClient after the ACK of the following Command has already been received
take over of ACK not possible in FeeServer and DIM framework
most likely package temporarily stuck in switch
duplicated ACKs
most likely due to lost link to FeeServer, DIM-DNS
should not disturb the system, filtered out by InterComLayer
Normal text - click to edit
Future Tests
Extended tests with more slices: 2, 9, 18 (one side), 36 (whole TPC, both sides)
preparing a complete set of benchmark test when TPC is available again in May 2007
Test with real commands, real configuration data and real execution in CE
Benchmarks of the Service Channels (fast triggered update of temp, etc.)
(usage of the CommandCoder during tests)
further investigation of delayed ACKs
verify that duplicated ACKs will not disturb the system
Normal text - click to edit