interconnection networks 4: hardware/software interaction
TRANSCRIPT
Network software/hardware interaction -- a case study
• V. Karamcheti and A. A. Chien, “Software Overhead in Messaging layers: Where does the time go?” ACM ASPLOS-VI, 1994.– A case study on the communication
performance issues on the CM5 machine.
Background
• The network requirement for a typical high performance computing user– In-order message delivery– Reliable delivery
• Error control• Flow control
– Deadlock free
• Where should these functions be realized?– Network hardware? Network systems? Or a
hardware/systems/software approach?
Background
• Where should these functions be realized?– How does the Internet realize these functions?
• No deadlock issue• Reliability/flow control/in-order delivery are done at the TCP
layer?• The network layer (IP) provides best effort service.
– IP is also in the OS (software).
– Drawbacks:• Too many layers of software• Users need to go through the OS to access the communication
hardware (system calls can cause context switching).
Background
• Where should these functions be realized?– High performance networking
• Most functionality below the network layer are done by the hardware (or almost hardware)
– This provides the APIs for network transactions
• If there is mis-match between what the network provides and what users want, a software messaging layer is created to bridge the gaps.
Software messaging level
Network
NI
Message Layer
NI
Message Layer
application application
Routing, switching, link level flow control, etc
In-order message deliveryReliable delivery
Error controlFlow control
Deadlock free
Messaging Layer
• Bridge between the hardware functionality and the user communication requirement– Typical network hardware features
• Arbitrary delivery order (adaptive/multipath routing)• Finite buffering• Limited fault handling
– Typical user communication requirement• In-order delivery• End-to-end flow control• Reliable transmission
Messaging Layer
Communication cost
• Communication cost = hardware cost + software cost– Hardware message time: msize/bandwidth, routing,
switching, etc.– Software time:
• Buffer management• End-to-end flow control• Running protocols
– Which one is dominating?• Depends on how much the software has to do.
What this study did:
• Analyzing the software cost in the CM5 machine.
• Investigating what overhead can be reduced if the underlining network provides higher level of service.
CM-5 Network hardware
• Send: store the dest node number and data in the NI send buffer
• Receive: load from the NI receive buffer• NI status is queries by loading the control registers
CM-5 Network hardware
• Out-of-order delivery (adaptive routing)• Nodes, NI, and the network have finite
buffering• Error detection, but not error correction.• Fixed packet size of five 32-bit words
CM-5 active message layer (CMAM)
• Active message– A message with a small amount of computation at the
receiving end.– Each message contain an address of a user-level
handler which is executed on message arrival with the message body as an argument.
– The handler extract data from the network and integrate it into the ongoing computation.
• CMAM vs Send/Recv– User direct access to network, no OS involve, the data go directly
to the user space– CMAM can be considered as some kind of software RDMA.
Software overhead cost analysis
• Consider implementation of 3 protocols– Single-packet delivery– Finite sequence, multi-packet delivery– Indefinite sequence, multi-packet delivery
• Use instruction counts for measurement
Single-packet delivery
Description Source Destination
Call/Return 3 10
NI Setup - -
Write to NI 2 -
Read from NI - 3
Check NI status 7 12
Control Flow 3 2
20 27
Finite sequence, multi-packet delivery
Packet transfer (4), buffer management (1, 2, 3, 5)Fault-tolerance (6), in-order delivery (extra inst. In 4).
Indefinite sequence, multi-packet delivery
In order delivery: sequence # (store/send/read)Fault-tolerance: (1) and acks.
Message size = 16 words
Message size = 1024 words
What do we see in the study?
• The mis-match between the user requirement and network functionality can introduce significant software overheads (50%-70%).
Messaging layer with high-level network feature
• Given that CMAM is considered to be very efficient, there are 2 choices to reduce software overhead– Lower user requirement– Raise level of service provided by the network
• Compressionless Routing (CR)– Order-preserving transmission– Deadlock freedom independent of packet acceptance
guarantees– Fault-tolerant transmission at packet level
Single-packet delivery
• Has the same cost as the previous CMAM case
• However, it is guaranteed to be fault-free, no deadlock or buffer overflow
Finite sequence, multi-packet delivery
1. No buffer allocation messages2. No overhead for in-order delivery3. No end-to-end acks.
Indefinite sequence, multi-packet delivery
Message size = 16 words
Message size = 1024 words
Discussion
• Larger packet sizes– Reduce overhead, especially for indefinite-sequence
protocol
• Improved network interfaces and DMA hardware– Network interface:
• only make basic cost faster, but not protocol cost in messaging layer
• Make it more important for messaging layer to be effective
– DMA:• only reduce cost in moving large amounts of data
Discussion (Cont.)
• Implication of network design– Improving routing performance may increase
software cost, e.g. out-of-order delivery
• Providing low level features to applications– Put burden on parallel software programmers– problematic
Conclusion
• In the design of the communication system, holistic understanding must be achieved:– Focusing on network hardware may not be
sufficient. Software overhead is much larger than routing time.
• It would be ideal for the network to directly provide high level services.