x86 hardware for packet processing
Post on 27-Jun-2015
4.369 Views
Preview:
DESCRIPTION
TRANSCRIPT
Hardware/Software for Packet Processing
Hisaki Ohara
Thursday, October 11, 2012
Today’s Agenda
• Basic Requirement
• DCA (Direct Cache Access)
• Multiqueue (VMDq and RSS)
Thursday, October 11, 2012
Basic Requirement for Packet Processing
• 14.8 Mpps (packets per second) for 10GbE
• 10G / {8 * (64+8+12)}
• Processing time 67 nsec for a packet
• About 134 cycles for Xeon 2GHz
Thursday, October 11, 2012
Core
Hardware and Software
NICMulti Queue
Cache
CoreCoreCore
CPU
Memory
DMA
RSSVMDq
DCA
Multi Core
MSI-X
TSO
Full APIC VirtualizationPosed Interrupt
Interrupt Coalescing
LRO
IOMMU
Thursday, October 11, 2012
DCA (Direct Cache Access)• Feature: Put the data directly into the cache
• Reduce memory traffic
• Improve latency
• VT-c ....
• Hard to determine which CPU/chipset/NIC/firmware supports the feature
• First platform: Xeon 5100, 7300
• Now, Intel Data Direct I/O Technology..
• TPH (TLP Processing Hints)
• PCI Express 2.1 Protocol Extensions
• Steering Tags (8 bits)
Thursday, October 11, 2012
recap: DMA [Device Write]
MemoryController
NIC
Memory
PCIe RC
M→E→I→E
Cache
①
② ③
④
⑤
Maintain Cache Coherent!
① Memory write by NIC
② Snoop system cache
③ If Modified state, Writeback by CPU (Transits to Exclusive state)④ Device Write to memory (Transits to Invalid state)
⑤ Interrupt
⑥
⑥ Software reads DMA data (Transits to Exclusive state)
Thursday, October 11, 2012
DCA [Device Write]
MemoryController
NIC
Memory
PCIe RC
M→E→M
Cache
①
② ③
④① Memory write by NIC
② Snoop system cache
③ If Modified state, Writeback by CPU (Transits to Exclusive state)④ Device Write to cache (Transits to Modified state)
⑤ Software reads DMA data
Keeps Modified state as much as possible
⑤
Thursday, October 11, 2012
MESI protocol
M
I
E S
Thursday, October 11, 2012
Workload for DCA• Possibility that cache line modified by DCA
is evicted by writeback before it is read
• Depends on workloads
Thursday, October 11, 2012
Virtualization for Networking
Physical NIC
Physical Driver
Virtual Switch
Virtual I/F Virtual I/F
Virtual HW Virtual HW
Virtual Driver Virtual Driver
VM VM
Forward Ethernet frame
Resource reservation
Header inspection
Thursday, October 11, 2012
Virtualization for Networking
Physical NIC
Physical Driver
Virtual Switch
Virtual I/F Virtual I/F
Virtual HW Virtual HW
Virtual Driver Virtual Driver
VM VM
Forward Ethernet frame
Resource reservation
Header inspection
- Packet sorting- Moving data to VM- Routing packets to properCPU for receive
VMDq
Thursday, October 11, 2012
Multiqueue (VMDq, RSS)
• When reading the source code of ixgbe, relationship between VMDq, RSS, DCA and multi queue is not clear (for me)
• VT-c, again...
• Mixed terminology for feature and marketing
• Let’s clarify with datasheet
• Only focus on Intel 82599 (Niantic)
Thursday, October 11, 2012
Queues in 82599 Non-Virtualization128 Receive Queue128 Transmit Queue
16 RSS Queues
Thursday, October 11, 2012
Queues in 82599 VirtualizationRX
RX
RX
RX
RX
RX
RX
TX
TX
TX
TX
TX
TX
TX
QP (Queue Pair)128 Queue Pairs
RX
RX
RX
RX
RX
RX
RX
TX
TX
TX
TX
TX
TX
TX
RX
RX
RX
RX
RX
RX
RX
TX
TX
TX
TX
TX
TX
TX
Pool
#Pools * #Queue_Pair = 128
Pool 0 Pool 1 Pool 2 Pool 63
2 QPs 2 QPs 2 QPs 2 QPs
L2 Sorter/Classifier Switch
VM0 VM1 VM2 VM63
VMDq
Without RSS: 16 pools x 1 queue 32 pools x 1 queue 64 pools x 1 queueWith RSS: 32 pools x 4 RSS 64 pools x 2 RSS
Thursday, October 11, 2012
VMDq and RSS
• RSS is not supported in IOV mode (case of 82599)
• Supported in VMDq mode
• NetQueue in VMware ESX
Thursday, October 11, 2012
top related