implementing scalable and cost- effective session border control
TRANSCRIPT
IMPLEMENTING SCALABLE AND COST-EFFECTIVE SESSION BORDER
CONTROL ON GENERIC SERVER HARDWARE
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 1
David Reekie, SVP Engineering ETSI Future Network Workshop 10th April 2013
AGENDA
� Why is this an interesting case study?
� The route to virtualisation
� How most people build SBCs today
� Aspects of performance
� Packet throughput
� Media transcoding
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 2
� Media transcoding
� Remaining challenges
THE ROUTE TO COTS HARDWARE AND VIRTUALISATION
� Implement all SBC functions in software on generic server
� Signaling plane functions
� Media plane functions
� Achieve sensible levels of scaling
� 200k – 1M+ registered endpoints per server
� > 20k concurrent media sessions per server
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 3
� > 20k concurrent media sessions per server
� Get all this working in a virtualised environment
� Preserve capacity and performance
� Preserve failover capabilities
TODAY'S SBCS ARE BUILT ON PROPRIETARY HARDWARE
Network
DSPs
Crypto
General Purpose
SIPSignaling
RTPMedia
AccelerateSRTP, TLS,
Perform transcoding
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 4
ProcessorCrypto
TCAM
PurposeCPU
Discard packets from blacklisted IP
addresses
Look up RTP flows to
manage media relay
SRTP, TLS, IMS AKA
SOFTWARE SOLUTIONS FOR THESE CHALLENGES
� Spread load across multiple cores
� A 16-core server can do a lot of heavy lifting
� Make intelligent use of cache memory
� Fast look-up for media flows and IP address blacklists
� Do performance-critical jobs close to the metal
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 5
� Do performance-critical jobs close to the metal
� Custom kernel modules for low-level packet handling
� Leverage built-in crypto instruction set
� TLS, IPsec, SRTP
� Do the best you can with software-based transcoding
PERFORMANCE OF SOFTWARE-BASED SBC
� Standard off-the-shelf server, 1U, $9k
� NICs need to support Receive Side Scaling (RSS)
� If NICs support Intel DPDK integration, performance ↑ 2.5x
� Performing both signaling and media functions
� "Integrated SBC" configuration
� Signaling throughput > 12,000 SIP messages per second
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 6
� 6M BHCA (based on 7 x SIP messages per call)
� Media handling: up to 18,000 - 48,000 (DPDK) concurrent media streams
� RTP-to-RTP relay, any codec, no transcoding
Intel claims 80M pps forwarding of 64-byte frameswith 8-core Xeon and DPDK (on bare metal)
WHAT HAPPENS WHEN SBC RUNS OVER HYPERVISOR
� Signaling function performs as expected
� Hypervisor overhead in the range 5-20%
� Cross-core locking a major impact
� Typical of network-intensive virtualised applications
� Can be detailed issues to resolve eg high re system clock on KVM
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 7
� Media function suffers substantial throughput reduction
� RTP relay requires high throughput of small UDP packets
� Bare metal performance of 4M packets/second is achievable
� Hypervisor can introduce order-of-magnitude reduction
� Highly dependent on hypervisor and vNIC driver choice
HYPERVISOR UDP PACKET THROUGHPUT VS RAWM
bps thro
ughput
500
1000
64-byte packets
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 8
Mbps thro
ughput
Acknowledgement: "Performance Comparison of Common Server Hardware Virtualization Solutions Regarding the Network Throughput of Virtualized Systems", Daniel Schlosser, Michael Duelli, and Sebastian Goll, University of Würzburg, Germany, March 2011
0
HYPERVISOR PACKET THROUGHPUT CHALLENGE
� Quick fix by making use of passthrough mode
� Nailed-up connection between SBC app and physical NIC(s)
� Reduces flexibility of virtualised SBC
� Preferable to address hypervisor limitations
� What's needed: improved packet throughput of vSwitch component
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 9
� What's needed: improved packet throughput of vSwitch component
� Intel very active in this space
� SR-IOV – Single Root I/O Virtualisation: virtualises the NICs resources for sharing between cores / guests
� Also ARM + DSP SoC
� But none currently compatable with virtual routing
TRANSCODING IN SOFTWARE
� Historically, transcoding has been performed by DSPs
� Transcoding in software on x86 is more costly
� Hardware costs, power consumption costs
� Over five years, software-based transcoding ~ 2X cost of DSP-based
BUT
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 10
� DSP resources can only be used for transcoding
� x86 resources can be used for any kind of processing
� Modern codecs (eg SILK) optimised for CPU not DSP
� Flexibility of software-based transcoding compensates for higher cost
BUT
REMAINING CHALLENGES
� Cohabitation of fast packet processing and virtual routing
� Impacts NICs, servers, hypervisors and OSs
� Needs support across that vendor ecosystem to resolve
� Plus
� 1:1 FT needs control of instance location
� Networking control:
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 11
� Networking control:
� Multiple redundant interfaces? Bonded?
� Separate signalling and management networks / VLANs?
� Cloud owner vs service owner management and monitoring
� Reliability: do you depend on any cloud services (obvious or hidden)
METASWITCH NETWORKS | PROPRIETARY AND CONFIDENTIAL | METASWITCH.COM | © 2013 | SLIDE 12