john mailhot -- cto infrastructure imagine communications · 2017-01-17 · magellan nms manager...
TRANSCRIPT
Magellan NMS Manager
SD/HD/3G SDI
UHD SQD/2SI
External IP
COTS IP Switches48x10G 576x40G
Cisco, Arista, Brocade, HP, Hauwei
MultiviewerMonitoring
Magellan SDN Orchestrator
SDN Orchestrator
Network Manager
Audio Production
Integrated via AES-67
Video Production
2022-6 2110
TimingSMPTE 2059
What does an IP-Based System Look Like?
SDIIP
UHDIP
IPIP
IP-based Signal Processors
Integrated Playout Devices External IP Gateway
Redundancy Redundancy Redundancy
What if a Switch Fails? What if the Optics Fail? What if a cable Fails?
ST 2022-7 Works Really Well• Send Two Copies• On Two Interfaces• To Two Switches• Join and Receive from both• Packet-by-packet merge• Can be applied to AES67 audio• Can be applied to TR03 video
SDI –vs- IP: Different Issues?
SDI Routers• Cost –vs- #ports
is very non-linear
• HD(3G) coax or fiber
• One Signal Per Port
Ethernet “L3 Switches”• Cost –vs- #ports is non-
linear (but different)
• SFP+, QSFP+, 10GbaseT
• Multiple Signals Per Port
• Underutilized Ports
How Many signals does an endpoint put in a 10GBE port?• Between 1 and 12* Uncompressed HDs
• Between 1 and 6000 audios
• FULL DUPLEX (like it or not)
The desktop IT world has the same problem, but the answer is not as critical.
Port Utilization is the Hard Part
Typical “IT” Network Architecture
Core Switch (1:1)• File Servers / NAS• Compute Servers
Distribution Switches (5:1)
User Edge Switches(25:1) h
ttp:/
/w
ww
.excit
ingip
.net/
Dual-Central-Switches• Everything home-runs back to main and
protect switches• Obviously NonBlocking• 1:1 protect (2022-7)• The core ports are moving to 100GBE
Engineered Aggregation• Lightly-loaded links go to
aggregator switches• bulk uplink to Centrals (no overload)• Can leverage the 100GBE core ports • Engineered to be non-blocking
Network Architectures for Television
Ports with small numbers of signals always go through aggregator
Ports with 50% or better utilization:Wire directly to core
YMMV: pencil it out based on the real switch configurations available and the devices youare designing for
When to do Which? – Depends on Details
Typical Shapes and Sizes
Fixed: 48x10 + 4x40 (A, B, C, HP, HU, E, J)
Fixed: 32/36 x40 QSFP+ (most)
Fixed: 64/72 x40 QSFP+ (most)
Fixed: 32/64 x100 QSFP100 (new)
Modular: Large and Larger: • Mix (by card) of 10G, 40G, & 100G ports
• Up to 2000x2000 10GBE equivalent ports
Best Network Architecture?
Reliability Dual Switches, Dual Paths
Cost Engineered Aggregation
Simplicity Control System/SDN
Functionality Control System/SDN
Scalability Extra Ports or Slots
Flexibility Open-Marketplace Feature Set
“Straight to Core” 1:1 & Non-Blocking
EquipmentNx40G 4N x 10G
Core Switches are 1:1 redundantAll signals wire back to the coreModular Core support variety of wire formats
(identical for redundant)
High Availability Switches• Redundant Fabric• Redundant Power• Redundant Control• Flexible Ports• Non-Blocking Line Rate• ISSU
High Availability Architecture• Dual Switches 1:1• Dual Transmit• Hitless Merge• 1:1 cables & optics
Plan for Growth• empty slots in core
“Leaf-Spine” 1:1 & Non-Blocking
Core/Spine Switches 100G links to top-of-racksCores available in many sizes
High Availability Switches• Redundant Power• Redundant Control• Flexible Ports • Non-Blocking Line Rate• ISSU
High Availability Architecture• Dual Switches 1:1• Dual Transmit• Hitless Merge• 1:1 cables & optics
Plan for Growth• Empty slots/ports @ core• Add more leafs
Top-Of-Rack / “Leaf” 100G uplinks to cores10G/25G/40G/50G ports
TOR “leaf”
Redundancy 2022-7 dual path
Scale/Utilization Leaf-Spine/Top Of Rack
Blocking? Non-Blocking by Design
Standards/Interop SMPTE 2110
Network Architecture Key Points
The Trivial SDI System
Sender A
Sender B Receiver Y
Receiver ZSender C
Receiver X
ControllerReceivers are Blind
Controller Crosspoint
Senders Just SendEven if nobody listening
The Trivial IP System
Controller
10GBE
Eth
ern
et S
witc
h
10GBE
10GBE
Sender A
Sender B
Sender C
Receiver X
Receiver Y
Receiver Z
Endpoints are SmartController Endpoints
Senders Just Send
Senders always transmit (onto the IP network)Even if Nobody is currently Listening
Each Sender/Stream has a unique IP Multicast Address
Senders provide Information to the controllerabout their signals (so the controller knows what is being sent)
What Must a Sender Do?
Receivers get instructions from the controller• “Switch to” this new signal
Receivers ask the Network for the Signals• IGMP Multicast “Join”
Some Receivers will switch cleanly between signals• While others will not – market forces will drive this
Receivers tell the Network when they are done with a signal• IGMP Multicast “Leave”
Network figures it out anyway just in case• IGMP Multicast Querier
What Must a Receiver Do?
Receiver X
Switch-To239.5.4.3
IGMP join239.5.4.3
(then)IGMP leave239.8.7.9
done
Recognize & Keep Track of Incoming Multicast Groups
Send (only) the group(s) a Receiver asks for (“Joins”)
Stop Sending the group(s) when a Receiver “Leaves”
Query Receivers about their continued interest
Provide Statistics and Information to the Controller
What Must the Switch/Network Do?
How did you do it in an SDI System?• Automated discovery maybe within the same vendor
• Manual entries when integrating other’s equipment
How will it work in an IP Environment?
• Automated Discovery by a Standard Protocol
• Stream Switching requests, too
• Let’s talk about AMWA-NMOS
How do I Find all the Parts of the System?
Advanced Media Workflow Association (AMWA)
Devices (things that make or eat signals)• Look in DNS (or mDNS) to find the “registrar”
• Tell the “registrar” who they are and what they have
• Keep the “registrar” informed if things change
Anybody (who cares) can:• Ask the registrar about the devices and their streams
• Ask the registrar to update them if things change
AMWA “Network Media Open Spec”
Built on very standard internet/web technology• Domain Name Service (DNS)• HTTP / REST communication style• JSON (JavaScript Object Notation) syntax
Technologies that are fully implemented in all major OS’s and Platforms already
Dozens of companies have tested NMOS in lab environments and interop events already
AMWA “Network Media Open Spec”
Here is the spec – FREE (not like SMPTE specs)• https://github.com/AMWA-TV/nmos
Here is an open-source implementation of it:• https://github.com/Streampunk/ledger
• There are others reported to be work-in-progress
How “Open” is AMWA-NMOS?
It solves two problems we care a lot about:
Registration& Discovery• Finding the parts of the system in a vendor-neutral way
• Cataloguing the streams being generated
Minimal Endpoint Control• Telling a Receiver to join a new stream
Does AMWA-NMOS Solve All Problems?
SDN == “Software Defined Network”• Means a Different Thing to Everybody who says it• IT Industry uses “SDN Controllers” to create custom networks
on the fly, often coupled with NFV• Most IT SDN Controllers have little/no support for multicast
routing (which we use a lot)
Television “Router Control Systems” designed to support the transition to IP equipment use some SDN techniques to optimize the configuration & operations workflow• Switch Configuration & Management via SDN interfaces• Managing some special cases that don’t do IGMP well
What is SDN and do I Need One?
How do we know what is going on in the
plant today?
•People Staring at PIPs
How will we know what is going on in the
future?
•People Staring at PIPs ?
T&M: Telemetry and Monitoring
Content Monitoring
Cue the Local insert in a network program
Preview the Camera shot before taking it
Time the switch to the talent’s banter
“Feel Good” that the air return looks OK
The backup PSU on the router is failed
There are some errors on the primary leg of CAM3 (secondary OK)
Optical Receive power on CAM5 CCU is low
Audio channel 2 on ENG2 is very quiet
Plant Monitoring
Content Monitoring Human Operators
Human Operators need to see the content to make content-based decisions• When to roll the interstitial promo
• When to switch from network to local
• Is the camera shot in focus and framed well
• Is the talent ready (for me to take the newsroom)
PIPS on the wall is a great way to interface with Human Operators
Path-Step Monitoring Debugging Tool
Pips for each step along the way
Maps to the hardware step by step
See main and protect sides
Give the Human Operator visual information to make a good decision about taking a backup signal or path• See what the backup path looks like before using it• Humans are good at seeing differences between two similar
scenes next to each other
But what if there are 50 channels? 100? 200? 500?
Multiviewer - Monitoring By Exception
Many classes of problems can be detected well by non-human methods analyzing the picture (without it being on a PIP in a multiviewer)• Black, Freeze, Blockiness• Audio mutes, Audio saturates• Underlying Errors (EDH, CRC, TSCC, IPSEQ, …)
What is the workflow after detecting an error?• Call Human Attention to the signal• “Penalty Box” approach
Telemetry – the New Frontier
Every Piece of equipment in the modern plant generates a ton of dynamic information• Frame sync: 500+ parameters & 100+ alarm-ables• MPEG encoder: 500+ parameters & 100+ alarm-ables• SDI Routers: dozens of status & alarm-ables per simple port • IP Routers: even more status and alarm-ables available
In most Television plants today, all this telemetrygoes onto the floor. Its not even logged.
To debug the plant, most people stare at the PIPs
Why Telemetry will matter more
Channel counts going up (but # of operators is not)
In single-threaded, non-redundant systems, most failures cause an obvious visible flaw• In redundant systems, the first failure ONLY causes a
telemetry event. The PIP looks OK until something else fails
In remote/virtual environments, telemetry might be all that you get besides the output signal. If the output is bad, telemetry is your only clue
What Can I Learn from Telemetry? (1)
Is the device happy with its input? How happy?• Bit errors, CRC errors, optical power, missed packets,
corrected packets
Is the device itself happy?• Temperature, internal alarms/alerts, redundancy status,
configuration changes (logged)
Is the device generating output?• Status of output signals
What can I learn from Telemetry (2a)
In SDI, the wiring diagram tells you everything.• Debug by following the wires
• Except if something is patched
• Or if the drawing doesn’t match reality
• Or it goes into a router/switcher/etc
In IP, many signals per wire, into a big switch• How do I know where each signal is going really?
• How do I know what signals are on the wire right now?
What can I learn from Telemetry (2b)
IP Switches tell you a TON about what is going on• LLDP: what is connected to every port
• Interface Statistics: TX/RX packets, bytes, rates, errors, broken out by many different types of error
Flow Data (Sflow, NetFlow)• Samples of the headers from every port in the switch
(includes some bytes of the payload)
• Much can be learned through this simple means
What can SFlow samples tell me?
List (and Approximate rate) of every incoming flow
5-Tuple [IP-SRC, IP-DEST, SRC-PRT, DEST-PRT, Proto]
What kind of RTP flow it is• SMPTE 2022-6, SMPTE 2022-2, TSUDP, AES67, …
Interface Statistics (counters, errors, etc) also
How can I analyze this Sflow data?
The IT Marketplace has created dozens of nice analysis tools driven by the data from the switches
About half of these tools are FREE
Turn the Telemetry ON and set it up
Establish a logging environment (choose one)• Logstash• Greylog• Nagios• Fluentd• Splunk
Invest some time on building import filters for theequipment that you have already
Telemetry & Monitoring “Sensible” Practices
Without Standards, things can’t be designed repeatably
• Nuts and Bolts, ASTM standards for concrete, Dimensional Lumber, etc.
Examples of (Television Industry) Standards that work:
• NTSC & PAL Composite AES3 and AES10
• HDMI & DisplayPort SDI, HDSDI, 3GSDI
Examples of (Television Industry) Non-Standards that hurt:
• RS422 Control Protocols 20 different file formats
• Some tape formats (What is your favorite?)
Technical Standards – Do they Matter?
Physical Layer Standards (Electrical, Optical, Connectors)• 10G, 40G, and 100G (10x10)Ethernet (IEEE 802.3-2012 is current)• 25G, 50G, and 100G (4x25) Ethernet (IEEE 802.3ba, IEEE 802bj)• SFP+ and QSFP+ connectors (SFF-8431, SFF-8635, SFF-8665)
RTP/UDP/IP Standards (IETF RFC 3550, etc)• How to send time-sensitive streams of things over IP
SMPTE 2022-2 (TS), 2022-6 (SDI), 2022-7 (redundancy)
AES67
[VSF TR03/04] SMPTE 2110
[Video & Audio] Over IP Standards
32NF-60 ST 2110
Video/Audio over IP Standards Universe
SDI
RFC4175pixels on IP
VSF TR03RTP of V,A,M
VSF TR042022-6+AES67
ST2022-6SDI on IP
AES67audio on IPRFC3190
L24 audioRFC3550RTP/UDP/IP
UHD Formats
3G Formats
HD Formats
SD Formats
Audio Samples
Metadata(VANC)
RFC-(WIP)VANC on IP
Standards BodiesIndustry AssociationsTechnology-SpecificAlliances/Communities
SFP/QSFPMulti-Source Agreement
Video Services Forum (VSF)• Video Networking Trade Association
• Technology groups recommendations• ABC, CBS, NBC, FOX are all members
European Broadcast Union (EBU)• 73 members, 56 countries, public media
Advanced Media Workflow Association• Software APIs, File Formats, etc.
Alliance for IP Media Solutions (AIMS)• Making the Standards Work, Together
• Encouraging the Standards to be Used
VSF, EBU, AMWA, and AIMS