john mailhot -- cto infrastructure imagine communications · 2017-01-17 · magellan nms manager...

49
John Mailhot -- CTO Infrastructure Imagine Communications

Upload: others

Post on 30-Mar-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

John Mailhot -- CTO InfrastructureImagine Communications

Magellan NMS Manager

SD/HD/3G SDI

UHD SQD/2SI

External IP

COTS IP Switches48x10G 576x40G

Cisco, Arista, Brocade, HP, Hauwei

MultiviewerMonitoring

Magellan SDN Orchestrator

SDN Orchestrator

Network Manager

Audio Production

Integrated via AES-67

Video Production

2022-6 2110

TimingSMPTE 2059

What does an IP-Based System Look Like?

SDIIP

UHDIP

IPIP

IP-based Signal Processors

Integrated Playout Devices External IP Gateway

Redundancy Redundancy Redundancy

What if a Switch Fails? What if the Optics Fail? What if a cable Fails?

ST 2022-7 Works Really Well• Send Two Copies• On Two Interfaces• To Two Switches• Join and Receive from both• Packet-by-packet merge• Can be applied to AES67 audio• Can be applied to TR03 video

SDI –vs- IP: Different Issues?

SDI Routers• Cost –vs- #ports

is very non-linear

• HD(3G) coax or fiber

• One Signal Per Port

Ethernet “L3 Switches”• Cost –vs- #ports is non-

linear (but different)

• SFP+, QSFP+, 10GbaseT

• Multiple Signals Per Port

• Underutilized Ports

How Many signals does an endpoint put in a 10GBE port?• Between 1 and 12* Uncompressed HDs

• Between 1 and 6000 audios

• FULL DUPLEX (like it or not)

The desktop IT world has the same problem, but the answer is not as critical.

Port Utilization is the Hard Part

Typical “IT” Network Architecture

Core Switch (1:1)• File Servers / NAS• Compute Servers

Distribution Switches (5:1)

User Edge Switches(25:1) h

ttp:/

/w

ww

.excit

ingip

.net/

DataCenter Network Architectures

MLAG/Spine-Leaf, Spine-Leaf (Clos), Spline

Dual-Central-Switches• Everything home-runs back to main and

protect switches• Obviously NonBlocking• 1:1 protect (2022-7)• The core ports are moving to 100GBE

Engineered Aggregation• Lightly-loaded links go to

aggregator switches• bulk uplink to Centrals (no overload)• Can leverage the 100GBE core ports • Engineered to be non-blocking

Network Architectures for Television

Ports with small numbers of signals always go through aggregator

Ports with 50% or better utilization:Wire directly to core

YMMV: pencil it out based on the real switch configurations available and the devices youare designing for

When to do Which? – Depends on Details

Typical Shapes and Sizes

Fixed: 48x10 + 4x40 (A, B, C, HP, HU, E, J)

Fixed: 32/36 x40 QSFP+ (most)

Fixed: 64/72 x40 QSFP+ (most)

Fixed: 32/64 x100 QSFP100 (new)

Modular: Large and Larger: • Mix (by card) of 10G, 40G, & 100G ports

• Up to 2000x2000 10GBE equivalent ports

Best Network Architecture?

Reliability Dual Switches, Dual Paths

Cost Engineered Aggregation

Simplicity Control System/SDN

Functionality Control System/SDN

Scalability Extra Ports or Slots

Flexibility Open-Marketplace Feature Set

“Straight to Core” 1:1 & Non-Blocking

EquipmentNx40G 4N x 10G

Core Switches are 1:1 redundantAll signals wire back to the coreModular Core support variety of wire formats

(identical for redundant)

High Availability Switches• Redundant Fabric• Redundant Power• Redundant Control• Flexible Ports• Non-Blocking Line Rate• ISSU

High Availability Architecture• Dual Switches 1:1• Dual Transmit• Hitless Merge• 1:1 cables & optics

Plan for Growth• empty slots in core

“Leaf-Spine” 1:1 & Non-Blocking

Core/Spine Switches 100G links to top-of-racksCores available in many sizes

High Availability Switches• Redundant Power• Redundant Control• Flexible Ports • Non-Blocking Line Rate• ISSU

High Availability Architecture• Dual Switches 1:1• Dual Transmit• Hitless Merge• 1:1 cables & optics

Plan for Growth• Empty slots/ports @ core• Add more leafs

Top-Of-Rack / “Leaf” 100G uplinks to cores10G/25G/40G/50G ports

TOR “leaf”

Network Design – “only” 3K x 3k SDI equivalent

Nx100 Nx100

Nx100 Nx100

Nx100 Nx100

Redundancy 2022-7 dual path

Scale/Utilization Leaf-Spine/Top Of Rack

Blocking? Non-Blocking by Design

Standards/Interop SMPTE 2110

Network Architecture Key Points

Control Systems are your Friend

The Trivial SDI System

Sender A

Sender B Receiver Y

Receiver ZSender C

Receiver X

ControllerReceivers are Blind

Controller Crosspoint

Senders Just SendEven if nobody listening

The Trivial IP System

Controller

10GBE

Eth

ern

et S

witc

h

10GBE

10GBE

Sender A

Sender B

Sender C

Receiver X

Receiver Y

Receiver Z

Endpoints are SmartController Endpoints

Senders Just Send

Senders always transmit (onto the IP network)Even if Nobody is currently Listening

Each Sender/Stream has a unique IP Multicast Address

Senders provide Information to the controllerabout their signals (so the controller knows what is being sent)

What Must a Sender Do?

Receivers get instructions from the controller• “Switch to” this new signal

Receivers ask the Network for the Signals• IGMP Multicast “Join”

Some Receivers will switch cleanly between signals• While others will not – market forces will drive this

Receivers tell the Network when they are done with a signal• IGMP Multicast “Leave”

Network figures it out anyway just in case• IGMP Multicast Querier

What Must a Receiver Do?

Receiver X

Switch-To239.5.4.3

IGMP join239.5.4.3

(then)IGMP leave239.8.7.9

done

Recognize & Keep Track of Incoming Multicast Groups

Send (only) the group(s) a Receiver asks for (“Joins”)

Stop Sending the group(s) when a Receiver “Leaves”

Query Receivers about their continued interest

Provide Statistics and Information to the Controller

What Must the Switch/Network Do?

How did you do it in an SDI System?• Automated discovery maybe within the same vendor

• Manual entries when integrating other’s equipment

How will it work in an IP Environment?

• Automated Discovery by a Standard Protocol

• Stream Switching requests, too

• Let’s talk about AMWA-NMOS

How do I Find all the Parts of the System?

Advanced Media Workflow Association (AMWA)

Devices (things that make or eat signals)• Look in DNS (or mDNS) to find the “registrar”

• Tell the “registrar” who they are and what they have

• Keep the “registrar” informed if things change

Anybody (who cares) can:• Ask the registrar about the devices and their streams

• Ask the registrar to update them if things change

AMWA “Network Media Open Spec”

Built on very standard internet/web technology• Domain Name Service (DNS)• HTTP / REST communication style• JSON (JavaScript Object Notation) syntax

Technologies that are fully implemented in all major OS’s and Platforms already

Dozens of companies have tested NMOS in lab environments and interop events already

AMWA “Network Media Open Spec”

Here is the spec – FREE (not like SMPTE specs)• https://github.com/AMWA-TV/nmos

Here is an open-source implementation of it:• https://github.com/Streampunk/ledger

• There are others reported to be work-in-progress

How “Open” is AMWA-NMOS?

It solves two problems we care a lot about:

Registration& Discovery• Finding the parts of the system in a vendor-neutral way

• Cataloguing the streams being generated

Minimal Endpoint Control• Telling a Receiver to join a new stream

Does AMWA-NMOS Solve All Problems?

SDN == “Software Defined Network”• Means a Different Thing to Everybody who says it• IT Industry uses “SDN Controllers” to create custom networks

on the fly, often coupled with NFV• Most IT SDN Controllers have little/no support for multicast

routing (which we use a lot)

Television “Router Control Systems” designed to support the transition to IP equipment use some SDN techniques to optimize the configuration & operations workflow• Switch Configuration & Management via SDN interfaces• Managing some special cases that don’t do IGMP well

What is SDN and do I Need One?

John Mailhot, Imagine Communications

How do we know what is going on in the

plant today?

•People Staring at PIPs

How will we know what is going on in the

future?

•People Staring at PIPs ?

T&M: Telemetry and Monitoring

Content Monitoring

Cue the Local insert in a network program

Preview the Camera shot before taking it

Time the switch to the talent’s banter

“Feel Good” that the air return looks OK

The backup PSU on the router is failed

There are some errors on the primary leg of CAM3 (secondary OK)

Optical Receive power on CAM5 CCU is low

Audio channel 2 on ENG2 is very quiet

Plant Monitoring

Content Monitoring Human Operators

Human Operators need to see the content to make content-based decisions• When to roll the interstitial promo

• When to switch from network to local

• Is the camera shot in focus and framed well

• Is the talent ready (for me to take the newsroom)

PIPS on the wall is a great way to interface with Human Operators

Path-Step Monitoring Debugging Tool

Pips for each step along the way

Maps to the hardware step by step

See main and protect sides

Give the Human Operator visual information to make a good decision about taking a backup signal or path• See what the backup path looks like before using it• Humans are good at seeing differences between two similar

scenes next to each other

But what if there are 50 channels? 100? 200? 500?

Multiviewer - Monitoring By Exception

Many classes of problems can be detected well by non-human methods analyzing the picture (without it being on a PIP in a multiviewer)• Black, Freeze, Blockiness• Audio mutes, Audio saturates• Underlying Errors (EDH, CRC, TSCC, IPSEQ, …)

What is the workflow after detecting an error?• Call Human Attention to the signal• “Penalty Box” approach

Telemetry – the New Frontier

Every Piece of equipment in the modern plant generates a ton of dynamic information• Frame sync: 500+ parameters & 100+ alarm-ables• MPEG encoder: 500+ parameters & 100+ alarm-ables• SDI Routers: dozens of status & alarm-ables per simple port • IP Routers: even more status and alarm-ables available

In most Television plants today, all this telemetrygoes onto the floor. Its not even logged.

To debug the plant, most people stare at the PIPs

Why Telemetry will matter more

Channel counts going up (but # of operators is not)

In single-threaded, non-redundant systems, most failures cause an obvious visible flaw• In redundant systems, the first failure ONLY causes a

telemetry event. The PIP looks OK until something else fails

In remote/virtual environments, telemetry might be all that you get besides the output signal. If the output is bad, telemetry is your only clue

What Can I Learn from Telemetry? (1)

Is the device happy with its input? How happy?• Bit errors, CRC errors, optical power, missed packets,

corrected packets

Is the device itself happy?• Temperature, internal alarms/alerts, redundancy status,

configuration changes (logged)

Is the device generating output?• Status of output signals

What can I learn from Telemetry (2a)

In SDI, the wiring diagram tells you everything.• Debug by following the wires

• Except if something is patched

• Or if the drawing doesn’t match reality

• Or it goes into a router/switcher/etc

In IP, many signals per wire, into a big switch• How do I know where each signal is going really?

• How do I know what signals are on the wire right now?

What can I learn from Telemetry (2b)

IP Switches tell you a TON about what is going on• LLDP: what is connected to every port

• Interface Statistics: TX/RX packets, bytes, rates, errors, broken out by many different types of error

Flow Data (Sflow, NetFlow)• Samples of the headers from every port in the switch

(includes some bytes of the payload)

• Much can be learned through this simple means

What can SFlow samples tell me?

List (and Approximate rate) of every incoming flow

5-Tuple [IP-SRC, IP-DEST, SRC-PRT, DEST-PRT, Proto]

What kind of RTP flow it is• SMPTE 2022-6, SMPTE 2022-2, TSUDP, AES67, …

Interface Statistics (counters, errors, etc) also

How can I analyze this Sflow data?

The IT Marketplace has created dozens of nice analysis tools driven by the data from the switches

About half of these tools are FREE

Turn the Telemetry ON and set it up

Establish a logging environment (choose one)• Logstash• Greylog• Nagios• Fluentd• Splunk

Invest some time on building import filters for theequipment that you have already

Telemetry & Monitoring “Sensible” Practices

Standards Groups, Trade Associations, and Technology-Specific Alliances

Without Standards, things can’t be designed repeatably

• Nuts and Bolts, ASTM standards for concrete, Dimensional Lumber, etc.

Examples of (Television Industry) Standards that work:

• NTSC & PAL Composite AES3 and AES10

• HDMI & DisplayPort SDI, HDSDI, 3GSDI

Examples of (Television Industry) Non-Standards that hurt:

• RS422 Control Protocols 20 different file formats

• Some tape formats (What is your favorite?)

Technical Standards – Do they Matter?

Physical Layer Standards (Electrical, Optical, Connectors)• 10G, 40G, and 100G (10x10)Ethernet (IEEE 802.3-2012 is current)• 25G, 50G, and 100G (4x25) Ethernet (IEEE 802.3ba, IEEE 802bj)• SFP+ and QSFP+ connectors (SFF-8431, SFF-8635, SFF-8665)

RTP/UDP/IP Standards (IETF RFC 3550, etc)• How to send time-sensitive streams of things over IP

SMPTE 2022-2 (TS), 2022-6 (SDI), 2022-7 (redundancy)

AES67

[VSF TR03/04] SMPTE 2110

[Video & Audio] Over IP Standards

32NF-60 ST 2110

Video/Audio over IP Standards Universe

SDI

RFC4175pixels on IP

VSF TR03RTP of V,A,M

VSF TR042022-6+AES67

ST2022-6SDI on IP

AES67audio on IPRFC3190

L24 audioRFC3550RTP/UDP/IP

UHD Formats

3G Formats

HD Formats

SD Formats

Audio Samples

Metadata(VANC)

RFC-(WIP)VANC on IP

Standards BodiesIndustry AssociationsTechnology-SpecificAlliances/Communities

SFP/QSFPMulti-Source Agreement

Video Services Forum (VSF)• Video Networking Trade Association

• Technology groups recommendations• ABC, CBS, NBC, FOX are all members

European Broadcast Union (EBU)• 73 members, 56 countries, public media

Advanced Media Workflow Association• Software APIs, File Formats, etc.

Alliance for IP Media Solutions (AIMS)• Making the Standards Work, Together

• Encouraging the Standards to be Used

VSF, EBU, AMWA, and AIMS

Chuck Meyer

Paul Briscoe

Michael Mueller

Karl Kuhn

John Mailhot

Thank You