watching video over the web - georgia institute of … more than 40 million unique viewers in the...

27
1 Watching Video over the Web, Part I Streaming Protocols Ali C. Begen, Tankut Akgul and Mark Baugher, Cisco A U.S. consumer watches TV for about five hours a day on average. While the majority of viewed content is still the broadcast TV programming, the share of the timeshifted content has been ever increasing. One third of the U.S. consumers currently use a digital video recorder (DVR)like device for timeshifting, however, the trends are showing that more and more consumers are going to the Web to watch their favorite shows and movies on a computer or mobile device. Increasingly, the Web is coming to the digital TV, which incorporates movie downloads and streaming using Web protocols. In this first part of a twopart article, the authors describe both conventional and emerging streaming solutions using Web and nonWeb protocols and provide a detailed comparison. Summer 2010, time for the 19 th FIFA World Cup. It was an exhilarating whole month with total 64 matches played by 32 national teams. It is no surprise that the World Cup has been one of the most watched events worldwide throughout history. This year, there was even more interest as it was the first time in the history of the tournament that some 25 games were broadcast in 3D. But more importantly, many more viewers than ever watched the games over the Web. With the recent developments in video streaming technologies and increase in the broadband Internet access penetration, fans enjoyed watching the games in HD or near‐HD quality on their computers, smartphones and other connected devices including TVs. In the U.S., 45% of the daily World Cup TV audience watched the matches in a non‐home environment or on a non‐TV platform 1 . ESPN3.com reached over seven million unique viewers and delivered over 15 million hours of content. Just a few months ago, we witnessed the same phenomenon in the Vancouver 2010 Winter Games. In Canada where the population is about 34 million, there were almost four million unique viewers who watched the games on the Internet. CTV, the national TV channel that broadcast the games, made 300 events available on the Web. Canadian Internet users consumed a total of 6.3 million hours of live and 0.9 million hours of on‐demand content, resulting in a total 6.2 petabytes delivered in about two weeks 2 . Two years ago, the Beijing Summer Games also attracted many online viewers ‐ both for live and on‐demand viewing. In the U.S., NBC delivered more than 1100 years of video content to about 52 million unique online viewers during the games. 1 http://www.espnmediazone3.com/us/2010/07/espn‐xp‐world‐cup‐dispatch‐4‐through‐742010/ 2 http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?casestudyid=4000007347

Upload: dinhkhanh

Post on 20-May-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

1

Watching Video over the Web, Part I Streaming Protocols  

AliC.Begen,TankutAkgulandMarkBaugher,Cisco

AU.S. consumerwatches TV for about five hours a day on average.While themajority of viewedcontent is still thebroadcastTVprogramming, the shareof the time‐shifted contenthasbeen everincreasing.OnethirdoftheU.S.consumerscurrentlyuseadigitalvideorecorder(DVR)‐likedevicefortime‐shifting,however,thetrendsareshowingthatmoreandmoreconsumersaregoingtotheWebtowatch their favorite shows andmovies on a computer ormobile device. Increasingly, theWeb iscomingtothedigitalTV,whichincorporatesmoviedownloadsandstreamingusingWebprotocols.Inthis firstpartofatwo‐partarticle,theauthorsdescribebothconventionalandemergingstreamingsolutionsusingWebandnon‐Webprotocolsandprovideadetailedcomparison.

Summer2010,timeforthe19thFIFAWorldCup.Itwasanexhilaratingwholemonthwithtotal64matchesplayedby32nationalteams.ItisnosurprisethattheWorldCuphasbeenoneofthemost

watchedeventsworldwide throughouthistory.Thisyear, therewasevenmore interestas itwas

thefirsttimeinthehistoryofthetournamentthatsome25gameswerebroadcastin3D.Butmoreimportantly, many more viewers than ever watched the games over the Web. With the recent

developments in video streaming technologies and increase in the broadband Internet access

penetration, fans enjoyed watching the games in HD or near‐HD quality on their computers,smartphonesandotherconnecteddevicesincludingTVs.IntheU.S.,45%ofthedailyWorldCupTV

audiencewatchedthematchesinanon‐homeenvironmentoronanon‐TVplatform1.ESPN3.com

reachedoversevenmillionuniqueviewersanddeliveredover15millionhoursofcontent.

Justafewmonthsago,wewitnessedthesamephenomenonintheVancouver2010WinterGames.

InCanadawherethepopulationisabout34million,therewerealmostfourmillionuniqueviewers

whowatched thegameson the Internet.CTV, thenationalTVchannel thatbroadcast thegames,made 300 events available on theWeb. Canadian Internet users consumed a total of 6.3million

hours of live and 0.9 million hours of on‐demand content, resulting in a total 6.2 petabytes

delivered in about two weeks2. Two years ago, the Beijing Summer Games also attracted manyonlineviewers‐bothfor liveandon‐demandviewing.IntheU.S.,NBCdeliveredmorethan1100

yearsofvideocontenttoabout52millionuniqueonlineviewersduringthegames.

1http://www.espnmediazone3.com/us/2010/07/espn‐xp‐world‐cup‐dispatch‐4‐through‐742010/2http://www.microsoft.com/casestudies/Case_Study_Detail.aspx?casestudyid=4000007347

2

Thesporteventsarenotthesoletypeofcontentattractivetoonlineviewers, though. Inthepast

fewyears,manycontentprovidershavestartedmakingtheirregularandpremiumcontentsuchasnews, series, shows and movies available on their Web sites. Despite the certain geographical

restrictionsonmanyoftheseWebsites,theconsumerssawagreatproliferationintheamountof

content they had access to. Some content providers or TV channels actually did not impose anyrestrictionsontheviewerlocation,andthusweretransformedfrombeingaregionalTVchannelto

becomeaglobalone,extendingtheirreachforadrevenuesinanunforeseenmanner.

Theconsumers’desireforbeingabletoaccesspracticallylimitlessamountofcontentanytimetheywanted to and the drop in the delivery costs further hastened the deployment of streaming

services.NewWebsites thatdocontentaggregationsuchasHulucamealive. InMay2010,Hulu

hadmore than 40million unique viewers in the U.S., streamingmore than one billion videos amonth3. Remarkably, these numbers have been steadily increasing. Netflix, who is the largest

subscriptionserviceforDVDrentalandstreamingvideo,currentlyhas13millionmembersinthe

U.S.,55%ofwhomuseitsstreamingservicesonavarietyofdevices4.

Internetvideo,alsoknownasover‐the‐top(OTT)services,canbedividedintodistinctcategoriesof

(i)user‐generatedcontentfrommostlyamateurs(e.g.,majorityofthecontentservedbyYouTube),

(ii) professionally generated content from studios and networks to promote their commercialofferingsandprogramming(e.g.,thecontentservedbyABC.com,hulu.com),and(iii)directmovie

salestoconsumersovertheInternet(alsoreferredtoasElectronicSell‐Through–EST).Inthelast

category, Netflix, Apple TV and new undertakings such as Ultraviolet will greatly increase theamountofvideoofferingsontheInternet.

Cisco’s Visual Networking Index (VNI)5 suggests that traffic volumes in the order of tens and

hundredsofexabytes(onebilliongigabytes)andzettabytes(onethousandexabytes)arenotthatremote.Overthenextfewyears,about90%ofthebitscarriedintheInternetwillbevideorelated

andconsumedbyoverabillionusers.Whilesomeportionofthesevideobitswillbeformanaged

servicessuchascableTVandIPTV,wecannotignoretheamountofbitsfortheunmanaged(OTT)services. Actually, by the end of 2010, the Internet video traffic will surpass peer‐to‐peer file

sharingforthefirsttimeeverandemergeasthetopInternettrafficcontributor.

Cable TV and IPTV services offered by a service provider run over managed networks fordistribution since these services usemulticast transport and require certainQoS features (1). In

contrast, conventional streaming technologies such as Microsoft’s Windows Media, Apple’s

3http://www.comscore.com/Press_Events/Press_Releases/2010/6/comScore_Releases_May_2010_U.S._Online_Video_Rankings4http://www.netflix.com/MediaCenter?id=53795http://www.cisco.com/web/go/vni

3

QuickTime and Adobe’s Flash as well as the emerging adaptive streaming technologies such as

Microsoft’sSmoothStreaming,Apple’sHTTPLiveStreamingandAdobe’sHTTPDynamicStreamingrun over mostly unmanaged, i.e., best‐effort, networks. These streaming technologies send the

contenttothevieweroveraunicastconnection(fromaserverorcontentdeliverynetwork(CDN)

facilitatingcontentdistribution)througheitheraproprietarystreamingprotocolrunningontopofan existing transportprotocol,mostlyTCPandoccasionallyUDP, or the standardHTTPprotocol

thatisoverTCP.

Historically,progressivedownload,whichusesHTTPoverTCP,hasbeenquitepopularforonlinecontent viewing due to its simplicity. In progressive download, the playout can start as soon as

enough and necessary data is retrieved and buffered. Today, YouTube delivers over two billion

videosadaywiththisapproach.However,progressivedownloaddoesnotoffertheflexibilityandrich features of streaming. Before the download starts, the viewer needs to choose the most

appropriate version if there aremultiple offeringswith different resolutions and qualities of the

samecontent.Ifthereisnotenoughbandwidthfortheselectedversion,theviewermayexperiencefrequent freezesandre‐buffering.The trickmodessuchas fast‐forwardseek/playorrewind,are

often unavailable or limited. These limitations are likely to inhibit the growth of large volume

(includingHD)moviedistributionontheInternet.Anewapproach,whichwerefertoasAdaptiveStreaming, is emerging to address these shortcomings while preserving the simplicity of

progressivedownload.

Adaptive streaming is a hybrid of progressive download and streaming. On one hand, it is pull‐basedasisprogressivedownload:theadaptivestreamingclientsendsHTTPrequestmessagesto

retrieveparticularsegmentsofthecontentfromanHTTPserver,andthenrendersthemediawhile

the file is being transferred.On the other hand, these segments are short duration, enabling theclient to download only whatever is necessary and employ trick modes much more efficiently,

which gives the impression that the client is streaming. However, more importantly, the short‐

durationsegments(e.g.,MPEG4filefragments)areavailableatmultiplebitrates,correspondingtodifferentresolutionsandqualitylevels,andtheclientmayswitchbetweendifferentbitratesateach

request. The client strives to always retrieve the next best segment after examining a variety of

parametersrelatedtotheavailablenetworkresourcessuchasavailablebandwidthandthestateoftheTCPconnection(s),devicecapabilitiessuchasdisplayresolutionandavailableCPU,andcurrent

streamingconditionssuchassizeoftheplaybackbuffer.Thegoalistoprovidethebestqualityof

experiencebyenablingdisplayof thehighest achievablequality, faster start‐up,quicker seeking,andreducingskips,freezesandstutters.

As adaptive streaming usesHTTP, it benefits from the ubiquitous connectivity thatHTTPhas to

offer.Today,anyconnecteddevicesupportsHTTP insome form.Since it isapull‐basedprotocolthateasily traverses firewalls andNATdevices, it keepsminimal state informationon the server

side,whichmakesHTTPserverspotentiallymorescalablethanconventionalpush‐basedstreaming

4

servers.To theexistingHTTPcaching infrastructure, adaptive streaming isnodifferent thanany

other HTTP application. Individual segments of any content are separately cacheable as regularWeb objects using HTTP or any RESTful (conforming to the Representational State Transfer

constraints) Web protocol. This allows distributed CDNs to greatly enhance the scalability of

contentdistribution.

In this first installment of a two‐part series,we describe several streaming, both push and pull‐

based,solutionsthatexisttoday,motivatetheneedforadaptivestreaming,andprovideadetailed

comparisonbetweendifferentmethods.

Media Streaming 

Transmissionof contentbetweendifferentnodeson anetwork canbeperformed in a variety of

ways. The type of the content being transferred and the underlying network conditions usually

determine the methods used for communication. In case of a simple file transfer over a lossynetwork, the emphasis is on reliable delivery. The packets can be protected against losseswith

added redundancy or the lost packets can be recovered by retransmissions. When it comes to

audio/videomediadeliverywith real‐timeviewingrequirements, theemphasis ison low latencyand efficient transmission in order to enable the best possible viewing experience. Occasional

lossesmaybetoleratedinthiscase.Thestructureofthepacketsandthealgorithmsusedforreal‐

time media transmission on a given network collectively define the media streaming protocol.Althoughvariousmediastreamingprotocolsavailabletodaydifferinimplementationdetails,they

canbeclassifiedintotwomaincategories:push‐basedprotocolsandpull‐basedprotocols.

Push‐Based Media Streaming Protocols 

Inpush‐basedstreamingprotocols,onceaconnectionisestablishedbetweenaserverandaclient,

the server is always on and streams packets to the client until the session is torn down or

interruptedbytheclient.Consequently,inpush‐basedstreaming,theservermaintainsaconnectionstatewiththeclientand listens forcommandssentbytheclientregardingsessionstatechanges.

Real‐timeStreamingProtocol(RTSP),specified inRFC23266, isoneof themostcommonsession

controlprotocolsusedinpush‐basedstreaming.

Push‐basedstreamingprotocolsgenerallyutilizeReal‐timeTransportProtocol(RTP),specifiedin

RFC3550,orequivalentpacketformatsfordatatransmission.RTPusuallyrunsonUserDatagram

Protocol(UDP),which isastatelessprotocolwithoutany inherentrate‐controlmechanisms.Thisallows theserver topushpackets to theclientatabitrate that isdependenton theclient/server

6AllRFCdocumentsarefreelyaccessiblefromhttp://www.ietf.org/rfc.html.

5

implementationattheapplicationlevelratherthantheunderlyingtransportprotocol.Thismakes

RTPanicefitforlow‐latencyandbest‐effortmediatransmission.

In conventional push‐based streaming, the content is usually transmitted at themedia encoding

bitratetomatchtheclient’smediaconsumptionrate.Innormalcircumstances,thisensuresthatthe

clientbufferlevelsstaystableovertime.Italsooptimizestheuseofnetworkresourcesbecausetheclientusuallycannotconsumeat anyrateabove theencodingbitrate; consequently, transmitting

abovetheencodingbitratewouldunnecessarilyloadthenetwork.Moreover,thismaynotbeeven

possibleforlivestreamswherethestreamisbeingencodedon‐the‐fly.However,ifpacketlossortransmissiondelaysoccurover thenetwork, theclient’spacketretrievalratemaydropbelowits

consumptionrateandtheclient’sbuffermaystartdraining,whichmayeventuallyresultinabuffer

underflowandinterruptthemediaplayback.Thisiswherebitrateadaptationcomesintoplay.

Topreventabufferunderflow,themediaservermaydynamicallyswitchtoalowerbitratestream.

This,inturn,reducesthemediaconsumptionrateattheclient‐sideandthereforecounteractsthe

effectofnetworkbandwidthcapacity loss.Sincea suddendrop inencodingbitratemayresult innoticeablevisualqualitydegradation,theencodingbitratereductionmaybeperformedinmultiple

intermediatestepsuntiltheclient’sconsumptionratematchesordropsbelowitsavailablereceive

bandwidth.Whenthenetworkconditionsimprove,theserverdoestheoppositeandswitchestoahigherbitratestreaminstead.Again,thisprocessmaybeperformedinmultipleintermediatesteps

nottocauseasuddenoverloadonthenetwork.Providedthatthestreamisnotalivestreamand

thenetworkcapacityallowsforahigherbitratetransmission,theservermayalsochoosetosendover packets at a higher rate than the media encoding rate to fill up the client’s buffer. By

dynamicallymonitoringtheavailablebandwidthandbuffer levelsandadjustingthetransmission

rateviastreamswitching,push‐basedadaptivestreamingmayachievesmoothplaybackexperienceatthebestpossiblequalitylevelwithoutpausesorstuttering.

Thebandwidthmonitoringisusuallyperformedontheclient.Theclientcomputesnetworkmetrics

suchasround‐triptime(RTT),packetlossandnetworkjitterperiodically.Theclientmayusethisinformationdirectly tomakedecisions forwhen to switch to ahigheror a lowerbitrate stream.

Alternatively,theclientmaycommunicatethisinformationalongwithitsbufferlevelstotheserver

viareceiverreportsandlettheservermakethosedecisions.SuchreportsareusuallytransmittedviaReal‐timeControlProtocol(RTCP).

In the following section,we study themechanismsdefined in3rdGenerationPartnershipProject

(3GPP) that constituteanexample forhowbitrateadaptationcanbe implemented inpush‐basedstreamingservers.

Bitrate Ad

3GPP istechnical

(GSM) ne

popularQuickTim

In its Re

dynamicastatistics

areimple

Figure1.

Figure1:E

During sbitrate an

response

withinth

Then,the

daptation in 3

collaboratiospecification

etworks. Th

push‐basedmeStreaming

lease 6, 3GP

ally switchinabout availa

ementations

Example3GPP

ession setupnd codec to

toclient’sR

heavailablea

eclientsend

3GPP 

on betweennsfor3Gmo

he bitrate ad

streaminggServerbyA

PP definedm

ng the transmablebandwi

specific.How

streamingses

p, the serverthe client v

RTSPdescri

audioandvid

sRTSPsetu

telecommuobilesystem

daptation m

servers suchApple.

mechanisms

mission bitraidth and clie

wever,inaty

ssionsetupwit

r sends a lisvia SessionD

ibemessag

deostreamst

upmessages

6

unications assbasedonG

echanisms d

h as Helix

for themed

ate (2) (3).ent buffer lev

ypicalimplem

thbitrateadap

st of availabDescription P

e.Receiving

thatbestfitt

s(oneforea

ssociationsGlobalSystem

defined in 3

streaming s

dia client to

The algorithvels and ada

mentation,th

ptation.

ble streamsProtocol (SD

sessiondesc

toitslinkspe

chuniqueau

whose scopmforMobile

3GPP standa

server by R

inform the

hms used toapting the bi

heprocessgo

and their prDP), specified

cription,the

eedanddeco

udioandvid

pe is to proeCommunica

ards are use

Real Network

media serve

collect realitrate accord

oesasdepict

roperties sud in RFC 456

clientpicks

odingcapabi

deostream)t

oduceations

ed by

ks or

er for

l‐timedingly

tedin

ch as66, in

from

ilities.

tothe

7

server to prepare the server for sending out the selected streams. The RTSP message header

contains a3GPP-Adaptation parameter that informs the server about the client’s buffer size

(size attribute) andminimum required buffering (target‐time attribute) to ensure interrupt‐free

playback.Italsocontainsa3GPP-Link-Charparameterthatprovidesinformationaboutclient’s

guaranteedreceivebandwidth(GBWattribute).

After the serverobtains informationabout the client’sbuffer and receivebandwidth, it setsupa

buffermodeltosimulatethechangesinclient’sreceivebufferlevels.Inthisbuffermodel,theservertries to satisfy the minimum required buffer level while at the same time it tries to prevent

overflows. Since the network conditions and buffering requirementsmay change over time, the

clientkeepsnotifying the serverviaperiodicRTCP receiver reports.TheseRTCPmessages carryinformationsuchasavailablefreespaceinclient’sreceivebuffer,playoutdelay(thetimedifference

between the presentation time of the next frame in the decoder buffer and the time the RTCP

message isbeing sent)andestimatednetworkbandwidth.Using this information, the server canmaintainanaccuratemodelofclient’sreceivebufferandcanmakeinformeddecisionsaboutwhen

toswitchtoahigheroralowerbitratestreamdynamically.

Theupshiftanddownshiftbuffer level thresholds for thestreamareusuallypre‐programmedonthe server. For instance, if the server detects that the client’s buffer level is below the next

downshift threshold, it switches to the next lower encoding rate available to prevent the buffer

fromdraininganyfurther.Ontheotherhand,ifthebufferlevelexceedsthenextupshiftthreshold,the server switches to the next higher encoding rate provided that the network bandwidth can

support that rate. Ifnot, theservermaychoose toslowdown the transmissionrate topreventa

bufferoverflow.

Pull‐Based Media Streaming Protocols 

In pull‐based streaming protocols, themedia client is the active entity that requests the content

fromthemediaserver.Therefore,theserverresponsedependsontheclient’srequestswheretheserver is otherwise idle or blocked for that client. Consequently, the bitrate at which the client

receives the content is dependent upon the client and the available network bandwidth. As the

primarydownloadprotocoloftheInternet,HTTPisacommoncommunicationprotocolthatpull‐basedmediadeliveryisbasedon.

Progressive download is one of the most widely used pull‐based media streaming methods

availableonIPnetworkstoday.Inprogressivedownload,themediaclientissuesanHTTPrequestto the server and startspulling the content from the server as fast aspossible.Onceaminimum

required buffer level is obtained, the client starts playing the media while at the same time it

continuestodownloadthecontentfromtheserverinthebackground.Aslongasthedownloadrateisnotsmaller than theplaybackrate, theclientbuffer iskeptatasufficient level tocontinue the

playback

mayfallb

Similarto

bufferun

bitrate adfragment

independ

reconstruthe right

fragment

according

Figure2:E

Although

forfragm

audiofraframeisu

stuffed in

fragmentindepend

video, th

withoutany

behindthepl

omethodsus

nderflowinp

daptation.Thts) each of

dently. When

uctedseamlet encoding b

t from the m

gtotheavail

Examplebitrat

hthestructur

mentconstruc

meusuallycusuallydeco

nto a media

t duration. Fdently due to

he partitionin

yinterruptio

laybackrate

sedinpush‐b

pull‐basedstr

hemedia cowhich is e

n the fragme

essly.Duringbitrate that m

media serve

lablereceive

teadaptationi

reoftheme

ctionisthes

consistsofcodableonitso

a fragment b

For video, oo common t

ng for fragm

n.However,

andaneven

basedstream

reamingprot

ntent isdiviencoded at

ents are play

gdownload,tmatches or

er. This way

bandwidth.

npull‐baseda

diafragmen

same.Inthe

onstantduratownforcom

by combinin

on the othetemporal pre

ment constru

8

ifthenetwo

ntualbufferu

ming,bitrate

tocols.Figur

ided into shovarious bit

yed back to

themediaclis below the

y the client

adaptivestream

tsdiffersam

casewhere

tionaudiosammonaudioc

ng sufficient

er hand, theediction app

uction is per

orkcondition

underflowma

adaptationi

e2showsan

ort‐durationtrates. Each

back, the o

lientdynamie available b

can adapt

ming.

mongimplem

audioandvi

amplesinthecodecs.There

t number of

e frames arplied betwee

rformed at

nsdegrade,t

ayresult.

isameasure

nexampleim

nmedia segmh fragment

original med

icallypickstbandwidth a

its media c

mentations,th

ideoarenot

emillisecondefore,audio

f audio fram

e not necesen the frame

the Group o

hedownload

etakentopr

mplementatio

ments (also ccan be dec

ia stream ca

thefragmentand requests

consumption

hebasicprin

interleaved,

dsrangeanddatacaneas

mes to match

ssarily decoes. Therefor

of Pictures (

drate

event

onfor

calledcoded

an be

twiths that

n rate

nciple

,each

deachilybe

h the

dablee, for

(GoP)

boundary

frame (I‐thatdepe

the same

closedGofragment

beadjust

duration.preferabl

In the fo

streaminMicrosoft

Microsoft

Microsoft(PIFF) (4

implemen

there istypically

Figure3:F

Figure 3

hierarchi

QuickTimEachbox

y instead. In

‐frame) thatendonother

e GoP, the Go

oPisself‐contfromaclose

tedsuchthat

. However, iletopackmo

ollowing sec

g protocolstandHTTPL

t Smooth Stre

t’s Smooth S4), which is

ntation, all f

a separate fcontainasin

FragmentedMP

shows the s

icaldatastru

me).Thereartypeisdesig

n video codin

can be decoframes.Ifth

oP is called

ntained(i.e.,iedGoPstraig

tthetotaldu

if the fragmeorethanone

ctions, we lo

that are basLiveStreamin

eaming 

Streaming ims an extens

fragmentsha

file for eachngleGoP.

P4fileformat

structure of

ucture.Theb

reboxesthatgnatedbyaf

ng, a GoP is

oded indepehepredictedf

a closedGoP

itcanbedecghtforwardly

urationofth

ent durationGoPintoafr

ook into tw

sed on multngbyApple.

mplementatioion to MPE

aving the sam

available b

(Reproducedf

a fragmente

uildingblock

tcontainaudfour‐letterid

9

a sequence

ndently folloframeswithi

P. Otherwise

codedindepey.Duringenc

heframesin

n is large coragment.

o different

i‐bitrate frag

on is basedEG4 (MP4)

mebitrate a

itrate. Fragm

from(4)).

edMP4 file.

kofthishiera

dio/videodatdentifier.The

of frames th

owed by preinaGoPdepe

e, the GoP is

endentofothcoding,then

theGoPise

ompared to

implementa

gments. The

on Protectespecification

are stored in

ments are u

A fragmente

archyiscalle

taaswellasefilestartsw

hat startwit

edicted framendonlyont

s called ano

herGoPs),onnumberoffra

equaltothe

a typical Go

ations of pul

ese are Smoo

ed Interopern (5). In Sm

na singleMP

sually two s

edMP4 file

edabox(kno

sboxesthatcwithanftypb

th an intra‐c

mes (P‐/B‐fratheframesw

openGoP. Sin

necanconstramesinaGo

desiredfrag

oP size, then

ll‐based ada

oth Streamin

rable File Fomooth Strea

P4 file.There

seconds long

is composed

ownasanat

containmetaboxthatdesc

coded

ames)within

nce, a

ructaPcan

gment

n it is

aptive

ng by

ormataming

efore,

g and

d of a

tomin

adata.cribes

theversio

that descfragment

mdatbox

Themoofragment

Inorder

knowwhclientatt

sideman

theavaila

Figure4:

Oncethe

for the inpair.The

and the t

Streamin

Whenthe

fragment

informatifilemaps

informati

correspo

Afterthe

achieved

contain tinformati

theserve

box,andt

GET/s

oninformati

cribes the mtiscontained

xisimmedia

fboxmaycot,thenumber

fortheclien

hether that fthebeginnin

nifestfiledes

ablestreams

AnexampleH

clienthasdo

ndividual fraHTTPreque

timeoffseto

ngTransport

emediaserv

t, it first ne

ioniscontainsMP4 files t

ion it receiv

ndingMP4fi

correctfile

via indexing

the locationionitreceive

er locates the

thesemakeu

sample/v_72

ionforthesp

media tracksdwithinabo

telyprecede

ontainmetadrofsamples

nttorequest

fragment is agofthesessi

scribescodec

.

HTTPrequestfo

ownloadedt

agments. Eacestmessage

of thereques

clientreque

vergetsanH

eds to dete

nedinanotho thebitrate

ves from the

iletosearch.

isfound,the

gdata.Asse

and presentesfromthec

e fragment i

upthefragm

0p.ism/Quali

pecification(

available inoxoftypem

edbyaboxo

dataandsignwithinthefr

afragment

availableonionviaaclie

cs,videores

oradownload

themanifest

ch fragmentheadercont

sted fragmen

st.

HTTP‐encaps

rmine which

ermanifestfesof the frag

e client (as s

enextstepis

een inFigure

tation time oclienttofind

in the file, it

mentonthew

ityLevels(150

10

s)thefileco

n the file. Tmdat.Inafra

oftypemoof

nalinginformragmentand

withaspeci

the server.Tnt‐sidemani

olutions,cap

dingafragmen

file,itstarts

is downloadtainstwopie

ntas shown

sulatedrequ

h MP4 file

file,whichisgments they

seen in Figu

stolocateth

e3,anMP4

of a fragmendamatchwi

sends thec

wire.

00000)/Frag

omplieswith

The audio/viagmentedMP

thatcontain

mationsuchadurationof

ificbitratean

This informaifestfile.Ina

ptionsandot

nt.

makingHTT

ded via a unecesof infor

inFigure4,

est fromthe

to search to

calledserveycontain.Th

ure 4) in the

herequested

filealsocon

nt. Themediithafragmen

ontainedmo

gments(video

h.Itisfollowe

ideo mediaP4container

nsmetadataf

asasequenceachsample

ndstarttime

ation is comadditiontob

therauxiliar

TPGETrequ

niqueHTTPrmation.The

which isaM

eclientfora

o locate tha

er‐sidemanifhe server loo

e manifest f

dfragmentin

ntainsboxes

ia server usentindexint

oofbox follo

o=160577243

edbyamoo

data for a srstructure(5

forthatfragm

cecounterfoe.

e, itfirstnee

mmunicated titrates,thec

ryinformatio

ueststothes

request‐respsearetheb

MicrosoftSm

aparticularm

at fragment.

fest.Thismanoksup theb

file and find

nthatfile.T

of typemfra

es the time otheMP4file.

owedby the

3)HTTP/1.1

vbox

single5),an

ment.

orthe

edsto

to theclient‐

onfor

erver

ponseitrate

mooth

media

This

nifestitrate

ds the

Thisis

a that

offsetOnce

mdat

Apple HTT

Apple’s H(referred

ISO/IEC1

mediasewhich is

intoequa

comparedbetterco

number

decisionstodynam

Figure5:E

Similarto

oftheme

playlist f

organized

#EXTM3#EXT‐X

http://e

#EXT‐X

http://e

#EXT‐X

TP Live Strea

HTTP Lived to asmedia

13818‐1MPE

gment is stotypically a s

aldurationse

d to twosecmpressions

of fragment

scanbemadmicbandwidt

Exampleshowi

oMicrosoft’s

ediasegment

file standard

d ina two‐le

3U‐STREAM‐INF:P

example.com/lq

‐STREAM‐INF:P

example.com/h

‐ENDLIST

ming 

Streaming ia segment in

EG2Transpo

ored ina sepsoftware com

egments.The

condsused iinceitprovid

ts. However

de.Thereforethchanges.

ingHTTPLive

sSmoothStr

tsavailablef

d and is def

evelhierarch

PROGRAM‐ID=1

q/prog1.m3u8

PROGRAM‐ID=1

q/prog1.m3u8

mplementatn the implem

ortStreamfil

parate transpmponent, rea

edefaultseg

nMicrosoftdesbetterte

r, it also red

e,longfragm

Streamingpla

reamingimpl

fortheclient

fined in (6)

hy.Each file

1,BANDWIDTH

1,BANDWIDTH

11

ion (6) follmentation) st

leformat.As

port streamads a contin

gmentduratio

SmoothStreemporalredu

duces the g

mentduration

aylists.

lementation,

t.Themanif

. Figure 5 s

startswitha

H=100000

H=2000000

ows a diffetorage.HTTP

sopposedto

file.Aproceuous transp

onis10seco

eaming.Longundancyand

granularity a

nmaynotbe

,aclient‐sid

festfileform

shows an ex

anEXTM3U

#EXTM3

#EXT‐X‐M

#EXT‐X‐T

#EXTINF

http://ex#EXTINF

http://ex

#EXT‐X‐E

#EXTM3#EXT‐X‐M

#EXT‐X‐T

#EXTINF

http://ex

#EXTINF

http://ex…

#EXT‐X‐E

erent approaPLiveStrea

SmoothStre

essnamedSort stream f

onds.Thisis

ger fragmendatthesame

at which fra

easgoodint

emanifestfi

matisanexte

xample of t

tag thatdist

U

MEDIA‐SEQUEN

TARGETDURAT

F:10,

xample.com/seF:10,

xample.com/se

ENDLIST

UMEDIA‐SEQUEN

TARGETDURAT

F:10,

xample.com/se

F:10,

xample.com/se

ENDLIST

ach for fragaming isbase

eaming,here

StreamSegmefile anddivid

alongerdur

ntdurationaetimereduce

agment‐swit

termsofada

ilekeepsare

ensiontothe

the manifest

tinguishes th

NCE:0

TION:10

gment‐HQ1.ts

gment‐HQ2.ts

NCE:0

TION:10

gment‐LQ1.ts

gment‐LQ2.ts

gmentedon

eeach

enter,des it

ration

allowsesthe

tching

apting

ecord

eMP3

t files

he file

fromano

level file

Identifier

has an a

manifest

alternate

quality(2

Thelowe

download

number o

sequencesequence

EXTINF

correspo

Example

Aclient‐s

theserve

identifiesfrom frag

othercon

Figure6:A

Thebasic

theclient

ordinaryMP3

s on the rig

r(URI)thati

ttribute nam

for an alter

encodingsf

2Mbps)enco

er‐levelfiles

d.Following

of the first s

e number, ae number. F

tag.Thistag

ndingmedia

e Functiona

sidepull‐bas

erprovidess

s files that cogments of a

nditions.

Aclient‐sidepu

cclient/serv

t‐sideofFigu

3playlist.Th

ght. The lin

salwayspre

med BANDWI

rnate encodi

foreachsegm

oding.

ontheright

theEXTM3U

segment in t

nd sequenceFinally, the U

ghasanattr

asegment.In

al Diagram 

edadaptive

standardres

ontainmediafile over on

ull‐basedadap

veradaptive

ure6.

hefileonthe

nk for each

ecededbyan

IDTH that in

ing of the se

ment, that is

providethe

Utag,isanE

the playlist.

e numbers iURIs for the

ibutesepara

theexample

for Client‐S

streamingim

ponsestoHT

apresentatioeormore co

ptivestreaming

streamingco

12

leftisahigh

lower‐level

nEXT-X-STR

ndicates that

egments it li

s,onelow‐qu

linksforthe

EXT-X-MEDI

Each segme

increment be individual

atedbyasem

e,eachsegme

Side Pull‐Ba

mplementati

TTPGETreq

onsat alternonnections a

gimplementat

onfiguration

her‐levelfile

file is spec

REAM-INFt

t the corres

ists. In the e

uality(100K

eindividual

IA-SEQUENC

ent in the p

by one startsegments f

micolonthat

entis10seco

ased Adapt

ionisillustra

quests.Thec

nativebitrataccording to

tion.

requiresthe

thatlinksto

cified in a U

tag.EXT-X-S

sponding low

example, the

Kbps)encodi

mediasegm

CEtagthatl

playlist typica

ting from thfollow each

t indicatesth

ondslong.

tive Stream

atedinFigur

clientgetsa

es.Theclieno the playout

egeneralfun

otwootherlo

Uniform Reso

STREAM-IN

wer‐level file

e stream has

ingandone

mentsavailab

liststhesequ

ally has a un

he first segmmarked wit

hedurationo

ming 

re6.Atminim

manifestfile

nt acquiresmt buffer state

nctionsshow

ower‐

ource

NFtag

e is a

s two

high‐

lefor

uence

nique

ment’sth an

ofthe

mum,

ethat

mediae and

wnon

13

Theclientneedsafile,calledaclient‐sidemanifestoraplaylisttomapfragmentrequeststo

specific files or byte ranges into files. In some adaptive streaming schemes, there is asimilarfileontheservertotranslateclientrequests.

Managing a playout buffer is also a basic function that is always needed on an adaptive

streamingclient,whichusesadaptationlogictoselectfragmentsfromafileataparticular

bitrate in response to buffer state and potentially other variables. Typically, an adaptivestreamingclientkeepsaplayoutbufferofseveralseconds,e.g.,5–30seconds.

Atransportisneededtocommunicaterequestsfromtheclienttotheserver;thepull‐based

adaptivestreamingschemesthatwesurveyinthisarticleuseHTTPGETrequests,eithertoa standardHTTPWeb serveror to a specializedWeb servicesAPI that is supportedby a

specialserviceontheserver(suchasaSmoothStreamingTransportapplicationrunningon

Microsoft’sInternetInformationServicesserver).

The GET requests use a single TCP connection by default, but some adaptive streaming

implementations support using multiple concurrent TCP connections for requesting

multiplefragmentsatthesametimeorforpullingaudioandvideoinparallel.

Theclientispre‐configuredtorequestamovieatacertainbitrate,orprofile,basedontheresultof

networktestsorsimpleconfigurationscript.WhenaprofileisselectedandtheURIassociatedwith

theprofile is found in themanifest, the clientestablishesoneormore connections to the server.Wehaveobservedthatdifferentproductsemploydifferentstrategies.Forexample,someproducts

useonlyoneTCPconnectionforGETrequeststoafilewhereasothersopenandclosemultipleTCP

connectionsduringanadaptivestreamingsession.Astheclientmonitorsitsbuffer,itmaychoose

toupshifttoahigher‐bitrateprofileordownshifttoaloweronedependingonhowmuchtherateofvideo transport does or does not exceed the rate of media playout. Some adaptive streaming

productsopenanewconnectionwhenupshiftingordownshiftingtoanewprofile.

Alternative Methods for Bitrate Adaptation 

We havementioned adaptive streamingmethods thatwork based on the principle of switching

betweenalternateencodingsofastreamasawholeorswitchingbetweenindividualfragmentsofitencoded at several bitrates. There is also an alternative technology called Scalable Video Coding

(SVC). In SVC‐based systems, a client can choosemedia streams appropriate for the underlying

network conditions and its decoding capabilities. SVC has been standardized as an extension toH.264/MPEG4 AVC video compression specification. SVC in its current form applies to video

streamsonly.

InSVC,avideobitstreamismadeupofahierarchicalstructureoflayers.Thebaselayerprovidesthelowest levelofquality intermsof framerate,resolutionandsignal‐to‐noiseratio(SNR).Each

enhancement layer on top of the base layer provides an improvement for one ormore of these

scalable quality parameters. Enhancement layers can be independently stored or sent over the

14

network.Therefore,theoverallstreambitratecanbemodifiedbyselectivelyaddingorsubtracting

enhancementlayersto/fromastream.

The quality knobs in SVC are referred to as temporal scalability, spatial scalability and SNR

scalability.Temporalscalabilityprovidesawaytoaddordropcompletepicturesto/fromastream.

Temporal scalability is not something newly introduced by SVC. Temporal scalability in itsmostbasicformhasbeenusedintraditionalpush‐basedstreamingserversintermsofamethodknown

as stream thinning. In spatial scalability, on the other hand, video signal is encoded at multiple

resolutions. The reconstructed lower‐resolution frames can be used to predict and reconstructhigher‐resolution frameswithadded informationsent in theenhancement layers.Finally, inSNR

scalability,videosignalisencodedatasingleresolution,butatdifferentqualitylevelsbymodifying

theprecisionofencoding.Eachenhancementlayerincreasestheprecisionofthelowerlayers.

A key advantage of SVC lies in its ability to distribute information among various layers with

minimaladdedredundancy.Inotherwords,whileastreamthatistraditionallyencodedatdifferent

quality levels has significant redundancy between the encodings; each layer in an SVC‐encodedstreamhasminimalcommoninformationbetweenthelayers.ThismakesSVCefficientforstorage

ofmediaatvariousqualitylevels.AnotheradvantageofSVCisthegracefuldegradationofstream

quality without client or server intervention when packets from enhancement layers cannot bedeliveredduetoabruptchangesinnetworkconditions.Thisisincontrasttomulti‐bitrateencoding

whereitrequiresswitchingfromoneencodingtoanotherviaadecisionbytheclientortheserver

toadjusttothechangesinnetworkconditions.SVCstreams,ontheotherhand,aretypicallymorecomplex tobegeneratedand imposecodecrestrictionscompared tomulti‐bitratestreams.Thus,

theadoptionrateforSVChasbeenratherlow.

Comparison of Push‐Based vs. Pull‐Based Streaming Protocols 

Above,wedescribedpushandpull‐basedstreamingprotocols,mentionedtheirbitrateadaptation

mechanisms and provided case studies. In this section, we present a qualitative comparison

betweenthesetwoclassesofmediastreamingprotocols.

Table 1 summarizes the differences between push and pull‐based streaming7. One of the main

differencesbetweenpushandpull‐basedstreamingisthecomplexityoftheserverarchitecture.As

mentionedbefore, inpull‐basedstreaming, thebitratemanagement isusuallyataskof theclient,which significantly simplifies the server implementation. Furthermore, pull‐based streaming can

runon topof simpleHTTP.Therefore,withminorprovisions, anordinaryWeb server can serve

mediacontentinpull‐basedstreaming,althoughcomplexitiesmayariseinstreaminglivecontentto

7Inthiscomparison,pull‐basedstreamingexclusivelyreferstostreamingoverHTTP.Albeitunlikely,apull‐basedstreamingmethodrunningoveradifferenttransportcanbelaterdevelopedandadopted.

15

a largenumberofclients.Push‐basedstreaming,ontheotherhand,requiresaspecializedserver

thatimplementsRTSPorasimilarpurpose‐builtprotocolandhasbuilt‐inalgorithmsfortaskssuchasbitratemanagement,retransmissionandcontentcaching.Thismakespull‐basedstreamingcost

effectivecomparedtopush‐basedstreaming.

Despite the lower server costs, pull‐based streaming is usually less efficient overhead‐wise thanpush‐based streaming due to the underlying transport protocol being used. Compared to HTTP

overTCP,RTPimposesalowertransmissionoverhead.Moreover,sinceRTPusuallyrunsontopof

UDP, the retransmission dynamics and congestion controlmechanisms of TCP do not inherentlyexistinRTP.

Both protocols allow client buffering both at the beginning of a session and also after trick‐play

transitionssuchasfastforwardtoplay.Thisisperformedtopreventbufferunderflowsandachievesmoothplayback experience. In adaptive streamingmethods, the initial client bufferingduration

canbesubstantiallylessthannon‐adaptivestreamingmethodsduetothefacttheclientcanpicka

lower‐bitratestreamtobeginwith.Thisallowsfaststartuptimesandincreasesresponsiveness.

Oneofthekeybenefitsofpush‐basedstreamingismulticastsupport.Multicastallowsaserverto

sendapacketonlyoncetoagroupofclientswaitingtoreceivethatpacket.Thepacketisduplicated

alongthenetworkpathintheoptimalway.Aclientcanjoinorleaveamulticastgroupondemand.Thisway,aclientcanreceiveonly thepackets itwants.Pull‐basedstreaming,ontheotherhand,

worksbasedontheunicastdeliveryscheme.Inunicastdelivery,thereisaone‐to‐onepathbetween

theserverandeachclient.Therefore,intheworstcase,aserverhastosendapacketasmanytimesasthenumberofclientsrequestingthatpacket,and,similarly,anetworknodeinthemiddlehasto

passalongthesamepacketmultipletimes.Thisreducestheefficiencyofthenetwork.

The routing and packet delivery over the network can bemademore efficient by using contentcaching.Thecontentiscachedalongthenetworkondedicatedcacheservers.Thiswayaclientmay

obtaincontentfromanearbycacheserverinnetworktopologyinsteadofgoingallthewayupto

theoriginserver.Probably, themostefficientuseofcaching is in thecaseofpull‐basedadaptivestreaming.Inthiscase,eachfragmentcanbecachedinthenetworkindependently.

Table1:Comparisonbetweenpush‐basedandpull‐basedstreamingprotocols.

Push‐BasedStreaming Pull‐BasedStreaming

ServerType Streamingserver

WindowsMedia,AdobeFlash,AppleQuickTime,CiscoCDS

Webserver

LAMP,MicrosoftIIS,CiscoCDS

URLFormat rtsp://…,rmtp://… http://…

Protocols RTSP,RTMP,RTP,UDP HTTP

16

BandwidthUsage (Likely)Moreefficient (Likely)Lessefficient

ClientBuffering Yes Yes

VideoMonitoringand

UserTracking

Standardtools (Currently)Proprietary

MulticastSupport Yes (Currently)No

Contentcaching Requiresspecialservers Yes

Concluding Remarks 

PopularityofwatchingtraditionalbroadcastTVprogrammingisweakeningeverydayagainstthe

rapidlyincreasinginterestinwatchingthecontentontheWeb.ConsumersareaccessingtheWebcontentnotjustfromaTVinthelivingroombutfromavarietyofdevicesfromavarietyofplaces

connected throughdifferent typesofaccessnetworks.Adaptivestreamingmethods that candeal

withthechallengespresentedbythevarietyofsuchdevicesandnetworksaswellasthescalabilityofcontentdistributiontolargeaudiencesarefurtheracceleratingthistrendtransition,whichwill

clearly impact theexistingbusinessmodelsandcreatenewrevenueopportunities. In thesecond

partofthisarticle,wewilllookintoapplicationsforstreamingincludingend‐to‐endmobileandin‐home streaming, contrasting adaptive approaches to other videodeliveryparadigms, discuss the

current standardization efforts and highlight the areas that still require further research and

investigation.

References 

1.  IPTV: Reinventing  Television  in  the  Internet Age.  Thompson, Greg  and Chen, Yih‐Farn Robin. May 

2009, IEEE Internet Computing, Vol. 13, pp. 11‐14 . 

2.  Transparent  End‐to‐end  Packet‐switched  Streaming  Service  (PSS);  Protocols  and  Codecs.  3GPP  TS 

26.234. [Online] Oct. 2010. http://ftp.3gpp.org/specs/html‐info/26234.htm. 

3. Adaptive  Streaming within  the 3GPP Packet‐Switched  Streaming  Service.  Fröjdh, Per  , et al., et al. 

Mar. 2006, IEEE Network, Vol. 20, pp. 34‐40. 

4. Protected Interoperable File Format. [Online] May 2010. http://go.microsoft.com/?linkid=9682897. 

5.  ISO/IEC  14496‐12  Base  Media  File  Format.  [Online]  2008. 

http://www.iso.org/iso/catalogue_detail.htm?csnumber=51533. 

17

6. HTTP Live Streaming. IETF Internet Draft. [Online] June 2010. http://tools.ietf.org/html/draft‐pantos‐

http‐live‐streaming. 

 

Biographies 

AliC.Begen([email protected])iswiththeVideoandContentPlatformsResearchandAdvanced

DevelopmentGroupatCisco.Hisinterestsincludenetworkedentertainment,Internetmultimedia,transport protocols, and content distribution. He is currentlyworking on architectures for next‐

generationvideotransportanddistributionoverIPnetworks,andheisanactivecontributorinthe

IETFintheseareas.HeholdsaPh.D.degree inelectricalandcomputerengineeringfromGeorgiaTech. He received the Best Student‐Paper Award at IEEE ICIP 2003, and the Most Cited Paper

AwardfromElsevierSignalProcessing:ImageCommunicationin2008.Heisalsoamemberofthe

ACM.

TankutAkgul([email protected])isasoftwareengineerintheServiceProviderVideoTechnology

Group at Cisco. His research interests are embedded software design, video compression and

multimedia streaming.He currently designs and develops software for the next‐generation IPTVset‐topboxes.PriortoCisco,heworkedasaseniorscientistatSoftNetworks,LLCanddeveloped

several video codecs for Intel embedded SoCs. He received his Ph.D. degree in electrical and

computerengineeringfromGeorgiaTech.HisPh.D.thesiswasondebuggingandreverseexecutionofprogramsforembeddedsystems.Hehasseveralconferenceandjournalpublicationsinthearea

ofembeddedsoftware.

Mark Baugher ([email protected]) is a technical leader in the Research and AdvancedDevelopmentgroupatCisco.Prior toCisco,Markco‐foundedandservedas theCTOofPassedge

Inc.,whichdevelopedamulticastmediasecurityproduct.Overtheyears,Markhasledteamsthat

deliveredinnovativetechnologiessuchasIntel'sPC‐RSVP,thefirstnetworkreservationsystemona PC, andLANServerUltimedia, IBM's firstmedia server product.Markhas co‐authored several

widely used international standards in the IETF and ISMA, and he is currently co‐chairing the

Internet GatewayDeviceWorking Committee in the UPnP Forum.Mark holds anM.A. degree incomputersciencefromtheUniversityofTexasatAustin.

1

Watching Video over the Web, Part II Applications, Standardization and Open Issues  

AliC.Begen,TankutAkgulandMarkBaugher,Cisco

In this secondpartofa two‐partarticle, theauthors look intoapplications for streaming includingend‐to‐endmobileand in‐home streaming, contrastingadaptiveapproaches toothervideodelivery

paradigms, discuss the current standardization efforts and highlight the areas that still require

furtherresearchandinvestigation.

Applications: End‐to‐End Mobile and Home Streaming 

There aremany video applications and services on the Internet, and the first part of this article

mentionedseveralofthem.Themostimportantoftheseapplicationsarguablyfallintothemobile

and home entertainment categories. These two categories span the range of video presentationformatsforportable,standardandhigh‐definitionvideo. Theyalsooperateondiversenetworks

and use both managed and unmanaged end‐to‐end network services. Mobile and home video

applications are among the most popular uses of video and include both walled‐garden andInternetapplications.This section identifiesmobileandhome‐streamingusecasesandconsiders

thesuitabilityofpull‐basedversuspush‐basedadaptivestreamingforeachusecase.

Mobile Streaming 

Peoplestreammovies,YouTubevideosandothermediaon theroad, inhotels,oncampusesand

while inmotion to theirmobile devices. Portable‐definition (PD) encoding is a video format for

mobile devices, which is a lower resolution and lower‐bitrate encoding than standard (SD) andhigh‐definition(HD)encoding,whicharecommoninhomes.PDvideodeliveryscalesdowntothe

lowerspeedsandhigher‐lossfoundonmostmobilenetworks:Variationsinloadonnetworkcells,

changes in radio‐network conditions and fluctuations in available throughput force videoapplications to adapt to the network rather than vice versa in GPRS, EDGE, UMTS and CDMA 1x

networks. Thus, it is no surprise that mobile handset vendors and operators have been in the

forefrontindefininganddevelopingadaptivestreamingstandards.

When the mobile connection is to a walled‐garden server, the network provider can manage

networkvideodelivery toa service‐compliant,walled‐gardenclient. This isoneuse case. If the

end‐to‐end connection is to a server on the Internet, the network provider cannot provide QoS

2

management,and theservice isnotmanaged. This isanotherusecase.There isalsoa thirduse

casethatcombinesmobileandhomestreaming.

Walled Garden 

In this case, the end‐to‐end path is contained in the cellular provider’s network. The network

operator can specialize service for video delivery. The provider can choose to use push‐basedadaptivestreamingservicesfromspecializedstreamerstogetoptimaluseofthenetworkresources

and improved video quality over a range of network conditions. Alternatively, the provider can

choose to reuse their data network services such asHTTP servers and caches, and employ pull‐basedadaptivestreamingmethodssuchasApple’siPhoneHTTPLiveStreaming.

Mobile Internet 

The cellular provider cannotmanage the service quality on an end‐to‐end basiswhen the video

connectiontraversesthepublicInternet.Inthiscase,pull‐basedadaptivestreaminghasadvantagesoverpush‐basedmethods.

Hybrid Mobile‐Home Use Case 

Thecellularprovidercannotguaranteeaservicewhenthefirstnetworkhop(s)orthelastnetworkhop(s)areontheclient’shomenetwork.Femtocellnetworksandhome‐networkservicestomobile

clients are examples. When an unmanaged homenetwork is a hop on the end‐to‐end path, and

particularlywhen it is awireless hop, pull‐based adaptive streaming is desired instead of push‐basedmethods.

Home‐Network Streaming 

The home network use cases are crucial to video streaming since video is so widely used byconsumersathome.Althoughinmanyresidences,thehomenetworkconsistsofasingleEthernet

cable connected to a broadband modem and the home PC, more complex home networks are

becomingcommontodayinmanycountries.Increasingly,theself‐installedhomeLANsarecapableoftransportingvideo.

802.11(WiFi)LANsarewidespreadinmanyregionsoftheworld.Onewell‐knownvideo‐delivery

problem in WiFi networks is the divergence of network speed and reliability due to networktopology,load,andothervariables. Thepresenceofmetalobjects,commonhomeappliancesand

other sources of interferencemean that a television in one room often cannot sustain the same

quality of presentation as a device in another. The reality of diverse network capacity amongreceivers coupledwith thedesignof the802.11mediumaccess control also causeaproblem for

multicaststreamingoverwirelessLANs(WLAN)(1).

Video‐on‐demand, not broadcast TV, is likely to be the dominant video application on the homenetworks.Unfortunately,wirelessandother typesofcommonhomeLANssuchaspowerline,are

3

vulnerable to interferenceandoverload. AlthoughthemigrationofWiFiproducts to5GHz from

2.4GHzreducessomeoftheinterferenceproblems,even802.11nisvulnerabletointerferencethatresultsinwidelydifferinglossandthroughputcharacteristicsontheWLAN(2).AnotherhomeLAN

technology that isvulnerable to interference ispowerline communications (PLC) in thehome. In

fact,withtheexceptionsofwell‐laidEthernetoverCat5orovercable(asdefinedbytheMultimediaover Coax Alliance ‐ MoCA), it is hard to imagine a home networking technology that can offer

uniform,high‐speedservicesthroughoutthehome.

Whereas HTTP adaptive streaming is productively used on the public Internet and unmanagednetworks in general, adaptive streaming is arguably crucial for unicast, point‐to‐point video

deliveryonhomeWLANs.Homenetworksaremostlyunmanagedandexhibitgreatdifferencesin

networkspeedsandfeedsavailabletoadiversesetofdevices.Itisunreasonabletoexpectthatatypical home userwillmanage and select particular resolutions of amovie title based upon the

qualityofnetworkservicethataparticulardevicecanobtain fromaparticularspotonthehome

network.

Itisthereforeunlikelythatmanagedvideowilleverbecommonforin‐homestreaming.Managed

videowill likely continue to provide a superior, trouble‐free and gold‐level video service to the

home, however, which unmanaged services try to replicate. In the use cases that follow, bothmanaged andunmanagedmethods are considered, andpush‐based streaming is compared to its

pull‐basedcounterpart.

Streaming to a Home Client from a Home Server 

Thisusecaseisarguablythemostinterestingbuttheleastcommoncase:Foryears,hobbyistsand

digitalmediaenthusiastshaveconnected theirhomeTVs to theirhomenetworks toplaymovies

fromaPC,gamingconsoleorotherservers,butthisisnotyetamainstreampractice.IntheU.S.,theWiFinetworkcanautomaticallyconnectanoff‐the‐shelfhomenetworkattachedservertoaSony

PlayStation and a host of other devices implementing Digital Living Network Alliance (DLNA)

specifications1.However,thiscapabilityisnotwidelyusedbynon‐technicalconsumers.Settingupa home media server requires sufficient knowledge of how to load a server with commercial

movies, which are usually encrypted, match formats with players, manage Digital Rights

Management(DRM)whenitisexposedornotcircumvented,andperformothertasksthatpeopleoutside of the computer industry do not routinely do. Nonetheless, home servers can be quite

usefultopeoplewhowanttoplaytheirdigitalmovieswithoutexperiencingthebottlenecksofthe

Internetorserviceprovider’snetwork.Thehomeusershouldbeabletoplayafileevenwhentheirbroadbandnetworkconnectionisdown,andplaytheirmoviesonalltheTVsorotherdisplaysin

1http://www.dlna.org

4

thehomebesides justone. iTuneshasachieved in‐homemusicsharing;Appleandothervendors

willundoubtedlydevelopvideosharingamonghomedevicesaswell.Ontheotherhand,standardsolutions such as DLNA can open the market to a greater diversity of products and are likely

importanttofuturein‐homevideousecase.Streamingtoahomeclientfromahomeserverisause

casethatisarguablymoreimportanttothefutureofhomevideothanitistoday.

Whether using proprietary or standard players, home wireless networks are limited by the

available bandwidth of the radio spectrumand various types of noise and interference;wireless

networks,ingeneral,arearguablytheweakestlinkinthevideotransmissionchain.Thequalityofthe video presentation on a wireless home network is subject to all of the problems of an

unmanagednetworkplusproblemswithsignalloss,interference,andcontentionwithothertraffic

ontheWLAN(3).

Whenpush‐basedadaptivestreamingisused,thehome‐networkownermustinstall,configureand

manageaspecializedpush‐basedstreamingserveronthehomenetworkthatisnotmanagedbya

businesssuchastheInternetserviceprovider. Moreover,themostpopularprotocolforvideoonthe home network is HTTP followed by RTP/RTSP. HTTP is used on the Web and in home

networking protocols such as DLNA. It will be used bymany new services such as Ultraviolet.

TherearemanyturnkeyHTTPvideoapplicationsandcommodityserversavailable inthemarket,andHTTPworks throughhome firewalls. Thus, pull‐based streaming overHTTPhas advantages

overpush‐basedstreamingoverRTP.

Streaming to a Home Client from an Internet Server 

This use case includes,Webcasts, YouTube, Hulu, iTunes and other Internet video services to a

homePCordigitalTV.Asexplainedabove,apotentiallytroublesomehomenetworkhop,suchasa

homeWLAN,willoftenbepartoftheend‐to‐endvideopathwheneverthenetworksupportsmorethanasingledeviceconnected toamodem;aconfiguration that isbecoming increasinglyrare in

developed countries. Managed videowithin the home is unlikely to become common since the

home network is usually unmanaged. The public Internet,moreover, only supports unmanagedvideodelivery.Thus,pull‐basedadaptivestreamingistheonlyfeasibleapproachforthisusecase.

Streaming to a Home Client from a Managed Server 

Most digital television services today, including video‐on‐demand, originates from video service

providers.Aprovider’svideoservicemightbeanATSC,IPTVorHTTPservice,whichinallcasescanbeoptimizedformanagingvideoovertheprovider’snetworktoahomedevice.Managedvideo

generally achieves a level of quality that unmanaged video attempts to meet – often without

success.Thus,amanagedservercanuseapush‐basedadaptivestreamingmethod.

5

HTTP progressive download or adaptive streaming can of course run over a managed video

network andevenuseamanaged server.Thus, thisuse case includesbothpush‐basedandpull‐basedstreaming.

Streaming to a Home Client via Peer‐to‐Peer (P2P) Delivery 

In certain parts of theworld, particularly Asia and some parts of Europe, a popularmethod forstreamingvideoistouseP2Pnetworks.ResearchhasshownthatP2Ptransportcouldreducethe

load on the source servers and provide a better scalability for large streaming populations.

However, with the advances in server and caching infrastructure designs, large streamingcapacities are now easily achievable without needing P2P transport. On the business side,

uncontrolledandunmanagedP2Psystemsoftenfailtoprovideafairrevenuesharingbetweenthe

content and Internet serviceproviders.While thereare industry‐wideefforts toaddress someofthespecificissuesrelatedtonetworkandprovider‐friendlyP2Ptransport,thebusinessmodelsthat

willbeprofitableforboththecontentowners/providersandInternetserviceprovidersarestillnot

inplace.Thus,inthisarticle,wewillnotgointofurtherdetails;interestedreadersmayrefertotheDecember 2007 issue of the IEEE Journal on Selected Areas in Communications for a detailed

coverageofvarioustopicsinP2Pstreaming.

Summary 

ThetwomajortrendsinInternetvideo,therefore,arestreamingtomobiledevicesandtoaTVor

other in‐home screen. Of the six use cases of mobile and home‐network streaming, most have

unmanagednetworksalongtheirend‐to‐endpaths.

Two of the cases shown in Table 1 can benefit from push‐based adaptive streaming using

specializedserversandclients,i.e.,the(walled‐garden)managedserverusecasesinwhichtheend‐

to‐endpathisoveraprovider‐managednetwork.ManagednetworksareincreasinglyoptimizedforHD video delivery; managed network devices comply with the technical protection measures

required by major video producers. And video service providers have the strategic business

relationships to allow them an early spot in the windowed movie release schedules as well asspecialpay‐per‐viewvideoevents.Thisgivesserviceprovidersaroleforpremiumcontentdelivery

ofmajortitles,particularlyexpensivemovies.Asstatedabove,managedvideoisthestandardthat

Internetvideomustmeet.

6

Table1:End‐to‐endpathsandpreferredstreamingmethods.

InternetServer ManagedServer HomeServer

MobileClient Pull‐basedstreaming Pushandpull‐based

streaming

Pull‐basedstreaming

HomeClient Pull‐basedstreaming Pushandpull‐basedstreaming

Pull‐basedstreaming

Howclosecanunmanagedvideocometomanaged‐videoquality?Andhowwellcansuchservices

scale? These questions are crucial to the future of Internet video services that use pull‐basedadaptivestreaming.Arguably,theInternetserverpathinTable1offersusersthegreatestaccessto

themostcontent–includingbroadcasters’sites,iTunes,Netflix,Hulu,andevensocialnetworking

sites. Ifpull‐basedadaptivestreamingcanmatchthequalityofmanagedvideowellenough,thenthisusewillfavorthefuturesuccessofthisvideostreamingtechnology.

Ofthesixusecases,streamingfromahomeservertoahomeclientismuchlesspopularcompared

to Netflix, Hulu, iTunes, and managed video from service providers/retailers. This use casebecomesmoreimportant,however,asdigitalfilesreplacedigitaldisksinthehome.Andwhenthe

broadband connection becomes unavailable, as broadband connections sometimes do, then the

usershouldat leasthaveaccess to their titleswithin theboundsofa functioninghomenetwork.Streamingtoahomeclientisalsointerestingbecausethisusecaseoperatesonthehomenetwork,

andthevideo‐deliveryperformancecharacteristicsofhomenetworkssuchas802.11areimportant

totheInternetandmanagedservercasesaswell.Wethereforeconcludethatpull‐basedadaptivestreaming methods have considerable advantages over push‐based adaptive methods for most

Internetvideoapplications.

Standardization Efforts 

MicrosoftPIFF (Protected InteroperableFileFormat) (4) andSST (SmoothStreamingTransport)(5)arecompletespecifications foranadaptive‐streamingcontainerandstreamingprotocol.PIFF

and SST use the standard MP4 fragmented file structure. Microsoft has also opened up these

specifications forwidespread use by offering theMicrosoft Community Promise,which providesintellectual property rights guarantees for organizations that choose to use their adaptive

streamingprotocols.ApplehassimilarlyoffereditsHTTPLiveStreamingontheWebandinanIETF

draft (6). Both the Microsoft and Apple specifications reflect the current adaptive streamingproductsthatareinthemarketandofferedbythetwocompanies.

Theseandotherspecificationeffortsareappearing invariousvenues.These include3GPP,MPEG

and industry consortia such as the Digital Entertainment Content Ecosystem (DECE) LLC,which

7

announced a new service calledUltraviolet in July 2010.DECEhas notmade public its technical

specification at the time of thewriting of this article. It has been announced, however, that theDECE Ultraviolet service would support a common container format, movie downloads and

streamingwithsupportformultipleDRMs2.Itseemslikelythatthefinalspecificationswillsupport

at leastsomeof thestate‐of‐the‐artmediadistributiontechnologiesthathavebeenconsidered inthisarticle.

Although DECE LLC’s membership includes dozens of companies who are involved in media

distribution, it isaprivateeffortbyaconsortiumof companies.Adaptivestreaming technologieshavemadetheirwayintopublicstandardsdevelopmentorganizationsincluding3GPPandMPEG.

3GPPisthefirstorganizationthatreleasedaspecificationrelatedtoadaptivestreamingoverHTTP.

3GPPRelease9(7)waspublishedinMarch2010,andtheSA4WorkingGroupiscurrently intheprocess of bug‐fixing this specification taking into account experience from the first

implementations. In addition, 3GPP is preparing an extended version for Release 10, which is

scheduledtobepublishedlater in2011.Thisreleasewill includeanumberofclarifications,offerimprovementsandaddnewfeatures.OntheMPEGside,theStreamingofMPEGMediaoverHTTP

AdHocGrouphasreceived15submissionstotheircallforproposalspublishedinMay2010.After

an initialevaluationperiod in July2010, thisadhocgroup,whichhasbeenrenamedasDynamicAdaptiveStreamingoverHTTP(DASH)AdHocGroup,hasadoptedthe3GPPRelease9asabaseline

specification and started running several evaluation experiments. The DASH Ad Hoc Group will

workonstandardizingthemanifest file,delivery format,conversion from/toexisting file formatsand the use ofMPEG2 Transport Streams as amedia format. TheDASHAdHoc Groupwill also

coordinatecloselywith3GPPSA4WorkingGrouptobetteraligntherespectivespecifications.

Open Issues and Future Directions 

Atthewritingofthisarticle,therearestillmanyquestionsaboutadaptivestreamingthathavenotbeenadequatelyansweredyet.Whilethereareearlyinvestigationssuchas(8)andthereferences

therein,most of the questions still remain openmostly due to the lack of field data required to

conductarigidanalysis.Todate,TCPandHTTPhavebeenindividuallystudiedingreatdetail forconventionalapplications.Proposalshavebeenmadeforbothprotocolstomakethemabetterfit

for streaming‐like applications supported by simulations and experiments. However, making

generalizationsanddrawingconclusionsforlarge‐scalestreamingdeploymentsbasedonnarrowlyscopedstudiesisnottrivialandmaybeeasilymisleading.

2http://www.homemediamagazine.com/digital‐copy/dece‐becomes‐ultraviolet‐20060

8

Historically,TCPhasbeendesignedandoptimizedforFTP‐likeapplications,andimplicationsofits

congestion control algorithm and currently adopted best practices such as Nagle’s algorithm,delayedacknowledgments,slow‐startrestart(SeeRFC5634)havetobeexaminedinthecontextof

adaptivestreaming.EffectsofrunningmultipleTCPconnectionssimultaneouslyforovercomingthe

head‐of‐line blocking problem or another transport protocol that inherently provides a bettersupport for multiple streams such as SCTP have also to be studied carefully. Most importantly,

thesestudiesmustrepresentlarge‐scaledeploymentswithmanyoriginservers,cachesandclient

streamingavarietyofcontentoverdiverselycharacterizednetworkpaths.

A niche area is to develop instrumentation tools to assess the effectiveness and performance of

adaptivemediatransport.Thesetoolsmustbeuser‐friendly,abletoprovideadequateinformation

fordiagnosticsandfault isolation.Suchinformationbecomesquitehandyforserviceprovidersinfixingproblemsandmakingthenecessaryprovisionsandenhancementsintheirnetworks.

An interesting future research direction is to understand how network elements can help a

providerachieveabetterperformance,whichcouldbesupportingmoreclientsordeliveringtheirclientsabetterandmoreconsistentstreamingquality.Inotherwords,ratherthanconsideringthe

underlyingnetworkasablackboxorjustbunchofpipesthatcarrythebits,havingtheendpoints

(e.g.,originserversandclients)andmiddleboxes(e.g.,cacheservers)talktothenetwork,exchangeinformationandactaccordinglymaypotentiallyprovidemanybenefits.Forexample,thenetwork

itselfisfirsttoknowaboutacongestionorfailure.Suchinformationcertainlyhelpstheserversand

clientsadaptfasterandmoreaccurately.

Lastbutnotleastimportantopenissueiswhetherthenetworkcapacityinamulti‐accessnetwork

can sufficiently supportmany clients concurrently. In sucha scenario, each client competeswith

othersformorebandwidthandislikelyexperienceacontinualbitratefluctuation,whichadverselyimpacts quality of experience. Prepositioning the content or using multicast (especially for live

streaming)mayprovetobeuseful.

Inafuturearticle,wehopetoprovidetheresultsfromourongoingstudiesinvestigatingtheseopenissues.

Acknowledgments

WethankourcolleaguesatCiscofortheirreviewsandfeedbackforthisarticle.

9

References 

1.  Multicast  in  802.11  WLANs:  an  experimental  study.  Dujovne,  Diego  and  Turletti,  Thierry  . 

Torremolinos, Malaga, Spain : ACM, 2006. ACM MSWiM. 

2.  Understanding  and  Mitigating  the  Impact  of  RF  Interference  on  802.11  Networks.  Gummadi, 

Ramakrishna , et al., et al. Kyoto, Japan : ACM, 2007. ACM SIGCOMM. 

3.  Performance  Evaluation  of  Video  Streaming  with  Background  Traffic  over  IEEE  802.11  WLAN 

Networks. Cranley, Nicola and Davis, Mark . Montreal, QC, Canada : ACM, 2005. ACM WMuNeP. 

4. Protected Interoperable File Format. [Online] May 2010. http://go.microsoft.com/?linkid=9682897. 

5. Smooth Streaming Transport Protocol. [Online] Sept. 2009. http://go.microsoft.com/?linkid=9682896. 

6. HTTP Live Streaming. IETF Internet Draft. [Online] June 2010. http://tools.ietf.org/html/draft‐pantos‐

http‐live‐streaming. 

7.  Transparent  End‐to‐end  Packet‐switched  Streaming  Service  (PSS);  Protocols  and  Codecs.  3GPP  TS 

26.234. [Online] Oct. 2010. http://ftp.3gpp.org/specs/html‐info/26234.htm. 

8. An experimental evaluation of rate‐adaptation algorithms in adaptive streaming over HTTP. Akhshabi, 

Saamer, Begen, Ali C. and Dovrolis, Constantine. San Jose, CA : ACM, 2011. ACM MMSys. 

Biographies 

AliC.Begen([email protected])iswiththeVideoandContentPlatformsResearchandAdvanced

DevelopmentGroupatCisco.Hisinterestsincludenetworkedentertainment,Internetmultimedia,transport protocols, and content distribution. He is currentlyworking on architectures for next‐

generationvideotransportanddistributionoverIPnetworks,andheisanactivecontributorinthe

IETFintheseareas.HeholdsaPh.D.degree inelectricalandcomputerengineeringfromGeorgiaTech. He received the Best Student‐Paper Award at IEEE ICIP 2003, and the Most Cited Paper

AwardfromElsevierSignalProcessing:ImageCommunicationin2008.Heisalsoamemberofthe

ACM.

TankutAkgul([email protected])isasoftwareengineerintheServiceProviderVideoTechnology

Group at Cisco. His research interests are embedded software design, video compression and

multimedia streaming.He currently designs and develops software for the next‐generation IPTV

10

set‐topboxes.PriortoCisco,heworkedasaseniorscientistatSoftNetworks,LLCanddeveloped

several video codecs for Intel embedded SoCs. He received his Ph.D. degree in electrical andcomputerengineeringfromGeorgiaTech.HisPh.D.thesiswasondebuggingandreverseexecution

ofprogramsforembeddedsystems.Hehasseveralconferenceandjournalpublicationsinthearea

ofembeddedsoftware.

Mark Baugher ([email protected]) is a technical leader in the Research and Advanced

DevelopmentgroupatCisco.Prior toCisco,Markco‐foundedandservedas theCTOofPassedge

Inc.,whichdevelopedamulticastmediasecurityproduct.Overtheyears,MarkhasledteamsthatdeliveredinnovativetechnologiessuchasIntel'sPC‐RSVP,thefirstnetworkreservationsystemon

a PC, andLANServerUltimedia, IBM's firstmedia server product.Markhas co‐authored several

widely used international standards in the IETF and ISMA, and he is currently co‐chairing theInternet GatewayDeviceWorking Committee in the UPnP Forum.Mark holds anM.A. degree in

computersciencefromtheUniversityofTexasatAustin.