sections 14.1 - 14.4 streaming media on demand and live broadcast multimedia over ip and wireless...
Post on 20-Dec-2015
217 Views
Preview:
TRANSCRIPT
Sections 14.1 - 14.4
Streaming Media on Demand and Live Broadcast
Multimedia over IP and wireless networks: compression, networking, and systems
Mihaela van der Schaar & Philip A. Chou
Presented by
H. Mark OkadaCMPT 820
February 18, 2009
Streaming Media Media on demand: a user scenario characterised
by audio or video playback locally from a CD or DVD interactive controls: fast forward, pause, seek, etc.
Live broadcast: a user scenario characterised by tuning into a radio or television program only has ability to join or leave a session
Both are prevalent in the internet todayEg. interactive music and video playback internet radio
chapter 14 looks at how these services are available
Sections 14.2-14.4 will only cover media on demand
OverviewSection 14.2 Overview of
Architectures Protocols Format issues
Section 14.3 Buffering and timing fundamentals
Section 14.4 How media data is communicated for
streaming on demand
NOT COVERED - Section 14.5 Live broadcast
Architectures - 14.2.1 Streaming media on demand and live
broadcast require different architectures
Figure 14.1
Streaming media on demand source of media is encoded off line to a media file streaming using different protocols (Section 14.2.2)
media file may be specialized to support various modes of streaming (discussed in Section 14.2.3)
client temporarily buffers encoded media into decoder buffer
temporarily buffers decoded media in a render buffer fairly short (a frame or two) as it has large decoded
frames enable experience through playback commands
play, FF, stop, seek
Communication between server & client tailored to client’s resources network connection
Figure 14.1a
Progressive downloading type of streaming - media can be streamed
faster than playback. i.e. downloading entire file
If able to decode sequentially progressive downloading can be done through simple
file transfer protocols eg. FTP, HTTP both over TCP/IP (i.e. over FTP or through
a web server)
If limited buffer progressive downloading can be done using simple TCP
flow control allows client to accept data from TCP only if there is
space in media buffer popularised by SHOUTcast, an early music streaming
service
network bandwidth > media content bit rate
(the source coding rate)
Progressive downloading type of streaming - media can be streamed
faster than playback. i.e. downloading entire file
need to account for network jitter, temporary interferences
want highest possible source coding rate (not less than worst case network bandwidth)
These are much of the issues for media on demand, and the communication protocol between the client and server
network bandwidth > media content bit rate
(the source coding rate)
Live broadcast encoder may be directly connected to the
server through an encoder buffer encoder buffer contains limited data to
maintain fixed and short end-to-end delay server accesses data at the playback point,
not in any arbitrary data in a file restricts adaptivity, important for multiple receivers not possible to have interactive access to media
difficult to adapt transmission rate of varying clients** difficult for server to use retrans-
mission-based error control due to negative acknowledgement
(NAK) implosion problem error becomes delicate issue for live
broadcast
**receiver-driven layered multicast (RLM) allows adaptation of transmission rateAlso see:S. R. McCanne. Scalable Compression and Transmission of Internet Multicast Video. Ph.D. thesis, The University of California, Berkeley, CA, December 1996. S. R. McCanne, V. Jacobson, and M. Vetterli. “Receiver-Driven Layered Multicast,” in Proc. SIGCOM, pages 117–130, Stanford, CA, August 1996. ACM.
Protocols - 14.2.2 streaming on demand requires many
protocols at different levels
This section covers a subset of the protocols described in week 2 of this class
RTP: Real-Time Protocol RTSP: Real-Time Streaming Protocol RTCP: Real-Time Control Protocol SIP: Session Initiation Protocol
Real-time streaming protocol (RTSP)
RFC 2326At the topmost level:
application level protocol protocols for content discovery connection to specific streaming media server
Content discovery is done “out of band”eg. http://www.microsoft.com/directory/contentname.asx
http://www.realnetworks.com/directory/contentname.ram
http://www.apple.com/directory/contentname.mov URL pointing to metadata that references a separate file
on a webserver different for each type: asx, ram, mov
Client contacts server using URL for the content.eg. rtsp://wms.microsoft.com/directory/contentname.wmv
rtsp://helixserver.example.com/audio1.rm?start=55&end=1:25rtsp://qtserver.apple.com/directory/contentname.mov
Prefix: indicates the streaming protocol used Suffix: info to the server, eg. seek, play speed, etc.
Example of auxiliary fileMicrosoft ASX file<ASX Version="3.0">
<ENTRY>
<REF HREF="mms://streamingmedia/studios/0505/24721/MTV_XBOX_preview_160k.wmv" />
</ENTRY>
<ENTRY>
<REF HREF="mms://winmedianw/studios/0505/24721/MTV_XBOX_preview_160k.wmv" />
</ENTRY>
</ASX>
RealNetworks RAM file# First URL that opens a related info pane.
rtsp://helixserver.example.com/video3.rm?rpcontextheight=350
&rpcontextwidth=300&rpcontexturl="http://www.example.com/relatedinfo2.html"
&rpcontexttime=5.5&rpvideofillcolor=rgb(30,60,200)
#
# Second URL that keeps the same related info pane,
# but changes the media playback pane’s background color.
rtsp://helixserver.example.com/video4.rm?rpcontexturl=_keep
&rpvideofillcolor=redFigure 14.2
Streaming protocol commands typically sent reliably over TCP
connection (many forms) Real Time Streaming
Protocol (RTSP)is widely adopted(RFC 2326)
Idea is simple but SET_PARAMETER can be complicated a media file may have multiple streams for audio and
video for different languages, subtitles, source coding rates, etc.
Real-time protocol (RTP) Client is able to specify which lower level
data transport protocol to use data transport is usually either
RTP over UDP, or RTP over TCP
Both are preferred for bandwidth efficiency
RTP over UDP - must be a means of transmission rate and error control
no standard means of transmission rate and error control for RTP
HTTP over TCP may be used when avoiding firewall issues
Real time control protocol (RTCP)
RFC 3551
often used with RTP often receivers provide statistical
feedback to sender (reports) the interoperable and proprietary
features limit the use as a standard
Windows Media system RTP over UDP normally transmission rate control
based on source coding rate of content
client can detect congestion signal server to lower or increase source
coding rate
Alternative methods of transmission rate control
1) TFRC: TCP-friendly rate control2) TCP-like congestion control algorithm
Both are being standardised as two profiles in Datagram congestion control protocol (DCCP)
Must be paired with a source coding algorithm so that coding rate is same as transmission rate…
Source coding rate control algorithm Eg. rate-distortion optimised (RaDiO) scheduling
algorithm error control in Windows Media use selective
retransmission gaps sends a NAK to the server (negative
acknowledgement), causing retransmission audio has higher priority than video Windows media players stalls if missing audio packets
and waits for arrival
File formats - 14.2.3Challenging to adapt fixed media file tovarious network and client conditions
encoding must be done before streaming (no knowledge of context)
allow flexibility into media file
Unrealistic to: compress or transcode to needs of
every client best way is to allow server to select
which parts of the file to stream
Some streaming formatsThe Major players MPEG-4 format QuickTime format (MPEG-4 is based) RealMedia format Microsoft Advanced streaming format
(ASF)
All have ability to contain/multiplex multiple media and versions of each medium
recorded into a track (MPEG-4/QT) or stream (ASF) data units: made of chunks (MPEG-4/QT) or
packets (ASF)
Streaming formats Each has a header containing metadata relating
to overall file and specific tracks or streams title, author, date, encryption, right managements, table
of contents, track/stream enumeration & their descriptions
Information on individual track/stream properties start time, duration, bit rate, buffer size, sampling rate,
picture size, scalability capabilities Time-varying metadata can be associated with
each track/stream network packetisation, decoding and presentation time
stamps, SMPTE time codes, key frame, switch frame
Two types of metadata static metadata: size independent of length of data,
inexpensive to transmit over the network time-varying metadata: size grows with data,
expensive to transmit
Streaming formats … provides a structure to allow a method
to select parts of data to transmitEither course grained: server streams only a
particular subset of streams to client fine grained: in addition allows fraction
of the data to be chosen Can set a Lagrange multiplier parameter
which determines which data units are not transmitted
Encoding media into a streamTwo methods1) Multibit rate (MBR) multiple independent encodings (each
with varying coding rates) are stored in separate streams (in same file)
choice in which streams to play2) scalable coding
later on section 14.3.3
Data units use packets
eg. H.264/AVC use Network Adaption Layer (NAL)
In general, local playback/storage not suitable for streaming hard for server to choose the right portions
of the file to stream difficult to randomly access (seek) arbitrary
points in the stream
OverviewSection 14.2 Overview of
Architectures Protocols Format issues
Section 14.3 Buffering and timing fundamentals
Section 14.4 How media data is communicated for streaming on
demand
NOT COVERED - Section 14.5 Live broadcast
Fundamental abstractions - 14.3Fundamental abstractions of streaming media on
demand (Section 14.3) Section covers
leaky bucket models of bit streams constant bit rate (CBR) vs. variable bit rate (VBR) compound (multiple media) streams preroll delay playback speed timing timing clocks decoder and presentation timestamps
Should know when it is safe for client to begin playback
Buffering and leaky bucket models
Scenario 1 - constant bit rate (CBR) isochronous** noiseless communication
channel
encoder buffer in between encoder and channel
decoder buffer in between channel and decoder
schedule – sequence of bits which successive bits in an encoded bit stream pass a given point in pipeline
Figure 14.4
Encoding buffer Decoding buffer
B bits = Encoding buffer
+
Decoding buffer
**isochronous - equal amounts of data are communicated in equal amounts of time
Figure 14.3
Buffer tube Can view previous as a buffer tube Characterised with 3 parameters
R - slope B - height in bits Fe - offset/fullness from bottom of tube
Or by Fd - offset from top of tube Fd = B - Fe Can view previous as a buffer tube
From a buffer point of view overflow in of encoder buffer => decoder buffer
underflow underflow in of encoder buffer => decoder buffer
overflow B = encoder buffer + decoder buffer Fe - initial fullness of encoder buffer
managed by a rate control algorithm assigns a number of bits b(n) to each frame n
Buffer tube Managed by a rate control algorithm
assigns a number of bits b(n) to each frame n
B = encoder buffer + decoder buffer Fe - initial fullness of encoder buffer
De initial delay before entering channel De = Fe/R
Dd = Fd/R delay after data extracted by the decoder from the channel
Aim to keep decoderbuffer delay
Dd = Fd/Rlow
Figure 14.5
(R,B,F) tube
Variable bit rate stream (VBR)Scenario 2 - variable bit rate stream
(VBR) Unlike CBR, VBR has a variable amount
of data per time segment higher bitrate for complex segments lower bitrate for less complex segments
tend to have wider buffer streams=> larger start-up delay
part of an overall problem: difficult to determine the average bit rate of system
Variable bit rate stream (VBR) Recall the (R,B,F) tube
each parameter is not uniquefor a given bit stream
Definitions of average rate is non trivial fit the closest slope along the stairwell,
or number of bits in stream / duration of
stream
Variable bit rate encoder does not use channel continuously
channel has peak transmission rate R higher than average stream bit rate
when needed, sends packets at rate R otherwise at 0
typical of packet network and shared channels
best modelled by leaky bucketDefined by (R, B, Fe) n: frame number b(n): number of bits placed in leaky bucket τ(n): time that frame n is processed R: bit rate of data leaked out of bucket Fe(n) fullness of en. buffer before frame n added Be(n) fullness of en. buffer after frame n added has schedule
Leaky bucket Be(n) fullness of encoder buffer after
frame n added to bucket
Fe(n) fullness of encoder buffer before frame n added to bucket
Be(n) < B for all n = 0, 1, … N
Aim is to find smallest decoder buffer size and smallest decoder buffer delay
€
Be (n) = Fe (n) + b(n)
€
Fe (n +1) = max{0,Be (n) − R[τ (n +1) − τ (n)]}
Leaky bucketFor a given stream, define: Minimum bucket capacity with leak rate R and
given initial fullness Fe
Bmin(R,Fe) = minnBe(n) Initial decoder buffer fullness
Derives that there is a minimum capacity B as well as minimum decoder buffer delay Dd = Fd / R, provided it starts with initial fullness Fe = Fe
min (R)
Source coding rate (Rc): maximum leak rate R such that a leaky bucket (R, B, Fe) does not underflow with initial fullness Fe = Fe
min(R) larger leak rates R => smaller required
capacity
€
Fdmin (R,Fe ) = Bmin (R,Fe ) − Fe
Leaky bucket If transmission rate R > source coding rate Rc
Decoder buffer reduced Decoder buffer delay
also reduced client can determine required
buffer size and preroll delay use functions Bmin(R) and Fd
min(R) computed off line at set of transmission rates
R, R1 < R2 < · · · < RL
stored in the bit stream header as a set of leaky bucket parameters (Ri , Bi , Fi ) where Bi = Bmin(Ri) and Fi = Fd
min(Ri) each i ∈ L represents the breakpoints in piecewise
linear function in Bmin(R) and Fdmin(R)
can estimate by linear interpolation (and extrapolation at ends) at any point R can estimate Bmin(R) and Fd
min(R)
Figure 14.7
Compound streams (section 14.3.2)
Compound streams encapsulate many streams meant to played and streamed concurrently view as a single compound stream and a set of leaky
buckets
a leaky bucket (B,F,R) is the sum of its component leaky buckets
eg. If audio has bucket (Ra,Ba,Fa), and video has bucket (Rv,Bv,Fv), then parameters sum: R = Ra + Rv
B = Ba + Bv
F = Fa + Fv
Find a combination of each leaky bucket s.t. the combined leaky bucket won’t overflow
Compound streams Find a combination of each leaky bucket
s.t. the combined leaky bucket won’t overflow
combination of i in La and j in Lv
minimising using Lagrangian shows that there are at most La + Lv index pairs, that lie on set
can extend this into M concurrent media streams
Multibit rate (MBR) multiple independent encodings (each
with varying coding rates) are stored in separate streams (in same file)
choice in which streams to play mutually independent, each at different
source coding rates combining all possible mutually
exclusive streams (eg. audio Na and video Nv) each with a different leaky bucket most combinations of Na × Nv not likely,
typically are Na + Nv use distortion rate approach
Distortion-rate approachDecide which streams to pair
assign a distortion Dia and source coding rate Ri
a to each audio stream in i = 0… Na
assign a distortion Djv and source coding rate Rj
v to each video stream in j = 0… Nv
For each (i,j) combined stream, define distortion and source coding rate
Where α: arbitrary weight relative to video distortion using Lagrangian again, can find the lowest
total distortion among all combinations with same or lower total bit rate
can extend this to other sets of media
Temporal coordinate systems and timestamps (section 14.3.4)
Each frame has a decoder timestamp (DTS) in (MPEG terminology) instructs client when to decode it also acts as a decoding deadline
presentation buffer holds decoded frames before the renderer
assigned presetation timestamp (PTS), instructs when to play critical in synchronising different streams PTS are a layer above the DTS
Note that presentation order ≠ decoding order Eg. I0, B1, B2, P3, B4, B5, P6, ... (presentation order)
I0, P3, B1, B2, P6, B4, B5, ... (decoding order)
assumed that frames are time stamped with DTS and PTS
book will only use DTS
clocks (temporal coordinate system)
media time τ: clock for device used to capture and timestamp original content (real time)
client time t: clock for device playing contenteg.
τDTS(0), τDTS(1), etc. tDTS(0), tDTS(1), etc.
Converting is done by Where
v is the playback rate (v=2 => playing 2x the speed)
t0 and τ0 are common initial events (first frame after seeking/rebuffering)
€
t = t0 +τ − τ 0
ν
Leaky bucket update Leaky bucket update becomes
where R´ = Rv is the arrival rate of bits into client
(unit: bits/client time) R = R´/v rate that must be used to compute
required buffer size Bemin(R) and initial decoder
buffer fullness preroll delay is Fd
min(R)/R´ = Fdmin(R)/Rv
larger playback speed => smaller preroll delay
OverviewSection 14.2 Overview of
Architectures Protocols Format issues
Section 14.3 Buffering and timing fundamentals
Section 14.4 How media data is communicated for streaming on
demand
NOT COVERED - Section 14.5 Live broadcast
Packet networks - 14.4 RC: source coding rate
RS: sending rate - rate at which data injected into transport layer Measured in bits/s of client time
RX: transmission rate - rate which data injected into network layer (TCP or UDP)
RX - RS = error control overhead
RS / RX = channel coding rate
Ra: arrival rate assumed to be RS
usually set to Ra = vRc
Decoupling Rc and Ra has advantagesFigure 14.8a
Decoupling Ra = vRc
Adjusting source coding rate defined by problem source coding rate control Choose Rc as a function of Ra
Change client buffer duration and history Have variety of average bit rates R(1), R(2), … Each with tight buffer tube (R(i),B(i),Fe
(i))
Can delay playback to ensure guaranteed continuous playback
Control theoretic model - 14.4.2.1
Client buffer - gap between frame arrival time ta(n) and its playback deadline td(n) Overflow when gap too large Underflow when gap too small
If gap shrinks, must reduce Rc to adjust tb(n)
Figure 14.9
Control Objective - 14.4.2.2 Underflow prevented by previous
section Quality fluctuates to complexity of content
Target schedule has a margin of safety Introduces a penalty to the cost function
Deviation of buffer tube from target schedule Coding rate difference between successive
frames
Target schedule design - 14.4.2.3
Want smallest client buffer duration Start with small delay, and increase gap
Slope is the average source coding rate to the average arrival rate
If upper bound alignswith target schedule
tb(n) = tT(n)
Eventually want logarithmic growth of bufferFigure 14.10
€
s(n) =tT (n +1) − tT (n)
τ (n +1) − τ (n)
€
s(n) =Rc (n +1)
Ra
Controller design - 14.4.2.4 Adjust source coding rate
Controller needs to change n+2 frame at time n
Uses notion of an error e(n) and a vector feedback gain G Optimal G* is solved
top related