diplomarbeit a java-based streaming media serverreal-time media stream control and real-time media...
TRANSCRIPT
Diplomarbeit
A Java-Based
Streaming Media Server
ausgeführt am
Institut für Interaktive Medien Systeme
der Technischen Universität Wien
unter der Anleitung von
ao.Univ.Prof. Dr. Horst Eidenberger
durch
Harald Psaier
Matr.Nr. 9826727
Bergsteiggasse 7/1/20
Wien, 19th October 2005
2
Abstract
The distribution of multimedia content has become a prospective sector of the
Internet business. Multimedia content is transmitted by media streaming servers to
stream-controlling clients. These multimedia architectures are available for many
different operating systems. This diploma thesis describes the implementation of
such a multimedia client/server architecture.
The main task of a media streaming server is to provide transparent network
services for a collection of media objects. The server acts as a proxy server and
allows different access methods to the provided multimedia content. Accessing
the multimedia content includes streaming media objects, accounting media ob-
jects and their access rights and, furthermore, media stream-control during media
content transmission.
At the early stages of the work existing solutions and methods of multime-
dia implementations were examined . The resulting software project consists of
a client component and a server component based on the Java programming lan-
guage and its media extension, the Java Media Framework. Java and the frame-
work provide all the means to implement a multi-threaded listening server and a
controlling and accounting client. The communication between the counterparts is
established by the Real Time Streaming Protocol. It has been extended to support
media information and user profile accounting. A relational database manages
the media information and user profiles. Media streaming is implemented by seg-
menting the media objects into network packets using the approach described by
the Real-Time Transport Protocol.
The resulting software projects comprises a server and a client package. The
server package contains facilities for streaming, device capturing and caching of
media streams. The client provides an user interface for both control of media
transmissions and administration of media information.
The final implementation is a multimedia client/server architecture written in
Java code, thus making it independent from the operating system. Equipped with
real-time media stream control and real-time media object transport capabilities,
it is compatible with modern multimedia streaming solutions such as the RealNet-
works and the QuickTime client.
CONTENTS 3
Contents
1 Introduction 5
2 Related solutions and background 8
2.1 Streaming protocols . . . . . . . . . . . . . . . . . . . . . . . . .8
2.1.1 RFC 2326 - The Real Time Streaming Protocol (RTSP) . .9
2.1.2 RFC 1889 - The Real Time Transport Protocol (RTP) . . .9
2.2 Multimedia frameworks . . . . . . . . . . . . . . . . . . . . . . .10
2.2.1 OpenML . . . . . . . . . . . . . . . . . . . . . . . . . .12
2.2.2 Apple QuickTime . . . . . . . . . . . . . . . . . . . . . .12
2.2.3 GStreamer . . . . . . . . . . . . . . . . . . . . . . . . .13
2.2.4 Java Media Framework . . . . . . . . . . . . . . . . . . .13
2.3 Streaming media overview . . . . . . . . . . . . . . . . . . . . .14
2.3.1 Apple streaming solutions . . . . . . . . . . . . . . . . .15
2.3.2 RealNetworks streaming solutions . . . . . . . . . . . . .16
2.3.3 VideoLAN streaming solutions . . . . . . . . . . . . . . .16
2.4 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .17
2.4.1 Caching objectives . . . . . . . . . . . . . . . . . . . . .18
2.4.2 Memory caching policies . . . . . . . . . . . . . . . . . .18
3 Design and architecture 22
3.1 Requirement analysis . . . . . . . . . . . . . . . . . . . . . . . .22
3.1.1 Media management and control . . . . . . . . . . . . . .23
3.1.2 RTSP/RTP server . . . . . . . . . . . . . . . . . . . . .24
3.1.3 Interaction with RealNetworks and QuickTime clients . .25
3.1.4 JMF MPEG-4 RTP integration . . . . . . . . . . . . . . .25
3.1.5 Caching . . . . . . . . . . . . . . . . . . . . . . . . . . .26
3.1.6 UI enhancement . . . . . . . . . . . . . . . . . . . . . .26
3.1.7 Media capturing timer . . . . . . . . . . . . . . . . . . .27
3.2 Deployment strategy . . . . . . . . . . . . . . . . . . . . . . . .27
3.3 Static structure . . . . . . . . . . . . . . . . . . . . . . . . . . .28
3.3.1 The server package . . . . . . . . . . . . . . . . . . . . .28
3.3.2 The client package . . . . . . . . . . . . . . . . . . . . .32
3.3.3 Database tables . . . . . . . . . . . . . . . . . . . . . . .35
3.4 Dynamic behavior . . . . . . . . . . . . . . . . . . . . . . . . . .36
3.4.1 RTSP/RTP stream setup . . . . . . . . . . . . . . . . . .37
3.4.2 Extended RTSP . . . . . . . . . . . . . . . . . . . . . .39
3.4.3 Authentication process . . . . . . . . . . . . . . . . . . .40
3.4.4 VCR timer sequence . . . . . . . . . . . . . . . . . . . .43
CONTENTS 4
3.4.5 Multi-unicast session . . . . . . . . . . . . . . . . . . .44
3.4.6 Patterns - singletons and factories . . . . . . . . . . . . .44
3.5 Implementation challenges . . . . . . . . . . . . . . . . . . . . .46
3.5.1 JMF creation . . . . . . . . . . . . . . . . . . . . . . . .46
3.5.2 JMF RTP engine . . . . . . . . . . . . . . . . . . . . . .47
3.5.3 JMF MPEG-4 RTP stream setup . . . . . . . . . . . . . .47
3.5.4 JMF AVI Container with audio/video streams . . . . . . .48
3.5.5 Interval Caching . . . . . . . . . . . . . . . . . . . . . .48
4 Implementation 49
4.1 Implementing with the Java Media Framework . . . . . . . . . .50
4.2 Project description . . . . . . . . . . . . . . . . . . . . . . . . .50
4.2.1 System overview . . . . . . . . . . . . . . . . . . . . . .51
4.2.2 Server package . . . . . . . . . . . . . . . . . . . . . . .51
4.2.3 Sub-packages in the server package . . . . . . . . . . . .56
4.2.4 Client package . . . . . . . . . . . . . . . . . . . . . . .59
4.2.5 Mp4 package . . . . . . . . . . . . . . . . . . . . . . . .60
4.2.6 Util package . . . . . . . . . . . . . . . . . . . . . . . .61
4.2.7 MediaDB database tables . . . . . . . . . . . . . . . . . .61
4.3 User Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . .63
5 Usage description 64
5.1 System requirements . . . . . . . . . . . . . . . . . . . . . . . .65
5.1.1 Hardware requirements . . . . . . . . . . . . . . . . . . .65
5.1.2 Software requirements . . . . . . . . . . . . . . . . . . .65
5.1.3 Building requirements (ANT) . . . . . . . . . . . . . . .66
5.2 System usage description . . . . . . . . . . . . . . . . . . . . . .66
5.2.1 Running the projects server and client . . . . . . . . . . .66
5.2.2 Transmitting and controlling an RTP stream . . . . . . . .66
5.2.3 Adding and deleting media . . . . . . . . . . . . . . . .67
5.2.4 Editing media object properties . . . . . . . . . . . . . .67
5.2.5 Starting a multi-unicast transmission . . . . . . . . . . . .67
5.2.6 Adding and deleting a recorder timer . . . . . . . . . . .67
5.2.7 Running other clients . . . . . . . . . . . . . . . . . . . .68
5.3 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . .68
5.4 Limitations & known bugs . . . . . . . . . . . . . . . . . . . . .70
6 Conclusions 71
6.1 About the project . . . . . . . . . . . . . . . . . . . . . . . . . .71
6.2 Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72
1 Introduction 5
1 Introduction
This diploma thesis describes my work in the field of multimedia streaming archi-
tectures. A multimedia streaming architecture is characterized by server components,
client components and a component-connecting network.
With the growth of the Internet and the local networks with their manifold possibilities,
the streaming of multimedia content has become a new challenge and a profitable busi-
ness. Current computer networks were initially built to handle text-based information
only. In order to integrate a multimedia server into these networks, new procedures and
problem-specific solutions have be defined.
The initial documents, implementations and supplier of media streams appeared in
the middle nineties when the Real-Time Transport Protocol (RTP) and the Real Time
Streaming Protocol (RTSP) were composed and RealNetworks was founded. Never-
theless, transmitting reliable and continuous multimedia streams is still a complex task.
Many obstacles have to be passed before an undisturbed delivery can be achieved. Huge
amounts of data need to be delivered. Hence, large network bandwidths are required.
Moreover, the layers below the network transport layer do not guarantee reliable deliv-
ery. Appropriate methods have to be implemented by the upper layers. Finally, from
the marketing point of view, consumers still need to be attracted to such services. One
solution to aforementioned bandwidth problems would by to "throw bandwidth on con-
gestion problems", augmenting the networks reliability. This strategy would solve the
problem on local networks. However, streaming through public networks worldwide,
such as the Internet, is still only of interest if connected to a cheap and steady web,
using flat rates such as DSL technology. Public streaming services started with radio
streams at different compression and quality rates, followed by download and stream-
ing portals provided amongst others by such well-known companies as RealNetworks
(Listen.com), Microsoft (MSN Music) or Apple (iTunes Music Store). The broadcast
of audio-visual content is just starting to establish itself, with only a few and hesitating
providers. The newest addition to streaming implementations for the Internet is Internet
telephony also known as Voice over IP.
The purpose of the project is to develop a streaming and accounting media server which
can handle multiple client requests simultaneously. The media server’s main function is
to hosts a streaming and accounting service for a collection of media objects. The ser-
vices are supported by a relational database that stores records of media content infor-
mation. Furthermore, the server tries to balance the streaming request load by caching
intervals of media streams. The client offers access to the server’s services through an
user interface. Along with media stream capturing, the client’s main function is to co-
1 Introduction 6
ordinate media content presentation with media stream flow-control. Both server and
client are written in Java programming language and include the Java Media Framework
(JMF). The framework adds streaming and presentation functions to the components.
Figure 1 shows the main components of the media server implementation of this thesis
work.
Figure 1: Project overview. CT1, CT2 represent the different communication types.
The projects main components the database, the multimedia server and the clients in-
teract using two communication types (CT1 and CT2). The database holds media infor-
mation and user authentication data. The connection between server and database CT1
requires a database dependent protocol for communication. It is implemented using a
library interface. The link to the interface allows the server to submit standard queries
to the database tables and to fetch the returned query results.
In order to satisfy the user the client requires real-time communication with the server.
Hence, CT2 actually consists of two communication protocols supporting real-time
demands. These are the Real-Time Transport Protocol (RTP) and the the Real Time
Streaming Protocol (RTSP). RTP is the state-of-the-art method to transmit multimedia
data in real-time over networks. Its functions are embedded into the projects software
with the help of the Java Media Framework. RTSP provides control over multimedia
streams. This protocol was implemented to put remote-control methods over media
content streams at the client’s disposal. RTSP was extended to allow authenticated
media object accounting.
Additionally, to the just stated main objectives my work comprises the following as-
pects. On the server’s side, not only the processing of multimedia streams and protocol
communication are of interest but also some proposals about clever resource handling
have been made. Several caching policies for multimedia streaming were considered to
1 Introduction 7
support the server in load balancing and better utilization of free primary and fast mem-
ory, hence increasing server capacity and resource usage. Depending on the system’s
architecture, caching strategies use distributed data balancing on a virtual server net-
works as well as statistics on frequently accessed media and client behavior in the time
domain. The project just contains one server, making the virtual server method irrele-
vant. Therefore, the implemented caching method at the server is a modified Interval
Caching policy.
The integration of MPEG-4-compliant codecs to the RTP packing procedure is fre-
quently of great interest. A specification was created in 2003, but only few imple-
mentations exist, which are almost all proprietary. With MPEG-4 (DIVX, XVID, etc.)
becoming a very popular codec for video material, the importance of this particular
codec definition is growing. MPEG-4 RTP streaming was integrated with the help of
the JMF packetizer and depacketizer model.
An interesting feature to the project for CT2 and online media stream providers in gen-
eral, is to take advantage of the multicast property of IP. With only one multimedia
stream transmitted by the server, the stream can be received by all clients participating
in the multicast session. However, IP multicast requires additional addressing and rout-
ing for the participants. A less sophisticated solution is multi-unicast. Multi-unicast
allows the simultaneous transmission to all registered clients of an RTSP media ses-
sion, thereby producing more load on the server. One stream per connection is opened.
Multi-unicast is a feature of the JMFRTPManager.It was used to add multi-unicast
transmission mode to the software implementation.
Device capturing also became of interest for the server. The device capturing sets up
an inverse processing queue to network streaming. The stream from a server’s device
is captured into a file sink. The client permits to set device capturing intervals and the
database provides the means of storing those intervals. The capturing queue was set up
with the help of suitable JMF components. The extended version of CT2 allows the
client to issue interval capture requests.
The diploma thesis is organized as follows. Section 2 presents the research work on
recent multimedia streaming products and types of implementation. Section 3 explains
the project design phase. Starting at the concept a requirement analysis leads to the
static structures and dynamical behaviors of the software. Finally, implementation
challenges are considered. Section 4 and 5 contain the project’s implementation and
the usage description. Section 6 concludes this thesis and offers an outlook to issue of
present multimedia streaming.
2 Related solutions and background 8
2 Related solutions and background
This section provides an overview of recent streaming server products. In Subsec-
tion 2.1 the RTSP and RTP suites are explained. In Subsection 2.2 multimedia frame-
works and their functionality is analyzed. Subsection 2.3 presents reference implemen-
tations from several providers and their features.
2.1 Streaming protocols
Streaming protocols are deployed for the transmission of real-time streams. Two rep-
resentatives, RTSP and RTP, have been chosen for the purpose of this work. Both
protocols are that fundamental for the transmission of real-time streams, that all major
streaming server products support the combination of the two.
The RTSP suite makes it possible to deliver media and transport information in a stream-
ing client/server architecture. It provides the means required to setup sessions for video-
on-demand and video control operations. RTP defines a standardized packet format.
This format is a model for audio and video delivery over the network. It solves the
media frame packing, the timestamp and the synchronization issues by adding a spe-
cial RTP header to the packed media data. The format is suitable for both unicast and
multicast transport. Figure 2 below shows the RTSP and RTP in the OSI seven layers
model.
RTSP
RTP
RTSP
RTP
Application
Pyhsical
Datalink
Network
Transport
Session
Pesentation
Server Client
streaming over
layer 4 e.g. UDP
session control
media information
Figure 2: OSI view of RTSP and RTP protocol.
2.1 Streaming protocols 9
2.1.1 RFC 2326 - The Real Time Streaming Protocol (RTSP)
The Real Time Streaming Protocol (RTSP) is defined in [1]. It was developed by the
IETF and published in 1998 as RFC 2326. RTSP is a protocol which allows a client
to remotely control a streaming media server. The client controls the server by issuing
remote-control-like commands such as "play" and "pause".
The main function of RTSP is to establish and control one or more time-synchronized
streams. RTSP is an application layer protocol. It does not transmit the streams by itself
but helps low-level streaming protocols, such as RTP, to establish connections between
the streaming partners. Furthermore, it provides a method to transmit media object
description.
RTSP must provide states which indicate the status of the streaming process. In RTSP
there are two states,session-lessand in session. In sessionmeans that the client re-
quested the set up of stream transmission, thus requiring an RTSP session for further
stream control. An RTSP session identifier is transmitted from the server to the client.
Only the session identifier allows the client to issue control commands on the session’s
streams.
RTSP has an HTTP-similar syntax making it extensible and adaptable. RTSP comprises
commands calledmethodssuch as SETUP, PLAY, PAUSE, etc. which indicate the
requested action. There are essentialmethodswhich define how to create and leave
an RTSP session. In addition, an RTSP session strictly defines its RTSPmethodflow.
The Session Description Protocol [3] describes multimedia content. It adds a means
of multimedia information exchange, session announcement or session invitation to the
RTSP suite.
2.1.2 RFC 1889 - The Real Time Transport Protocol (RTP)
The Real Time Transport Protocol (RTP) [4] defines a model for transmission of real-
time media. It was developed by the Audio-Video Transport Working Group of the
IETF and published in 1996 as RFC 1889. The model describes packaging methods
for data with real-time characteristics, such as interactive audio and video. Moreover,
the RTP suite defines a control protocol. TheRTP Control Protocol(RTCP) allows
monitoring of the media data delivery. It adds minimal control and identification func-
tionality to the RTP stream.
RTP neither guarantees timely delivery nor a continuous stream. On this matter it en-
tirely depends on lower layer services. However, aSequence Number Fieldis included
in the RTP packet header in order to reconstruct the sender’s packet sequence. But this
does not prevent the lower network layers from sending out-of-sequence packets. In
2.2 Multimedia frameworks 10
addition, it does not interfere into the transmitting logic of upper layers. The stream
processing queues at the server often deliver media object parts out-of-sequence in or-
der to force the clients to prebuffer media presentation. Another important part of the
RTP packet header is thePayload Type. It identifies the data format of the RTP payload.
In the RTP Profile for Audio and Video Conferences with Minimal Control RFC 1890
[2] the payload types for audio and video RTP streams have been specified. Further-
more, aTimestampfield in the RTP header field reflects the sampling instant of the first
octet in the RTP data packet.
RTCP was defined to add minimal control to the RTP. This control is obtained by trans-
mitting periodic control packets to all participants of a streaming session. The control
packets are distributed using the same mechanism as the RTP data packets. RTCP adds
four functions to RTP transmissions:
1. It provides feedback about the quality of the data distribution (diagnoses network
problems).
2. It carries a persistent transport-level identifier calledCNAME.
3. It helps to adjusts the RTP packet sending rate.
4. It conveys minimal session information: for example session’s participant identi-
fication.
The functions 1-3 are mandatory for an RTP transmission in an IP multicast environ-
ment. Additionally, they are recommended for all environments. For UDP and simi-
lar protocols, the RTP stream uses an even port number and the corresponding RTCP
stream uses the next higher (odd) port number.
The RFC 1889 only defines the standard properties of the RTP suite. Several other
protocols extend this basic concept with new ideas. Aforementioned RFC 1890 [2]
both lists the assigned payload types for media codecs and provides guidelines of cor-
rect media encoding for RTP transmissions. In addition, most assigned payload types
are specified in an RFC specification which describes how to adjusts the original RTP
packet model to the special requirements.
2.2 Multimedia frameworks
A multimedia framework is a software structure composed of a set of software libraries.
The primary function of such a framework is to handle media objects. A good multime-
dia framework offers an intuitive API and a plug-in architecture. Thereby, new codecs,
2.2 Multimedia frameworks 11
new formats, new capture devices, new communication procedures, etc. can comfort-
ably be added. Its main purpose is to be integrated into multimedia applications such as
multimedia players, multimedia servers or multimedia editing software. Furthermore,
the specialized libraries and interfaces of a multimedia framework make it possible to
combine new and customized multimedia solutions. Multimedia frameworks process
media with various handlers for formats, streams and contents. They are equipped with
fresh codecs, formats, multiplexers, readers, writers, etc. Additionally, frameworks try
to automate the media handling by setting up appropriate media processing queues. A
modern multimedia framework ought to be extensible because of the rapid development
concerning all parts of multimedia processing.
Figure 3 shows a model with the main components of a multimedia framework. Subse-
quently, the multimedia frameworks of four well-known suppliers are described.
Encoder Demuxer Encoder Muxer Parser
etc.
CODEC
Analizer Editor Effect
Converter etc.
Audio Video
File Network
etc.
Developer defined
NEW SOURCE / SINK FILTER
Capture queue
Presentation queue
Network transmission queue
Transcode queue
Output interface Input interface Queue connector
Developer defined queue
Figure 3: Model of a multimedia framework.
2.2 Multimedia frameworks 12
2.2.1 OpenML
The OpenML library, provided by Khronos [6], supplies a cross-platform standard pro-
gramming environment for capturing, transporting, processing, displaying and synchro-
nizing digital media. In order to increase efficiency, the OpenML API is designed to
provide support for audio, video and graphics at the lowest possible hardware level.
Whenever possible, OpenML aims at the utilization of existing standards. The main
modules implemented by OpenML are MLdc for display control, ML as media library
and a means of synchronization between the hardware devices.
MLdc is an API which allows applications to control the display of a video stream and
video output devices. The ML media library API provides an interface for high-level
utility libraries and tool kits. It offers a means of communication, processing, buffering
or control between the multimedia resources (e.g. input/output devices, files, codecs,
processing queues, etc.). The Unadjusted System Time (UST) allows a synchronization
throughout the system.
Although not open source, the OpenML package is free of charge. It provides interfaces
through modules and tools. Unfortunately, the OpenML library is not equipped with any
network media streaming properties.
2.2.2 Apple QuickTime
The QuickTime framework [7] is Apple’s multi-platform multimedia software architec-
ture. It is composed of a set of multimedia operating-system extensions, a comprehen-
sive API, a file format, and a set of user applications such as the QuickTime Player, the
QuickTime ActiveX control and the QuickTime browser plug-in.
The QuickTime framework considers itself as a complete multimedia architecture. It
supports creating, producing and delivering of media. The architecture provides an
end-to-end support for multimedia processes. These processes include the capturing
of media in real-time, the importing and the exporting of existing media and further-
more, the editing, the composing and the compressing of media content. Less complex
processes, such as the playback of multimedia, are supported as well.
Apple provides an API reference for its framework containing descriptions of the pro-
gramming interface elements. The documentation includes a huge collection of sample
code. The QuickTime architecture is a collection of tools and sets such as the Movie
Toolbox or the QuickTime streaming API. It is organized with components making it
modular, flexible, and extensible. A QuickTime component is a shared code resource
with a defined API. Therefore, it is possible to add a new component to the QuickTime
architecture and have existing applications automatically find it. Apple provides a free
2.2 Multimedia frameworks 13
of charge software development kid to access the API of the QuickTime architecture.
Moreover, it is possible to direct access the QuickTime architecture using code written
in C, C++, Objective C or Java.
2.2.3 GStreamer
The GStreamer project [8] is a framework for multimedia applications. It has a pipeline
design to compose processing queues. The pipeline design was chosen to decrease the
queues overhead. This supports applications with high demands on latency.
The framework contains plug-ins that provide various libraries and tools. Plug-ins are
considered parsers, codecs, sinks, multiplexers and demultiplexers. The pipeline struc-
ture defines the flow of the media data. A GUI editor is available for pipeline editing
and design. Pipeline patterns can be saved in the XML format in order to create pipeline
libraries which later can be extended. The project has been split into modules. The core
module provides a library of functions which define a framework for plug-ins, data flow
and media type handling. A documented API and sample applications are available. A
number of applications use the GStreamer framework including applications developed
for the market of mobile communication. Still, there is no streaming project registered
with the GStreamer framework which is equipped with RTSP/RTP streaming support.
The GStreamer project is released under the LGPL and is open for development and
free for download.
2.2.4 Java Media Framework
The Java Media Framework (JMF) [9] is another media framework that specifies a
simple, unified architecture in order to synchronize and control audio, video and other
time-based data within Java applications and applets. Furthermore, it is enhanced with
toolkit for integrating, developing and enhancing the frameworks libraries. The JMF
extends the Java 2 Platform Standard Edition with multimedia properties.
The latest version of the JMF API was developed by Sun Microsystems together with
IBM and some other supporters. JMF contains methods of building players, capturing
streams live or reading from file input. Other features include streaming and processing
of media content. It is very easy to extend and integrate. The JMF code can be changed
and enhanced. New codecs and parsers are add as plug-ins. The JMF classes’ API
documentation is available online. There are tutorials with reference implementations
to almost all features of the framework. An included framework registry provides an
overview of all available processing properties and understood MIME types. An RTP
engine is already integrated and an RTSP client plug-in allows communication to RTSP
2.3 Streaming media overview 14
servers. Sun Microsystems decided to pass its product free to the user and developer.
The source code can be downloaded after a free registration.
2.3 Streaming media overview
Streaming media is media which is consumed (read, heard, viewed) while it is being
delivered. It is a flow of continuous data on a transport media such as air, wire, etc.
Media delivery is closely connected to constant data consumption. Thus, its delivery
should be at constant rate and timeliness. In other words, the objective of streaming
media is a real-time delivery to numerous clients at high quality rates. The available
quality of the delivered media is closely related to the used delivery system. A number
of streaming media systems are currently available. All try to solve the delivery issue
quite different. Solutions to the issue are based on the use of various codecs such
as DivX, QuickTime, RealAudio and RealVideo or formats such as AVI, MPEG and
QuickTime or protocols such as HTTP, RTP and Microsoft’s MMS. Furthermore, there
exist at least two options to deliver media streams. These options are multicast and
unicast transmission. In an unicast transmission all clients start transmissions at the
server. In contrast, in a multicast transmission only one stream leaves the server and
reaches all requesting clients. The second transmission type saves resources at the
server, thus can influence the quality of the media delivery.
A technique related to streaming media content is Voice over IP (VoIP). The VoIP ser-
vice provides routing of voice conversations through an IP network. A majority of VoIP
implementations use RTP to transmit the voice data. The challenges of Voice over IP
include latency and integrity issues which raise from the IP protocol. This issues are
also interesting for multimedia streams. However, the constraints on real-time behavior
of VoIP are much harder than when streaming media objects. An interruption lasting
more than 200ms (milliseconds) is unacceptable in a VoIP conversation. Furthermore,
caching or buffering can not be integrated. One solution of a VoIP implementation is the
combination of the Session Initiation Protocol (SIP) [5] and RTP. SIP, equipped with a
syntax similar to HTTP, takes the role of the session control protocol and initiates calls.
RTP transmits the voice data. This combination is very similar to the use of RTSP and
RTP in multimedia streaming.
2.3 Streaming media overview 15
Figure 4: Model of a multimedia client/server architecture.
Figure 4 shows the parts of frequently available multimedia client/server architectures.
As the figure illustrates, the main challenge of a multimedia server is to provide de-
coders, encoders and transmission formats that fit all the requirements of the served
clients.
2.3.1 Apple streaming solutions
The QuickTime Streaming Server (QTSS) is Apple’s commercial streaming server de-
livered as part of Mac OS X Server. The QTSS provides users with enhanced adminis-
tration and media management tools. As a result of the tight integration with Mac OS
X Server, these tools are not available as part of the open source project the Darwin
Streaming Server (DSS). However, both the QTSS and the DSS are built on the same
server core and provide almost the same features.
Both support the streaming protocol combination of RTSP over TCP, RTP over UDP or
TCP. Furthermore, it is possible to bypass firewalls by tunneling RTSP and RTP over
HTTP port 80. Apple’s servers stream most well-known formats, especially Apple’s
own, the QuickTime format. The servers streaming options range from live, simulated
live1 to on demand in multicast and unicast sessions.
DSS is freely available for most platforms. Additionally, Apple hosts a number of email
discussion lists and a detailed documentation for the DSS users and developers.
1simulated live: A property that allows to stream a prerecorded clip or a broadcast archive as if it wasa live event. Comparable to switching channels on a TV.
2.3 Streaming media overview 16
2.3.2 RealNetworks streaming solutions
The ninth generation of RealNetworks’ streaming products introduced the Helix prod-
uct line [10] which today includes all their streaming media products. To these solutions
belong servers, gateways and players. One reason for starting the Helix product line was
to collaborate with other competitors and independent developers. Therefore, some of
their software is available as open source. This includes a client, the Helix DNA Client,
and a server, the Helix DNA Server.
The Helix Universal is RealNetworks’ professional streaming server solution. Some
streaming protocols available for this server are RTSP, the Progressive Networks Au-
dio (PNA), the Microsoft Media Services (MMS) and HTTP. Supported formats are
RealAudio, RealVideo, Flash, some Windows Media formats, QuickTime and all other
well-known formats. The server can also be seen as an agent between a media content
processing server farm and the clients. The server allows data transmission in multicast
and unicast session. The media information can be served on demand, live or using
simulated live. Furthermore, a virtual server architecture for load balancing is available
together with a server farm. Helix Universal Server supports RTP transmission, and
shifts to RTP transmission automatically when streaming to an RTP-based client.
The free port of the Universal Server, the DNA Server, is well documented. It is avail-
able as a plug-in structure which allows to add new extension. However, it does not
support many features of the commercial Universal Server, making it incompatible to
numerous proprietary formats, recent codecs and protocols.
2.3.3 VideoLAN streaming solutions
The VideoLAN project [11] offers multimedia streaming for all well-known audio and
video formats. The streaming methods include network transmissions in unicast or mul-
ticast transmissions. Furthermore, VideoLAN software is able to run on most operating
systems such as GNU/Linux systems, all BSD systems, Windows, Mac OS X, BeOS,
Solaris, QNX, etc.
The VideoLAN project hosts more than one solution for multimedia streams. The
project provides the source of the VLC media player, previously called VideoLAN
Client, and the VLS, also known as VideoLAN Server. The naming is a bit confusing,
because VLC can act as a streaming server and moreover, capture or present streams
as a client. However, VLC was initially coded to be an universal media client capa-
ble of playing streams from multimedia files, multimedia discs and later the network.
VLS is a multimedia streaming server. Furthermore, it provides software interfaces to a
number of multimedia input devices such as digital satellite and terrestrial TV cards or
2.4 Caching 17
encoding hardware. Yet, the VideoLAN project advises to use the VLC as server mainly
because the VLS project is stuck at the moment. The VideoLAN Manager (VLM), a
module for VLC, is a small media manager designed to control multiple streams with
VLC. Equipped with the VLM module, VLC can act as RTSP server with streaming
functionality. The streaming features of an enhanced VLC include streaming with sev-
eral protocols such as HTTP, MMS, UDP and RTP. The combination of the last two
supports multicast transmission. The VideoLAN project hosts free software and is re-
leased under the GNU General Public License. I would like to point out, that the recent
patent legislation plans of the EU pose a threat to many open source multimedia projects
including the VideoLAN project.
2.4 Caching
Caching[20] is the part of a system which tries to optimize the use of the available
resources. Thus, it decrements system load and increases system effectively. This is
performed generally by cachinghot data. That is, copying recently accessed data into
primary and faster memory. The idea is, that a future use of the cached data results in an
access of the copy in the fast memory, rather than gathering it from secondary memory
with relatively long access times. Average access time and system load can be reduced.
In a multimedia system, storage and bandwidth are critical resources. Any presenta-
tion requires a large volume of data compared to traditional applications.Cachingis
implemented in such an architecture in many different ways and parts. Nevertheless,
the overall objective remains to improve total utilization of the systems resources and
to provide higher effective throughput.
A very closely related method for data prefetching isbuffering. However, there is an
important difference betweencachingandbuffering. The difference regards the perfor-
mance objectives, the application requirements and the usage of the primary storage.
Buffering is used to avoid access delay.Caching is used to avoid access overhead
and/or delay. If the data blocks are prefetched on behalf of the currently consuming
data stream the process is regarded asbuffering. Cachingalso prefetchs data blocks,
but retains these in memory for future data streams, even after they are delivered to the
current stream.
2.4 Caching 18
2.4.1 Caching objectives
As explained previously,cachingtries to optimize the use of the available resources,
decrement the system load and increase the system effectively. This leads to following
objectives:
1. The first caching objective is increasing server capacity. The capacity is measured
by the concurrent requests that can be served. If the CPU or the network interfaces
do not become the bottleneck, the retrieval path from storage devices could be
the bottleneck. Therefore, by storing parts of frequently accessed multimedia
objects in the server’s primary memory, capacity is gained. This leads to the next
objective.
2. The second objective is to reduce access latency. Access time changes with the
location of the data in the storage hierarchy. Data in main memory can be deliv-
ered rather instantly.
3. The third objective is to balance load across storage devices. If there are more
than one servers in a multimedia architecture the initial placement of current and
popular multimedia objects can lead to a bottleneck.Cachingis introduced in
such an architecture to share these popular objects between a bunch servers.
2.4.2 Memory caching policies
The multimedia caching systems handle three issues. Namely these are, the dynami-
cally changing workload which arises from large and small multimedia files, the remote
control operations and the integration with other resource optimization policies.
Well-known caching policies used in operating systems or databases retain unrelated
but frequently accessed data blocks in the cache. However, these method would require
the media objects to be cached in their entirely. Only this could guarantee continuous
delivery of the media objects. But the relatively large size of most media objects makes
add and replace operations very expensive. Therefore, the multimedia object caching
attempt must take advantage of the sequential access patterns in long media data and
the knowledge of the relationships across delivered streams. A relative small cache can
be used by retaining only a short duration and the relevant fragments of a media object.
Taking advantage of the sequential access pattern is common to multimedia caching
policies.
Multimedia streams are controlled and prefetched in order to guarantee continuous and
jitter free delivery. The prefetching requires the buffers to be refreshed in a periodic
2.4 Caching 19
manner. Typically, a small number of buffers is allocated for each stream. This obser-
vation leads to the idea of serving several streams from the same buffer, especially if
they read from the same multimedia object. In other words, if multiple streams access
the same multimedia object and if a buffer is allocated for the preceding stream, the sub-
sequent streams can be served from the same buffer. No further prefetching is necessary
for the following streams. Therefore, the number of readers from cache is controlled in
order that the buffer refreshment of the preceding streams does not overwrite any blocks
to be consumed by the succeeding streams. However, the buffer boundaries are limited.
The prefetching for the first stream must continue by replacing data at the begin of the
buffer, once the buffer end is reached. Thus, a time window is created between the first
stream and the last following stream. This time window is called theenrollment win-
dow. Further enrollment of this window can only be accepted if the buffer refreshment
does not to overwrite blocks that are read by followers. The policy can be improved by
partitioning the buffer and allocating each buffer partition to a group of streams. This
results in multiple enrollment windows. Each group is served from the same buffer
partition. The groups contain streams that are simultaneous or very close in time. Still,
not all requests can result in a cache hit and need to be served directly from secondary
memory. Therefore, there ought to be cache replacing policies which ensure a satisfying
utilization of all the server’s available memory resources. Two methods shell be listed.
1. The buffer is replaced which contains the block that would not be accessed for the
longest period of time by the existing progressing clients. This replacing policy
is called BASIC
2. Between two consecutive clients a distance in data blocks can be calculated. A
client with no successor is considered to have a very large distance. The buffer of
the client with the largest distance is selected for replacement. DISTANCE is the
name of this replacing policy.
While the BASIC policy ought to have a global view of all buffer usage in order to
keep an buffer access table up to date, the DISTANCE policy has a disadvantage in
each service cycle. When a client stops or pauses or a new stream arrives all buffers are
freed, clients are reordered, and buffers are reallocated subsequently.
Interval Caching (IC) tries to get rid of this global view of all streams. The responsibil-
ity for the buffer replacement is moved to the individual streams. In Interval Caching,
the data blocks between a pair of consecutive streams accessing the same multimedia
object, is referred to as interval. The two streams associated with an interval are called
2.4 Caching 20
the preceding and the following stream. By caching a running interval, the follow-
ing stream can be served from the cache using the blocks brought in by the preceding
stream. The size of the interval is estimated as the time difference between the two
streams reading the same block. The number of blocks needed to store a running inter-
val is defined to be the cache requirement of an interval. In order to maximize the cache
hit ratio and minimize the access to slow memory, the interval caching policy sorts the
intervals in terms of interval sizes in the time domain and not in terms of memory re-
quirements. It caches only the shortest time intervals. IC is regarded to be the optimal
policy for multimedia caching, where optimality is defined as the policy of retaining the
data blocks to be accessed the earliest. Changes in the cache interval arrangement only
occur due to the arrival of a new stream or the termination of a current stream.
Figure 5: Illustration of the Interval Caching policy. Sxy represent streamx on a mediaobjecty, bxy the requested buffer by Sxy.
Figure 5 shows following IC situation: There are seven streams reading from three
different media objects. Some of the streams require the same media object such as
S12 and S22. Therefore, intervals can be identified and their buffer requirements bxy can
be estimated. The question is, which intervals can read from cache. In the displayed
example, all preceding streams must read direct from the media object. Streams S11,
S12 and S21 buffer for the following stream. The dashed arrows indicate streams that
can read from the cache. S13 can not buffer for its follower because the interval and/or
buffer requirement between S13 and S14 was considered to large.
Two important functions are defined in Interval Caching. Theopenrequest is called on
the arrival of a new stream. First, the size of the new interval and its cache requirement
is computed. Next, the cache management checks if the new interval can be served
from cache. Finally, an algorithm determines if the interval is desirable to cache. That
is, opencomputes the total replaceable cache space from the free pool and the less
2.4 Caching 21
desirable intervals with larger interval size. If the cache requirement of the new interval
can be satisfied, cache management rearranges the available buffer. However, if the
interval is worthy to cache but there is not enough free memory, it is marked predicted
and is add to the ordered interval list, waiting for free space.
Algorithm 1 openfor an interval caching policy.
From new interval with previous stream;Compute interval size and cache requirement;Reorder interval list;If not already cached
If space available,
Cache the new interval;
else if this interval is smallerthan existing cached intervalsand sufficient cache space can be released
Release cache space from larger intervals;
Cache the new interval;
Thecloserequest is issued by an ending multimedia stream. Based on the fact that the
closed stream is not leading the group, the interval is deleted from the reordered list
and associated allocated cache space is freed. Finally, the next smallest uncached and
predicted interval is considered for caching by the allocation algorithm described with
theopenrequest.
Algorithm 2 closefor an interval caching policy.
If following stream of a real interval// real means the interval resides in cache and is// not predicted
Delete the interval;
Free allocated cache;If next largest interval can be cached
Cache next largest interval
3 Design and architecture 22
3 Design and architecture
By using the results from the research, the upcoming section describes the project plan-
ing phase. Starting with the projects concept, a requirement analysis is made. A deploy-
ment strategy is created and the static structures including the packages with the classes
and the database are defined. Afterward, the dynamic behavior is analyzed. This in-
cludes the server’s stream setup, authentication process and the programming patterns.
Consideration on implementation challenges conclude the section.
Figure 6: Projects concept diagram.
Figure 6 shows all parts involved in the project. The project bases on a client/server
architecture. There is only one server holding media files. It is linked to a database
holding accounting information. The server provides multimedia streams and multime-
dia information for different clients with various connection profiles. The stream are
transmitted and controlled using an RTSP/RTP streaming server. The whole concept is
analyzed in detail in the following sections.
3.1 Requirement analysis
The requirement analysis lists all the features of the project and characterizes their
individual requirements. Some diagrams shell help the reader to understand the content
of the the sections.
3.1 Requirement analysis 23
3.1.1 Media management and control
Figure 7: Use case for usage of system, management and live control by different users.
In a multimedia architecture, there is always the wish of management and control over
the stored and presented media objects. Therefore, the features listed below are add to
the server of the project:
1. The multimedia server hosts a file system with media files that can be updated by
the client. The update operations include adding, updating and removing media
information.
2. A database stores important media properties. These include an unique media
name, the file location of media on the server, important caching information and
other media information such as date, artist, genre, etc.
3. In order to provide different network profiles, network connection speed estima-
tion is implemented. On connection a client is add to a profile and can only access
appropriate media objects. Furthermore, all media objects belong to a certain pro-
file. This adds additional media object access control to the server.
4. An authentication system with user management is necessary. A normal user
should not be able to remove, add or alter media objects and especially, the
streaming relevant data properties of a media object.
3.1 Requirement analysis 24
5. A means of control over the multimedia streams is required. This concerns fea-
tures such as querying media properties, playing (streaming live) media or issuing
VCR commands.
Figure 7 shows the features of the project and their interfaces to the different user
groups. As usual only the user group called "Administrator" is able to access all fea-
tures of the system explained in 1-5. The "Editor" group is allowed to authenticate,
update media and control streams. The normal user "User" can only control media con-
tent streams. External clients such as RealNetworks or QuickTime client are regarded
as normal users.
3.1.2 RTSP/RTP server
Using the RTSP explained in Subsection 2.1.1 an RTSP communication enhanced server
is created. Using the Java technologies for binding, listening and accepting the server
expects a client request at a socket. The client request must be properly parsed. Fake,
wrong and out of version requests are detected and answered with the RTSP defined
error messages. A correct call results in a media information response, a setup of a
session or a control action on the current stream. Streams are packed according to
the packet format model of the RTP described in Subsection 2.1.2. The JMF includes
libraries for RTP communication. It provides RTP packetizer for some well-known
codecs. All media is streamed with the help of RTP transmissions. RTCP packets are
used for streaming statistics and end of communication detection.
3.1 Requirement analysis 25
3.1.3 Interaction with RealNetworks and QuickTime clients
Figure 8: User case for RealNetworks, QuickTime and custom client interaction.
It is required that the server transmits streams not only to the projects client, but can as
well, open streams to the recent clients of RealNetworks and QuickTime. Both clients
can use RTSP for a stream setup. Therefore, RTSP is the standard communication
protocol in the project’s client/server architecture. RTSP communication logs between
RealNetworks’ client and server described in Subsection 2.3.2 and Apple’s QuickTime
client and server described in Subsection 2.3.1 must be intercepted, recorded and ana-
lyzed. The differences are listed and bypassed if possible. The objective is an RTSP
server that can communicate with all mentioned clients. No extra effort is made to
integrate the RealNetworks and QuickTime client to all the features of the server. In
particular, these are the features described by 1-4 in the previous Subsection 3.1.1.
3.1.4 JMF MPEG-4 RTP integration
Another goal of this project is packing MPEG-4 data into RTP packets. The payload
type list in [2] shows that the MPEG-4 has no defined RTP payload type. Hence, based
on the mentioned RFC, MPEG-4 belongs to the dynamical payload types beginning
with payload type 96. However, MPEG-4 is so popular, that a RFC description in [13]
is available on how to pack the codec into RTP packets.
3.1 Requirement analysis 26
3.1.5 Caching
The integration of the IC policy is considered as a result of the research of Subsec-
tion 2.4 on caching. No efforts are made to implement a distributed caching policy.
Only one server is running in this client/server architecture. Clients accessing the same
media object are served, whenever possible from primary memory. The IC policy re-
quires the implementation of replacement strategies similar to Algorithm 1 and 2. These
strategies require exact synchronization between the different server worker threads, the
cache and the buffer management. Furthermore, the caching interface needs to be pre-
cisely integrated int the server’s RTP streaming queue.
3.1.6 UI enhancement
The JMF contains the JMStudio a multifunctional multimedia client with UI. The JM-
Studio provides all the standard media operations such as opening, playing or closing
of a media object. The server is equipped with streaming functionality and more impor-
tant, it implements communication with server implementing the RTSP for communi-
cation. It can connect to the Helix Server mentioned in Subsection 2.3.2 and the Darvin
Streaming Server of Subsection 2.3.1. The code of the JMStudio is available. It has
already a lot of features the client of the project requires. It is integrated in the project
by extending it with the list of functions blow:
1. Simulated Live:The client presents a list of media streams to the user. These
media streams can then be selected and set up for a streaming transmission from
server to client. The original JMStudio provides the display of visual content and
the output of audio.
2. User administration:Another group of UIs is created for user administration.
This UIs provide operations such as adding, updating and deleting an user. This
also demands for different user groups and access rights.
3. User authentication:A login dialog authenticates the user of the enhanced JM-
Studio to the server. Gathered user information will be stored by the client
and used whenever an authentication or encrypted information interchange is re-
quested by the server.
4. Media editing: A group of UIs allows to change the media object properties
stored in the database tables. There exist to classes of media object properties.
Media properties which hold transmission critical information and media proper-
ties which hold accounting information. Transmission critical media properties
3.2 Deployment strategy 27
are regarded properties which influence the transmission process. To this class of
properties belong the unique media object name, the file location on the server
and caching information. Accounting media properties are considered properties
such as genre, artist, profile etc.
5. Adding Media: Moreover, some UIs are written which enable updating of the
database with newly added media objects residing on the server’s file system. A
selection and insertion dialog is implemented.
3.1.7 Media capturing timer
A Java timer periodically loops in order to provide the possibility to capture from pre-
defined devices which are connected to the server. Timers are set using a client UI. The
idea is to connect the server to a satellite TV set and capture a stream from the preset
channel to a file sink. This requires the setup of a processing queue. The queue starts
with a source of a continuous stream of data. Thereafter, a processor packs the stream
in a container format. Finally, the queue ends with this formatted stream being written
to a file on the server’s media file directory. Furthermore, an entry for the new media
file is created in the server’s database. This allows instant access to the newly registered
media object.
3.2 Deployment strategy
Figure 9: Projects deployment diagram.
The deployment diagram in Figure 9 represents a client/server architecture. The com-
munication from client to server and vice versa is only based on the RTSP. Recognizable
3.3 Static structure 28
also the libraries bound to the project. There is the jmf.jar required by the client and
the server software. The database tables are managed and served by a MySQL database
and run on the server’s hardware. The connection to the database uses Java JDBC in-
terface (mysql-connector.jar). The server’s configuration file is an XML format file.
The necessary client configuration requires the existing JMStudio initialization file to
be adjusted. Furthermore, separately attached to the projects client is the mp3plugin,
allowing limited MPEG-1 Audio Layer 3 decoding.
3.3 Static structure
The following section presents the reader the main two packages that can easily be
identified with the deployment diagram shown in Figure 9. These packages are a server
package and a client or player package. The implementation Section 4 explains the
missing packages. This concerns generally packages which contain interfaces to both
server and client such as the MPEG-4 packetizer package or theutil package.
3.3.1 The server package
Figure 10: Server package model.
The server package includes all the server relevant parts. These are the server itself with
the listening threads, a verifier for RFC 2326 protocol and the extended RTSP methods.
Moreover, the package contains the RTP streaming system, the VCR timer thread for
capturing and the RTSP session management.
3.3 Static structure 29
Figure 11: TheServerclass model.
Server:On start, the server parses the configuration file and sets the default values. The
main()function of the server is to listen in a loop to a TCP socket. This socket is bound
to an initialized port number and the hosts IP address expecting client calls. A call
results in the dispatching of a worker thread. Therefore, the server has properties that
describe the maximum of threads available to the clients, a connection timeout value
and a default RTSP port number.
Figure 12: TheServerThreadclass model.
ServerThread:TheServerThreadrepresents the worker thread dispatched by the server
daemon. The function of a thread is to parse the RTSP message. The message is split
into the relevant parts. The relevant parts are the request line, the header fields and the
message body. All parsed lines are validated in an instance created byServerThread,
namely theRTSPProtocol. TheServerThreadkeeps the connection to the requesting
client until a predefined timeout elapses. If still open, the RTSP response is send back
through the connection to the client. If the client shuts the socket before a response
could be send, an exception handling prevents any interruption of the server daemon.
3.3 Static structure 30
Figure 13: TheRTSPProtocolclass model.
RTSPProtocol:A client request is verified against the RTSP message syntax. The
verifying function is the main purpose of theRTSPProtocol. A client request always
starts with an RTSP method, followed by space, a correct RTSP URL, another space
and the RTSP version:
Request-Line = Method SP Request-URI SP
RTSP-Version CRLF
The SP stands for space and CRLF for carriage return (CR) and a line feed (LF). The
header part of the request must contain a sequence number according to the RTSP suite
such as "CSeq: 1". Not all other defined header fields are known to theRTSPProtocol.
Only the understood fields are evaluated, the others are ignored. A response is assem-
bled for any request. The response to the client is composed using the RTSP response
syntax. An RTSP response consists of Status-Line header and message-body. The
Message-body is optional. A status line consists of RTSP-Version, space, Status-Code,
space and Reason-Phrase:
Status-Line = RTSP-Version SP Status-Code SP
Reason-Phrase CRLF
The response header must return the received sequence number. Other header fields are
a date/time field and an identification field containing the server’s name. The message-
body is empty unless an RTSP DESCRIBE method has been received. In the event of
an RTSP DESCRIBE request, an SDP compliant message is returned to the client. The
protocols implemented by theRTSPProtocolare explained in Subsection 2.1.1.
3.3 Static structure 31
Figure 14: TheRTPTransmitterclass model.
RTPTransmitter: The main function of theRTPTransmitteris to provide an RTP trans-
mission. TheRTPTransmitteris invoked by theRTSPProtocolif the client request
contains a stream control request. A stream control request can lead to the set up of an
RTP processing queue by theRTPTransmitter. Furthermore, RTCP stream control feed-
back must be collected and analyzed by theRTPTransmitterin order to adjust its RTP
transmission. TheRTPTransmitteralso provides methods to stop, play, skip forward or
backward in the transmitted RTP streams. The RTP suite is introduced in Subsection
2.1.2.
Figure 15: TheSessionclass model.
Session:In Subsection 2.1.1 was explained that a session is necessary for RTSP state
management. The RTP transmission relevant data structures like packetizer or stream
manager are stored in a session instance joined with RTSP session information such as
session identifier or last RTSP command (method) identifier. Furthermore, if a session
state change is requested by the client it must be verified.
3.3 Static structure 32
Figure 16: TheVCRTimerclass model.
VCRTimer:This is a periodically called timer which checks database entries for a times-
tamp (startTime) in order to set up the processing queue better described in Subsec-
tion 3.1.7.VCRTimerholds the queue’s DataSource, Processor and DataSink.
3.3.2 The client package
Figure 17: The client package model.
The client is composed by an updated JMStudio code and newly integrated dialogs.
Furthermore, it has a communication interface to the server for extended RTSP com-
munication. The newly integrated UIs are bundled in a separate menu embedded into
JMStudio’s main window menu list. The new UIs handle the authentication, the editing
of media and users and set VCR timers. Additionally, a table with media browsing,
searching, sorting and on demand play ability is integrated.
3.3 Static structure 33
Figure 18: TheAuthenticationUIclasses model.
AuthenticationUI:There is a number of UIs which require authentication. To this group
belong an UI for the login process, an UI that allows additions and changes to the
stored user list and an UI that enables media information adding, altering and deleting.
Encryption must be implemented to allow secure transactions.
Figure 19: TheMediaBrowserUIclass model.
MediaBrowserUI:A table is used in order to present the media information stored in
the server’s database to the user. The table information is adjusted by a database query
result. Therefore, searching and sorting operations on the tables content are available.
To avoid congestion with the server’s resources, only the first five entries matching the
query are returned.
Figure 20: TheRTSPFunctionsclass model.
RTSPFunctions:This class implements the communication interface between the client’s
administrative and accounting extensions and the server. RTSP is extended with com-
3.3 Static structure 34
mands that allow database queries and updates. A communication starts at the client’s
UI event such as button pressed, results in a translation of the event byRTSPFunctions
to an RTSP statement and finally, a socket connection in order to transmit the request.
RTSPFunctionswaits for a server’s RTSP response until a certain timeout elapses. It
parses and checks the result and calls a feedback or UI update if required.
Figure 21: TheAuthenticationDataclass model.
AuthenticationData: This class stores the user credentials necessary for authenticated
and encrypted communication. This supports the aforementionedAuthenticationUIs
as follows. Firstly,AuthenticationDatastores authentication data in order to allow re-
stricted UI handling such as disable some menu entries or buttons or entire UIs. Sec-
ondly, RTSPFunctionsuseAuthenticationDatato get a session key for extended and
encrypted RTSP communication.
3.3 Static structure 35
3.3.3 Database tables
Figure 22: The database tables.
The database supplies the server with information about the media objects. This media
information includes location of the objects on the file system and other accounting in-
formation such as year, genre or type. The rest of the information saved in the database
is necessary for user authentication and accounting. The vcrentry table holds capture
timer entries.
mediafile: This table stores the unique media object identification (title) as a primary
key and the stream location on the file system. Moreover, this table stores the editable
media information. It is accessed on any media object information query. The media
information is essentially for RTSP session description and setup.
user: The table contains the user information necessary to authenticate at the server.
Three entries are generated. A user is identified byusername.The passwordfield
authenticates the user. Finally, therightsfield assigns the user to a group.
usertype:This table lists all user groups such as administrator or editor and allows to
define new groups.
genre:This table provides predefined genres for audio and video streams. Additionally,
new genres can be added to this table.
vcrentry: The last table holds capturing timer entries. This table provides the informa-
tion for the device capturing explained in aforementioned Subsection 3.1.7. A row on
this table saves of the timer start and stop time information.
3.4 Dynamic behavior 36
3.4 Dynamic behavior
After presenting the reader the static structure of the project containing server and client
package and the database tables, the next section focuses on the dynamic behavior of
the system.
Figure 23: Dynamic behavior overview (without error feedback).
Figure 23 shows in detail, the dynamic process flows of the system. Most process flow
depends on user interaction. The figure clearly indicates that the server and client side
are two independent environments. Therefore, interaction only works if both are able to
fulfill synchronized communication over the communication network. Both sides need
to be started on their system before communication can be established. The server and
the client have their own start and final state. The server is in the start state if it listens
3.4 Dynamic behavior 37
to a socket. The client waiting for user interaction indicates the client side start state.
Once user interaction such as a clicked button is captured by the client, it transforms the
user request into a corresponding RTSP message. This is only the case if the request
is directed to the server. In that case, a connection is opened to the server side and
the RTSP message transmitted. The listening server parses the incoming socket stream.
Next a decision is made, whether the message belongs to an administration task or a
standard RTSP stream request. The first case results in a database query or update. In
the second case, again a decision is made, if the request is media information gathering
such as an RTSP DESCRIBE method, or media stream controlling such as an RTSP
PLAY or TEARDOWN method. A response for the client is created and send if the
client still waits for response. The client processes the server’s response and decides
how the result is presented to the user.
The model above does not contain the error and exception cases on both sides. On the
server side these cases usually lead to an error response to the client which is presented
in an error feedback dialog to the user.
3.4.1 RTSP/RTP stream setup
Figure 24: Streaming setup sequence flow diagram.
3.4 Dynamic behavior 38
The sequence flow diagram in Figure 24 shows the sequence from the beginning to
the end of a media object stream setup. This sequence is executed on a client’s RTSP
SETUP request.
If a client connects to the server a working threadServerThreadis dispatched. After
parsing the request the protocol handler must verify if the request is correct and an
implemented RTSP request. Is it the client’s desire to initiate a session, the protocol
handlerRTSPProtocolwill start a new session and return the session identification ses-
sionid to the client via the connection. A session is created by the protocol handler
through a session interface. The function of this interfaces is to create a new sessionid
and to allocate memory for all required communication instances. The new sessionid
must be random and unique in theSessionContainer. Therefore a control method must
be called before inserting the new session. A successful session setup leads to the ini-
tialization of all RTP instances.
Figure 24 above represents a single RTSP SETUP request. The first SETUP request
leads to a session creation on the server identified by an uniquesessionid. Thesessionid
is transmitted to the client in the response of the first SETUP request. Henceforth, the
client must add thesessionidto all the following session dependent RTSP requests.
Session dependent requests included in the project are PLAY, PAUSE, TEARDOWN
and SETUP. A second SETUP request might be necessary in order to set up more than
one media track. A second SETUP expects the samesessionid. This allows to start the
RTP streams synchronized that originate from the media object’s tracks. This means,
a session does not contain only one stream but belongs to a whole media object with
possibly more than a stream. An RTSP SETUP request requires an RTSP sequence
number, a session identification, a stream identifier and transport information. The
transport information tells the server which ports have been allocated at the client. The
client allocates prior to stream transmission an RTP stream port and an RTCP feedback
port. Furthermore, the setup request contains transport preferences such as unicast or
multicast. The server must also respond with transport information. This information
tells the client what at which port the RTP packages will leave the server. Moreover, it
informs the client about all available transport options. The flow diagram in Figure 24
corresponds to a request presented below:
SETUP rtsp://example.com/foo/bar/baz.rm RTSP/1.0
CSeq: 302
Transport: RTP/AVP;unicast;client_port=4588-4589
And a response:
3.4 Dynamic behavior 39
RTSP/1.0 200 OK
CSeq: 302
Date: 23 Jan 1997 15:35:06 GMT
Session: 47112344
Transport: RTP/AVP;unicast;client_port=4588-4589;
server_port=6256-6257
As explained in Subsection 2.1.2 the next higher (odd) port indicates the RTCP commu-
nication port. In the sample communication above, the server will stream data packets
from port 6256 and listen to RTCP feedback on port 6257.
3.4.2 Extended RTSP
Additionally to the in [1] defined RTSP methods OPTIONS, DESCRIBE, SETUP,
PLAY, TEARDOWN and PAUSE an extended RTSP suite enhances the project with
authentication and media object accounting services. Therefore, new RTSP methods
need to be introduced and appropriate procedures written. The extended RTSP is im-
plemented in theRTSPProtocolclass on server’s side and theRTSPFunctionson the
client’s side. Following now a list of the newly introduced RTSP methods:
AUTHENTICATE: This method allows the user to authenticate to the system. On the
client’s side, the username and the encrypted password are marshaled into an authen-
tication RTSP message. The server extracts the username and decrypts the password
and compares the pair to the stored user information in its database. A correct authen-
tication message results in key exchange for further authenticated communication. See
next Subsection 3.4.3 for more information.
SESSIONKEY: After authentication, any further authenticated request results first in
a session key exchange. This adds encrypted communication for administrative
and accounting tasks to the system. The encryption is preformed by encrypting
the relevant RTSP header properties with the session key.
ADMINREQUEST: All administrative tasks are identified by the ADMINREQUEST
method. This request can only be issued by authenticated users. Therefore it re-
quires session key exchange first, and encryption of the relevant messages parts
afterward. Administrative tasks are adding, deleting or updating of database in-
formation.
MULTIUNICAST: This method can be called by all users. It returns a result set of all
running or waiting multi-unicast sessions of the system. This allows the project’s
client to enter a multi-unicast session.
3.4 Dynamic behavior 40
SQLREQUEST:The SQLREQUEST method is also available to all users running the
project’s client. It is generally used to submit database queries that do not include
update operations such as querying or selecting media information or reading
entire table contents e.g. the genre table. No part of this message is encrypted.
3.4.3 Authentication process
Figure 25: Activity flow on authentication. Please note that, the AUTHENTICATEmethod is a newly introduced RTSP method to enable authentication over RTSP. SeeSubsection 3.4.2.
Figure 25 above shows all process communication involved in the authentication pro-
cess. On the client’s side, the RTSPFunctions introduced in Subsection 3.3.2, handles
the extended RTSP communication. The client presents anAuthenticationUI,namely
theLoginUI, to the user. The login button event leads to an encryption of the password.
It is embedded in an envelope which pads short passwords with random ASCII sym-
3.4 Dynamic behavior 41
bols. Next, an RTSP message must be put together containing the AUTHENTICATE
method.
An AUTHENTICATE message from client to server looks like this:
AUTHENTICATE rtsp://192.168.1.2:1234/path/file RTSP/1.0
CSeq: 1
user: Administrator
auth: oqtiIrBVECw9/rsjMKKdQSx79gQg14X9jvMV73QeF...
The fictional URL "rtsp://192.168.1.2:1234/path/file" is used to make sure the host in-
formation is correct. The "auth" property holds the encrypted and padded password.
A socket connection to the server is opened and the message send. The server starts
a worker thread that reads from the socket and validates the message. A correct AU-
THENTICATE message leads to a database query that verifies the existence of the user.
If verification is approved the server provides the client with the individual rights and
adds the user to a group. An encrypted key for further authenticated communication is
exchanged. Any error on the server side leads to an RTSP error message. The client
parses the server message, informs the user and updates the UI menus matching the
users group. The decrypted key together with the rights is saved in theAuthentication-
Data. Any further action that requires authentication uses the key for secure communi-
cation.
3.4 Dynamic behavior 42
3.4.4 VCR timer sequence
Figure 26: VCR timer sequence model.
Figure 26 presents all states implemented in the server’s VCR module. The module
is separated in two tasks. Firstly, there is the timer thread checking the database for
timer entries. And secondly, there is the socket listening thread of the main server. A
triggered timer leads to the following ordered list of events:
1. First the thread checks the system’s timestamp if an actual running capture oper-
ation needs to be stopped and resources freed.
2. Next, the databasevcrentryis checked for a new timer entry
(a) If a new capturing request matches the actual system’s timestamp and the
capturing device is not occupied a new capturing thread can start immedi-
ately. A capturing queue is set up.
(b) Regardless if in a) a processing queue was set up a last query deletes old
timer entries fromvcrentry.
3.4 Dynamic behavior 43
Once in a while, the listening server thread receives a new timer information marshaled
in an ADMINREQUEST RTSP message. See Subsection 3.4.2 for explanation on that
RTSP method. This message is partly encrypted. Therefore, only registered users are
allowed to enter or modify timers. A correct timer RTSP request leads to an entry in
the serversvcrentrytable followed by an acknowledgment RTSP response to the client.
This entry is now periodically checked by the previously presented timer thread.
3.4.5 Multi-unicast session
JMF supports RTP transmission in unicast and multicast mode. Moreover, JMF in-
cludes another method for transmitting simultaneously a stream to a group of clients.
This method is called multi-unicast. In multi-unicast no multicast IP addressing is re-
quired. Instead, as in an unicast session, each client opens a stream at the server. How-
ever, there is a difference between unicast and multi-unicast session. A multi-unicast
session streaming queue is set up for more than one client. All clients interested in this
session just need to propagate their connection properties ([IP, port] pair). For details
on the processing queue setup see Subsection 3.4.1. The connection properties of ev-
ery client are bundled in the JMFSessionAddress. Subsequently, they are added to the
RTPManagers by the managersaddTarget()method. All clients are served by the same
processing queue. However, there are some constrains. Only one client is allowed to
control the streams. This is the session initiating client that sets up the media object pro-
cessing queue. Moreover, it is necessary that interested clients are aware of the available
multi-unicast sessions. The multi-unicast transmission requires special control, thus it
is only available to the system’s client.
3.4.6 Patterns - singletons and factories
The following subsection lists first the singletons and then the factories found in the
implementation phase. It shell be pointed out to the reader, that not all listed classes
have already been presented. Proceed to their explanation in Section 4 if there are
problems in understanding.
This is a multi-threading software project. Therefore, special care is taken to provide
the singletons in this project with thread safeness. The thread-save lazy instantiation
was chosen as explained by the next example code:
3.4 Dynamic behavior 44
Algorithm 3 Example of a thread-save Singleton.
public class MySingletonClass {
/**singleton instance* created when class is loaded. */private static MySingletonClass instance =
new MySingletonClass();
/** get handle to the singleton* @return the singleton instance of this class. */public static MySingletonClass getInstance() {
return instance;
}/** private constructor, prevents direct* instantiation of this class. */private MySingletonClass(){}
}
The preceding singleton implementation ofMySingletonClassis thread-safe because
static member variables, created when declared, are guaranteed to be created the first
time they are accessed. There is a list of classes in the project that will take advantage
of this concept. It shell save some memory by only instantiating one class instance and
also provide access control.
XMLParser: This parser sets with the help of Java’s XML language extensions global
values on server’s start up. It parses the values from a configuration XML format file
passed to the server. Only one instance ofXMLParseris required.
VCRTimer:It is instantiated by the server. One instance ofVCRTimeris enough to dis-
patch in regular interval threads that query the database. Also capturing can be handled
by this instance in an autonomous way.
DBFunctions: This class is used to connect, disconnect and submit queries to the
database. One instance ofDBFunctionsis enough to offer a database interface for
all packages.
Utils: This class is a collection that holds needful functions for the whole project. E.g.
it offers unified number conversions or global constants. Only one instance of the class
is required.
SessionRegistry:In order to provide a synchronized interface for all RTSP session op-
erations a singleton is required.SessionRegistryholds the session information in a map.
A maximum of one instance is allowed in order to gain synchronized access to the
3.5 Implementation challenges 45
session operations. The critical operations are a session insertion to the registry or a
remove operation on the registry.
In the course of implementing, some factory classes have be identified especially when
inheritance took place. The described families of classes are listed below:
SessionRegistry:This is not only a singleton but also a factory. It constructs the ap-
propriateSessioninstance when adding a new session to the static session map. The
two available session types are theSessionRfc2326a Sessionobject holding also the
RTP transport system and a simple implementation of the abstractSessionmother class
calledSessionExt.
RTSPFactory:This factory extends theRTSPProtocolin order to provide RTSP com-
munication validation. Once the RTSP method has been identified, the factory in-
stance creates the suitable RTSP worker. The available workers areRTSPrfc2326or
RTSPExtProtocol. Both interpret the RTSP message and execute appropriate actions.
FeedbackDialog: The classFeedbackDialogextends Java’s JDialog and is a factory for
different feedback dialogs used by the augmented JMStudio UI. The invocation of the
appropriate constructor defines the message type, the message title and the message
itself. The default constructor represents an error message.
AuthenticationUI:All authentication related UIs descentAuthenticationUI. It is another
UI family added to the standard JMStudio. Sub-classes includeChangePasswordUI,
LoginUI andNewUserUI. The extended JMStudio decides what sub-instance ofAu-
thenticationUIis created.
3.5 Implementation challenges
This subsection faces the reader with considerations concerning the implementation
challenges. The first four subsections are about JMF specific requirements. The fifth
subsection provides information on the integration of Interval Caching to the system.
3.5.1 JMF creation
The JMF source code is available for download from Sun Microsystems. It gives an
insight into the processes behind the framework and helps to understand the relation
between the classes. A successful JMF creation results in an assembled jmf.jar library
containing the whole framework and its interfaces. However, creating the JMF from
code implies some challenges. The provided built script contained in the source docu-
mentation tree can not be executed successfully without the intervention of the devel-
oper. Depending on the methods of creation, the library paths must be updated and
3.5 Implementation challenges 46
some source code files fixed.
3.5.2 JMF RTP engine
RTP implementations of have become very important in modern media processing and
other application. It seems only rational, that Sun Microsystems, the provider of JMF
has decided not to include the source code of their RTP processing engine. The com-
piled classes can be found in thederiveddirectory tree of the JMF source package. This
especially concerns the RTP packing functions at a very low level. At this level the final
RTP packets are put together. The RTP packet header is created by this engine with as-
signed RTPPayload Typeor RTPMarker. However, the developer has no direct access
to the RTP headersSequence Numberfield or Timestamp. Moreover, the whole RTCP
communication processing remains a secret and results of this communication are only
available in callback queues. Yet, the interfaces for implementing special packetizer are
clearly defined. The portion of the RTP packet’s packet data can be specified. Still, the
the developing process is disturbed by the lack of customized header field editing.
3.5.3 JMF MPEG-4 RTP stream setup
Unfortunately, the RTSP handler and especially the SDP Protocol parser part of JMF
(version 2.1.1e) is not able to parse MPEG-4 session description entirely. Therefore,
the JMF client side setup is not initiated automatically. The presenting process queue
on the client’s side is not aware of what codec to use or what picture size to present. The
QuickTime and the RealNetworks clients both initialize the appropriate chain of pro-
cessing and picture size using the SDP description. The solution is to extend the servers
SDP creator with appropriate parameters and enhance the JMF client’s SDP parser. The
updated protocol sends the correct codec in an ”X-MPV4-ES-_CODECNAME_/RTP”
message line. _CODECNAME_ represents the name of the codec e.g. DIVX or XVID.
Furthermore, borrowed from Apple’s DSS SDP implementation, the submission of the
correct picture size is available in a line like ”a=cliprect:0,0,_HEIGHT_,_WIDTH_”.
_HEIGHT_ equals the picture height value and _WIDTH_ represents the picture width
value as an integer. The leading zeros are not used. The QuickTime client uses the two
coordinates to estimate the placing position of the player window on the screen. The
two newly added SDP protocol lines are essential for Microsoft’s Video Compression
Manager VCM wrapper. The wrapper is required by the JMF client to present MPEG-4
multimedia data on a Windows system. The VCM wrapper crashes on a wrong video
picture size initialization.
3.5 Implementation challenges 47
3.5.4 JMF AVI Container with audio/video streams
A problem arises while parsing the AVI container format. The AVI format is one of
the most popular container formats. It usually holds bound tracks such as an MPEG-
4 video track combined with an MPEG-1 Audio Layer-3 audio track. The codecs of
both tracks are supported by the JMF processing queue. Furthermore, an AVI format
parser that extracts the tracks is integrated in JMF. Unluckily, the AVI parser works not
correctly. Presented with the track setting of the previous example, the JMF AVI parser
provides the following processing instance with a complete MPEG-4 track. However,
thebits per sampleproperty of a MPEG-1 Audio Layer 3 audio track is always set to
zero. Thebits per samplefield of the patched JMF AVI parser returns just a fixed value.
3.5.5 Interval Caching
A suitable position for Interval Caching in the JMF media processing is the JMFData-
Source. The appropriate data source type is a file parsing data source. This JMFData-
Sourceis situated at the root of the JMF media processing queue. Thus, it provides
access to the raw file data required by the buffer implementation of the caching policy.
The Interval Caching, described in Subsection 2.4.2, is a time aware caching method.
The interval length is not defined by the amount of data required for the interval but the
size in duration of time. This task is impossible to implement with a non time aware
JMFDataSource. Its main function is to provide file access through a file pointer to the
upper layer in the processing queue. The first time aware component in the processing
queue of JMF is theProcessor. However, the JMFProcessoris only a model for the
several formats and codec handlers in the JMF processing system. It would be a huge
effort to change all implicatedProcessorimplementations of JMF. Therefore, the Inter-
val Caching idea is adjusted to the requirements of the fileDataSource. The intervals
length is defined in amount of data.
4 Implementation 48
Figure 27: Caching of a container format.
Figure 27 shows another caching issue. The problem arises when caching for a con-
tainer format such as AVI, MOV, etc. In that case, an irregular reading from the file is to
be expected. In the example above, the gray fields of the container file indicate parts of
the file which are not considered queue data and therefore the readpointer skips to the
beginning of the next data block. The stream send to client1 is read from a container
file. Client2 has opened a transmission requesting a stream from the same file. The im-
plemented Interval Caching policy permits the JMFDataSourceto write data (d1 and
d2) to a cache interval (bracket arrows). Unfortunately, the packetizer (JMFProcessor)
of client2 issues the same parsing commands as the one of client1. It also issues the
skip s1 after having consumed data d1. The dotted arrow indicates that theDataSource
executes skip s1 on the cached interval. Therefore, theDataSourcereturns data from
within d2 and consequently, returns corrupt data. The problem is, that theProcessor
issues always the same read pointer positioning commands to theDataSource. It is un-
aware if the data is gathered from a file or the cache. A solution to the problem splits
all media container into the contained tracks.
4 Implementation
The reader has been introduced to the design and architecture issues. The implementa-
tion will present details on how the final software product was created, what the final
4.1 Implementing with the Java Media Framework 49
classes main properties and functions are and how these functions interact. Some infor-
mation about implementing with the Java Media Framework providing the bases of the
implementation, next.
4.1 Implementing with the Java Media Framework
The implementation using the Java Media Framework is very close to Java program-
ming. It is very intuitive once the chain of production of the JMF engine has been
understood. Here a sample JMF Processor setup:
Algorithm 4 Simple JMF processing queue.
...MediaLocator ml = new MediaLocator (_URL_);DataSource ds = Manager.createDataSource (ml);Processor processor = Manager.createProcessor (ds);...
A JMF Processor specialized for _URL_ is almost ready after these three lines of code.
Afterward, it is just required to wait for its realized state. This is very useful for small
projects, but can results in the loss of orientation when a larger project must be devel-
oped.
There is a JMF code documentation [9] available. Furthermore, there is a JMF mailing
list provided by Sun Microsystems. Finally, a good source for trouble shooting is the
books available for implementations with JMF. Just skip to the reference list at the end
and try here [24], [26], [25].
Apart from the available sample code, there are not many open source JMF implemen-
tation available. However, most additions to the JMF are libraries in other programming
languages and just bind to the JMF through appropriate interfaces. The reader shell be
pointed at this reference [15].
4.2 Project description
The main project is a client/server architecture containing an RTSP/RTP streaming
server and an appropriate client. Multimedia content is streamed from the server to the
client. In return, the client controls the stream with VCR like commands. Additionally,
accounting of media information and user groups is available. The accounting feature
includes media object and user accounting. The accounting information is stored in a
database on the server’s side.
4.2 Project description 50
4.2.1 System overview
Figure 28: Projects deployment diagram.
The deployment diagram above shows all parts that combine the final project. Linked
to the supporting libraries such as the JMF or the JDBC, the packages are defined in the
sub-package"mediaserver"and integrated in the overall package"org.vizir". Below,
the involved packages are described in detail.
4.2.2 Server package
Figure 29: Theserverpackage.
The server package overall task is to provide RTSP server functionality. Thus, the server
classes were implemented first. A small socket listener soon was established followed
by the implementation of the RTSP processing part. Finally, the server package was
linked to the JMF libraries making extensive use of the RTP creating and transmitting
process queue.
4.2 Project description 51
Figure 30: TheServerclass.
Server:This class with only amain()method starts the server side of the project. After
parsing initializing data from the configuration file, a port number bound to the host
address provides a listening server socket.SessionRegistryprepares a session container
andVCRTimerstarts execution worker threads every defined interval.
Figure 31: TheServerThreadclass.
ServerThread:A socket connection to the server leads to the instantiation of a worker
thread calledServerThread. It extends Java’s Thread superclass and overrides therun()
method. In therun() method the socket’s input stream, a JavaInputStream, is gathered.
Next, in a loop, the socket input stream is split into Java Strings until the RTSP end
sequence is found. Then the request is prepared for further examination by theRTSP-
Protocol family. Defined parts are extracted from the input stream. If preparation was
successful, the appropriateRTSPProtocol, RTSPExtProtocolor RTSPrfc2326, is instan-
tiated to validate the request for essential and optional information. ThesendResponse()
method finally pushes the server response through the socket output stream.
4.2 Project description 52
Figure 32: TheRTSPProtocolfamily classes.
RTSPProtocol, RTSPFactory, RTSPrfc2326, RTSPExtProtocol:This family with ab-
stract super classRTSPProtocolchecks the request. If it fulfills all the RTSP message
requirements, it also executes the demanded action that matches the requests method
(workCommand()). Finally, it puts together an RTSP response message. The first cre-
ated instance of this family is the factory class, namelyRTSPFactory. It extends the
abstractRTSPProtocoland gives access to the primary RTSP request examining func-
tions like checkCommand()andcheckUrl(). If the primary examination result is pos-
itive, an RTSP method corresponding protocol handler is instantiated by the factory.
This is done by looking up a table. Both,RTSPrfc2326andRTSPExtProtocol, have
their own implementation of the abstractworkCommand()method. The method code
is adapted to the tasks it has to execute.RTSPrfc2326is specialized on the RFC 2326
protocol as described in Subsection 2.1.1.RTSPExtProtocolimplements all methods to
accomplish the tasks defined in Subsection 3.4.2. Finally, theworkCommand()executes
the method and protocol dependent task. ThecomSpecificStringand theresponseStatus
supply a String response which is formed through all stages of the process. Its content
is finally collected by the server worker thread and add to the server RTSP response.
4.2 Project description 53
Figure 33: TheRTPProcessorTransmitterclass.
RTPProcessorTransmitter:Derived from a JMF sample implementation of an RTP
transmitter setup, theRTPProcessorTransmitterprovides a means of control for the
RTP processor setup process. Furthermore, it controls the media object stream flow as
expected by RTSP. The class for creating, maintaining and closing an RTP session is
the JMFRTPManagerinterface provided by JMF. All RTP instances have to be known
to RTPManager.
Figure 34: RTP streaming queue from file source to RTP packet.
The following description of the RTP streaming queue is shown in Figure 34. The
queue starts with theDataSourceproviding a file or cache pointer. TheDataSource
gets connected to aProcessorvia thecreateProcessor()of theDataSource. A Proces-
sor can handle one or more tracks. After theRTPManagerhas been initialized with
local network properties, namely the local JMFSessionAddress, aSendStreaminstance
is created by the managers’createSendStream()method. This method takes an out-
4.2 Project description 54
put streamPushBufferOutputStream(PBDS) grabbed from the processor and a stream
identification streamid, of recognized track only, as arguments. This means oneRTP-
Managerper stream is required. All this work is done by thesetupCall()method which
can be invoked for each requested stream once in the process of an RTSP SETUP call.
Previous toSendStreamcreation usuallyReceiveStreamListener, RemoteListenerand
SessionListenerhave been added to theRTPManagerfor stream statistics (RTCP mes-
sages). After that theaddTraget()method is called. This method adds the client’s IP
address and port number information bundled in aSessionAddressto theRTPManager.
This is also the point where multi-unicast feature has been implemented. See Subsec-
tion 3.4.5 for a multi-unicast explanation. Multi-unicast allows to add more than one
receiver to anRTPManager. When theRTPManagerfinally calls its SendStreams’s
start() method an RTP transmission is initialized between the communication partners.
Still, no media data is transmitted unless theRTPProcessorTransmitter’s play()method
is invoked. It starts the previously realized processors. Thestop()method main func-
tion is to free all resources of the RTP transportation. There are twostop()methods
in RTPProcessorTransmitterbecause the RTSP TEARDOWN can ask to tear down
a whole transmission or only one stream at the time. In the second case, only the
RTSP stream identifier matchingRTPManageris freed from theRTPManager, whilst
the other queues continue to process media until the last stream has been torn down. The
RTPProcessorTransmitter’s createProcessor()method is also called bysdpRequest()to
gather recent media and track information of a media object for an SDP response. This
is necessary if there is no prepared SDP information available from the file system.
Figure 35: TheVCRTimerclass.
VCRTimer: Furthermore, theVCRTimeris contained in the server package. A Java
4.2 Project description 55
Timer task polls the database for saved timers and initiates an new capturing queue.
Additionally every Timer task also checks the session registry for old session entries.
Is the session’s time stamp of the last update older than a predefined interval value
the session is removed. The data capturing queue drives the data source stream into a
selected format by setting up the appropriate JMF processor.
4.2.3 Sub-packages in the server package
Cache package
Figure 36: Thecachepackage.
CachedProcessorRegistry, CachedProcessor:TheCachedProcessorRegistryis the ac-
counting class for the stream interval mapping implemented by theCachedProcessor.
TheCachedProcessorrepresents the interface between the modifiedDataSourceof the
cache package and the processor hold byRTPProcessorTransmitterof the server pack-
age. TheCacheProcessorinitializes twoIntervalManager,reader and writer, and passes
them to theDataSourcein order to allow direct read or write access.
The integration of an Interval Caching (IC) similar policy to this project was a very
complex task. The following Figure 37 guides the reader through the cache instances
and their interactions and interfaces with the rest of the software.
4.2 Project description 56
Figure 37: A cache setup example.Interval1 is shortest therefore the first in cachearray.Interval2 is longer than anotherinterval# therefore is third in array.
IntervalManager:TheIntervalManagerrepresents the managing layer above theInter-
valclass. Its has methods for setting up the ownInterval instance and preparing it for the
caching process. As explained earlier two instances of this class are passed to theData-
Source. A DataSourceread invocation can cause a read call at theIntervalManager’s
reader side and a write call at theIntervalManager’s cacher side. TheIntervalManager
also needs to keep track of the present read and write positions, namelycacherPosand
readerPos, of its Interval. Each read and write access must precede a check if the inter-
val is still in cache or direct file access is necessary. Moreover, a reader stop or a writer
stop must be handled by removing the interval from cache.
Interval: TheInterval represents a model of the an IC interval in the cache byte array of
theCacheinstance. In order to provide transparent reading and writing for the manager
to the byte array, theInterval holds its byte array boundaries defined byintervalBegin
and intervalSize. This also allows to shift the cache interval. ThevectorPosproperty
keeps track of the position of theInterval in the cache’s sorted interval list.
Cache:This all static defined class contains the reserved byte arraycacheand allows
new intervals to join the cache. It provides all necessary functions to implement the
openandcloseIC functions introduced by algorithm 1 and 2. The IC functions require
4.2 Project description 57
methods that check the new interval size or the final place in the cache’s sorted interval
list sortedIntervalList.
Synchronizer:This class adds a means of synchronization to all cache package in-
stances. As the reader can see in Figure 36 all important instances of the cache package
require synchronization. The read, write and shift operations on the cached intervals
have to be in sequence and thus, require synchronization. In the event of a shift oper-
ation e.g., both actions, read and write, must be prohibited until the interval has been
shifted to the final position in the byte array.
DataSource:This is a modified DataSource copied from the JMF source in order to
connect DataSourceand Processor to the caching system. The DataSource is of type
file reader with some additional methods.
Session package
Figure 38: Thesessionpackage.
SessionRegistry:As explained in Subsection 3.4.6 this class represents a singleton and
moreover, a factory for the session family. TheSessionRegistryhas a container for the
server’s RTSP sessions. It provides methods to insert, delete and search sessions.
Session, SessionExt, SessionRfc2326:The Session family hosts the two different mod-
els of a session used by the project. TheSessionRfc2326implements the a session
necessary for RTSP enhanced streaming. Thus, it holds anRTPProcessorTransmit-
ter instance to reference the step-by-step build connection until it is streamed. The
SessionExtbelongs to the implementation of the extended part of RTSP explained in
Subsection 3.4.2. It holds temporary authentication information such as the user key
and the user rights. This is similar to the use of theAuthenticationDataclass in the
client package.
4.2 Project description 58
4.2.4 Client package
Figure 39: Theclient package.
As Figure 39 shows, the client package consists of a major group of classes extending
Java’s JDialog and hence, presenting themselves as a dialogs to the user. TheExtJM-
Studiois a copy of JMStudio adapted and extended for the projects several purposes.
The RTSPFunctions class stands on its own as the only communication interface for the
dialogs andExtJMStudioto the server.
ExtJMStudio, JMFRegistry:TheExtJMStudiois the main class in the player package.
It is derived from the JMStudio source code. TheExtJMStudioallows to present media
content to the user and adds controlling functions to it. The integration of a copy of
the JMFRegistry became necessary because starting with Java version 1.4, classes not
included in a package, cannot be add to another package. The copied JMFRegistry code
is embedded into the player package.
RTSPFunctions:This represents the interface of communication fro the client to the
server. The methods invoked marshal the passed data into RTSP messages understood
byRTSPExtProtocolon the server side. As Figure 23 shows, the client-server communi-
cation over RTSP in the common case results in a response to the client.RTSPFunctions
has to decide if the result must be forwarded to the UI or can be ignored.
AuthenticationUI, NewUserUI, LoginUI, ChangePasswordUI:These classes are respon-
sible for user authentication and user accounting. The naming is a clear hint to their
function.
TableUI, MediaTableUI, UserTableUI, MultiUnicastTableUI:This is a family of classes
4.2 Project description 59
holding a JTable component. The TableUI defines methods for the common table setup,
search operation on the table and the table data selection. Again, the naming is a clear
hint to their function.
SetHostUI, MediaStreamUI, MediaEditorUI, RecorderUI, FeedbackUI:The last bunch
of listed UIs are not connected in a family. Their task is also hinted by their name.
More information on the concepts behind the UIs is available in the UIs description of
Subsection 4.3.
4.2.5 Mp4 package
Figure 40: Themp4package.
RTPMpeg4Packetizer, RTPMpeg4DePacketizer, BasicCodec:This family is the imple-
mentation of the MPEG-4 RTP packet handler. The all extend the JMFBasicCodec.
TheRTPMpeg4Packetizercore methodprocess()packs the MPEG-4 stream according
to the RTP packet format. TheRTPMpeg4DePacketizerdoes the inverse operation. A
packing approach was chosen following the instructions of the MPEG-4 RTP pack-
ing methods explained in [13]. Additional help is provided by FFmpeg’s [16] MPEG-4
header parser functions. The approach bases on the parsing of the MPEG-4 video object
plain (VOP) and the video object layer (VOL) start codes explained in ISO MPEG-4
standard draft [21]. The depacketizer is adjusted to the VCM codec wrapper. The VCM
wrapper requires for initialization the output frame measurements of a MPEG-4 video.
See Subsection 3.5.3 for explanation. Furthermore, the VCM is very sensitive to cor-
rupt data and only works with complete data. Therefore, the unpacking instance tries
to grab the RTP package data and submits only fully assembled stream material to the
following processing queue. This is done by checking for start codes at the beginning
of packets and looking for the RTP marker flag which indicates the end of a MPEG-4
media packet.
4.2 Project description 60
4.2.6 Util package
The idea of theutil package contained in "org.vizir.mediaserver.util" is to provide access
to common operations, such as database queries, encryption or fixed RTSP communi-
cation, to the rest of the packages contained in the project. The most important classes
are explained now in detail.
RtspUtil, RtspHandler:The original classes of Java Media FrameworkRtspUtil and
RtspHandlercould not parse SDP MPEG-4 description correctly. Subsection 3.5.3 ex-
plains the issue. Patched versions ofRtspUtil and RtspHandlerare available to the
projects packages. Fixes includes correct SDP parsing for MPEG-4 streams dimen-
sions and capturing of the correct codec type. Another fix helps to gather stream start
information supporting multi-unicast connections. Thus, simultaneous start of all par-
ticipating clients is available.
DBFunctions: This implements the Java MySQL connection wrapper including con-
nect, disconnect, querying and update methods.
DesEncrypter:TheDesEncrypterprovides the project with decrypting and encrypting
methods using Javas own javax.crypto package. Decrypting and encrypting is necessary
for the authentication and administrative requests implemented in the RTSP extended
communication protocol.
4.2.7 MediaDB database tables
A MySQL database engine was considered to provide the projects database manage-
ment. The reasons for the use of MySQL are:
1. The MySQL database engine is free of charge to a non profit use and very easy
to set up and configure.
2. It provides all needs required regarding table entries data types, interoperability
with Java programming language and processing speed.
3. It does not require extra hardware but can run together with the server on the same
system.
Next, a figure presenting the entity relationships followed by a description of the enti-
ties.
4.2 Project description 61
Figure 41: TheMediaDBtables.
mediafile:This table contains the main media object attributes. The stored information
is used for media object accounting. This table is always involved when media infor-
mation selection or update queries are issued. This table needs to be readable to all user
system groups.
stream: The streamtable has a 1..n relation to themediafiletable. Thus, it stores
the media objects’ streams location on the file system and its properties. The stored
properties arestreamidnecessary for RTSP andcacheablea Boolean telling the caching
engine whether to cache the stream or not.
sdpinfo: If a media object has a row entry in this table it provides the path to the SDP
information file on the file system.
genre:This table holds all known audio and video genres.
user: Theuser table is necessary for user authentication. The properties stored in this
table areusername, passwordandrights. Rightsmatches the group identification. The
passwordis hidden by encryption.
usertype:This table holds all known user groups.
vcrentry: The recording information for the device capturing is stored here. This infor-
mation includes start and stop time. Furthermore, properties of this table are username
for identification and filename-prefix.
4.3 User Interfaces 62
4.3 User Interfaces
Figure 39 shows that all UIs of the client package extendJDialog. This is a Java
class contained in the "javax.swing" package. It is provided by Java and allows to
create custom dialogs. Furthermore, the UIs all implementActionListenerfrom Java’s
"java.awt.event" package in order to capture button or other click events. Only imple-
mented events are processed. Very important to the project is the concept of thecaller.
A caller instantiates and prepares aJDialog before invoking theshow()method that
allows to present a dialog in modal mode. Moreover, thecaller is responsible for grab-
bing the event hint left by the dialog after it was closed withdispose(). Finally, the
caller must execute the task matching the event hint. Following the examples of the
different UIs presented by the client.
Figure 42: TheExtJMStudiowindow.
This is theExtJMStudioUI. The new Java Menu Media DB contains all extensions
added to the originalJMStudio. Newly integrated dialogs are called from this menu.
Therefore theExtJMStudiois the caller for almost all dialogs the UI provides. Many
of the events require RTSP communication. TheExtJMStudiouses for that purpose an
RTSPFunctionsinstance.
Figure 43: An authentication family window.
The login mask is an example of a simple dialog class used for the UI. It is called
5 Usage description 63
by its caller the ExtJMStudio. On button click, it disposes itself. TheExtJMStudio
grabs the event hint. Is a login requested by the user, both fields next to User and
Password get grabbed and passed by theExtJMStudioto the appropriate method in the
RTSPFunctionsinstance.
Figure 44: A table family window.
This dialog represents an UI with additional table. The table allows user interaction
and shows a feedback dialog when a table row is selected. Not all options on the feed-
back dialog must result in a sub-dialog. But if a sub-dialog is opened, the table dialog
becomes acaller and must provide code for event handling.
Figure 45: A feedback family dialog.
The last example of the UI types is a custom feedback dialog provided by theFeed-
backUI. The constructor of theFeedbackUIdecides on the type and button layout of
the feedback. AFeedbackUIinstance can never became acaller nor does it comprise
that its caller applies to thecaller concept.
5 Usage description
The usage of the projects software is defined by its streaming client/server architecture.
The main function is a streaming on demand service over RTP. RTSP enables stream
control by the client. Additionally, the system allows to add and remove media objects
5.1 System requirements 64
to the library. This media object library consists of a directory containing the media
files and their media information stored in the database.
The next section will list the system requirements regarding the software. Subsequently,
the actual system usage description is provided. Another section gives insight in the
server performance. Finally, limitations and known bugs are listed.
5.1 System requirements
This is a Java software project. In order to run the system successfully, especially Java’s
hardware and software requirements need to be considered.
5.1.1 Hardware requirements
The hardware setup must be adjusted to Java’s memory and processor requirements.
Please, consult therefore their information sources [17]. The development and testing
took place on an AMD AthlonXP 1900+ (1500 MHz), 256 MB DDR266 Ram with
a common graphic card and a common sound setup. The described setup meets the
hardware requirements just fine.
5.1.2 Software requirements
A list of software components needs to be installed before the client and sever can
successfully run:
1. Java and JMF: The project’s client and server run on any platform that can host
a Java 2 Runtime Environment (version 1.4.2). Mentioned platforms must host a
Java compatible operating systems with IP network functionality. Additionally, a
copy of Java Media Framework (version 2.1.1e) from Sun is required.
2. MySQL, JDBC: The servers database needs a MySQL database. To interact with
the database the server must also be extended with the JDBC MySQL database
wrapper.
3. MediaServerClient.jar: This is the essential project’s library. It contains all the
software implemented for the final system. It must be available on the server
system as on the client system.
4. Codecs: The JMF does provide some codecs to process media. However, if the
VCM wrapper is used to present MPEG-4-compliant media all desired MPEG-4
codecs need to be extra installed.
5.2 System usage description 65
5.1.3 Building requirements (ANT)
The project’s software is available as both, a JAR library containing all binaries required
and a directory structure holding the whole source code. If the source code is used next
paragraph will present the reader a fine solution for building the binaries.
The Ant tool provided byThe Apache Ant Project[12] makes it very easy and com-
fortable to build Java projects. Furthermore, the Apache Ant is available for different
platforms. Ant is considered the building tool fitting the best the projects needs. The
Apache Ant creates from source the required binaries, libraries and documentation. The
projects own Antbuiltfile is included in the source.
5.2 System usage description
Next, all system features are described. An expression of that kind "[menu->selection]"
means that in order to use the described feature,ExtJMStudio’s menu "menu" and the
selection "selection" need to be clicked. This only applies for Subsection 5.2.
5.2.1 Running the projects server and client
The server is invoked as any other Java program. Thejavacommand is started with the
correctclasspathand a configuration file. The configuration file needs to be adjusted
to the system. The client is also started with thejava command. No configuration
file is required by the client as the JMStudio provides one. Before the client can be
successfully connected to the a running server, the network properties need to be set up.
This can be done using the UI in [MediaDB->set rtsp host].
5.2.2 Transmitting and controlling an RTP stream
There are two methods of initiating a transmission session at the server. The first and
most comfortable is realized with the project’s client. An UI [MediaDB->database]
presents a table with all available media objects. By selecting the desired row the trans-
mission can be initiated immediately. The other option, applying especially to the other
supported clients, is to enter the media objects RTSP URL into the clients URL selector.
The stream control is accessible using the usual media playback controls provided by
all players suitable for the server.
5.2 System usage description 66
5.2.3 Adding and deleting media
A new media file which was add to the server’s filesystem can be registered with the
database. After that, it is available to all clients. In order to register the new media file
with the database, administrator rights are required [MediaDB->login]. Administrator
rights are gained by logging into theAdministratorgroup. To select media files an UI
[MediaDB->update media library] with a list of all files next to check-boxes is available.
A new file is selected by checking the check-box next to it. The selected files are entered
with default values to the database. To get rid of a media object administrator rights are
required as well. Deleting a media information from the system means to delete all its
entries from the database tables.
5.2.4 Editing media object properties
TheExtJMStudioprovides UIs for media object editing [MediaDB->database]. There
is two kinds of media properties. Transmission critical properties are regarded proper-
ties that influence the transmission process. Accounting properties are considered non
critical to streaming. Therefore, only anAdministratoris allowed to change transmis-
sion critical properties presented in a separate UI. Furthermore, this UI is hidden from
the other user groups. The accounting properties can be edited by theEditor user group.
An UI is available to allow changes on the accounting properties. All changes affect
only the database tables. No changes are made to the media file itself.
5.2.5 Starting a multi-unicast transmission
Starting a multi-unicast transmission involves more than one client. There is two types
of clients in a multi-unicast transmission. The session leader initiating and controlling
the multi-unicast session and the passive followers. The UI [MediaDB->database] pro-
vides an option to start a multi-unicast session. Furthermore, a list with all available
multi-unicast transmissions are presented by the UI [MediaDB->enter multiunicast].
Passive session followers use this list to join a session. In order to support simultane-
ous start of all clients, feedback is delivered by the UI. The session leader controls the
transmission the same way as an usual media content transmission.
5.2.6 Adding and deleting a recorder timer
The Editor group is allowed to set timers for the integrated recording feature of the
server. Recording means capturing data from a predefined device to a media file that is
added automatically to the database media collection. The server periodically checks
5.3 Performance 67
the database for new timers. There is a separate UI [MediaDB->recorder] that sets new
timers.
5.2.7 Running other clients
The QuickTime and the RealNetworks client, equipped with RTSP/RTP extensions, can
connect to the server. In order to start a transmission the RTSP URL must be known
to the client. The server also allows RTSP control operations to these clients. All other
features are not available to this group of clients.
5.3 Performance
Server performance investigations have been conducted by confronting the server with
automated and concurrent streaming request described in batch files. The tests give an
insight on reliability, load behavior, concurrent handling of streams and memory usage
contributing all to the overall server performance.
The test environment includes a single server instance running on a LAN network
(100MBit) with a well known file archive. The command-line client openRtsp from
the live package [18] is used to support automated testing. The projects client was not
involved in the performance testing. A shell script, that is executed concurrently several
times, chooses randomly from a collection of media objects known to the server for a
random streaming duration. Below the batch file content:
5.3 Performance 68
Algorithm 5 Bash script for performace testing.
#!/bin/bash# Array containing all media objectsmedia=" test.mp3test.mpgFantastic_FourFantastic_Four-vid.aviFantastic_Four-aud.mp3BatmanBegins "#Array containing durationstime=" 304050200" #200...play the whole streamsuite=($media) # Read into array variable.num_suites=${#suite[*]} # Count how many elements.timeArr=($time)num_time=${#timeArr[*]} # Count how many elements.number=0#loop 50 timeswhile [ $number -lt 50 ];do
rtime=${timeArr[$((RANDOM%num_time))]}if [ "$rtime" -eq "200" ] then
command="openRTSP -V "
else
command="openRTSP -V -e $rtime"
fi#execute openRTSP./${command} -F HERE \rtsp://192.168.1.2:1234/${suite[$((RANDOM%num_suites \))]}number=$((number + 1))
doneexit 0
The batch script executes the openRTSP client 50 times. A average test calls the script
three times. This means that 150 requests total have to be handled by the server. During
testing in the worst case three clients have to be served. This means up to six streams
run in parallel. Caching is also switched on for appropriate streams.
Additionally, sampling inspection is conducted during automated testing to estimate
5.4 Limitations & known bugs 69
the quality of stream transmission. Mplayer and the RealNetwors client were used for
inspection.
date test duration scripts -
clients total
memory usage server state caching notes
30.03.05 24 min. 1 - 50 262 MB up and running not impl.
27.04.05 1 min. 2 - 3 n.a. hung-up still buggy
30.05.05 5 min. 3 - ~40 out of memory crashed memory bug
19.06.05 36 min. 3 - 150 420 MB up and running enabled
01.07.05 32 min. 3 - 150 419 MB up and running enabled
10.09.05 40 min. 3 - 150 422 MB up and running enabled
16.09.05 37 min. 3 - 150 420 MB up and running enabled
The table list some performance testing results. The list shows that between the 27.04.05
and the 19.06.05 some major software bugs caused the server to malfunction. The first
problem arouse from synchronization problems thus, only 3 clients were executed until
the system hung-up. The second mistake was eliminated when the server’s RTSP thread
management was updated and client worker threads were relieved after request end.
The final server could withstand all the artificially produced load and kept running sta-
ble after the test period. The quality of playback during sampling inspection not only
depends on the server’s load, but also the clients stream buffering capability before
playback. Whilst Mplayer tries to start immediately the playback, RealNetworks client
prebuffers for about three to five seconds before playback. As expected, with a mean of
four concurrent running RTP streams, the problems were encountered especially in the
picture quality and also in the synchronization between an audio and a video stream.
However, the difference between RealNetworks client and Mplayer are remarkable. The
RealNetworks client does not show corrupted pictures or audio synchronization prob-
lems, but stops presentation for an instance when media content was missing or was
corrupt or the buffer was empty. Mplayer tries to keep up with the running stream, by
letting the audio preceding the video and then when resynchronizing, causing corrupt
pictures on some occasions.
5.4 Limitations & known bugs
The project comes with some flaws regarding codecs, caching, container parsing and
memory usage. Some of them are connected to the limitation of Java and the Java Media
Framework.
6 Conclusions 70
Codec problems arise with the VCM wrapper. The VCM wrapper is very sensitive
to corrupt media data. Yet, it provides MPEG-4 playback to the system. Playback
crashes happen sometimes as a result of bad MPEG-4 stream information. Refer to
Subsection 3.5.3 for detailed description. Furthermore, the JMF codec processing queue
is limited to the JMF codec registration. Only recognized and correctly placed codecs
or plug-ins can be added and used.
Caching is only implemented with the JMFDataSource. The parsing of container for-
mats with the JMF engine has some flaws. First, there is the caching problem that
comes with irregular reading from DataSource. Then there is the problem with MPEG-
1 Audio Layer 3 tracks combined with an MPEG-4 track in a AVI container format.
For more information see Subsection 3.5.5. The flaws have been patched as explained
in Subsection 3.5.4. Moreover, cache synchronization is the most complex part of the
software. Hopefully, all bugs have been eliminated regarding this topic.
During performance testing it became obvious that the Java Runtime Environment run-
ning the server is a very memory hungry process. It allocates a lot main memory at
start-up (~ 209 MB) and keeps adding memory at almost constant rates when clients
start streaming sessions. A system with enough main memory should be considered.
Read previous Subsection 5.3 about performance testing results.
6 Conclusions
6.1 About the project
The two main services offered by the server are the streaming media content and the
accounting media objects. Media streams can be initiated and controlled by all suitable
clients. In order to support useful media object accounting, the stored media infor-
mation is protected by access rights. Access rights are a property of the existing user
groups and can be gained by logging in at the server. The media object information
together with the user groups information is stored in the server’s RDBMS. Another
restriction allows the client only to start streams matching its transmission profile. A
client is add to a profile at runtime with a ping timeout estimating the connection qual-
ity. The client component contains a number of UIs. These let the user interact with
the server. Interaction includes both accounting the database content and requesting or
controlling media streams. The server software comprises a multi-threading daemon
which listens to client requests on predefined socket. It implements an RTSP parser
that validates the client requests. All the communication between client and server is
6.2 Outlook 71
processed using RTSP and RTP. RTSP provides streaming control functions such as re-
mote control for media objects. In addition, an extended RTSP suite enables accounting
of media information. Client software from other providers, namely RealNetworks and
Apple, that include RTSP/RTP capabilities, can also request media transmissions from
the server and control initiated streams. RTP provides the server with a model for media
content network packing allowing real-time streams. Moreover, the RTP packet format
is used to create customized MPEG-4 RTP packets. The communication between server
and the database engine is database dependent and solved by an interface library. An-
other enhancement to the server is an adapted Interval Caching policy. This increases
the server capacity and reduces access latency by delivering cached streams also from
primary and fast memory. Finally, the server offers a device capturing thread, which
polls timers set by the client. The thread evaluates the timers and if requested, grabs
input streams from capture devices.
The software is implemented using the Java programming language and the Java Media
Framework (JMF). JMF supports the project in multimedia content streaming. More
precisely, it provides both the RTP/RTCP engine at the server and the bases for the
client’s UIs and the client’s RTSP communication. Regarding client’s RTSP communi-
cation, only extended RTSP is implemented separately.
Implementing a customized RTSP/RTP multimedia server with JMF turned out to be
challenging. Even though JMF provides most of the required mechanisms, such as RTP
packing and the setup of stream queues, integrating it to a specified project requires
some patching. Almost all the implementation difficulties, such as the JMF building
problems or the JMF SDP parsing flaws, have been eliminated.
6.2 Outlook
By introducing the reader into the various possibilities of efficient multimedia streaming
and all the ideas and protocols attached to the issue, it should not be forgotten that apart
from the numerous radio streams, Internet multimedia streams in applications today are
usually only available as short clips like trailers or advertisement videos. When talking
to someone about audio-visual content and streaming, I realized that people assume an
average home setup. On a local area network, with capacities up to 1GBit of trans-
fer rate, enjoying compressed or raw multimedia data is straightforward. Publishing a
multimedia library on the local network can simply be achieved by sharing directories.
On most systems it is sufficient to set the directory attributes correctly. Shared direc-
tories make file access transparent. Hence, playing a file from the local file system or
streaming it from a remote shared directory appear to the user to be the same. This is
6.2 Outlook 72
only possible because recent local networks are equipped with enough spare capacities
additionally to the common data flow making real-time multimedia streaming simple.
New developments such as Universal Plug and Play [19] focus on easy integration of
domestic PCs and on a new generation of home entertainment electronics. Universal
Plug and Play provides services of media discovery (Simple Service Discovery Pro-
tocol SSDP), media information exchange over HTTP/SOAP using XML documents,
registered device event notification (General Event Notification Architecture GENA)
and media streaming.
So, why bother about all these fancy complicated old protocols and complex ways of
transmission when things seem so easy? First, it should not be forgotten that the pre-
vious example takes place only on the local network. With Internet connections with
transfer rates of a fraction of those on the local network (~1-2MBits download maxi-
mum) available today, it is only clear, that other mechanisms had to be implemented to
enable real-time streaming. With radio programs, already been available for a while,
broadcasting TV over the Internet (e.g. by the BBC) will certainly establish soon. How-
ever, there are even more possibilities. Combining the methods of caching, multicast
and real-time transmission protocols, the cornerstones for streaming services of multi-
media content on demand are already available. The controversies that slow down these
processes are not only the slow Internet or the huge investment requirements. Most
evident is the problem that arises from the user’s right of use and enjoyment opposed to
companies’ media copyrights.
LIST OF ALGORITHMS 73
List of Algorithms
1 openfor an interval caching policy. . . . . . . . . . . . . . . . . . . . .21
2 closefor an interval caching policy. . . . . . . . . . . . . . . . . . . .21
3 Example of a thread-save Singleton. . . . . . . . . . . . . . . . . . . .45
4 Simple JMF processing queue. . . . . . . . . . . . . . . . . . . . . . .50
5 Bash script for performace testing. . . . . . . . . . . . . . . . . . . . .69
List of Figures
1 Project overview. CT1, CT2 represent the different communication types.6
2 OSI view of RTSP and RTP protocol. . . . . . . . . . . . . . . . . . .8
3 Model of a multimedia framework. . . . . . . . . . . . . . . . . . . .11
4 Model of a multimedia client/server architecture. . . . . . . . . . . . .15
5 Illustration of the Interval Caching policy. Sxy represent streamx on a
media objecty, bxy the requested buffer by Sxy. . . . . . . . . . . . . . . 20
6 Projects concept diagram. . . . . . . . . . . . . . . . . . . . . . . . . .22
7 Use case for usage of system, management and live control by different
users. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23
8 User case for RealNetworks, QuickTime and custom client interaction. .25
9 Projects deployment diagram. . . . . . . . . . . . . . . . . . . . . . .27
10 Server package model. . . . . . . . . . . . . . . . . . . . . . . . . . .28
11 TheServerclass model. . . . . . . . . . . . . . . . . . . . . . . . . . .29
12 TheServerThreadclass model. . . . . . . . . . . . . . . . . . . . . . .29
13 TheRTSPProtocolclass model. . . . . . . . . . . . . . . . . . . . . .30
14 TheRTPTransmitterclass model. . . . . . . . . . . . . . . . . . . . .31
15 TheSessionclass model. . . . . . . . . . . . . . . . . . . . . . . . . .31
16 TheVCRTimerclass model. . . . . . . . . . . . . . . . . . . . . . . .32
17 The client package model. . . . . . . . . . . . . . . . . . . . . . . . .32
18 TheAuthenticationUIclasses model. . . . . . . . . . . . . . . . . . . .33
19 TheMediaBrowserUIclass model. . . . . . . . . . . . . . . . . . . . .33
20 TheRTSPFunctionsclass model. . . . . . . . . . . . . . . . . . . . . .33
21 TheAuthenticationDataclass model. . . . . . . . . . . . . . . . . . . .34
22 The database tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
23 Dynamic behavior overview (without error feedback). . . . . . . . . .36
24 Streaming setup sequence flow diagram. . . . . . . . . . . . . . . . .37
REFERENCES 74
25 Activity flow on authentication. Please note that, the AUTHENTICATE
method is a newly introduced RTSP method to enable authentication
over RTSP. See Subsection 3.4.2. . . . . . . . . . . . . . . . . . . . . .40
26 VCR timer sequence model. . . . . . . . . . . . . . . . . . . . . . . .43
27 Caching of a container format. . . . . . . . . . . . . . . . . . . . . . .49
28 Projects deployment diagram. . . . . . . . . . . . . . . . . . . . . . .51
29 Theserverpackage. . . . . . . . . . . . . . . . . . . . . . . . . . . . .51
30 TheServerclass. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
31 TheServerThreadclass. . . . . . . . . . . . . . . . . . . . . . . . . .52
32 TheRTSPProtocolfamily classes. . . . . . . . . . . . . . . . . . . . .53
33 TheRTPProcessorTransmitterclass. . . . . . . . . . . . . . . . . . . .54
34 RTP streaming queue from file source to RTP packet. . . . . . . . . . .54
35 TheVCRTimerclass. . . . . . . . . . . . . . . . . . . . . . . . . . . .55
36 Thecachepackage. . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
37 A cache setup example.Interval1 is shortest therefore the first in cache
array. Interval2 is longer than anotherinterval# therefore is third in
array. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .57
38 Thesessionpackage. . . . . . . . . . . . . . . . . . . . . . . . . . . .58
39 Theclient package. . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
40 Themp4package. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60
41 TheMediaDBtables. . . . . . . . . . . . . . . . . . . . . . . . . . . .62
42 TheExtJMStudiowindow. . . . . . . . . . . . . . . . . . . . . . . . . 63
43 An authentication family window. . . . . . . . . . . . . . . . . . . . .63
44 A table family window. . . . . . . . . . . . . . . . . . . . . . . . . . .64
45 A feedback family dialog. . . . . . . . . . . . . . . . . . . . . . . . . .64
References
[1] http://www.faqs.org/rfcs/rfc2326.html Last visited: 17.10.2005
[2] http://www.faqs.org/rfcs/rfc1890.html Last visited: 17.10.2005
[3] http://www.faqs.org/rfcs/rfc2327.html Last visited: 17.10.2005
[4] http://www.faqs.org/rfcs/rfc1889.html Last visited: 17.10.2005
[5] http://www.faqs.org/rfcs/rfc3261.html Last visited: 17.10.2005
[6] http://www.khronos.org/openml/ Last visited: 17.10.2005
REFERENCES 75
[7] http://www.apple.com/quicktime/ Last visited: 17.10.2005
[8] http://gstreamer.freedesktop.org/ Last visited: 17.10.2005
[9] http://java.sun.com/products/java-media/jmf/ Last visited: 17.10.2005
[10] https://helixcommunity.org/ Last visited: 17.10.2005
[11] http://www.videolan.org/ Last visited: 17.10.2005
[12] http://ant.apache.org/ Last visited: 17.10.2005
[13] http://www.faqs.org/rfcs/rfc3016.html Last visited: 17.10.2005
[14] http://www.mysql.com/ Last visited: 17.10.2005
[15] http://jffmpeg.sourceforge.net/ Last visited: 17.10.2005
[16] http://jffmpeg.sourceforge.net/ Last visited: 17.10.2005
[17] http://java.sun.com/ Last visited: 17.10.2005
[18] http://www.live.com/liveMedia/ Last visited: 17.10.2005
[19] http://www.upnp-ic.org/ Last visited: 17.10.2005
[20] Dinkar, Sitram and Asit, Dan:Multimedia Servers: Applications, Environments,
Design,Morgan Kaufmann Publishers 2000 ISBN 1558604308
[21] ISO/IEC International Standard 14496 (MPEG-4); "Information technology -
Coding of audio-visual objects", January 2000
[22] Hitz, Martin and Kappel, Gerti:UML @ work, Dpunkt Verlag 2005 ISBN
3898641945
[23] Randerath, Detlef and Neumann, Christian:Streaming Media, Galileo Press 2001
ISBN 3898421368
[24] Eidenberger, Horst and Divotkey, Roman:Medienverarbeitung in Java. Audio und
Video mit Java Media Framework & Mobile Media API,Dpunkt Verlag 2003 ISBN
3-89864-184-8
[25] Gordon, Robert, Talley, Stephen and Gordon, Rob:Essential Jmf: Java Media
Framework,Prentice Hall PTR 1998 ISBN 0130801046
REFERENCES 76
[26] DeCarmo, Linden:Core Java Media Framework, Prentice Hall PTR 2002 ISBN
0130115193
[27] Dashti, Ali, Kim, Seon Ho and Shahabi, Cyrus:Streaming Media Server Design,
Prentice Hall PTR 2003 ISBN 0130670383