assessing objective and subjective quality of audio/video...
TRANSCRIPT
Assessing Objective and Subjective Quality of Audio/Video for
Internet based Telemedicine Applications
Dissertation Proposal
Submitted to School of Information Science
Claremont Graduate University
Bengisu Tulu
November 8, 2004
1 of 60
Table of Contents
Chapter 1: Introduction................................................................................................................ 2
Chapter 2: Problem Statement ..................................................................................................... 7
Chapter 3: Literature Review..................................................................................................... 11
3.1 Telemedicine, Telehealth, and e-Health ................................................................... 11
3.2 Audio/Speech Quality Measures............................................................................... 15
3.2.1 Objective Measures............................................................................................... 16
3.2.2 Subjective Measures ............................................................................................. 18
3.3 Video Quality Measures ........................................................................................... 20
3.3.1 Objective Measures............................................................................................... 20
3.3.2 Subjective Measures ............................................................................................. 22
3.4 Quality Measurement in Telemedicine applications................................................. 23
3.5 Session Initiation Protocol (SIP) for Internet-based Videoconferencing ................. 25
Chapter 4: Proposed Study ........................................................................................................ 28
4.1 Stage 1 – Development of Telemedicine Taxonomy................................................ 28
4.1.1 Definition of Taxonomy Dimensions ................................................................... 29
4.1.2 Interaction of Proposed Dimensions..................................................................... 33
4.2 Stage 2 – Assessment of Objective and Subjective Quality Measures for
Telemedicine............................................................................................................................. 35
4.2.1 Technological Factors Affecting Quality in Telemedicine over IP Networks ..... 35
4.2.2 Experimental Test-bed .......................................................................................... 44
4.2.3 Experimental Procedures ...................................................................................... 47
4.3 Stage 3 – Development of SIP-based Videoconferencing Tool with Real-time
Telemedicine Capability Index ................................................................................................. 50
4.4 Research Methodology ............................................................................................. 51
4.5 Contributions and Potential Implications.................................................................. 52
4.6 Timeline .................................................................................................................... 54
References.................................................................................................................................. 56
2 of 60
Chapter 1: Introduction
Health expenditures as a share of Gross Domestic Product (GDP) have been rising in the
United States and other member countries of the Organization of Economic Cooperation and
Development (OECD) since 1960s. A study, conducted to examine the reasons for this increase,
concluded that Information technology (IT) can play an important role to reduce increasing costs
[1]. However, experts around the world believe that new demands in providing healthcare will
require fundamental changes in the structure of the industry. Besides the failure to disseminate
medical knowledge quickly enough or use it in a methodical manner, there is another shortfall:
medical practitioners with scarce, specialized knowledge cannot bring it to bear beyond their
geographical confines [2-4]. Telemedicine’s effort to bridge this gap has been reported
repeatedly [2, 3].
Telemedicine, and associated technologies, are touted as critical to solve the above-mentioned
problems. As a result of growing interest in telemedicine during the last decade [5], many
telemedicine applications have been developed and deployed during recent years [6].
Telemedicine and related healthcare technologies aim to provide efficient healthcare to improve
the well being of patients and bring medical expertise at a lower cost to the right people at the
right time. The Internet is moving towards becoming the most widely used communication
medium around the world. Given its objective, telemedicine has been quite slow in utilizing
Internet technologies to provide medical expertise for a larger audience. One of the reasons for
this slow adoption is the quality that can be provided through varying Internet connections.
3 of 60
In telemedicine, quality of the data obtained at the receiving end of the connection is critical
for the medical decisions. The final outcome of the medical session and hence the success of the
whole process, depends on the amount and quality of the data received. Lack of necessary
information may increase the frustration and dissatisfaction for parties involved and may lead to
erroneous decisions, which can severely affect the overall outcome of a telemedicine event. To
prevent this frustration and the negative effects of an unsuccessful telemedicine experience,
current telemedicine applications are conducted with high quality equipment over high-speed
connections that are not IP based. Hence, the spread of telemedicine to areas which are
underdeveloped and which do not have a good telecommunications infrastructure to support such
expensive connection lines and equipment have been limited. The Internet is more or less
accessible from many locations in the world for a very low cost compared to the current
alternatives used in telemedicine. Figure 1.1 illustrates an end-to-end Internet-based telemedicine
network implementing several telemedicine scenarios. For each scenario, user requirements and
available technical infrastructure may vary and this variation affects the quality of medical
decisions made in a session.
IP-based telemedicine systems are vulnerable to various impairments that can occur at the
physical, network, and application levels. Unlike circuit switched networks, the Internet is a
packet switched network where packet loss, delay, and delay variation can occur easily. In
addition, available bandwidth can vary from one location to another. On the application level,
different coding and compression techniques have been developed to enable the transmission of
audio/video data, which can consume large amounts of bandwidth. On the network level, to
provide a more reliable and stable connection, many service providers are offering Quality of
Service (QoS) features that provide some guarantee of performance such as traffic delivery
4 of 60
priority, speed, delay, or delay variation by prioritizing and guaranteeing bandwidth for selected
applications to achieve optimal service performance. However, implementation of these features
is still very limited.
Figure 1.1 End-to-End IP-based Telemedicine Networks
Even though information quality is a critical factor in medical decision making, some decisions
may not require the highest quality of information possible. Certain decisions can be made with
5 of 60
limited quality. An examination of current knowledge reveals that there is no study focusing on
answering the following question: “For any given communication channel and a given
telemedicine purpose, what is the right amount of information to transfer given the limitations of
communication technology and the devices in use for specific scenarios?” Knowing the
boundary for the minimum information required can help us utilize limited channels and prevent
unnecessary information load on larger channels.
Today, there is an increased awareness of the errors that may occur in medicine and the lack of
decision support systems and tools that can help physicians. The problem is particularly acute
when it comes to telemedicine. In order to provide support for physicians that are involved in
telemedicine events, this study proposes to develop and utilize objective and subjective
audio/video quality measures to calculate a real-time quality index given a specific telemedicine
setting for IP-based networks. This index will be utilized by study participants, while making
decisions, regarding whether a telemedicine setting is capable of providing the required quality
to complete a specific task in a given application domain. Early evaluation of the existing setting
can prevent frustrations and time loss while enhancing the response time and satisfaction in
telemedicine events. This study will first, investigate the factors that affect the quality of
information required in a telemedicine event that will lead to a taxonomy of telemedicine
applications. Second, considering that audio and video are the two data formats that are most
affected by the factors affecting quality, objective and subjective quality measures for providing
real time feedback to the participants of a telemedicine event will be collected in an experimental
testbed. Finally, an a videoconferencing tool that can predict and present perceived quality of a
telemedicine session based on the objective values collected in real time will be developed.
6 of 60
Contributions of this study are threefold. First, it will provide a telemedicine taxonomy for
classifying different telemedicine events based on factors affecting quality and outcome. Second
it will develop a subjective measure database for telemedicine in two application domains and
investigate a heuristic measure to predict perceived quality of a telemedicine event in real time.
Third, it will implement this heuristic method in a new artifact, a videoconferencing tool with a
telemedicine capability indicator, capable of measuring objective quality and providing
subjective quality feedback to users in real time.
One limitation of this study is that the subjective measures will be developed only for two
specific telemedicine events (listening to heart beats and viewing an eye image). The
generalizability of these measures to other areas of telemedicine is unknown; hence future
research will be needed to test and expand these quality measures to other application domains
and telemedicine events. Another limitation may arise from the number of subjects that will be
recruited for subjective tests. To minimize the effects of this limitation, ITU’s minimum
requirements for subjective tests will be the criteria for recruiting subjects.
7 of 60
Chapter 2: Problem Statement
Since the introduction of the term “telemedicine”, various studies have outlined how one can
utilize this new technology and reap its benefits. However, telemedicine remained a “black box”
for the public as a result of the fact that even the authorities have not yet reached a consensus on
a clear and precise definition of telemedicine content and boundaries [7]. What is telemedicine
and how does it differ from traditional medicine? What are the necessary new laws and
regulations to bring this technology across the globe?
Despite the fact that the government has supported it and there have been continued reductions
in the equipment and transmission costs, there have not been enormous numbers of actual
implementations of telemedicine. Among the many barriers listed in the literature, the lack of
information about the effect of telemedicine on cost, quality, and access has been a significant
one [8]. It is important to analyze and predict the success of the future applications to make
better decisions. In telemedicine, the quality of the data obtained at the receiving end of the
connection is critical for the medical decision. The final outcome of the telemedicine session
and hence the success of the entire process depends on the amount and quality of the data
received. If the information necessary to make a decision cannot be retrieved during the
telemedicine event, both parties involved in the process will feel frustration and dissatisfaction,
which can severely affect the results of telemedicine visit. To prevent this frustration and the
negative effects of an unsuccessful telemedicine experience, current telemedicine applications
are conducted over high-speed connections that are not IP based. One drawback of this approach
is the barriers it introduces regarding the spread of telemedicine to areas which are
8 of 60
underdeveloped and which do not have a good telecommunications infrastructure to support such
expensive connection lines. On the other hand, the Internet is more or less accessible from many
locations worldwide for a very low cost compared to the current alternatives used in
telemedicine.
Use of the Internet for telemedicine has been studied for the last ten years, and some of these
studies have demonstrated that even on Internet and IP-based connections it is feasible to
conduct telemedicine sessions, but no study has been able to provide convincing evidence to the
telemedicine community for its widespread adoption. The unreliable connection properties of IP-
based systems prevent the spread of these applications. However, if one can measure the
predicted quality that can be obtained for a specific telemedicine setting and compare the
predicted quality with the requirements outlined by the parties involved, then a feasibility and
capability value can be presented to the parties involved and a decision to either continue with
the telemedicine event or switch to an alternative method can be made. Early evaluation of the
existing setting can prevent frustration and time loss while enhancing the response time and
satisfaction. An evaluation method, which is described here, introduces the existing gap in the
literature for predicting quality of telemedicine event settings.
Existing studies [9] indicate the importance of studying quality of the existing or future
applications.
“Use of quality improvement process not only results in improved output quality but it
also makes the production process sensitive to changes in input, output and the
environment.” [10]
Quality is important; however, the channel used to deliver this quality information is limited.
Thus, there is a need to understand the requirements of each application in its own domain and
9 of 60
define the amount and quality of information required to provide telemedicine services. It is
important to classify telemedicine applications based on their potential use by taking the medical
domains they serve into consideration. Then one can identify the IT infrastructure needs and
requirements for each of these applications in order to provide a satisfactory telemedicine
experience to end users. There are a variety of applications, devices, and communication
technologies that are used in telemedicine. The reasons for this variety are: (1) the diversity of
telemedicine locations and physical limitations of each location; (2) the application areas that
utilize the telemedicine applications; and (3) the purpose for the use of telemedicine.
Communication infrastructure technologies, such as telephone lines or leased lines, also have a
critical impact on the applications utilized in telemedicine, and hence, on the outcome. A
telediagnosis case in the psychiatry domain or a teleconsultation case in telecardiology domain
are expected to have different requirements since the information that is necessary to make a
clinical decision differs based on the application domain. Therefore, it is not reasonable to expect
similar results from the same technology when it is being used in different domains and/or for
different purposes.
With the goal of solving some of the problems introduced above, this proposed study will try
to answer the following research questions:
RQ1. What are the factors that affect the quality of information required in a telemedicine
event?
RQ2. Given the current set of objective and subjective audio and video quality
measurements, which ones – if any – provide the appropriate quality assessment for IP-based
telemedicine systems that can aid decision makers?
10 of 60
RQ3. Using the appropriate objective and subjective quality assessment measures for
telemedicine, is it possible to integrate them as a capability index in a SIP-based desktop
videoconferencing tool to provide real time feedback for decision makers regarding quality?
11 of 60
Chapter 3: Literature Review
In this chapter, an overview of literature in telemedicine as a concept, voice and video quality
measurement techniques in general, and quality measurement in telemedicine is presented. The
first section introduces a brief summary of telemedicine literature including various definitions
of telemedicine and related terminology, as well as how the technological changes affected the
evolution of telemedicine applications. Next two sections provide a summary of objective and
subjective quality measures developed and utilized in prior research and practice for voice and
video respectively. The forth section summarizes the quality measurement literature in
telemedicine and provides a list of factors used in these quality measurement studies. The last
section introduces the Session Initiation Protocol, which will be utilized in this study to develop
a videoconferencing application for telemedicine.
3.1 Telemedicine, Telehealth, and e-Health
Telemedicine has various potential uses such as clinical, educational and administrative. The
promising potential of bringing high quality service to under-served areas via telemedicine is an
example of how IT can reduce the quality-adjusted cost. Bashshur [7] notes that telemedicine
provides a solution to the problems such as access to care for large segments in the population,
continuing healthcare cost inflation, and uneven geographic distribution of quality by: (1)
enhancing accessibility to care for underserved populations, (2) containing cost inflation as a
result of providing appropriate care to remote patients in their home communities, and (3)
improving quality as a result of providing coordinated and continuous care for patients, targeted
12 of 60
and highly effective continuous education for providers, and highly effective tools for decision
support.
The evolution and growth of telemedicine is highly correlated with the developments in
communication technology and IT software development. This dependency is evident if we
quickly browse through the history of telemedicine technologies, which was categorized into
three eras [11]. All the definitions during the first era of telemedicine focused on medical care as
the only function of telemedicine. The first era can be named as telecommunications era of the
1970s [11]. Telemedicine programs during the first era ended as the government terminated
funding before these programs matured. It is important to note that “telemedicine is a product of
the information age, just as the assembly line was the product of the industrial age.” [11]. The
application in this era was dependent on broadcast and television technologies where
telemedicine application was not integrated with any other clinical data.
The second era of telemedicine, the dedicated era, started during the late 1980s as a result of
digitization in telecommunications and it grew during 1990s [11]. The transmission of data was
supported by various communication means ranging from telephone lines to Integrated Service
Digital Network (ISDN) lines. The high costs attached to the communications infrastructure that
can provide higher bandwidth became an important bottleneck for telemedicine.
The dedicated era turned into the Internet era where more complex and ubiquitous networks
are supporting telemedicine. The third era of telemedicine is supported by the technology that is
cheaper and accessible to an increasing user population [11]. The enhanced speed and quality
offered by Internet2 is providing new opportunities in telemedicine as well. In this new era of
telemedicine, the research strategies should include “…an understanding of the functional
13 of 60
relationships between telemedicine technology and the outcomes of cost, quality, and access”
beyond the assessment of technical sufficiency [11].
During the evolution of telemedicine, new terminologies were developed as the applications
and delivery options increased in variety, and the application areas expanded to almost all the
fields medicine can cover. This resulted in confusion and misidentification of what could be
termed telemedicine and what could be termed telehealth or e-health. This became even more
complicated as these fields advanced. Cybermedicine is yet another term introduced lately into
the literature.
Since the first formal definition of telemedicine by Bird in 1971, many researchers tried to
define this term in order to clarify the boundaries of telemedicine and its use. Even though the
core of these definitions is the same, telemedicine, and hence its definition, evolved dramatically
as a result of the tremendous changes experienced in the telecommunication and information
technologies. These changes were so significant that new terminologies like telehealth, e-health,
and others were introduced, and explaining the difference between telemedicine and these new
terms became important. Studies defined telehealth as a big umbrella that encompasses more
applications than the definition of telemedicine can cover [12, 13]. Table 3.1 presents a selected
list of definitions proposed in the literature for telemedicine, telehealth, and e-health.
This list of definitions gives an indication of the competing terminologies; more terminologies
may be introduced in the future as further technological advances are achieved. Therefore, it is
important to understand that the purpose of research in this field is to support the “ultimate
quest” which is to cure disease, prevent it if possible, reduce infirmity, and enhance quality of
life, as stated by Bashshur [7].
14 of 60
“Some may question whether this is telemedicine, telehealth, e-health, health informatics,
or biohealth informatics. It does not really matter what we call it or where we draw boundaries.
…collective and collaborative efforts from various fields of science, including what we call now
telemedicine is necessary. [7]”
Table 3.1. Definition of terms
Definition Ref.
Telemedicine is the practice of medicine without the usual physician-patient
confrontation …via an interactive audio-video communications system.
1971 Bird [11]
Telemedicine is a system of care composed of six elements: (1) geographic
separation between provider and recipient of information, (2) use of information
technology as a substitute for personal or face-to-face interaction, (3) staffing to
perform necessary functions (including physicians, assistants, and technicians), (4) an
organizational structure suitable for system or network development and
implementation, (5) clinical protocols for treating and triaging patients, and (6)
normative standards of behavior in terms of physician and administrator regard for
quality of care, confidentiality, and the like.
1975 Bashshur [11]
Telemedicine is the use of electronic information and communications technologies
to provide and support healthcare when distance separates the participants.
1996 Committee on
Evaluating Clinical
Applications of
Telemedicine [14]
Telemedicine is the delivery of health services when there is geographical
separation between healthcare provider and patient, or between healthcare
professionals.
2001 Miller [8]
Telemedicine is the provision of healthcare services, clinical information, and
education over a distance using telecommunication technology.
2001 Maheu [13]
Telehealth is the removal of time and distance barriers for the delivery of healthcare
services or related healthcare activities. (In this study, telemedicine is a subset of
telehealth)
2001 American Nurses'
Association [12]
E-health refers to all forms of electronic healthcare delivered over the Internet,
ranging from informational, educational, and commercial “products” to direct
services offered by professionals, non-professionals, businesses or consumer
themselves.
2001 Maheu [13]
15 of 60
It is important to note that the ultimate goal of any telemedicine effort is to improve the
well being of patients. However, since the first definition of the term, uncertainty on the
meaning of telemedicine became evident over time. This uncertainty is hindering efforts in
developing a clear definition and a classification method for telemedicine. Previous attempts [14]
to classify telemedicine were motivated by the demand for evidence of its effectiveness and
therefore, were focused on developing a strategy to evaluate the telemedicine applications and
their effects on quality, accessibility or cost of healthcare. In 1996, Committee on Evaluating
Clinical Applications of Telemedicine published a report [14] that classified clinical application
of telemedicine under six categories (p.30): (1) initial urgent evaluation, (2) supervision of
primary care, (3) provision of specialty care, (4) consultation, (5) monitoring, and (6) use of
remote information and decision analysis resources to support or guide care for specific patients.
The broad classification that will be developed for this study, which is more focused on
identifying different dimensions of telemedicine and telehealth, and which can then be used to
identify user requirements for different categories in an organized manner, is expected to have a
positive impact on the use and development of current and future applications.
3.2 Audio/Speech Quality Measures
Measuring speech quality has been researched for many years and its results were utilized in
public-switched telephone networks (PSTN). The goal of many studies was to find an objective
measure that can be used to predict the perceived subjective quality of a human subject. Many
objective and subjective speech (audio) quality measures were developed. However, most of
these measures, if not all, were originally developed for PSTN networks, which are circuit-
switch networks, and recent research indicates that these measures may not work well for packet
switched networks, such as Internet telephony [15]. In his work, Hall [15] compared three
16 of 60
different measures, which are: (1) perceptually weighted distortion measures such as enhanced
modified Bark spectral distance (EMBSD) and measuring normalizing blocks (MNB), (2) word-
error rates of continuous speech recognizers, and (3) the ITU E-model, under conditions of a
typical VoIP system. The results of his study indicate that E-model provides the highest
correlation with Mean Opinion Score (MOS) for VoIP systems. This section summarizes the
most popular quality measures with a brief description of each one.
3.2.1 Objective Measures
The most widely adopted objective speech quality measure is the Signal-to-Noise Ratio (SNR),
which compares original and processed speech signals sample by sample. The SNR is the
simplest measure possible as it measures the distortion of the waveform coders that reproduce
the input waveform [16]. The SNR is also defined as “ratio of the energy of the original target
source to the energy of the difference between original and reconstruction – that is, the energy of
a signal which, when linearly added to the original, would give the reconstruction”[17]. A
modified version of the SNR is called segmental Signal-to-Noise Ratio (SNRseg), which
decomposes the entire signal into segments and calculates the average SNR of these short
segments [16]. These measures are easy to compute; however, their disadvantages limit their use
in various scenarios. First of all, SNR measures require access to the original signal, which
eliminates them for use in real time measurements. Other drawbacks of these time-domain
measures are reported in [16, 17].
When speech quality must satisfy human listeners, there is no better way then performing
subjective tests. However, due to the cost of such evaluations, researchers often utilize
algorithms that can estimate the outcomes of these tests. These algorithms can be grouped under
perceptual models whose measures are based on human auditory perception models[16]. One
17 of 60
example for these perceptual models is Bark Spectral Distortion (BSD) [18], which is based on
the assumption that speech quality is directly related to speech loudness (the magnitude of
auditory sensation). It works well when the distortion in voiced regions represents the overall
distortion, and hence identifying the voiced regions is required [16]. An Enhanced Modified
BSD, which consists of a perceptual transform followed by a distance measure that incorporates
cognition model, was also proposed. Based on the test in [16], its correlation with subjective
results is relatively good for encoding impairments but poor on network impairments.
Another example of perceptual models for estimating subjective quality is the ITU
Recommendation P.861, Perceptual Speech Quality Measure (PSQM). The PSQM algorithm
measures the distortion experienced by a speech signal in an internal psychoacoustic domain
when transmitting through various codecs and transmission media. The transformation of
physical domain to loudness domain is used to mimic the sound perception of human subjects in
real-life situations. An extension of PSQM, named PSQM+, improves the performance of its
predecessor by adopting a simple algorithm in the cognition module. This improves the poor
performance of PSQM for temporal clipping distortions but the performance of PSQM+ for other
types of distortion is questionable [16]. There are other examples of perceptual models in the
literature such as Measuring Normalizing Block (MNB) [19]. However, each of these
measurements has its limitations on certain impairment types.
The most commonly used objective speech quality measure is the ITU’s E-model, which was
originally developed to evaluate the speech quality for PSTN. It takes into account multiple
variables such as encoding distortion, delay, jitter, echo, etc. As mentioned above, the E-model
provides the closest correlation to MOS results among other measures discussed in this section
[15]. One important advantage of this model is that it does not require access to the original
18 of 60
speech signal and hence it can be used for real-time quality assessments [16]. The E-model
generates the rating R and the formula for its computation is provided below:
AIIIRR eds +−−−= 0
0R , the highest possible rating for this system with no distortion [15, 16], is the basic signal-to-
noise ratio based on send, receive loudness, electrical, and background noise [20]. sI is the
impairment of the speech signal itself [16] and captures impairments that happen simultaneously
with the voice signal, such as sidetone and PCM quantizing distortion [20]. These two values do
not depend on the transmission over the network. dI is the impairment level caused by delay,
jitter, and echo. eI , also known as the “equipment factor”[20], is the level of impairments caused
by encoding and hence captures the degradation in quality due to compression and loss during
transmission. A stands for the advantage factor that captures the willingness of users to accept
some degradation of quality in return for the other benefits the system may provide such as
mobility in the case of cellular phones. E-model values can be directly matched with MOS
values by using a simple table provided in the standard.
3.2.2 Subjective Measures
Measuring subjective quality of speech has been an important issue since the transmission of
audio over telephone networks began. Over the years, standards emerged based on the results of
various studies carried out in various laboratories. Today, ITU recommendations are the most
widely used standards utilized by researchers while working on quality assessment methods. The
ITU-T P.800 provides numerous methods for the subjective assessment of transmission quality.
Scales for these methods are provided in Table 3.2. Results from 5-point category scales are
averaged across participant responses to provide a Mean Opinion Score (MOS).
19 of 60
Table 3.2. Speech Quality Measurement Scales provided by ITU-T recommendations
Listening Quality Scale Conversation Difficulty Scale Quality of the speech/connection Score Score
Excellent 5 Yes 1
Good 4 No 2
Fair 3
Poor 2
Bad 1
Did you or your partner have any
difficulty in talking or hearing over the
connection?
Listening Effort Scale Loudness Preference Scale Effort required to understand the meaning of the
sentences
Score Loudness Preference Score
Complete relaxation possible; no effort required 5 Much louder than preferred 5
Attention necessary; no appreciable effort required 4 Louder than preferred 4
Moderate effort required 3 Preferred 3
Considerable effort required 2 Quieter than preferred 2
No meaning understood with any feasible effort 1 Much quieter than preferred 1
Comparison Category Rating Scale Degradation Opinion Scale The quality of the second compared to the first is Score Degradation is inaudible 5
Much better 3 Degradation is audible but not annoying 4 Better 2 Degradation is slightly annoying 3 Slightly better 1 Degradation is annoying 2 About the same 0 Degradation is very annoying 1 Slightly worse -1 Worse -2 Much worse -3
Watson and Sasse [21] criticized these recommended scales, with respect to speech in real time
multimedia communications, in three main areas: (1) vocabulary of the scale labels, (2) length of
the recommended test material, and (3) conversation difficulty scale. They note that transmission
of speech in real time over IP-based networks may be carried on low bandwidth connections and
is subject to various network impairments. Hence the reason for their first criticism regarding
scale labels claims that even with training, it is likely that responses will be skewed towards the
lower end of the scale. Regarding their second criticism, they note that the recommended test
length of 10 seconds is too short in duration to understand the rapid and unpredictable changes
that can occur in speech quality due to changes in network conditions. And finally, they criticize
the binary scale by arguing that even a small amount of packet loss is likely to cause difficulty in
hearing or talking, even if it is short-lived. In one study [22] they proposed a new subjective
20 of 60
quality scale termed polar continuous quality scale, which was shown to be a reliable means of
measuring perceived quality. During their experiments, users were consistent in their use of it
and the rating trend followed the same slope obtained with MOS. One other important finding of
their study was that the perceived quality of speech is not affected with network impairments as
much as it is affected by factors such as volume discrepancies, poor quality microphones, or
echo.
3.3 Video Quality Measures
Video quality has been an important issue, first for television broadcasting applications.
Various measures have been developed for analog video systems to evaluate the effects of
transmission on the original video signal. However, today, digital video systems are replacing
these analog systems and are becoming an essential part of the U.S. and world economy [23].
Wolf and Pinson [23] states that, “To be accurate, digital video quality measurements must be
based on the perceived quality of the actual video being received by the users of the digital video
system rather than the measured quality of traditional video test signals (e.g., color bar)”. Hence,
new measurement techniques for measuring quality of digital video signals are being developed
by various researchers and organizations. This section presents a survey of existing objective and
subjective video quality measurement techniques utilized in the literature.
3.3.1 Objective Measures
Peak Signal-to-Noise Ratio (PSNR) is the most commonly used metric for measuring video
and image quality. It measures how close a sequence is compared to the original one [16]. The
calculation of the PSNR for a video sequence of K frames each having NxM pixels with m-bit
21 of 60
depth is explained below [16]. First, the Root Mean Square Error is calculated according to the
following formula:
∑∑∑= = =
−=K
k
N
n
M
m
kjixkjixKMN
RMSE1 1 1
2)],,(),,([..
1
where ),,( kjix and ),,( kjix are the pixel luminance value in the ji, location in the k frame for
the original and distorted sequences respectively. Once the RMSE is calculated, the PSNR can be
calculated using the following formula:
2
2
log.10RMSE
mPSNR =
The PSNR is usually reported in decibels (dB) [24]. An image with a PSNR of 25 dB or below
is usually unacceptable. Between 25 dB and 30 dB, perceived quality usually improves and
above 30 dB, images are often perceived as good as the original image. Markopoulou [25] notes
that the PSNR is exclusively used as a quality measure, partly because of its mathematical
tractability and partly because of the lack of better alternatives. It is has also been noted [16] that
the PSNR does not always correlate well with subjective measures.
One other commonly used metric is the “Video Quality Metric (VQM)”[23], which was
developed by the Institute for Telecommunication Sciences (ITS). It requires the extraction and
classification of features from both the original and processed video sequences similar to the
other measurement techniques. Once these features are extracted, the distance between the
original and processed video sequences are computed based on these features, and this distance
is mapped to a subjective score [23]. Compared to the PSNR, this metric offers different models
for various transmission types, such as videoconferencing or TV models. It is also possible to
identify the nature of an impairment using the VQM, which the PSNR does not provide [25].
22 of 60
There are other standard and proprietary measurement techniques that have been developed
and reported in the literature that are not mentioned here. One commonality between these
objective measures, however, is that they require access to both original and processed video
sequences. One recent study [16] proposed a new measure, which does not require access to the
original video sequence, similar to the idea of E-model for voice quality. In this new method,
artificial neural networks (ANN) are used to predict perceived voice and video quality using a
trained engine based on previous objective and subjective tests. This type of measurement
techniques enables real-time measurement of video quality and is an open area for research.
3.3.2 Subjective Measures
The ITU-R 500 is the standard for subjective assessment of image quality and has evolved over
the years to include measures for digital video transmissions as well. This standard provides
scales for single and double stimulus methods. The Absolute Category Rating (ACR) is a single
stimulus method where test sequences are presented one at a time and are rated on a category
scale after they are viewed. Usually a 5-point category scale is used as illustrated in Table 3.3.
The Single Stimulus Continuous Quality Evaluation (SSCQE) is different from the ACR in terms
of the scale it uses and the assessment process. The scale used in the SSCQE is a continuous
quality scale, illustrated in Figure 3.1, and assessment takes place in a continuous manner during
the presentation of the video sequence.
Table 3.3 ITU Video Quality Assessment Scales
5-point Quality Scale 5-point Impairment Scale Estimated Quality Score Estimated Impairment Level Score
Excellent 5 Imperceptible 5
Good 4 Perceptible 4
Fair 3 Slightly Annoying 3
Poor 2 Annoying 2
Bad 1 Very Annoying 1
23 of 60
Figure 3.1 Continuous 5-point Quality Scale
Figure 3.2 Continuous 5-point Quality Scale for DSCQS
Among the double stimulus methods, the Double Stimulus Impairment Scale (DSIS) - also
known as the Degradation Category Rating (DCR) - presents pairs of original and impaired video
sequences during the test respectively. In this case, subjects are asked to rate the impairment of
the second stimulus with respect to the reference (first stimulus) using the 5-point impairment
scale illustrated in Table 3.3. In the Double Stimulus Continuous Quality Scale (DSCQS)
method, the sequences are presented in pairs like in the DSIS and subjects are asked to evaluate
the quality of both sequences. The original sequence is included for reference; however, the
observers are not told which one is the reference sequence and the order of appearance changes
for each test. The scale used in this method is illustrated in Figure 3.2. There are other test
methods where the two sequences are shown simultaneously and the observers are asked to make
a comparison of the two based on stimulus comparison scale.
3.4 Quality Measurement in Telemedicine applications
Quality in telemedicine has been studied from different perspectives in the literature. As a
common way of assessing quality of a telemedicine event, user satisfaction was used in a large
24 of 60
number of articles. Another approach common in literature is to study the quality of the
transmitted media (image, audio, etc.). These studies have been usually limited to the
compression techniques and their effects on the perceived quality of the users. For example,
Eikelboom [26] investigated image compression of digital retinal images and the effect of
various levels of compression on the quality of the images. They compared JPEG and Wavelet
image compression techniques and concluded that; “for situations where digital image
transmission time and costs should be minimized, Wavelet image compression to 15 KB is
recommended, although there is a slight cost of computational time. Where computational time
should be minimized, and to remain compatible with other imaging systems, the use of JPEG
compression to 29 KB is an excellent alternative”.
To answer the question of which compression technique is better in a generic way, some
studies focused on quality measures. Cosman et al. [27] studied an interesting question, “How
does one decide if an image is good enough for a specific application, such as diagnosis, recall
archival, or educational use?”, and compared and contrasted three approaches to the
measurement of medical image quality: the signal-to-noise ratio (SNR), a subjective rating, and
diagnostic accuracy. They concluded that there is a need for computable measures of image
quality that can accurately predict the outcomes of image quality evaluation studies. Another
recent article on image quality by Przelaskowski [28] states that, “A numerical measure, which is
able to predict diagnostic accuracy rather than subjective quality, is required for compressed
medical image assessment.”. A new vector measure for image quality, reflecting diagnostic
accuracy was developed in this recent study.
A recent study by Rosenthal [24] focused on understanding the impact of certain variables
affecting the transmission of video over IP networks. This study is one of the few studies that
25 of 60
investigated the effects of network impairments and the codec bit rate on the quality of video on
IP networks for telemedicine purposes. This study used the PSNR and a proprietary objective
measurement technique, the Picture Quality Rating (PQR). His findings suggests that an increase
in codec bit rate and network bandwidth have positive effects on the PQR and the PSNR levels
for sequences subjected to delay and jitter impairments, but not for those in which periodic
packet drops were introduced. He concludes that with or without the existence of selected
packet-specific impairments, increases in bandwidth and codec bit rate improve the objective
quality of video transmitted over IP networks. Another study by Dev et al. [29] presented a
method to obtain an end-to-end characterization of the performance of an application over a
network by taking into account network impairments and application constraints. The
applications selected for testing were two medical education tools: (1) an image serving
application that delivers a sequence of linked images based on user movement of the mouse
cursor and, (2) an application intended to train students remotely in various surgical procedures.
They were tested on four different types of networks. They propose that the subjective
evaluations used in their study can be utilized to predict the conditions under which the
application will be running based on predefined requirements.
3.5 Session Initiation Protocol (SIP) for Internet-based Videoconferencing
A recent Voice over IP signaling standard approved by IETF called Session Initiation Protocol
(SIP) is attracting telemedicine application developers due to its ability to handle voice, video, as
well as multimedia communications over IP-based networks and with a native security
mechanism built-in. Until the introduction of SIP, the only standard available for
videoconferencing applications was the H.323 family of ITU standards. However, the H.323
standard does not lend itself to integration with web and messaging, and does not have a native
26 of 60
security mechanism. With the increasing importance of security in the medical field, additional
effort and integration with other security mechanisms is necessary to provide authentication and
authorization. A brief technical summary of SIP is provided in this section.
Session Initiation Protocol (SIP) is the Internet Engineering Task Force (IETF) standard for IP
Telephony [30]. It is an application layer control protocol that can create, modify, and terminate
multimedia sessions. Different types of entities are defined in SIP: user agents, proxy servers,
redirect servers, and registrar servers. Figure 3.3 shows a typical SIP session including these
entities.
Figure 3.3 Entities in Session Initiation Protocol
In a typical SIP session, user agents first register to a registrar and forward their request to a
SIP proxy, which is responsible from discovering the location of the requested destination so that
two user agents can negotiate their session description [31]. Figure 3.4 illustrates a simple call
flow with single proxy server.
27 of 60
Using SIP for telemedicine is relatively new compared to the previous H.323 standard that has
been the dominant protocol since the early 1990s. SIP was first introduced in 1999 and in 2002 a
revised version of the protocol was published. Since then, it became the commonly used protocol
for Internet telephony and videoconferencing applications. However, use of the SIP for
telemedicine has been slow. A small number of studies [32, 33] have mentioned how the SIP can
be used in telemedicine applications.
Figure 3.4 SIP Call Flow
28 of 60
Chapter 4: Proposed Study
The proposed study consists of three stages. During the first stage of this study, a telemedicine
taxonomy will be developed to classify telemedicine applications based on their potential use
and by taking the medical domains they serve into consideration. The goal is to identify the IT
infrastructure needs and requirements for each of these applications in order to provide a
satisfactory telemedicine experience to end-users. The second stage of this study will first review
the available objective and subjective audio/video quality measurements in the literature and
select appropriate measures for telemedicine environments keeping the proposed telemedicine
taxonomy dimensions in mind which, as has been discussed, can play an important role in the
decision making process during telemedicine events. Experiments will then be conducted with
physicians to identify the proper subjective audio/video quality required to make a decision in a
telemedicine event for specific application area and purpose for IP-based telemedicine networks.
The findings from these experiments will be used to define indices for telemedicine event
capability. The last stage will be the development and testing of a videoconferencing tool in a
telemedicine environment incorporating the developed index. The next three sections explain
each stage in more detail. Later, the research methodology, potential implications, and a timeline
for this study are presented.
4.1 Stage 1 – Development of Telemedicine Taxonomy
This study proposes five dimensions that will help to categorize different telemedicine efforts.
These dimensions were derived from a survey of literature and reflect a combination of various
29 of 60
classification schemes proposed in early studies. The first subsection will provide a description
of these five dimensions: Application Purpose, Application Area (Domain), Environmental
Setting, Communication Infrastructure, and Delivery Options. The next subsection will explain
how these dimensions are related in the taxonomy.
4.1.1 Definition of Taxonomy Dimensions
Application Purpose refers to the purpose of communication and is categorized under two
main groups: Clinical and Non-clinical [34]. In addition to the six categories proposed in [14], it
is stated that clinical purpose covers diagnostic and treatment (surgical and non-surgical)
components of patient care as well. Telemedicine not only provides a tool that can be utilized by
professional medical technicians, but it is slowly moving in the direction where a patient can be
treated through electronic channels without the intervention of a local professional. Hence Table
4.1 extends the previous classification and presents a list of clinical telemedicine application
purposes.
Non-clinical purpose includes medical education, administrative meetings, and does not
involve decisions about care for particular patients. Table 4.2 shows non-clinical purposes that
will be utilized in this taxonomy. This study will not focus on the non-clinical applications of
telemedicine.
Table 4.1. Clinical application purpose
Triage
Diagnostic
Non-Surgical Treatment
Surgical Treatment
Consultation
Monitoring
Provision of specialty care
Cli
nic
al
Supervision of primary care
Table 4.2. Non-Clinical application purpose Professional Medical Education
No
n-
Cli
ni
Patient Education
30 of 60
Research
Public Health
Administrative
Application Area refers to the domains in the medical field. The domains listed in Table 4.3
represent a high-level example list of medical domains and can be expanded as necessary. The
reason for including medical domains as a dimension in this taxonomy is to point out the domain
specific differences that affect the information required and gathered through communication
channels. For example, the information required to make a diagnostic decision may differ
significantly in the cardiology domain compared to the psychiatry domain. Information can be in
various formats, such as text, audio, and video, and the application purpose and application area
defines the amount and type of information required to make a clinical decision. Based on a
review of the current literature, no studies have identified the application domain as a
classification criterion for telemedicine efforts.
Table 4.3. An example list of application areas
Neurology Home Care
Microbiology and
Immunology
Cardiology Ophthalmology Mental Health
Pathology Dermatology Otolaryngology
Radiology Rheumatology Emergency Room
Pediatrics Surgery Obstetrics and
Gynecology
Environmental Setting refers to the type of physical environment that the physician or the
patient will be using during the telemedicine event. These settings can be dramatically different
and can range from a patient at a primary care hospital to a mobile patient, or a professional at a
31 of 60
fully equipped hospital to a professional being reached at home. Considering the physical
environment attributes of medical videoconferencing identified in [35], a difference in the
quality of the information transferred between two ends is inevitable regardless of the
communication channel, as long as the two sites involved are not identical in terms of
environmental setting. These physical attributes are usually related to the characteristics of the
physical location. Therefore, environmental setting was included in this taxonomy as the third
dimension. Table 4.4 illustrates some possible telemedicine settings that can be encountered
during a telemedicine event.
Table 4.4. Environmental settings
Location 1 Location 2
Large Hospital Large Hospital
Small Hospital Small Hospital
Outreach Clinic Outreach Clinic
Health Center Health Center
Home Home
Mobile Mobile
LeRouge et al. [35] has provided a list of physical environment attributes for
videoconferencing. These attributes are facilitating décor, quite/soundproof environment, privacy
of the exam room, space and room size, and room lighting. Some of these attributes are very
specific to videoconferencing. However, some of them can be generalized to various delivery
options. The main idea is to be able to provide a meaningful description of the physical setting
and environmental values with regards to the telemedicine event. The personal preferences and
skills of patients and physicians should also be taken into account in order to assess the
feasibility of a telemedicine system use by the parties involved. Some patients may be capable of
performing related tasks only through the help of others as noted by Kaufman et al.[36].
32 of 60
Therefore, setting attributes should also include the presence of assistive personnel and their
relevant skills.
Communication Infrastructure refers to the channels that are available for the transmission,
emission, or reception of data or information in any format. The communications infrastructure
can be based on wired networks, radio waves, fiber optic lines, and many other forms of
telecommunication technologies. Each of these technologies comes with their own limitations
and advantages. These need to be considered carefully before a telemedicine event occurs in
order to understand the possible limitations, available resources, and how these various factors
can affect the event. Table 4.5 illustrates communication infrastructure possibilities as a function
of the telecommunication technologies that can be used in a telemedicine event and the
bandwidth they provide.
Table 4.5. Telecommunication technologies and their bandwidth capabilities
Technology Bandwidth
Dial-up 33.6kbps
DSL 64kbps – 1.544Mbps up
128kbps-1.544Mbps down
Cable Modem 200kbps – 2Mbps
Wir
ed
High Speed 10/100Mbps to 1Gbps
802.11b 11Mbps
802.11g 54Mbps
802.16a 70Mbps
3G 144kbps-1Mbps Wir
eles
s
2G >128kbps
Delivery Options is the final dimension of the taxonomy and it refers to the applications
provided to conduct a telemedicine event by fully complying with the requirements generated
based on the other dimensions explained above, as well as the requirements posed by the
professionals and patients. Even though various delivery options exists in today’s world of
advanced technological innovations, delivery options in telemedicine can be categorized under
33 of 60
two main groups [13, 37]: (1) synchronous and (2) asynchronous. Synchronous and
asynchronous communication refers to information transactions that occur among two or more
number of participants simultaneously and at different points in time respectively [38]. Table 4.6
presents some examples of these delivery options based on these two main categories. The
chosen delivery options can have an important affect on the final quality of the telemedicine
event and the outcome.
Table 4.6. Delivery options
Synchronous Asynchronous
Audio Telephone,
Audioconferencing
Voicemail
Video Videoconferencing Video/Audiostreaming
Data Instant Messaging,
Shared Electronic
white boards
Paging, Fax, Email, Web
Pages, Store and
Forward, Web Forums
4.1.2 Interaction of Proposed Dimensions
These five dimensions can be grouped under two main themes. The first two are dimensions
strictly related to the medical field. Therefore, they are in the medical dimensions group. The rest
form a group of various dimensions (environmental setting, infrastructure, delivery options),
which are related to the way healthcare is delivered. This group in termed “delivery dimensions”
since all the dimensions have a common goal, that is, to support the medical dimensions’ needs
in order to deliver health services. A simple picture of the taxonomy is presented in Figure 4.1.
As Figure 4.1 illustrates, there is an additional group termed organizational dimension in the
taxonomy that is pervasive to all healthcare organizations and their activities. This group consists
of various aspects of the organization such as human resources and IT management. These issues
will not be addressed in this study since the main focus is on the higher levels of the taxonomy.
34 of 60
However, future studies will be conducted to more fully understand the effects of the
organizational dimension on the final outcome of the telemedicine event.
Figure 4.1. Telemedicine taxonomy
Two other important dimensions that were excluded from this study, but have significant
importance for future telemedicine efforts, are the cost dimension and the legal issues dimension,
which we grouped under the organizational dimension. The taxonomy excluded these two
dimensions so as to concentrate on the core dimensions of telemedicine and to provide a simple
way of identifying varying efforts. These core dimensions will eventually affect the cost and
legality of the telemedicine applications.
Legal issues and cost have been discussed and are very important in the healthcare industry.
One study [39] reported how laws regarding telemedicine are being enacted by different states
35 of 60
and how the cost of telemedicine applications is affecting the decision making process. Further
studies are needed to understand how the core dimensions can make a difference on the decision-
making processes of lawmakers and payers.
4.2 Stage 2 – Assessment of Objective and Subjective Quality Measures for Telemedicine
Based on the five core dimensions identified in the previous stage as factors affecting
telemedicine events, this phase of the study proposes to integrate specific technical factors into a
measurement technique, which can then be used to predict telemedicine capability within a
specific setting, potentially real-time. While doing so, further dimensions will be treated as
constants in laboratory experiments to identify the effects of the selected technical factors on the
telemedicine capability of a setting. This study will focus on two application areas
(ophthalmology and cardiology) and one application purpose within each application area. The
delivery mechanisms will be videoconferencing and audioconferencing over IP networks using
SIP. The next subsection provides a brief overview regarding the technical factors that will be
included in this study. The experimental test-bed details are presented next. This section
concludes with a discussion on objective and subjective tests that will be conducted.
4.2.1 Technological Factors Affecting Quality in Telemedicine over IP Networks
The Internet Protocol (IP) is a packet-based network protocol that enables the transmission of
data packets, from one end system to another based on address information carried in the data
message. It can be used with two different transport layer protocols: Transmission Control
Protocol (TCP) and User Datagram Protocol (UDP). TCP is a connection oriented, reliable
transport protocol designed for data transmission. However, it is not suitable for real-time
applications because the retransmission of packets may cause high delay and increase delay
36 of 60
variation, which can significantly affect the quality of real-time applications. There are other
problems associated with using TCP for real time applications, which are not mentioned here.
Hence, real-time applications use UDP, a connectionless transport layer protocol that does not
guarantee the arrival of a packet.
Real-time multimedia applications use two protocols that run over UDP: the real-time transport
protocol (RTP) and the RTP control protocol (RTCP) [40]. RTP is designed to carry data that
has real-time properties. RTCP is designed to monitor the quality of service and to convey
information about the participants in an on-going session. Even though RTP is the commonly
used protocol for real-time applications; RTP, by design, does not provide any mechanism to
ensure timely delivery or provide other quality-of-service guarantees, but relies on lower-layer
services to do so. Therefore, real-time multimedia applications are vulnerable against any
impairment that can happen in the lower layers of the network. The next subsection presents
these impairments followed by two additional subsections that provide an overview of audio and
video codecs available for multimedia applications.
4.2.1.1 Network Impairments
Since the Internet was not designed for real-time applications and only provides best effort
service, carrying real-time applications over the Internet presents a number of challenges. These
include lack of guarantee in terms of bandwidth, packet loss, delay, and jitter, all of which affect
the quality of voice and video over the Internet as reported in various studies [20, 24, 41].
Packet loss – Unlike circuit-switched networks, in packet switched networks no physical end-
to-end circuit is established [41]. Packets are transmitted from the source to the destination over
the Internet by the help of routers. Arriving packets to a router are first queued and then
transmitted one-by-one, usually with the first in first out (FIFO) policy. However, if the queue
37 of 60
(buffer) of a router is already full when a packet arrives, then this packet is dropped and
consequently, is not transmitted to its destination. Network congestion occurs when routers start
dropping packets. The effects of packet loss on real-time multimedia applications are critical.
During a voice conversation, human cognition can handle only a certain amount of packet loss. If
too many packets are lost, the voice becomes incomprehensible. For video the effect of extensive
packet loss is more acute. If packet loss happens, some parts of the video cannot be decoded and
displayed. It is easy to understand the effects of packet loss on the perceived quality of voice and
video applications. Researchers have developed various techniques to overcome, or at least ease,
the effects of packet loss on applications; some of these techniques are discussed in [16, 41].
Packet Delay – End-to-end packet delay is typically caused by a number of components [41]:
(1) codec delay is the time it takes to convert analog voice to digital data and vice versa, (2)
serialization delay is the time it takes to place a packet on the transmission line, and is
determined by the speed of the line, (3) queuing delay occurs at the various switching and
transmission points of the network, such as routers and gateways, where voice packets wait
behind other packets waiting to be transmitted over the same outgoing link, and (4) propagation
delay is the time required by signals to travel from one point to another, which is fixed as
determined by the speed of light. The effects of large packet delay become even more severe for
voice communications, as timing is an important characteristic of voice. This is especially true
when an interactive conversation is being transmitted on the network; delay effects can turn the
conversation into a half-duplex mode where one speaks and other listens and pauses to make
sure the other is done. Echo is another unwanted effect of packet delay. Various techniques were
also developed to overcome these problems over packet-based networks since in current circuit-
switched networks the primary source of delay is propagation delay.
38 of 60
Packet Delay Variation (Jitter) – Packet delay variation refers to the variation or gaps
between packet arrival times at the receiving buffer. This occurs due to the variability in queuing
and propagation delays. To eliminate the effects of this variation, usually a playout buffer is
used. The receiver holds the first packet in the buffer for a specific amount of time before
playing it out. Therefore, a small jitter is tolerable but large fluctuation causes difficulty in
decoding and playback, and cause quality degradation. The effects of delay variation are similar
to the effects of packet loss. Large variation in delay will result in some packets arriving long
after the playout time scheduled for them based on the buffer size. The receiver will discard
these packets since they are out of order.
4.2.1.2 Audio Codecs
Audio data does not contain as much redundant data as video data and hence, it is harder to
compress. Speech coding techniques can be categorized in three groups: (1) waveform coding,
(2) source coding, and (3) hybrid coding. They are used at high, low, and moderate bit rates
respectfully.
Waveform encoding is almost a lossless coding scheme since the resultant signal is very close
to the original one. The simplest form of this coding is Pulse Code Modulation (PCM). Many
codecs try to predict the value of the next sample from the previous ones and an error signal is
computed from the original and predicted signals. Another method that utilizes this error signal
for encoding is called Differential Pulse Code Modulation (DPCM). Other examples of
waveform coding are sub-band coding (SBC) and discrete cosign transformation (DCT). Source
codecs implement the idea of understanding how the speech signal is produced and sending
certain parameters of the signal to the decoder. Hybrid coding is a mix of these two techniques.
Analysis-by-Synthesis (AbS) coding is the most famous type of hybrid coding. Using these
39 of 60
coding techniques, a large number of audio codecs have been developed over time and below is
an overview of some of these codecs.
The ITU-T G.711 (PCM at 64Kbits/s) codec, also known as µ−law, is a variant of PCM codec,
which is commonly used in North America and Japan for digital telephony. It does not require
much CPU power and it provides good quality with simplicity. However, sometimes the
resulting bit rate may be higher compared to other codecs. Two other public standards by the
ITU-T for compressing voice data are G.721 (ADPCM at 32 Kbits/s) and G.723 (ADPCM at 24
and 40 Kbits/s). They both use Adaptive DPCM (ADPCM), which utilizes an adaptive prediction
and quantization scheme to increase the performance of DPCM coding. Another application of
ADPCM is the DVI codec, a recommendation from the Interactive Multimedia Association
(IMA) Digital Audio Technical Working Group. It compresses 16-bit linear PCM samples into
4-bit samples, yielding a compression rate of 4:1.
Finally, GSM stands for Global System for Mobile Communications and is a variant of LPC
called RPE-LPC (Regular Pulse Excited - Linear Predictive Coder). It is a European standard
originally for use in encoding speech for satellite distribution to mobile phones. Its use results in
very good compression with good quality output but is very costly in terms of performance.
4.2.1.3 Video Codecs
Video streaming is a resource and bandwidth intensive application type [24] that requires the
video to be compressed before transmission to utilize the existing resources efficiently without
saturating them. The goal of video compression is to remove the redundancy in the original
source signal, which will eventually reduce the amount of bandwidth required for transmission
[16]. There are three types of video compression coding, they are: (1) lossless coding, (2) lossy
coding, and (3) hybrid coding.
40 of 60
Lossless coding (e.g. Huffman coding) is a reversible process with the perfect recovery of
original data. Therefore no quality degradation exists due to lossless coding. Lossy (e.g. Source
coding) coding is an irreversible process in which the recovered data is degraded. Hybrid (e.g.
JPEG) coding is the one used by most multimedia systems and it combines both lossy and
lossless coding. H.261, H.263, MPEG-1, MPEG-2, and MPEG-4 are the most popular video
codec standards. In this study, H.261 and H.263 will be used as video codecs during the
experiments.
H.261 is an ITU video-coding standard originally designed for ISDN lines. Its output bit rates
are multiples of 64Kbits/s. It is a constant-bit-rate codec with no constant quality and variable-
bit-rate encoding meaning that the encoding algorithm trades the picture quality against motion.
Therefore, to obtain higher quality, it is suitable to use this codec for scenes having a small
amount of motion. It supports only two resolutions: (1) Common Interchange Format (CIF),
which is 352x288 pixels and, (2) Quarter CIF (QCIF), which is 176x144 pixels.
H.263 is also an ITU video-coding standard originally designed for low bit rate
communications (less than 64Kbits/s – this limitation has now been removed). It uses a similar
coding algorithm with H.261 with some changes to improve the performance and error recovery.
As a result of these improvements, H.263 output stream is more resilient to packet loss, which
makes it very attractive for real-time communications over the Internet. It supports five
resolutions. In addition to CIF and QCIF, it provides resolution at SQCIF (128x96 pixels), 4CIF
(704x576 pixels), and 16CIF(1408x1152 pixels).
4.2.1.4 Working Around Impairments: Application and Network Level Quality of Service
Previous subsections summarized how certain network impairments can affect real-time
applications on IP-based networks. Currently two approaches exist to provide Quality of Service
41 of 60
(QoS) for real-time applications: (1) QoS at the application level and, (2) QoS at the network
level. Application-level QoS provides quality improvements without requiring changes of the
network infrastructure. In initial implementations of real-time applications, incoming data was
played out either immediately upon arrival or after a fixed delay. Since both methods lead to
significant signal degradation under high delay variance conditions, adaptive playout techniques
were introduced to make real-time applications more tolerant of delays and delay jitter and to
dynamically adjust the playback point [25]. Researchers have also studied reconstruction
methods at the receiver to compensate for packet loss in real-time applications. Various error
concealment methods for audio are summarized in Table 4.7.
Table 4.7 Error Concealment Techniques for Audio [41]
Name Technique
Silence
Substitution
Substitutes lost packet with silence. Causes voice clipping. Deteriorates voice quality
when packet size is large and loss rate is high.
Noise
Substitution
Substitutes lost packet with background noise. Better than silence substitution. Relies on
the ability of human brain to repair the received message if there is background noise.
Packet
Repetition
Substitutes the lost packet with the replays of the last correctly received packet.
Packet
Interpolation
Substitutes lost packet with a replacement packet produced based on the characteristics
of the packets in the neighborhood of the lost one (a.k.a. waveform substitution).
Frame
Interleaving
Reduces the effect of packet loss by interleaving voice frames across different packets.
Error concealment techniques for video try to recover the corrupted data by exploiting the
spatial and temporal redundancies of the video data [43]. The spatial-domain error concealment
algorithms interpolate the lost area using spatially neighboring image data and since these
algorithms recover an isolated lost macroblock (MB), which is made by the coded modification,
and provide good performance. On the other hand, temporal-domain error concealment schemes
utilize the previously decoded image data to recover the lost MBs where they estimate motion
42 of 60
vectors (MVs) for the lost MBs, and compensate for the lost MBs with the estimated MVs. Some
error concealment techniques are provided in Table 4.8.
Table 4.8 Error Concealment Techniques for Video [16]
Name Technique
Block
Replacement
Replaces the lost areas with the corresponding areas of the previous frame or field. Works quite
well in still parts of the picture but fails in areas where there is a lot of motion.
Linear
Interpolation
Replaces the lost areas with the linearly interpolated values calculated from the neighboring areas
of the same frame. Assumes that surrounding areas are correctly received and works well in a
uniform surface.
Motion
Vector
Replaces the lost areas with pixel blocks of the previous frame shifted by the average motion vector
of the neighboring blocks. Performance drops when the blocks have different motion vectors.
Hybrid
Technique
Uses both spatial and temporal redundancies to predict the lost MBs.
Development of network Quality of Service (QoS) features was partially motivated by the fact
that real-time traffic (as well as other applications) may sometimes require priority treatment to
achieve good performance on the Internet [44]. QoS can be achieved by managing router queues
and by routing traffic around congested parts of the network. The IETF proposed two models to
provide Internet QoS: Integrated Services (Int-Serv) [45] and Differentiated Services (Diff-Serv)
[46].
In IntServ, resources are reserved for each flow through the network using the Resource
ReSerVation Protocol (RSVP) [47]. When an application requests a specific QoS for its data
stream, the RSVP can be used to deliver the request to each router along the path and to maintain
router state to provide the requested service [44]. Current implementations of IntServ allow a
choice of Guaranteed Service [48] or Controlled-Load Service [49]. In Guaranteed Service
agreements, peak traffic is limited by a certain rate and packet size is restricted to be in a specific
range at all times. Based on these limitations and restrictions, a bandwidth requirement is
declared, and sufficient bandwidth is reserved on each hop to satisfy all the requirements of the
flow. If each node and hop can accept the service request, the flow should be lossless.
43 of 60
Controlled-Load Services [49] on the other hand uses only traffic specifications and does not
define any service request specifications. Hence, flows using this service should experience the
same performance as they would in a lightly loaded “best-effort” network.
Several reasons, including scalability problems, were reported for not using IntServ for IP-
based real time applications in [44]. To overcome these problems, a simpler framework and
architecture to support DiffServ was developed [46]. The primary goal of differentiated services
is to allow different levels of service to be provided for traffic streams on a common network
infrastructure [44]. In the Diff-Serv model, the QoS information is carried in a band within the
packet in the Type of Service (TOS) field in the IPv4 header or the Differentiated Service (DS)
field in IPv6 [50]. The TOS or the DS field is used to indicate the need for low-delay, high-
throughput, or low-loss-rate service. Backbone routers provide per-hop differential treatments to
different service classes as defined by Per Hop Behavior (PHB) that describes the forwarding
behavior a packet receives at a given network node. Despite the fact that DiffServ is a simpler
mechanism that provides performance improvements compared to “best effort” IP networks, it
has some shortcomings; it relies on ample network capacity for expedited forwarding traffic and
makes use of standard routing protocols that make no attempt to use the network efficiently [44].
One other type of network level QoS technique is provided by the Multiprotocol Label
Switching (MPLS) architecture offering IP networks the capability to provide traffic engineering
as well as a differentiated services approach to voice quality [44]. In IP networks, as packets
travel from one router to another, each router independently chooses a next hop for the packet,
based on its analysis of the packet's header and the results of running the routing algorithm [51].
Analysis of the packet header identifies the forwarding equivalence class (FEC) of a packet and
routing algorithm maps this FEC to a next hop. This is repeated at each hop until the packets
44 of 60
reach their destination. Notice that no distinction can be made between the packets with the same
FEC value in conventional IP networks. In MPLS, the assignment of a particular packet to a
particular FEC is done just once, as the packet enters the network [51] and hence the MPLS
separates routing from forwarding [44]. This FEC value is encoded as a short fixed length value
known as a "label" and when a packet is forwarded to its next hop, the label is sent along as well.
DiffServ and the MPLS can be combined to provide better QoS for real-time applications.
Regardless of the techniques developed for network QoS, the implementation of these
techniques is limited and available to only a small group of users. The reasons for this slow
adoption of network QoS techniques is discussed in [52] extensively and the conclusions of this
study suggests that the QoS community and researchers need to reach out and include business,
systems control, and marketing expertise in their efforts to get IP QoS meaningfully deployed
and used.
4.2.2 Experimental Test-bed
Previous sections identified the important factors that play a role on the perceived and
measured quality of voice and video over the Internet. This study will setup a test-bed where
these factors will be individually defined as variables within the experiments. Objective and
subjective evaluations will measure the effects of variance in these variables on the decisions
made for the selected telemedicine purpose. Figure 4.2 illustrates a summary of the application
and network components and how the identified factors can be positioned between these
components.
45 of 60
Figure 4.2 Application and Network Components
Figure 4.3 illustrates the simple test-bed setup for the experiments. There will be two
computers with video cameras to capture voice and video for an ophthalmology scenario and a
stethoscope for the cardiology scenario. Several software tools will be utilized in this test-bed to
capture audio and video, to transmit the captured information on the network, and to manipulate
and measure network impairments during the transmission.
Figure 4.3 Experimental Testbed
Hardware and equipments available for experiments of this study are listed in Table 4.9. Table
10 presents a list of tools that will be utilized during the experiments for manipulating network
46 of 60
parameters and for capturing and storing video and audio sequences. Network monitoring tools
presented in this table are selected from [53] where a comprehensive list of products can be
found.
Table 4.9 List of Available Hardware for the Testbed
Network Equipments
Make Model
Hub 1 SMC EtherEZ Hub 3605T 10Mbps
Hub 2 D-Link Hub DSH-5 100Mbps
Router 1 D-Link 2.4Ghz Wireless Router DI-614+
Router 2 Linksys EtherFast Cable/DSL Router BEFSR41 ver.3
Computers, Laptops, and Servers
Operating System CPU RAM
Laptop 1 Windows XP Home Edition 2.0 GHz 256 MB
PC 1 Windows XP Professional 1.8 GHz 256 MB
PC 2 Windows 2000 Professional 1.8 GHz 256 MB
PC 3 Windows 2000 Professional 1.8 GHz 256 MB
Proxy Server Linux Red Hat 7.3 1.8 GHz 256 MB
Table 4.10 Software List for the Testbed
Name Type Description
JMStudio Java based
Media Player
The Java Media Framework API (JMF) enables audio, video and other time-
based media to be added to applications and applets built on Java technology.
JMStudio is an application developed based on JMF which can capture, play,
record audio and video files. It can also receive and play RTP Media Streams.
Ethereal Packet
Capture Tool
It is a free network protocol analyzer for Unix and Windows that provides
features to examine data from live network or from a capture file on disk.
Distributed
Internet
Traffic
Generator
(D-ITG)
Network
Monitoring
Tool
D-ITG is a platform capable to produce traffic (network, transport and
application layer) and accurately replicate appropriate stochastic processes for
both IDT (Inter Departure Time) and PS (Packet Size) random variables
(exponential, uniform, cauchy, normal, pareto, etc.).
Netperf Thruput Tool
It provides general measures of performance of a network such as latency
between request and response of generic transactions across a TCP/IP
network. It is maintained by HP.
Bing
Pathrate
Pipechar
Bandwidth
Estimation
Tool
Bing is a point-to-point bandwidth measurement tool (hence the 'b'), based on
ping.
Pathrate measures end-to-end capacity.
Pipechar is a tool for reporting dynamic network characteristics in particular
the bottleneck bandwidth.
Traceping Ping It measures the packet loss to nodes along a route.
47 of 60
4.2.3 Experimental Procedures
The initial step for the experiments is to identify the telemedicine application area and purpose
under consideration. The two telemedicine application areas selected for this study are
ophthalmology and cardiology. The application purpose is currently restricted to diagnosis. The
next step is to obtain sample exam sequences for the selected application area and purpose. For
the experiments of diagnosis in ophthalmology application area, video sequences of an eye
examination session will be necessary. There are two possible ways of obtaining this video
sequence. One way is to request a readily available video sequence from the National Library of
Medicine (NLM) and feed this video sequence into the experimental test-bed for objective and
subjective measurement collection. Another way is to record a live session using the listed
devices in the previous section and use this self-obtained video sequence for testing purposes.
For the experiments of diagnosis in cardiology application area, audio sequences of heart beats
will be required. The first possible way of obtaining this audio sequence is to request it from a
source like NLM. Another possible way is to use electronic stethoscopes to capture the audio
sounds of heartbeats and directly feed this to the computer as audio input.
Once the audio and video files are captured and ready for use in the test-bed, the next step will
be to feed these files into the test-bed while manipulating the factors identified to have an effect
on the quality of degradation for voice and video over IP-based networks. Factors that will be
manipulated during the experiments are audio/video codecs, packet loss, packet delay, packet
delay variation, and bandwidth. As a result of these experiments, a set of distorted signals will be
collected and stored for future use in subjective tests. During the transmission of the original
signal over the test-bed, data for objective measurements will be collected. Ethereal will be used
as the main tool to monitor network traffic and to capture traffic on the network for further
48 of 60
packet and traffic analysis. In this test-bed, the sender (patient-end) will control the selection of
the codec; the router will control the loss rate, loss pattern, delay, and delay variation; and the
receiver (physician-end) will store the received signals, decode them, and use concealments
methods selected by the application in use to recover lost packets.
At this point, all values necessary to evaluate objective measures will be collected and stored.
The next step is to measure objective quality. Among the several objective quality measures
introduced in Section 3.2.1, the ITU-Emodel will be used for measuring audio quality for the test
sequences. This measure was chosen based on evidence that it is the only available measure that
does not require the original signal for calculations and it correlates well with the MOS values.
As mentioned in section 3.3.1, there are no objective measurements available other than the ones
proposed in [16] that can measure objective quality of a video sequences in the absence of the
original sequence. However, the VQM explained in section 3.3.1 can be used for this study to
measure the video quality of the distorted signals since it will be available as the original signal.
These will not affect the results of the final real-time tool for quality measurement because the
new tool will rely on the previously collected values in the database for assessment. The
objective measurement values calculated in this step will be stored in a database with the values
of the impairment factors. Table 4.11 illustrates the predicted fields of the quality database for
audio and video quality excluding objective measurement value field and MOS field.
Table 4.11 Fields of the proposed Quality Database
For Audio
Quality
Packet
Loss Rate
Consecutive
Lost Packets
Max.
Jitter
Max. Packet
Delay
Available
Bandwidth
Audio Codec
For Video
Quality
Packet
Loss Rate
Consecutive
Lost Packets
Max.
Jitter
Max. Packet
Delay
Available
Bandwidth
Video
Codec
Bit
Rate
Frame
Rate
49 of 60
Subjective measurements will follow the objective measurements. Selection of test subjects
will be based on availability. First, invitations to physicians familiar with telemedicine
applications in Loma Linda University Medical School will be sent. Based on the response rate,
if further recruitment of subjects is required, final year medical students will be recruited for
subjective tests. The ITU recommends that 4 to 40 test subjects be used for completing
subjective quality tests. Subjective tests will involve at least the minimum required number of
subjects. Since the use of subjective measurements for telemedicine related to voice and video
still is an immature area of research, this study will utilize different subjective measurement
techniques discussed in 3.2.2 and 3.3.2. Test subjects will be asked to view the recorded sessions
and provide their opinion for the questions asked in the standard. MOS scores of the subjective
test results will also be calculated and added as a new field to the quality database illustrated in
Table 4.11. The last step in this stage is to find a correspondence between objective and
subjective measures in the database. The quality database will be the final outcome of the second
stage in this study. A summary of the second stage is illustrated in Figure 4.4 below.
Figure 4.4 Process flow for the second stage of the study
50 of 60
4.3 Stage 3 – Development of SIP-based Videoconferencing Tool with Real-time
Telemedicine Capability Index
In this last stage, an existing SIP videoconferencing client, the CGUsipClient, will be enhanced
with a simple quality indicator based on the results obtained in the previous stage of this study.
The CGUsipClient was developed by the Network Convergence Laboratory (NCL) to provide
low-cost, low-bandwidth videoconferencing. It is a java-based client that utilized the Java Media
Framework (JMF) Sun libraries for voice and video handling. The video codecs supported by
this client are H.261 and H.263, the latter being the default codec for video communications. The
audio codecs supported are G.723, DVI, GSM, and G.711 (µ-law); the user can change the
default audio codec. Detailed information regarding the CGUsipClient architecture can be found
at [54]. Another study [32] reported the many useful features of this client for use in
telemedicine and how it can add value in the telemedicine setting.
The CGUsipClient will feature new user interface windows that will provide real-time quality
information, derived from the objective measures that will be collected in real-time during a
telemedicine session and the calculation of their correspondence to subjective measures using the
quality database. In order to achieve this goal, several improvements are required on the client.
First, a real-time objective measure collection module will be incorporated with the existing
client. This module will collect packet loss, delay, bit rate, and frames per second information
from the network. Second, a new module for calculating a correspondence to these objective
measures in terms of a subjective MOS value will be developed and incorporated into the
CGUsipClient. Finally, two graphical user interfaces (GUI) will be developed. The Session
information GUI will collect information regarding application area, purpose, and delivery
option (only audio, audio and video) before the session begins as part of objective measures.
51 of 60
Based on this information, relevant quality database will be used for calculations. The
Telemedicine Capability Index GUI will provide the outcomes of the correspondence
calculations in real-time to the user. A snapshot of the predicted Telemedicine Capability Index
GUI is provided in Figure 4.5. One final improvement can be to add a module to obtain instant
evaluations from the users and add these values to the relevant session database for future use.
Figure 4.5 GUI for Telemedicine Capability Index Indicator
4.4 Research Methodology
This study focuses on three research objectives. First, provide a telemedicine taxonomy as a
method to classify different telemedicine events while defining them based on five dimensions.
Second, evaluate the quality of information necessary to make medical decisions under
fluctuating conditions of network and application parameters. Third, develop an artifact that
provides a real-time quality and capability index for users based on evaluation results. To meet
these research objectives, a hybrid research methodology is utilized.
This study will first define a new taxonomy based on the exiting definitions and theories for
telemedicine after an extensive literature review. Later, an evaluation study will be conducted to
complete the second stage. There are two types of evaluation – formative and summative. As
described in [55] (p.208) “Formative evaluation is intended to help in the development of the
52 of 60
programme, innovation or whatever is the focus of evaluation. Summative evaluation
concentrates on assessing the effects and effectiveness of the programme.” In the context of this
study, the formative evaluation will address the effects of impairments caused by the network or
application parameters on perceived quality and hence, the medical decision making capability
of a physician. The evaluation will utilize two different data collection techniques – objective
data collection (quantitative) and subjective data collection (qualitative). Quantitative methods
will be used to analyze the results of the experiments.
The results of the formative evaluation will then be used to build an artifact. As stated by
Hevner et al. in [56] “Design science,…, creates and evaluates IT artifacts intended to solve
identified organizational problems.” They also state that the goal of behavioral science research
is truth whereas the goal of design science research is utility. “Truth informs design and utility
informs theory [56].” Utility is also one of the four features of evaluation research design as
reported in (p.209) [55]. Hence, both research methodologies are expected to produce an
outcome that is useful to its intended audience. The audience of this study is users of
telemedicine systems. Using formative evaluation and design research methodologies together to
build an artifact will provide utility for this audience.
4.5 Contributions and Potential Implications
The first research contribution of this study will be the telemedicine taxonomy, which
addresses multiple dimensions of telemedicine environments that need to be considered while
planning such systems or operating them. Implications of using this taxonomy may be important
for physicians, patients, medical organizations, and researchers. Using this taxonomy can also
help medical providers in understanding the building blocks of telemedicine systems and provide
them with possible explanations as to why their telemedicine system is a success or a failure.
53 of 60
Moreover, providers can outline their current status under each dimension, learn where they fit in
the taxonomy, and utilize this positioning to initiate new services that they are capable of
providing. This taxonomy may also help patients grasp the unfamiliar world of telemedicine,
inform their telemedicine expectations, and evaluate their own telemedicine capabilities at home
or in their local communities. This may eventually improve the acceptance and adoption of
telemedicine applications among patients. Organizations, such as HMOs and hospitals, can make
use of this taxonomy while identifying which dimensions are most critical for the services they
provide currently or in the future. It can be used as a guideline for planning or evaluating existing
or new services by hospital management. Finally, for the researchers, this taxonomy presents an
original effort to put all important telemedicine dimensions and their interactions together in
order to develop a comprehensive taxonomy and provides a method to compare and contrast
different efforts and studies in the field.
The channel used to deliver telemedicine services is always limited. It should be used wisely to
allocate enough capacity based on the priority of data required on each end. Unfortunately, every
channel and setting can support only a limited variety of medical dimensions. Before starting a
telemedicine event, it is useful to understand the capabilities of the support dimensions in hand
and what types of scenarios under specific medical dimensions can utilize that capability. The
measurement results of this study will help to further understand the acceptable quality levels for
confidently making medical decisions in a given telemedicine channel. The subjective results of
the experiment will be a first step in understanding the effects of impairments on telemedicine
events. Even though many studies have been conducted to measure user acceptance of
telemedicine systems, very few studies consider quality of information and its effects on decision
54 of 60
making. Using standards to measure perceived quality, this study will extend the telemedicine
research within the information systems field.
The final contribution of this study will be a videoconferencing client with a capability index
indicator. This new tool will fill a technology gap by providing a low-cost, low-quality,
telemedicine tool that can be applied in several settings. The quality indicator will help users
make decisions regarding the sessions they are planning through the existing channels. Even
though the telemedicine capability index will be limited to a very small subset of possible
telemedicine settings (cardiology and ophthalmology diagnosis), it can be improved by further
research since the procedures and tests necessary to conduct the experiments are selected from
standards available today. Moreover, the results of this study can be a starting point for
understanding how the objective/subjective audio/video quality assessment should be carried out
to extend the results to a larger subset by further experiments.
4.6 Timeline
Timeline for this study is provided in Figure 4.6 below.
55 of 60
Figure 4.6 Study Timeline
56 of 60
References
[1] C. I. Jones, "Why have health expenditures as a share of GDP risen so much?," National
Bureau of economic research, Working Paper 9325, 2002.
[2] S. K. Moore, "Extending Healthcare's Reach: Telemedicine can help spread medical
expertise around the globe," IEEE Spectrum, vol. 39, pp. 66 - 71, 2002.
[3] J.-M. Ho, J.-C. Hu, and P. Steenkiste, "A conference gateway supporting interoperability
between SIP and H.323," presented at the ninth ACM International Multimedia
Conference, Ottawa, Canada, 2001.
[4] Pricewaterhouse Coopers, "HealthCast 2010: Smaller World, Bigger Expectations,"
PricewaterhouseCoopers 1999.
[5] H. C. J. Linderoth, "Implementation and Evaluation of Telemedicine -a Catch 22?,"
presented at 35th Hawaii International Conference on Systems Sciences, Hawaii, USA,
2002.
[6] K. Hung and Y. T. Zhang, "On the feasibility of the usage of WAP devices in
telemedicine," presented at IEEE EMBS International Conference on Information
Technology Applications in Biomedicine, Arlington, Virginia - USA, 2000.
[7] R. L. Bashshur, "Telemedicine and Health Care," Telemedicine Journal and e-Health,
vol. 8, pp. 5-12, 2002.
[8] E. A. Miller, "Telemedicine and doctor-patient communications: an analytical survey of
the literature," Journal of Telemedicine and Telecare, vol. 7, pp. 1-17, 2001.
[9] W. H. DeLone and E. R. McLean, "Information systems success revisited," presented at
35th Hawaii International Conference on Systems Sciences, Hawaii, USA, 2002.
[10] J. G. McDaniel, "Improving system quality through software evaluation," Computers in
Biology and Medicine, vol. 32, pp. 127-140, 2002.
[11] R. L. Bashshur, T. G. Reardon, and G. W. Shannon, "Telemedicine: A New Health Care
Delivery System," Annual Review of Public Health, vol. 21, pp. 613-37, 2000.
[12] American Nurses' Association Developing telehealth protocols : a blueprint for success.
Washington, DC: American Nurses Association, 2001.
[13] M. M. Maheu, P. Whitten, and A. Allen, E-Health, Telehealth, and Telemedicine: A
Guide to Start-Up and Success, First ed. San Francisco: Jossey-Bass Inc., 2001.
[14] Committee on Evaluating Clinical Applications of Telemedicine, Temeledicine: A Guide
to Assessing Telecommunications in Health Care. Washington, D.C.: National Academy
Press, 1996.
[15] T. A. Hall, "Objective Speech Quality Measures for Internet Telephony," presented at
Proceedings of SPIE on Voice Over IP (VoIP) Technology, 2001.
[16] S. Mohamed, "Automatic Evaluation of Real-Time Multimedia Quality: a Neural
Network Approach." Rennes: University of Rennes I, 2003.
[17] D. P. W. Ellis, "Evaluating Speech Separation Systems," in Perspectives on Speech
Separation, P. Divenyi, Ed. New York: Kluwer Academic Publishers, 2004.
[18] S. Wang, A. Sekey, and A. Gersho, "An objective measure for predicting subjective
quality of speech coders," IEEE Journal on Selected Areas in Communications, vol. 10,
pp. 819 - 829, 1992.
57 of 60
[19] S. Voran, "Objective Estimation of Perceived Speech Quality-Part I: Development of the
Measuring Normalizing Block Technique," IEEE Transactions on Speech and Audio
Processing, vol. 7, pp. 371-382, 1999.
[20] A. P. Markopoulou, F. A. Tobagi, and M. J. Karam, "Assessing the quality of voice
communications over internet backbones," IEEE/ACM Transactions on Networking, vol.
11, pp. 747-760, 2003.
[21] A. Watson and M. A. Sasse, "Measuring perceived quality of speech and video in
multimedia conferencing applications," presented at The Sixth ACM International
Conference on Multimedia, Bristol, United Kingdom, 1998.
[22] A. Watson, "Assessing the Quality of Audio and Video Components in Desktop
Multimedia Conferencing," in Department of Computer Science. London, UK: University
of London, 2001.
[23] S. Wolf and M. Pinson, "Video Quality Measurement Techniques," U.S. DEPARTMENT
OF COMMERCE - National Telecommunication and Information Administration NTIA
Report 02-392, June 2002.
[24] D. A. Rosenthal, "Analyses of selected variables effecting video streamed over IP,"
International Journal of Network Management, vol. 14, pp. 193-211, 2004.
[25] A. P. Markopoulou, "Assessing the Quality of Multimedia Communications Over
Internet Backbone Networks," in Department of Electrical Engineering. Stanford, CA:
Stanford University, 2002.
[26] R. H. Eikelboom, K. Yogesan, C. J. Barry, I. J. Constable, L. Jitskaia, P. H. House, and
M. L. Tay-Kearney, "Methods and limits of digital image compression of retinal images
for telemedicine," Investigative Ophthalmology and Visual Science, vol. 41, pp. 1916-24,
2000.
[27] P. C. Cosman, R. M. Gray, and R. A. Olshen, "Evaluating quality of compressed medical
images: SNR, subjective rating, and diagnostic accuracy," Proceedings of the IEEE, vol.
82, pp. 919-932, 1994.
[28] A. Przelaskowski, "Vector quality measure of lossy compressed medical images,"
Computers in Biology and Medicine, vol. 34, pp. 193-207, 2004.
[29] P. Dev, D. Harris, D. Gutierrez, A. Shah, and S. Senger, "End-to-End Performance
Measurement of Internet Based Medical Applications," presented at American Medical
Informatics Association (AMIA) Symposium, 2002.
[30] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. Peterson, R. Sparks, M.
Handley, and E. Schooler, "SIP: Session Initiation Protocol," Internet Engineering Task
Force RFC 3261, June 2002.
[31] H. Schulzrinne and J. Rosenberg, "The Session Initiation Protocol: Internet-centric
signaling," IEEE Communications Magazine, vol. 38, pp. 134 - 141, 2000.
[32] B. Tulu, S. Chatterjee, T. Abhichandani, and H. Li, "Secured video conferencing desktop
client for telemedicine," presented at 5th International Workshop on Enterprise
Networking and Computing in Healthcare Industry (Healthcom), Santa Monica, CA,
2003.
[33] K. Arabshian and H. Schulzrinne, "A SIP-based medical event monitoring system,"
presented at 5th International Workshop on Enterprise Networking and Computing in
Healthcare Industry (Healthcom), Santa Monica, CA, 2003.
[34] M. J. Field, Telemedicine: A Guide to Assessing Telecommunications in Health Care.
Washington, D.C.: National Academy Press, 1996.
58 of 60
[35] C. LeRouge, M. J. Garfield, and A. R. Henver, "Quality attributes in Telemedicine Video
Conferencing," presented at 35th Hawaii International Conference on Systems Sciences,
Hawaii, USA, 2002.
[36] D. R. Kaufman, V. L. Patel, C. Hilliman, P. C. Morin, J. Pevzner, R. S. Weinstock, R.
Goland, S. Shea, and J. Starren, "Usability in the real world: assessing medical
information technologies in patients’ homes," Journal of Biomedical Informatics, vol. 36,
pp. 45-60, 2003.
[37] E. Coiera, Guide to Medical Informatics, The Internet and Telemedicine, First ed.
London,UK: Chapman & Hall, 1997.
[38] R. L. Glueckauf, J. D. Whitton, and D. W. Nickelson, "Telehealth: The New Frontier in
Rehabilitation and Health Care," in Assistive Technology: Matching Device and
Consumer for Successful Rehabilitation, M. J. Scherer, Ed., 1st ed. Washington D.C.:
Amarican Psychological Association, 2002.
[39] T. L. Huston and J. L. Huston, "Is Telemedicine a Practical Reality?," Communications
of the ACM, vol. 43, pp. 91-95, 2000.
[40] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A Transport Protocol
for Real-Time Applications," Internet Engineering Task Force (IETF) RFC 3550, July
2003.
[41] M. Hassan, A. Nayandoro, and M. Atiquzzaman, "Internet telephony: services, technical
challenges, and products," IEEE Communications Magazine, vol. 38, pp. 96 - 103, 2000.
[42] C. Demichelis and P. Chimento, "IP Packet Delay Variation Metric for IP Performance
Metrics (IPPM)," Internet Engineering Task Force (IETF), RFC 3393 November 2002.
[43] J.-W. Suh and Y.-S. Ho, "Error Concealment Techniques for Digital TV," IEEE
Transactions on Broadcasting, vol. 48, pp. 299-306, 2002.
[44] B. Goode, "Voice Over Internet Protocol (VoIP)," Proceedings of the IEEE, vol. 90, pp.
1495-1517, 2002.
[45] S. Shenker and J. Wroclawski, "General Characterization Parameters for Integrated
Service Network Elements," Internet Engineering Task Force (IETF), RFC 2215,
September 1997.
[46] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W.Weiss, "An Architecture for
Differential Services," Internet Engineering Task Force (IETF), RFC 2475, December
1998.
[47] R. Braden, L. Zhang, S. Berson, S. Herzog, and S. Jamin, "Resource Reservation
Protocol (RSVP) - Version 1 Functional Specification," Internet Engineering Task Force
(IETF), RFC 2205, September 1997.
[48] S. Shenker, C. Partridge, and R. Guerin, "Specification of Guaranteed Quality of
Service," Internet Engineering Task Force (IETF), RFC 2212, September 1997.
[49] J. Wroclawski, "Specification of the Controlled-Load Network Element Service," Internet
Engineering Task Force (IETF), RFC 2211, September 1997.
[50] K. Nichols, S. Blake, F. Baker, and D. Black, "Definition of the Differentiated Services
Field (DS Field) in the IPv4 and IPv6 Headers," Internet Engineering Task Force (IETF),
RFC 2474, December 1998.
[51] E. Rosen, A. Viswanathan, and R. Callon, "Multiprotocol Label Switching Architecture,"
Internet Engineering Task Force (IETF), RFC 3031, January 2001.
59 of 60
[52] G. J. Armitage, "Revisiting IP QoS: why do we care, what have we learned? ACM
SIGCOMM 2003 RIPQOS workshop report," ACM SIGCOMM Computer
Communication Review, vol. 33, pp. 81 - 88, 2003.
[53] L. Cottrell, "Network Monitoring Tools," http://www.slac.stanford.edu/xorg/nmtf/nmtf-
tools.html#public, accessed on: November 4, 2004.
[54] B. Tulu, T. Abhichandani, S. Chatterjee, and H. Li, "Design and Development of a SIP-
based Video-Conferencing Application," presented at IEEE 6th International Conference
on High Speed Networks and Multimedia Communications, Estoril, Portugal, 2003.
[55] C. Robson, Real Worl Research, Second ed. Malden, Massachusetts: Blackwell
Publishers Inc., 2002.
[56] A. R. Hevner, S. T. March, J. Park, and S. Ram, "Design Science in Information Systems
Research," MIS Quarterly, vol. 28, pp. 75-105, 2004.