Download - Jani Lakkakorpi S-38.130 Licentiate Course on Telecommunications Technology April 6, 2001
Voice in Packets:RTP, RTCP, Header Compression, Playout Algorithms, Terminal Requirements and Implementations
Jani LakkakorpiS-38.130Licentiate Course on Telecommunications TechnologyApril 6, 2001
Problems with Voice over IP
Received packet stream has to be playout buffered in order to restore the original packet spacing (and possibly packet order).
The large overhead in small VoIP packets. For example: 24 bytes of
payload (G.723.1) and 60 bytes of overhead (RTP/UDP/IPv6)
Header compression is necessary to reduce delay on slow links.
Terminal requirements usually grow with the voice compression ratio.
6024
Sent packets
Received packets
Synchronized packets
Time
IP Cloud
RTP and RTCP
Real-Time Transport Protocol (RTP) provides end-to-end transport functions for applications that transmit real time data, such as VoIP.
RTP does not provide any Quality of Service guarantees but it is only responsible of synchronizing the received packets. Timestamps and sequence numbers.
Real-Time Control Protocol (RTCP) gives feedback on the quality of data transmission and information about participants of the session.
RFC 1889
RTP Header (1)
Version (V, 2 bits) RFC 1889: V=2.
Padding (P, 1 bit) If P=1, padding octets at the
end of the payload. Last payload octet contains the number of padding octets.
Extension (X, 1 bit) If X=1, fixed header is followed
by extensions (RFC 1889). CSRC Count (CC, 4 bits)
The number of contributing source identifiers.
V P X CC M PT Sequence Number
Timestamp
Synchronization Source (SSRC) Identifier
Contributing Source (CSRC) Identifiers
...
Profile-specific Extensions
RTP Header (2)
Marker (M, 1 bit) Marks significant events such
as first packets in talkspurts. Payload Type (PT, 7 bits)
The format of RTP payload. Sequence Number (16 bits)
Starts from a random value and is incremented by one for each sent packet.
Used by the receiver to detect packet losses and to restore original packet sequence.
V P X CC M PT Sequence Number
Timestamp
Synchronization Source (SSRC) Identifier
Contributing Source (CSRC) Identifiers
...
Profile-specific Extensions
RTP Header (3)
Timestamp (32 bits) The sampling instant of the first
payload octet. Clock frequency defined for each
payload type, and the clock is initialized with a random value.
SSRC (32 bits) Identifies the synchronization
source. Randomly chosen.
CSRC list (0…15 items, 32 bits each) Identifies the contributing sources
for the payload of this packet. Inserted by mixers.
V P X CC M PT Sequence Number
Timestamp
Synchronization Source (SSRC) Identifier
Contributing Source (CSRC) Identifiers
...
Profile-specific Extensions
RTCP
RTCP provides feedback on the quality of data distribution. RTCP packet types:
Sender Report (SR) contains transmission and reception statistics for active senders.
Receiver Report (RR) contains reception statistics for participants that are not active senders.
Source Description Items (SDES) describe various parameters about the source.
BYE packet is sent when participant leaves the session. APP: Application specific functions.
RTCP Header: Sender Report (1)
Version (V, 2 bits) RFC 1889: 2.
Padding (P, 1 bit) If P=1, padding octets at the end.
Reception Report Count (RC, 5 bits) The number of report blocks in this
report. Packet Type (PT, 8 bits)
Sender Report: 200. Length (16 bits)
Includes header & padding. SSRC (32 bits)
Synchronization source ID of the originator of this report.
V P RC PT=SR=200 Length
SSRC of Sender
NTP Timestamp, Most Significant Word
NTP Timestamp, Least Significant Word
RTP Timestamp
Sender's Packet Count
Sender's Octet Count
SSRC_n
Extended Highest Sequence Number Received
Interarrival Jitter
Last SR Timestamp
Delay Since Last SR
…
Profile-specific Extensions
Fraction Lost Cumulative Number of Packets Lost
RTCP Header: Sender Report (2),Sender Information Section (Only present in SRs)
NTP Timestamp (64 bits) The wallclock time when this
report was sent. RTP Timestamp (32 bits)
Represents the same time as the NTP timestamp, but with the same units and random offset as in the timestamps of RTP packets.
May be used for synchronization. Sender's Packet Count (32 bits)
From the start of transmission until the time this report was generated.
Sender's Octet Count (32 bits) Only payload included.
V P RC PT=SR=200 Length
SSRC of Sender
NTP Timestamp, Most Significant Word
NTP Timestamp, Least Significant Word
RTP Timestamp
Sender's Packet Count
Sender's Octet Count
SSRC_n
Extended Highest Sequence Number Received
Interarrival Jitter
Last SR Timestamp
Delay Since Last SR
…
Profile-specific Extensions
Fraction Lost Cumulative Number of Packets Lost
RTCP Header: Sender Report (3),Reception Report Blocks
One block for each source that we have heard of since the last SR/RR.
SSRC_n (32 bits) Synchronization source ID of the
source that we are reporting about. Fraction Lost (8 bits) Cumulative Number of Packets
Lost (24 bits) Since the beginning of reception.
Extended Highest Sequence Number Received (32 bits) The highest sequence number
received in an RTP packet & the corresponding count of sequence number cycles.
V P RC PT=SR=200 Length
SSRC of Sender
NTP Timestamp, Most Significant Word
NTP Timestamp, Least Significant Word
RTP Timestamp
Sender's Packet Count
Sender's Octet Count
SSRC_n
Extended Highest Sequence Number Received
Interarrival Jitter
Last SR Timestamp
Delay Since Last SR
…
Profile-specific Extensions
Fraction Lost Cumulative Number of Packets Lost
RTCP Header: Sender Report (4),Reception Report Blocks
Interarrival Jitter (32 bits) An estimate of the variance of the
RTP packet interarrival time. Last SR Timestamp (LSR, 32 bits)
The middle 32 bits of the NTP timestamp from the most recent RTCP sender report issued by SSRC_n. If no sender report has been received yet, this field is set to zero.
Delay Since Last SR (32 bits) The sender of this last SR can use
it to compute the round trip time together with the last SR timestamp.
V P RC PT=SR=200 Length
SSRC of Sender
NTP Timestamp, Most Significant Word
NTP Timestamp, Least Significant Word
RTP Timestamp
Sender's Packet Count
Sender's Octet Count
SSRC_n
Extended Highest Sequence Number Received
Interarrival Jitter
Last SR Timestamp
Delay Since Last SR
…
Profile-specific Extensions
Fraction Lost Cumulative Number of Packets Lost
Playout Algorithms (1)
In most packet audio applications, the receiving host has to buffer packets in order to compensate for variable network delay. Playout delay can be constant or adaptively adjusted.
Adaptive playout delay can be either per-talkspurt or per-packet based: In the former approach, playout delay remains constant throughout the
talkspurt and the adjustments are done between talkspurts. The latter approach introduces gaps in speech not suitable for VoIP.
There is a tradeoff between packet playout delay and packet playout loss. If constant playout delay is too short or adaptive algorithm reacts slowly
to delay "spikes", packets are lost.
Playout Algorithms (2)
Here we present a simple algorithm for adaptive playout delay adjustment: For each received packet (except the first one), waiting time in the playout
buffer is calculated with the following formula:Twait = (TimeStampi - TimeStampi-1) - (ReceivedAti - PlayAti-1)
If the result is negative, packet has arrived too late and it is discarded. Otherwise, packet is played out at:
PlayAti = ReceivedAti + Twait, i
Whenever playout delay is adjusted, it will be the maximum of the initial playout delay and the current playout delay subtracted by the minimum Twait
of the latest measurement period. The following events trigger the delay adjustment process:
If N or more packets among the last M packets (measurement period) arrive late, playout delay is adjusted upwards at the next talkspurt.
Similarly, if M successive packets have been received all in time, playout delay is adjusted downwards at the next talkspurt.
Playout Algorithms (3)
Example: First packet of the connection arrives at 0 ms.
It has a timestamp of 10 ms. Waiting time for the first packet is set to, for example, 30 ms. (We don't assume that the sender and receiver clocks would be synchronized.)
Second packet of the connection arrives at 35 ms. It has a timestamp of 40 ms. Waiting time is calculated in the following way:
Twait = (40 - 10) - (35 - 30) = 25 ms. Third packet of the connection arrives at 80
ms. It has a timestamp of 70 ms. Waiting time is calculated in the following way:
Twait = (70 - 40) - (80 - 60) = 10 ms.
Sent packets
Received packets
Synchronized packets
TS=10TS=40TS=70
T=35T=80 T=0
T=30T=60T=90
Playout Algorithms (4)
Example (continued): Too many packets have been lost during last measurement period
It is time to adjust delay: Let's assume that minimum Twait of the latest measurement period is -5
ms. We subtract this value from the waiting time of first packet of next
talkspurt:Twait = (TimeStampi - TimeStampi-1) - (ReceivedAti -
PlayAti-1) - (-5 ms). Example values:
Twait = (3020 - 2000) - (3030 - 2020) - (-5 ms) = 10 + 5 = 15 ms.
Header Compression
TCP header compression: RFC 1144. RTP header compression: RFC 2508.
Basic idea: Since the difference in successive RTP packets is often constant, it is enough to convey an indication that the second-order difference was zero. Next packet header can thus be constructed from the previous one by adding the first-order differences.
Other proposals: ROCCO (Ericsson), ACE (Nokia). Should perform slightly better than the mechanism described in RFC
2508.
RTP/UDP/IP Header Compression (1)
In IPv4 header, only the total length, packet ID, and header checksum fields typically change. Total length can be excluded
(provided by the link layer). Header checksum can be
dropped, too. Link layer provides good error detection.
Changes in packet ID are transmitted. Usually packet ID is incremented by one for each packet.
In IPv6 base header, only the payload field changes.
Packet ID
IHL Type of Service Total Length
Source Address
Destination Address
Version
Flags Fragment Offset
ProtocolTime to Live Header Checksum
RTP/UDP/IP Header Compression (2)
In UDP header, port numbers are not likely to change during VoIP connection.
Length field is redundant with with the IP total length field and the length indicated by the link layer.
If source generates UDP check-sums, they must be sent intact in order to preserve lossless compression.
Destination PortSource Port
ChecksumLength
RTP/UDP/IP Header Compression (3)
In most RTP headers, only the sequence number and timestamp change from packet to packet. If packets are not lost and they arrive
in correct order, the sequence number is incremented by one for each packet.
For VoIP packets of constant duration, the timestamp is incremented by the number of sample periods conveyed in each packet.
One bit in the compressed header is reserved for the marker bit. If treated as a constant field, the
compression would become inefficient.
V P X CC M PT Sequence Number
Timestamp
Synchronization Source (SSRC) Identifier
Contributing Source (CSRC) Identifiers
...
Profile-specific Extensions
Some Terminal Requirements and Implementations
All terminals that support real time voice must have considerable processing capacity. The computational requirements of
voice codecs increase with the compression ratio.
Microsoft NetMeeting is popular video conferencing tool. Pentium 90 processor or higher,
24 MB of RAM. VocalTec Internet Phone Lite is
mainly targeted for pure voice connections. Pentium 75 processor or higher.
Conclusions
RTP/RTCP protocol suite provides the means for sending packetized voice by introducing timestamps and sequence numbers.
Playout buffering is needed to re-synchronize the received voice stream.
RTP/UDP/IP overhead problem can be solved by efficient header compression.
Terminals that support real time interactive voice must have considerable processing power. The computational requirements of the voice codecs typically increase with the voice compression ratio.