taking on webrtc in an enterprise

7
IEEE Communications Magazine • April 2013 48 0163-6804/13/$25.00 © 2013 IEEE INTRODUCTION WebRTC [1–3], the industry effort to add real- time voice and video communication capabilities to browsers, is receiving much attention and hype these days. As browser vendors announce timelines for support and developers show off demos, it is clear that these new standards, application programming interfaces (APIs), and protocols will have a major impact on the World Wide Web. Less often discussed is the impact on enterprises and comparisons to existing enter- prise communications. Typically, enterprise communication systems enable two-party phone calls or multiparty con- ference scenarios. The equipment enabling these session-centric client-server capabilities usually resides within the guarded enterprise network (e.g., behind the enterprise firewall or within a virtual private network, VPN). In contrast, WebRTC opens up several compelling opportu- nities that go beyond the classical enterprise communication views of sessions and guarded networks. For example, team members visiting the same internal project web page could auto- join a video conferencing application embedded in that page. As another example, customers vis- iting an enterprise website could initiate interac- tive conversations with its customer service agents — via voice or video — without asking the customers to call the enterprise using a phone. The agents, using WebRTC technology, participate in media flows from the browser rather than conventional session-centric telepho- ny gear, making it easier for the agents to live outside the enterprise communication system boundaries. Numerous other examples related to collaboration or other enterprise value proposi- tions that benefit from WebRTC are likely [4]. Some of these web-centric interactions will likely not have direct analogies in existing communica- tions systems, given the ease of building rich user interfaces and experiences with HTML5. Web developers working on consumer appli- cations and websites will likely be able to use WebRTC directly as initially specified. Using fairly simple JavaScript code will result in new multimedia communication capabilities being embedded in their application or site in new and interesting ways. However, there are some open issues related to enterprise adoption and use of WebRTC. This article illustrates and discusses a number of these issues. Some of these relate to security: firewall traversal, access control, and peer-to-peer data flows. Others relate to compli- ance: recording and logging. Additional issues relate to integration and interoperation with existing communication infrastructure and ses- sion-centric telephony equipment. Traditionally, the enterprise network enforces strict security requirements resulting in very restrictive firewalls and network border ele- ments. On the other hand, WebRTC strives to create an end-to-end secure media path between the two browser instances with little or no inter- ference from any intermediate entity. This fun- damental design principle poses several challenges to the traditional enterprise informa- tion technology (IT) mindset. Existing voice- over-IP (VoIP) tools and techniques used by enterprises are not enough to deal with the chal- lenges. Although we propose potential directions to solve them, ultimately the industry will decide how it adapts this emerging technology in enter- prises. ENTERPRISES AND FIREWALLS Enterprises use firewalls to enforce Internet Pro- tocol (IP) access policies at the edge of their networks. These policies relate to who is allowed to access which sites and resources. Firewalls are often implemented using 5-tuple rules (source and destination IP address, source and destina- tion ports, and transport protocol). Other fire- walls utilize deep packet inspection to determine the application using the transport connection and its characteristics. Typically, firewall devices include both filtering and address mapping tech- niques — the latter is known as Network Address (and port) Translation (NAT). Conven- tional firewalls were developed mainly to handle client-server protocols such as web browsing, email, and file transfer. A client-server packet ABSTRACT WebRTC, Web Real-Time Communications, will have a major impact on enterprise communi- cations, as well as consumer communications and their interactions with enterprises. This article illustrates and discusses a number of issues that are specific to WebRTC enterprise usage. Some of these relate to security: firewall traversal, access control, and peer-to-peer data flows. Oth- ers relate to compliance: recording, logging, and enforcing enterprise policies. Additional enter- prise considerations relate to integration and interoperation with existing communication infra- structure and session-centric telephony systems. WEB-BASED COMMUNICATIONS Alan Johnston, John Yoakum, and Kundan Singh, Avaya Inc Taking on WebRTC in an Enterprise

Upload: kundan

Post on 27-Jan-2017

215 views

Category:

Documents


3 download

TRANSCRIPT

IEEE Communications Magazine • April 201348 0163-6804/13/$25.00 © 2013 IEEE

INTRODUCTION

WebRTC [1–3], the industry effort to add real-time voice and video communication capabilitiesto browsers, is receiving much attention andhype these days. As browser vendors announcetimelines for support and developers show offdemos, it is clear that these new standards,application programming interfaces (APIs), andprotocols will have a major impact on the WorldWide Web. Less often discussed is the impact onenterprises and comparisons to existing enter-prise communications.

Typically, enterprise communication systemsenable two-party phone calls or multiparty con-ference scenarios. The equipment enabling thesesession-centric client-server capabilities usuallyresides within the guarded enterprise network(e.g., behind the enterprise firewall or within avirtual private network, VPN). In contrast,WebRTC opens up several compelling opportu-nities that go beyond the classical enterprisecommunication views of sessions and guardednetworks. For example, team members visitingthe same internal project web page could auto-join a video conferencing application embeddedin that page. As another example, customers vis-iting an enterprise website could initiate interac-tive conversations with its customer serviceagents — via voice or video — without askingthe customers to call the enterprise using aphone. The agents, using WebRTC technology,participate in media flows from the browserrather than conventional session-centric telepho-ny gear, making it easier for the agents to liveoutside the enterprise communication systemboundaries. Numerous other examples related tocollaboration or other enterprise value proposi-tions that benefit from WebRTC are likely [4].

Some of these web-centric interactions will likelynot have direct analogies in existing communica-tions systems, given the ease of building richuser interfaces and experiences with HTML5.

Web developers working on consumer appli-cations and websites will likely be able to useWebRTC directly as initially specified. Usingfairly simple JavaScript code will result in newmultimedia communication capabilities beingembedded in their application or site in new andinteresting ways. However, there are some openissues related to enterprise adoption and use ofWebRTC. This article illustrates and discusses anumber of these issues. Some of these relate tosecurity: firewall traversal, access control, andpeer-to-peer data flows. Others relate to compli-ance: recording and logging. Additional issuesrelate to integration and interoperation withexisting communication infrastructure and ses-sion-centric telephony equipment.

Traditionally, the enterprise network enforcesstrict security requirements resulting in veryrestrictive firewalls and network border ele-ments. On the other hand, WebRTC strives tocreate an end-to-end secure media path betweenthe two browser instances with little or no inter-ference from any intermediate entity. This fun-damental design principle poses severalchallenges to the traditional enterprise informa-tion technology (IT) mindset. Existing voice-over-IP (VoIP) tools and techniques used byenterprises are not enough to deal with the chal-lenges. Although we propose potential directionsto solve them, ultimately the industry will decidehow it adapts this emerging technology in enter-prises.

ENTERPRISES AND FIREWALLSEnterprises use firewalls to enforce Internet Pro-tocol (IP) access policies at the edge of theirnetworks. These policies relate to who is allowedto access which sites and resources. Firewalls areoften implemented using 5-tuple rules (sourceand destination IP address, source and destina-tion ports, and transport protocol). Other fire-walls utilize deep packet inspection to determinethe application using the transport connectionand its characteristics. Typically, firewall devicesinclude both filtering and address mapping tech-niques — the latter is known as NetworkAddress (and port) Translation (NAT). Conven-tional firewalls were developed mainly to handleclient-server protocols such as web browsing,email, and file transfer. A client-server packet

ABSTRACT

WebRTC, Web Real-Time Communications,will have a major impact on enterprise communi-cations, as well as consumer communications andtheir interactions with enterprises. This articleillustrates and discusses a number of issues thatare specific to WebRTC enterprise usage. Someof these relate to security: firewall traversal,access control, and peer-to-peer data flows. Oth-ers relate to compliance: recording, logging, andenforcing enterprise policies. Additional enter-prise considerations relate to integration andinteroperation with existing communication infra-structure and session-centric telephony systems.

WEB-BASED COMMUNICATIONS

Alan Johnston, John Yoakum, and Kundan Singh, Avaya Inc

Taking on WebRTC in an Enterprise

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 48

IEEE Communications Magazine • April 2013 49

sent from inside the firewall to the outside typi-cally creates a pinhole at the firewall so that thepackets in the reverse 5-tuple flow are notblocked. Peer-to-peer communication systemsand protocols are a bigger challenge to firewallsand other policy enforcement devices.

Real-time communication flows of Real-TimeTransport Protocol (RTP) [5] packets are typi-cally described as peer-to-peer flows. That is,they are often established directly between thetwo communicating devices. Servers are oftenused to help establish these flows, but routingthe resulting media session through these serversis often undesirable for a number of reasons:1. Real-time media flows are extremely sensi-

tive to latency or delay. Peer-to-peer flowstypically result in the lowest possible laten-cy.

2. Direct peer-to-peer media flows often havefewer IP hops than relayed traffic, whichresults in a lower chance of packet loss.

3. Servers used for signaling are often not dis-tributed geographically; nor do they haveenough bandwidth to be used successfullyas media relays.Peer-to-peer flows have been historically dif-

ficult to get through enterprise firewalls. Duringthe early days of Session Initiation Protocol(SIP) [6] VoIP deployment and testing, this wasfrequently encountered in the form of the “one-way media problem,” where media could be sentout from inside the firewall, but the reverse flowmedia was blocked. This occurred in cases wherethe signaling was able to traverse the firewall,due to its similarities to client-server protocols.

A number of approaches were developed tomake firewall traversal easier. They include:• Symmetric RTP [7]. A bidirectional media

session is actually two unidirectional RTPflows. In particular, the user agent uses thesame User Datagram Protocol (UDP) portfor sending and receiving the RTP stream.Using symmetric RTP, it is possible to makethese two RTP flows seem more like a sin-gle bidirectional flow, which more easilytraverses firewalls.

• Interactive Connectivity Establishment(ICE) [8], which formalizes the “hole-punching” approaches developed by peer-to-peer gamers. This approach uses testpackets sent by both participants in a ses-sion to establish filter rules in firewalls andalso NAT devices.

For enterprise firewall traversal of communica-tion, the most used approach today involves asession border controller (SBC).

SESSION BORDER CONTROLLERSAND FIREWALL TRAVERSAL

An SBC [9] is essentially an application layerfirewall with a signaling and media applicationlayer gateway (ALG) built in. The SBC is usuallyconnected in an enterprise demilitarized zone(DMZ) as a trusted enterprise network element.It blocks all unauthorized signaling and mediaflows, and provides a point of policy enforce-ment as shown in Fig. 1. Today’s SBCs supportSIP and RTP, including Secure RTP (SRTP).

SIP is a signaling protocol used for VoIP andvideo communication to establish RTP (orSRTP) media sessions. By parsing SIP messages,the SBC is able to discover the transport address-es (5-tuple) to be used for the media session. Ifthe SIP traffic is authenticated and authorized,the SBC either opens a pinhole (filter rule per-mitting the RTP traffic) or activates an RTPrelay. In both cases, the resulting RTP mediasession is able to traverse the firewall.

In addition to firewall traversal, SBCs alsoprovide a number of other services, includingprotocol normalization, media transcoding, andprotection against malicious packets and pay-loads. These features are not directly related tothe firewall traversal problem discussed here, butare highly useful in enterprises.

Since enterprises have widely deployed SBCsfor communication firewall traversal, it wouldseem logical to reuse them for WebRTC. How-ever, there are a number of problems with thisapproach. First, it relies on using the signalingchannel to authenticate the media channel. WithWebRTC, there is no standard signaling chan-nel. Second, SBCs rely on inserting themselvesinto the control path to learn when media flowsare beginning and ending. With WebRTC, thecontrol path is the HTTP or WebSocket channelbetween the browser and web server, and will berunning over Transport Layer Security (TLS)and hence be encrypted and not available toobserve, as shown in Fig. 1. Finally, SBCs use anidentity determined from the signaling channelto authenticate the media channel. In WebRTC,there is no standard way to indicate identity. Thefew identity mechanisms that have been dis-cussed in standards bodies are related to theidentity in the media path, and this identity is ofa different nature than that with which SBCs areaccustomed to dealing.

In addition, WebRTC has no concept of ses-sions; instead, it has a concept of streams .Streams have media sources and sinks that gen-erate and consume media flows. They are creat-ed and manipulated using JavaScript, resultingin peer connections being established. Streams

Figure 1. The SBC intercepts SIP/SDP signaling on port 5060, applies policies,and opens firewall pinholes to allow an RTP media path for enterprise com-munication, but WebRTC traffic uses HTTP Secure (HTTPS), which is notintercepted, and hence the media path is blocked.

PublicInternet

Enterpriseboundary

Innerfirewall

Outerfirewall

Enterprisenetwork

DMZ

SIPsystem

SIPphone

Webserver

Webbrowser

Webbrowser

SBCSIP and SDP SIP and SDP

HTTP or HTTPS HTTP or HTTPS

RTP or SRTP RTP or SRTP

SRTP and ICE(WebRTC)

SRTP and ICE(WebRTC)

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 49

IEEE Communications Magazine • April 201350

can be created for media to flow point-to-pointor between a browser and a media server ormixer. There is no direct correspondencebetween peer connections and participants in amultiparty session. Once a peer connection hasbeen established between a browser and a mediaselector, additional participants can be added atany time. For example, media from multiple par-ticipants can be mixed or offered by the mediamixer or selector.

One possible approach would be for an enter-prise to attempt to convert every WebRTC ses-sion that crosses enterprise boundaries into acommunication session that its existing infra-structure could handle in a session-centric man-ner. This could, for example, mean converting aWebRTC session into a SIP session, applyingpolicy and authentication, then converting backagain to provide it to the other browser (Fig. 2).This type of man-in-the-middle approach is notpractical, especially since WebRTC capabilitiescan be embedded in many types of web applica-tions and sites, and many of them follow a verydifferent paradigm of real-time communicationapplications and services. In particular, a one-to-one translation from a WebRTC stream to a SIPsession is not trivial without breaking existingSIP implementations. In addition, all control andmedia in WebRTC are encrypted.

These problems do not mean that WebRTCwill never cross enterprise firewalls, but rather thatnew approaches must be used. Some of theseapproaches will likely be novel, and discoveredthrough actual deployment and rollout ofWebRTC. However, there are some indications ofdirections this may take in today’s standards andapproaches. The rest of this article presents ideasto answer three important questions: How can theenterprise firewall adapt to WebRTC? How canthe enterprise detect and apply policy to WebRTCflows? And how can the emerging WebRTC appli-cations integrate and interoperate with existingenterprise communication equipment?

WEBRTC FIREWALL TRAVERSALFrom the previous discussion, it is clear thatWebRTC firewall traversal must work without astandardized signaling protocol, without a con-

ventional signaling identity, and without a con-cept of sessions that can be managed or con-trolled. This might seem like a daunting task,but there are some potential options.

It should be clear that the term session bor-der controller is not applicable to the elementthat assists WebRTC in traversing enterprisefirewalls. While the border part may be accurate,the session part is not applicable, and any notionof controlling WebRTC streams without accessto a signaling channel is also not realistic. In thefollowing discussion, we refer to the network ele-ment enabling enterprise edge traversal as asecure edge, just so we can have a label. Thereare a number of ways in which this secure edgemight be designed.

DETECT ICE EXCHANGEFor one, there is a type of standard signalingprotocol used for establishing media flowsdefined as part of WebRTC. It is built into theICE protocol used for hole-punching. Beforeany media data flows, ICE is run between thetwo browsers. The secure edge could detect anICE exchange starting up across the enterpriseborder. This could be used to distinguish aWebRTC media flow from a random packetflow across the border, allowing policy to beapplied.

In addition, there is a username/passwordfragment in ICE Session Traversal Utilities forNAT (STUN) messages. Normally, this informa-tion is randomly generated. If this informationwere generated in a particular way or coordinat-ed with an enterprise authentication system, thesecure edge could authenticate the ICE exchangeand hence the resulting media flow. One propos-al for how to do this is described in [10]. InWebRTC, ICE is run by the browser. While theJavaScript has access to the username/passwordfragment, there is currently no standard way inthe JavaScript to set this media flow identity. Itis possible a browser plug-in, utilized by theenterprise, could be used to set this identity.

SRTP KEY NEGOTIATIONAnother approach might be to use the SRTP keynegotiation along with Datagram TransportLayer Security (DTLS) to authenticate the mediaflow. For example, if DTLS-SRTP is used forkey management, the secure edge could act as aman-in-the-middle and hence validate the publickey in the fingerprint. If the enterprise deployeda Public Key Infrastructure (PKI), then the cer-tificate could be checked and validated. A self-signed certificate could also be uploaded fromthe browser and stored in an enterprise key serv-er. This could allow the Secure Edge to authen-ticate the browser inside the enterprise andapply appropriate policy. However, if some formof end-to-end identity is used in WebRTC, thenthis man-in-the-middle approach would look likean attack and would be detected and the useralerted.

If the media is being relayed by a man-in-the-middle, then this also provides a place whererecording could take place. Note, however, thatthe context of the media flow would not be avail-able. That is, the identity of the remote party inthe media flow or the application or website

Figure 2. Converting a WebRTC media flow into a SIP/SRTP to allow policyenforcement by an SBC is possible for a simple call or conference, but doesnot work in many web-centric scenarios where a one-to-one translationbetween a WebRTC media flow and SIP session is nontrivial.

Webbrowser

HTTP orHTTPS HTTP or

HTTPS

SRTP and ICE(WebRTC)

SRTP and ICE(WebRTC)

SRTP SRTP

SIPSDP

SIPSDP

Webserver

Webbrowser

WebserverSBC

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 50

IEEE Communications Magazine • April 2013 51

which enabled this media flow would not beknown from this insight alone.

MEDIA RELAYAnother approach would require a media relayto be used for WebRTC media sessions crossingthe enterprise boundary. There is a standardprotocol for this media relay, known as TraversalUsing Relays around NAT (TURN) [11]. ICEhole-punching begins with each browser gather-ing candidate addresses: addresses that might beuseful in routing incoming media packets to thebrowser. Typically, this includes private IPaddresses determined by reading the networkinterface card (NIC) or equivalent on the userdevice, and public IP addresses determined by aSTUN response packet from a STUN server. Inaddition, a TURN candidate address can also beprovided to be used as a last resort (i.e., the low-est relative priority of all the candidates).

The enterprise firewall would be configuredto block non-relayed WebRTC media flows. Theenterprise would deploy a TURN server in theDMZ, and permit media flows that go throughthis server, as shown in Fig. 3. The enterprisewould issue individual credentials to use theTURN server. A user would enter their TURNcredentials into the browser, which would thenuse them to gain a TURN candidate address.During media flow setup, the TURN servercould authenticate the user and also learn whattype of media flow is to be set up based on thebandwidth requested. Some policy could then beapplied to this media flow.

Note, however, that the exact context of themedia flow is inherently unknown. This approachalso relies on the browser being able to be con-figured with the user’s TURN credentials. Insome ways, this is similar to the configuration ofa web proxy, used by some enterprises to moni-tor and control web browsing.

The TURN server is also an ideal place toperform recording according to an enterprise

policy. Again, without the cooperation of thewebsite, no media flow context could be deter-mined. Also, since the media is encrypted end-to-end, the browser or web application wouldneed to share the encryption key in order for themedia to be played back.

FIREWALL-CONSCIOUS APPLICATIONSThe previous approaches deal with WebRTCtransparently to the application. Another path ofevolution might be that the enterprise web appli-cations consciously work in conjunction with thesecure edge. This is similar to how some gamingapplications use universal plug and play (uPnP)to configure firewall holes. The secure edgemight have a web interface, which is used by theenterprise application developer to explicitlyauthenticate the end user and request permis-sion from the secure edge. Thus, the recording ishandled by the application developer in thebrowser, and later uploaded to the secure edgeor other enterprise-specific storage, instead oftransparently recording the conversation at theborder device. The application developer mightalso deliver all the control messages of JavaScriptto the secure edge device so that it can authenti-cate, install session context, and open pinholes,as shown in Fig. 4. Such a device would block allWebRTC traffic unless the application hasexplicitly used its API to install a media flowcontext.

The main advantage of this approach is thatthe application developer voluntarily delivers themedia flow context to the secure edge. Thisapproach would require standardization of asecure edge API.

POLICY COMPLIANCEEnterprises often require policy compliance suchas recording and logging all conversations. Somerequire selective authorization of certain calldestinations or websites based on who is involved

Figure 3. Flows using a media relay in the DMZ to establish a media path with WebRTC.

Enterprise DMZ PublicInternet

User configuresrelay and credentials

BrowserBrowser Mediarelay

Webserver

Session offer with P over HTTPS

Session answer with Q over HTTPS

TURN allocateOpenport P

Createport Q

TURN response (port P)

ICE connectivity check to Q

ICE connectivity check to PRelayed connectivity check

Relayed connectivity check responseConnectivity check response

Media flows Media flows

The enterprise

firewall would be

configured to block

non-relayed WebRTC

media flows. The

enterprise would

deploy a TURN

server in the DMZ,

and permit media

flows which go

through this server.

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 51

IEEE Communications Magazine • April 201352

in the interactions. In particular, records of pastemails, memos, instant messaging, calls, andeven logs of visited websites are useful duringauditing and sometimes in legal proceedings.The usefulness of such insight often requiresenterprises to co-relate records with the personwho sent or received a piece of information orplaced or received a call.

Various examples of where policy could beapplied have been mentioned in the examplesabove. For all of the approaches described inthe previous section except the last one, thereis no known way for the secure edge to knowthe full context of a media flow to apply policy.For example, even knowing which web site orapplication is originating the flow is difficult, asthe user may have multiple open tabs andbrowsers running, and any of the sites could beWebRTC enabled. If per-site WebRTC block-ing is desired, the entire site might need to beblocked using, for example, the enterpriseproxy.

TRANSPARENT AT THE BORDERFrom the previous discussion, it is clear thatdue to the end-to-end encryption and securityof a WebRTC media path, transparently record-ing media conversations at the secure edge isnot feasible without acting as a man-in-the-mid-dle. Unfortunately, this could prompt the enduser with security warnings and cause annoy-ance.

MANGLE WEB JAVASCRIPTA web proxy could intercept and filter out anyWebRTC-related JavaScript code, or enforceuse of a media recording API. This would likelyinterfere with proper operation of the website,and could perhaps render the entire site unus-able or unpredictable in behavior. Second, thisapproach is either too difficult or impossible toimplement in practice, as it involves detectingrelated code in obfuscated or dynamically gener-ated JavaScript, and intercepting web pagesdelivered over TLS.

INTEGRATION AND INTEROPERATION

WebRTC deviates from traditional enterprisecommunication in two ways: it opens up commu-nications from every web application instead ofcontrolled software pieces installed by the ITdepartment, and it bypasses the existing enter-prise communication infrastructure typicallyenabled in its modern form by the SIP family ofstandards. The integration and interoperation ofWebRTC with legacy SIP devices could happenin the end user’s browser (client) or via a trans-lation gateway (server).

INTEROPERATIONTo connect with a SIP device from the browser,one could implement the SIP stack in JavaScriptrunning in the client, use SIP-over-WebSockettransport, and use an end-to-end media pathenabled by WebRTC. However, WebRTCincludes several new profiles and extensions inthe media path that are typically not implement-ed in existing SIP devices, for example, multi-plexing RTP and Real-Time Transport ControlProtocol (RTCP), audio and video streams onthe same UDP port, mandatory ICE negotiationattributes, mandatory SRTP, and the proposedmandatory audio and video codecs. Interoperat-ing with existing SIP devices that do not supportsome or all of these features requires an inter-mediate gateway that has access to the signalingas well as media paths. Alternatively, the SIPdevices need to be “upgraded” to support all thenew features.

Upgrading existing SIP phones, both hard-ware devices and softphones, is not trivial. Thereare several potential ways in which this couldevolve. For example, existing SIP phones couldbe reprogrammed to be able to participate inWebRTC media flows, remain unchanged butinteroperate with WebRTC via a gateway, or besuperseded by a new generation of phones thatemploy various web-centric enablements such asHTML5 and WebRTC. The benefit and costanalysis for such transitions is an open issue for

Figure 4. A firewall conscious web application voluntarily delivers the media flow context to the secureedge, which opens the firewall pinholes.

BrowserWebserver

Secureedge

Enterprise PublicInternet

Web appand browser

ICE address gathering

Media flows

Create session S with new offer

Update S with answer

Open ports

Session offer over HTTPS

Session answer over HTTPS

WebRTC deviates

from traditional

enterprise communi-

cation in two ways:

it opens up commu-

nications from every

web application

instead of controlled

software pieces

installed by the IT

department, and it

bypasses the existing

enterprise communi-

cation infrastructure

typically enabled in

its modern form by

the SIP family of

standards.

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 52

IEEE Communications Magazine • April 2013 53

many enterprises and device vendors that areconsidering WebRTC adoption. Besides thetechnical differences listed above, there is a cru-cial behavior difference in the way a SIP phoneor WebRTC enabled application is expected towork. In particular, people primarily expect aphone to make or receive calls, which is a singleprimary application. However, people expect touse WebRTC to enable real-time communica-tions inside whatever they are already doing orwant to do on the web. It remains to be seenhow the industry shapes the future of existingSIP devices while adopting WebRTC.

Translating between SIP/RTP and WebRTCat a gateway appears to be a viable short-termsolution for enterprises and device vendors alike.Moreover, such a gateway server can work withthe proposed secure edge to integrate authenti-cation and firewall traversal between the twoprotocols. Due to differences in offer-answerstate machines, it is nontrivial to blindly forwardsession descriptions between SIP and WebRTCendpoints. Nevertheless, the gateway can termi-nate and originate media flows on each side toperform translation as necessary, similar in spiritto a back-to-back user agent.

INTEGRATIONIntegration of WebRTC enabled applications inan enterprise is a broader issue, of which interop-eration with existing communication system is onlyone part. The larger perspective deals with howWebRTC will change the way enterprises do busi-ness — either within the organization or outsidewith customers or other organizations. The emerg-ing trend to bring your own device to work hasopened the enterprise network to some extent,and such devices may not be subject to the sameset of strict policy enforcement. As the trend ofmoving everything to the cloud continues,WebRTC can bring the communication primitivesto these cloud-hosted web applications (problemtracking and resolution systems, informationrepositories, customer relations tools, corporatedirectories, social media interactions, blogs, etc.).Initially, enterprises will likely take measures toprimarily interoperate between existing communi-cations systems and WebRTC-centric applications,while keeping core communications functionalityin SIP. Eventually, a few powerful web-centricenterprise applications will likely emerge that havethe potential to move WebRTC technology to amore central position in enterprises.

WebRTC not only brings voice and videoflows to web applications, but also allows sharinggeneric data interactively. For example, a brows-er could expose the user’s desktop as a localmedia stream so that any web application canenable desktop sharing. These applicationsdiminish the differences between a communica-tion-specific device and a generic web and com-puting device. Such interactive data sharingbrings new threats in relation to enterprises.

PEER-TO-PEER DATA FLOWS ANDFILE TRANSFERS

WebRTC also includes enablers for interactivelyexchanging data and files in conjunction withreal-time media flows. This capability is highly

attractive to gaming applications and possiblymany web applications. Such transfers will likelybe perceived by enterprises as introducing signif-icant threats. In contrast to media flows that areprocessed by codec technology that has definedexpectations and can ignore or discard unexpect-ed content, data flows can be unstructured andused in many ways. In general, threats related tomedia flows involve exploiting some flaw in themedia processing technology. In contrast, inter-active data flows can more easily contain virusesor other malware since they are not processed ina specific way.

Requests for media flows are polite in thatthey always ask the user for permission to use aresource like the microphone or camera in aflow. The current WebRTC specifications do notask or alert the users in any way that a data flowis going to happen. This characteristic alone like-ly makes such data flows as part of WebRTCinteractions appear as more of a threat in anenterprise context than the media flow itself forboth network protection as well as intellectualproperty reasons. Beyond network threat vec-tors, enterprises have concerns about what intel-lectual property leaves or enters the enterprise.Undetectable leakage of valuable intellectualproperty owned by an enterprise is an easilyunderstandable concern. Less obvious but alsoimportant are possible allegations or claims thatmay arise over intellectual property owned bysomeone else that enters an enterprise, especial-ly if the entry is not well documented.

It is likely interactive data flows will have tobe detected and subjected to enterprise policy,possibly in a different manner than interactivemedia flows. At least initially, until it is clearexactly how to handle interactive data flowsacross enterprise network boundaries, someenterprises may want to simply restrict WebRTCto media flows only. At this point in the maturityof WebRTC, detailed discussion of this topic isbeyond the scope of this introductory article.

CONCLUSION AND FUTURE WORKThis article has begun to look at enterpriserequirements for permitting WebRTC mediaflows to cross enterprise boundaries. The exist-ing state-of-the-art session border controllerswill not work with WebRTC, and many of theirprinciples do not really apply to WebRTC. Anumber of potential partial solution approacheshave been outlined and discussed. The analysisin this article shows that while there are somepromising potential approaches, the design of asecure edge to permit enterprise authorizationand application of policy to WebRTC traffic isfar from solved today.

In the future, if an enterprise treats web-based rich media interactions differently thanexisting VoIP, it may not apply the same set ofstrict policies to WebRTC. In such cases, website level filtering may be enough. On the otherhand, the IT department may treat it as a threatto enterprise security and compliance. Ultimate-ly, the benefits and desirability of WebRTCinteractions across the enterprise boundary willresult in solutions to this problem.

We are currently investigating various ideas

It is likely interactive

data flows will have

to be detected and

subjected to enter-

prise policy, possibly

in a different manner

than interactive

media flows. At least

initially, until it is

clear exactly how to

handle interactive

data flows across

enterprise network

boundaries, some

enterprises may want

to simply restrict

WebRTC to media

flows only.

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 53

IEEE Communications Magazine • April 201354

in the application of enterprise policy in aWebRTC context in ways that are significantlydifferent from how policy is applied in session-centric communications technologies. Publica-tion of the concepts related to our currententerprise policy research is presently under con-sideration, placing discussion of that researchbeyond the scope of this article.

REFERENCES[1] A. B. Johnston and D. C. Burnett, WebRTC: APIs and

RTCWEB Protocols of the HTML5 Real-Time Web, DigitalCodex, 2012, ISBN 978-0985978808.

[2] “WebRTC 1.0: Real-Time Communication betweenBrowsers,” W3C Working Draft, Aug. 2012, http://www.w3.org/TR/webrtc/.

[3] Real-Time Communication in WEB-browsers (RTCWEB)IETF Working Group, http://tools.ietf.org/wg/rtcweb.

[4] A. Lepofsky, “Social Business 2013: Less Talking, MoreDoing,” Constellation Research blog, Dec. 2012,http://www.constellationrg.com/blog.

[5] H. Schulzrinne et al., “RTP: A Transport Protocol forReal-Time Applications,” IETF RFC 3550, 2003

[6] J. Rosenberg et al., “SIP: Session Initiation Protocol,”IETF RFC 3261, 2002.

[7] D.Wing, “Symmetric RTP / RTP Control Protocol (RTCP),”IETF RFC 4961, 2007.

[8] J. Rosenberg, “Interactive Connectivity Establishment(ICE): A Protocol for Network Address Translator Traver-sal for Offer/Answer Protocols,” IETF RFC 5245, 2010.

[9] J. Hautakorpi et al., “Requirements from Session Initia-tion Protocol (SIP) Session Border Control (SBC) Deploy-ments,” IETF RFC 5853, 2010.

[10] T.Reddy et al., “STUN Extensions for AuthenticatedFirewall Traversal”, IETF-Draft, draft-reddy-rtcweb-stun-auth-fw-traversal.

[11] R. Mahy et al., “Traversal Using Relays around NAT(TURN): Relay Extensions to Session Traversal Utilitiesfor NAT (STUN),” IETF RFC 5766, 2010.

BIOGRAPHIESALAN JOHNSTON ([email protected]) is a distinguishedengineer at Avaya, and a contributing author to more thana dozen IETF standard specifications, including the SIPspecification. He has over 13 years of experience in SIP,VoIP, and Internet communications. Besides serving as anadjunct instructor at Washington University in St Louis,Missouri, he has authored several best-selling technicalbooks, including WebRTC: APIs and RTCWEB Protocols ofthe HTML5 Real-Time Web.

JOHN YOAKUM ([email protected]) is a consulting engi-neer at Avaya who champions emerging innovation anddisruptive technologies. He serves as an evangelist forkeeping the “future in focus.” He has distinguished himselfas an innovator and visionary with numerous patents, pub-lications, presentations, and standards contributions overhis professional career at Motorola, Nortel, and Avaya.

KUNDAN SINGH ([email protected]) is a consultingresearch scientist at Avaya Labs. He received his Ph.D. fromColumbia University with a focus on Internet telephony,and has worked at Motorola, Bell Labs, Adobe, Tokbox,Twilio, and Avaya on a variety of Internet communicationsystems using SIP, web, and cloud platforms. He has con-tributed to several open source projects in peer-to-peerInternet telephony, and voice and video communicationsover the web.

We are currently

investigating various

ideas in the

application of

enterprise policy in a

WebRTC context in

ways that are

significantly different

from how policy is

applied in

session-centric

communications

technologies.

YOAKUM LAYOUT_Layout 1 4/1/13 1:22 PM Page 54