using identities to achieve enhanced privacy in future content delivery networks

10
Using identities to achieve enhanced privacy in future content delivery networks q Pedro Martinez-Julia , Antonio F. Gomez-Skarmeta Department of Communication and Information Engineering, University of Murcia, 30100 Murcia, Spain article info Article history: Received 13 February 2011 Received in revised form 24 November 2011 Accepted 25 November 2011 Available online 19 December 2011 abstract In this paper we show how to bring enhanced privacy to future content delivery network architectures, with special attention to Content-Centric Networking (CCN). This objective is primarily achieved through the prevention of operation traceability while raising the role of communication parties to be a central element of the network. Thus, we propose an identity-based architecture that provides these capabilities in an integrated way with the underlay network architecture, which here is CCN. Also, the architecture we propose adds to CCN the possibility of performing secure end-to-end communications with trans- parent authentication capabilities. To reinforce the proposed architecture we discuss a suc- cessful security verification of its protocol and finally, we discuss a proof-of-concept implementation of the architecture working on top of CCN that we then use to perform the necessary tests to demonstrate its feasibility and scalability. Ó 2011 Elsevier Ltd. All rights reserved. 1. Introduction On its inception, the Internet was designed as a common infrastructure to let distant computers communicate with a packet-based model that prevents the occupation of communication lines and gains robustness thanks to its routing/switch- ing mechanisms. This view culminated in a communications model based on the conversation of two machines (peers) iden- tified by their location dependant addresses. Today, this scenario still represents the model followed by most communications performed through the Internet. Over time, induced by the emergence and consolidation of services with strong asymmetric nature, such as the Web, the behavior found in most of the network traffic of the Internet has evolved to reflect an environment in which a relatively low number of services provide content to a large number of clients. This behavior has founded the appearance and development of content delivery networks (CDNs), which maintain geographically distributed points of presence with replicated content and near the final client [1]. The main objective of CDNs is to improve the performance of Internet-based content delivery, reducing the response time and increasing the bandwidth and availability of services deployed on the Internet, particularly the web-based services. The expectation of a huge growth in the amount of content to be delivered over the Internet has promoted the search of new approaches and paradigms to content delivery for the Next Generation Internet (NGI), most of them overcoming the problem at the center of the architecture rather than on top of the current architecture that, as commented above, is based on end-to-end host addressing. Although current proposals for future CDN architectures successfully meet their mission, they fail when dealing with cli- ent (user) privacy and do not provide clear and native mechanisms to protect client transactions. We consider that enhanced 0045-7906/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.compeleceng.2011.11.021 q Reviews processed and approved for publication by Editor-in-Chief Dr. Manu Malek. Corresponding author. E-mail addresses: [email protected] (P. Martinez-Julia), [email protected] (A.F. Gomez-Skarmeta). Computers and Electrical Engineering 38 (2012) 346–355 Contents lists available at SciVerse ScienceDirect Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

Upload: pedro-martinez-julia

Post on 03-Sep-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Using identities to achieve enhanced privacy in future content delivery networks

Computers and Electrical Engineering 38 (2012) 346–355

Contents lists available at SciVerse ScienceDirect

Computers and Electrical Engineering

journal homepage: www.elsevier .com/ locate /compeleceng

Using identities to achieve enhanced privacy in future contentdelivery networks q

Pedro Martinez-Julia ⇑, Antonio F. Gomez-SkarmetaDepartment of Communication and Information Engineering, University of Murcia, 30100 Murcia, Spain

a r t i c l e i n f o

Article history:Received 13 February 2011Received in revised form 24 November 2011Accepted 25 November 2011Available online 19 December 2011

0045-7906/$ - see front matter � 2011 Elsevier Ltddoi:10.1016/j.compeleceng.2011.11.021

q Reviews processed and approved for publication⇑ Corresponding author.

E-mail addresses: [email protected] (P. Martinez-J

a b s t r a c t

In this paper we show how to bring enhanced privacy to future content delivery networkarchitectures, with special attention to Content-Centric Networking (CCN). This objective isprimarily achieved through the prevention of operation traceability while raising the roleof communication parties to be a central element of the network. Thus, we propose anidentity-based architecture that provides these capabilities in an integrated way withthe underlay network architecture, which here is CCN. Also, the architecture we proposeadds to CCN the possibility of performing secure end-to-end communications with trans-parent authentication capabilities. To reinforce the proposed architecture we discuss a suc-cessful security verification of its protocol and finally, we discuss a proof-of-conceptimplementation of the architecture working on top of CCN that we then use to performthe necessary tests to demonstrate its feasibility and scalability.

� 2011 Elsevier Ltd. All rights reserved.

1. Introduction

On its inception, the Internet was designed as a common infrastructure to let distant computers communicate with apacket-based model that prevents the occupation of communication lines and gains robustness thanks to its routing/switch-ing mechanisms. This view culminated in a communications model based on the conversation of two machines (peers) iden-tified by their location dependant addresses. Today, this scenario still represents the model followed by mostcommunications performed through the Internet.

Over time, induced by the emergence and consolidation of services with strong asymmetric nature, such as the Web, thebehavior found in most of the network traffic of the Internet has evolved to reflect an environment in which a relatively lownumber of services provide content to a large number of clients. This behavior has founded the appearance and developmentof content delivery networks (CDNs), which maintain geographically distributed points of presence with replicated contentand near the final client [1]. The main objective of CDNs is to improve the performance of Internet-based content delivery,reducing the response time and increasing the bandwidth and availability of services deployed on the Internet, particularlythe web-based services.

The expectation of a huge growth in the amount of content to be delivered over the Internet has promoted the search ofnew approaches and paradigms to content delivery for the Next Generation Internet (NGI), most of them overcoming theproblem at the center of the architecture rather than on top of the current architecture that, as commented above, is basedon end-to-end host addressing.

Although current proposals for future CDN architectures successfully meet their mission, they fail when dealing with cli-ent (user) privacy and do not provide clear and native mechanisms to protect client transactions. We consider that enhanced

. All rights reserved.

by Editor-in-Chief Dr. Manu Malek.

ulia), [email protected] (A.F. Gomez-Skarmeta).

Page 2: Using identities to achieve enhanced privacy in future content delivery networks

P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355 347

privacy is a key feature for the NGI, so in this paper we propose an architecture to overcome traceability issues in current andnext generation networks.

Among the different content delivery network approaches focused on the NGI that we discuss in Section 2, we can high-light the architecture proposal of Content-Centric Networking (CCN) [2]. It is a complete communications architecture builtaround the named data concept instead of the end-host and location-based addressing, while preserving the design decisionsbehind TCP/IP. As also discussed in Section 2, we decided to bring enhanced privacy to CCN, with special attention to trace-ability prevention. Thus, we propose an identity-based architecture that is prepared to work with future CDN architectures,including CCN, and give them many valuable security capabilities, such as user identification, identity management, inherentauthentication, and encryption.

We covered a similar approach in a previous work [3] that shows an identity based architecture based on the overlay net-work of a Distributed Hash Table (DHT), a common technology behind P2P networks. Also, the present work somehow fol-lows the SWIFT project [4] that defines a framework for identity management and privacy protection for users of multipleidentity providers.

The remainder of this paper is organized as follows. First, as mentioned above, we discuss the most interesting ap-proaches for future content delivery in Section 2. Then, in Section 3 we describe the architecture and in Section 4 we showits protocol. Then, in Section 5 we discuss how to implement an evaluation proof-of-concept architecture and in Section 6 weshow the results we obtained from the tests we performed. Finally, we discuss our conclusions in Section 7 with a smallintroduction of the future work.

2. Next generation CDNs

As introduced above, the content distribution mechanisms of the Internet have evolved from a centralized model to amodern and massively distributed model that is reflected in CDNs and Peer-to-Peer (P2P) networks. In addition, the defini-tion of new approaches is motivated by the spreading of high quality content of the Web and the augmentation of the band-width provisioned at the edge network [5]. Our work here is focused on CDNs so we leave the P2P approaches for futureworks. Below we describe some interesting approaches for the future of CDNs.

The Publish/Subscribe Internet Routing Paradigm (PSIRP) [6] project proposes to use rendezvous as a network primitive tomove the control from the sender to the receiver changing the network architecture from push to pull while distributing thecontrol over data reception from the network endpoints to the network itself. In PSIRP, each piece of data has both a publicand private label that identify the publisher and which is used to make routing decisions.

Being developed within the 4WARD project, we can find Networking of Information (NetInf) [7], an architecture for futureInternet content delivery. It provides a publish/subscribe mechanism that also abstracts the communications aspects pre-sented to applications, offering the transfer of application data objects instead of end-to-end reliable byte-streams. Also, thisarchitecture is extended beyond pure content to conversational services (VoIP) and store-and-forward services (email).Although its objectives are courageous, its achievements are in a very early stage so we need to wait until having a validimpression of this approach.

The Data-Oriented Network Architecture (DONA) [8] is a communications architecture that replaces DNS names withself-certifying flat names and a name-based anycast primitive above the current Internet. Instead of certifying the content,DONA certifies the publishers and labels the data. Also, the data can not be dynamically generated, it must be first registeredin the trusted resolution handlers (RHs). Once the content requested by a client is found, it is delivered using IP routing.Finally, the security in DONA is achieved by content and provider validation.

Being based on the flat layout of DHTs instead of hierarchical organizations, we can find many interesting content deliveryapproaches [9,10]. We do not particularize them because they are somewhat considered a type of P2P content delivery andare out of the scope of this work, but we take them into account during the design of our proposal.

We finally introduce the architecture that inspired this work, Content-Centric Networking (CCN) [2], which is also calledNetworking Named Content (NNC). The idea behind CCN is that it does not matter where the data comes from as long as it isvalid, secure and authentic. It proposes an architecture where intermediate elements (routers) keep a copy of requested con-tent to be served to all clients asking for it or all other intermediate elements nearer the clients. The whole system workswith a publish/subscribe mechanism, in which clients declare their interest on certain content, that is then forwarded amongthe intermediate elements until reaches the content source which, in turn, send back the content to those intermediate ele-ments asking for it to, finally, be provided to the clients. The security of this architecture, like seen in other architectures, isbased on the contextual protection and trust of the content served through validation.

Although all architectures described above perfectly accomplish with the content delivery task and, at one level oranother, try to overcome the challenges for the NGI [11,12], all of them lack some security capabilities about how entitiesbehind communications are treated. We consider them more than important, essential, for the future of Internet and hereis where our architecture goes into action. These essential capabilities are:

� Clear protection of privacy with particular focus on traceability prevention.� Secure identification of the entities that take part of a communication.� Utilization of roles as communication endpoints, instead of hosts or persons.

Page 3: Using identities to achieve enhanced privacy in future content delivery networks

348 P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355

� Dynamic management and provisioning of identity attributes in security negotiations.

These capabilities are desirable in CDNs because, even though a piece of data can be secured through cryptographic mech-anisms, the source and the piece of information itself can be easily followed during a communication, so an attacker (or notso attacker) may know what information is accessed by a client, thus violating its privacy. In the following section we de-scribe our architecture proposal to bring the aforementioned capabilities to future content delivery network architectures,with a special focus on CCN.

3. Proposed architecture

As commented above, the main goal of this work is to provide an identity-based architecture to prevent traceability issuesfound in many current and future network architectures, with special focus on content delivery networks. Also, we takeadvantage of this to bring secure end-to-end communications to them. To get this objective we propose to build an identityoverlay network where entities are addressed by their digital identities, instead of logical address of the device (or host) theyuse to access network, content provider names, used in most future CDNs, or even content names. This overlay network isthen divided in many domains of trust that are independent of the actual networks. Each entity is associated to a domain andcan have different devices connected to different physical or logical networks at the same time.

To achieve these capabilities, the architecture incorporates many elements and mechanisms. Fig. 1 shows an overview ofthe architecture with its main elements, leaving out the lower layer networking infrastructure used by the devices of thecommunication parties, which here is CCN. The most important elements of the architecture are the entities participatingin the communication, which can be people, software (services), hardware (machines), things (Internet of Things [13]),etc. One special element is the Domain Trusted Entity (DTE), which manages the association of entities and identifiers forits domain and permits communication parties to be sure they are talking to who they pretend without revealing identityinformation. It can also be used by other elements to obtain certain identity attributes if allowed by policies. The DTEs ofdifferent domains are connected forming an infrastructure that supports and protects the identity of the communication par-ties. Finally, the underlying network infrastructure is used to transmit low-level messages among communication parties.

In this architecture, the communications are established through endpoints that are used in message exchanges and areidentified by location independent identifiers. If the underlying network is based on addresses, our architecture requires toallocate many addresses to be associated with the different identifiers which can be dynamically negotiated through the DTEinfrastructure. Also, it permits to change any endpoint identifier at any time, so the mobility support is inherent and the pri-vacy can be enhanced with arbitrary identifier renegotiation.

At the identity level, our architecture proposes to manage identities and build identifiers with the eXtensible ResourceIdentifier (XRI) and eXtensible Resource Descriptor Sequence (XRDS) [14]. XRI is used to build the identifiers and XRDS isused to describe the identifiers of the endpoints owned by an entity. Thus, our architecture has a consistent identifier schemethat can be coupled with current identity management architectures, such as OpenID [15].

When this architecture is instantiated on top of CCN, we provide both privacy, through the traceability prevention of cli-ent activities, and the ability to perform end-to-end communications using the identity concept as described in the followingsubsection.

3.1. Identity and identifier

This architecture emphasizes the differentiation of identity and identifier. Here we meet with the ITU-T definition of iden-tity on its X.1250 [16] recommendation as follows: The representation of an entity in the form of one or more information

Fig. 1. Architecture overview.

Page 4: Using identities to achieve enhanced privacy in future content delivery networks

P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355 349

elements which allow the entity(s) to be sufficiently distinguished within context. For identity management purposes, theterm identity is understood as contextual identity (subset of attributes), i.e. the variety of attributes is limited by a frame-work with defined boundary conditions (the context) in which the entity exists and interacts. Thus, each entity is repre-sented by one holistic identity, which comprises all possible information elements characterizing such entity (theattributes). However, this holistic identity is a theoretical issue and eludes any description and practical usage becausethe number of all possible attributes is indefinite.

On the other hand, also meeting with its ITU-T definition, we consider an identifier to be a piece of fixed-size data thatidentifies something. In a general sense, this architecture uses identifiers to determine the endpoints of communication par-ties, as well as to obtain information from the identity if permitted. Nevertheless, they are not used to unambiguously asso-ciate an identity to an object on time, just in certain moment and communication event.

3.2. Domain Trusted Entity

The aforementioned Domain Trusted Entity (DTE) is a special entity that manages and protects the communicationaspects of its domain identities. Moreover, if allowed by the policies, it can reveal some identity attributes to other entities.Thus, the DTE collaborates with other identity management technologies like SAML [17] and Shibboleth [18]. In this archi-tecture, the DTE stores XRDS documents that belong to its entities and which other entities may request. The XRDS docu-ment describes the services offered by an entity and how they can be contacted, their service endpoints. Therefore, theDTE plays the role of the XRI/XRDS resolution infrastructure in OpenID.

The DTEs are also used to validate that the identifier (or identifiers) used by an entity belongs to such entity. Thus, anycommunication party can be sure that is talking with the entities it pretends to talk without knowing any attribute of the ac-tual identities behind them. Again, this functionality is also controlled by policies, so some entities may decide to forbid thevalidation. Furthermore, when an entity requires anonymity, it may request an anonymous identity whose identifiers can bevalidated and whose attributes can be requested, but the actual information of the real entity is not disclosed in any manner.

Due to the high number of interactions and traffic that is presumably supported by each DTE, it should be constructed in adistributed manner. For instance, it can be constructed using technologies found in DHTs [19].

3.3. Underlying network

This architecture needs a special underlying network infrastructure that is capable of deliver messages using identifiersinstead of network addresses but it can be instantiated on top of current IP networks using multiple addresses to achieve itscapabilities. Moreover, the protocol we propose with the architecture, discussed in Section 4, can be instantiated over dif-ferent underlying networks and, occasionally, allow the coexistence of multiple networking architectures.

Since this architecture is better fitted in an underlay network architecture that is capable of delivering messages withoutneeding location addresses, it fits perfectly with CCN capabilities, so both architectures get benefits. We discuss how to inte-grate both architectures in Section 5.

Apart from the content delivery networks, other network architectures are also suited for our architecture. First, we havethe overlay network protocols used in many DHT infrastructures like Chord [20]. It provides network simplicity, decentral-ized management, and many performance gains, such as Relaxed-2-chord [21] and LPRS [22]. Also, being derived fromChord, we can consider other architectures like Kademlia [23] or Cyclone [24].

3.4. Security

Instead of hiding the identity information of an entity, this architecture offers the possibility to access it in a controlledmanner. Thus, the DTE is responsible of managing the identity information, so others may ask it to ensure that an entity is‘‘who’’ is pretending to be. Also, we can consider that an entity is authenticated just by validating that the identifier (or iden-tifiers) it is using belongs to it and ensuring the integrity of the messages exchanged with it, which is done by a signature (ortoken) field included in the messages. Therefore, our architecture and protocol provides integrated authentication of all com-munication ends as well as message integrity.

In addition, our architecture proposes and recommends to use an asymmetric encryption mechanism to give confidenti-ality when needed. It may be inefficient and processor hungry but with obvious benefits over weaker encryption mecha-nisms: (1) Transmitted information will be kept secret for longer; (2) There is no need to negotiate the security terms,with the speed-up it represents; (3) Fits perfectly and performs much better in publisher/subscriber underlying networks.In the future, processor performance improvements may make those methods much more feasible. This does not preventour architecture to adopt symmetric mechanisms and key exchange protocols such as IKEv2 [25] but they are out of thescope of the current work.

3.5. Application message exchanges

Since our architecture provides endpoint semantics and permits services to have their own identifiers, it may be directlyused by applications and services to exchange their messages, reducing the final layers used in communication. For instance,

Page 5: Using identities to achieve enhanced privacy in future content delivery networks

350 P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355

in a SOAP [26] based application, each layer introduces its own headers and message format so it is difficult to take full com-munication control from the application layer, meaning that applications can not be aware or change the communicationparameters set by underlying network (such as security and privacy), not even make assumptions about them. In addition,it is also difficult to apply traffic engineering because communication semantics are hidden in upper protocol layers. On thecontrary, using our protocol the messages are directly delivered through the network, so it is simpler and traffic engineeringis able to consider some application layer details when taking decisions. When substituting SOAP, in order to interoperatewith legacy web services tied to SOAP, our architecture proposes to use a simple adapter or special gateway that convertsmessages from our protocol to SOAP messages and vice-versa.

3.6. Message format

Considering that exchanged messages must support at least the capabilities described above and, preferably, other value-added features, the message format of this architecture needs to be quite extensible. The best alternative is to have arbitrarynumber of fields while keeping mandatory the essential fixed-position fields, such as the source and destination identifiers,the signature, and the content (payload).

Once defined a flexible message format, applications may include specific headers into network messages so identityinfrastructure is able to correctly, securely, and efficiently deliver them. Also, other information to be used by endpointsmay be introduced, so applications get a fine control over their messages. Furthermore, messages can be instantiated inmany low-level message representations that may need specific headers, as name/value/field-separator, JSON, XML, andbinary.

4. Protocol

To achieve the objectives of the architecture proposed in this work we define a protocol that supports all capabilities dis-cussed in the previous section. As discussed above, this protocol is not tied to a current underlying network architecture. Byother means, both the architecture and protocol are so generic that can be instantiated on top of many network architec-tures. To describe the protocol we use a scenario in which two entities start a conversation.

Fig. 2 shows the main scenario of the protocol in which an entity (Alice) that belongs to a domain (Domain 1) starts aconversation (session) with another entity (Bob) from other domain (Domain 2). This process is divided in four stages: reg-istration, entity search, session establishment, and message exchanges. Below we discuss the whole process but before webriefly describe each message.

The integrity and confidentiality of all messages is correspondingly ensured with a signature or encryption mechanismthat is independent of the message content. The authentication message (AUTHENTICATE) has the client identifier of the en-tity that is being authenticated encoded in XRI together with the XRDS data that describes the services (facets) offered by the

Fig. 2. Identity-based network protocol.

Page 6: Using identities to achieve enhanced privacy in future content delivery networks

P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355 351

entity. The authentication response (OK) only contains the verification result of the identifier (AUTHENTICATION SUCCESS).The GET_XRDS message contents are the XRI-encoded identifier of the service (facet) used by the initiator (@D1=ALICE_FA-CET), the destination domain also encoded in XRI (@D2), and a query to search Bob in terms of its identity attributes (e.g.And(Equals(Attribute(‘‘email’’), ‘‘[email protected]’’), GreaterThan(Attribute(‘‘Age’’), 27))). The query language definition is outsidethe scope of the current paper. The RESPONSE message has a subset of the XRDS document of Bob with the services (facets)allowed to the faced used by Alice to resolve the query. The START_SESSION message content has the facet used by Alice, likein the previous message, the session identifier that is going to be used to identify it in the session, and the selected service(facet) from the previously received XRDS document. The response (OK) message includes the session identifier to be used bythe counterpart (responder) that here is Bob. Finally, the DATA message contains the destination identifier and the payload. Itdoes not contain the source identifier because each identifier is associated with only one session, so from the destinationidentifier the responder can extract the source identifier.

During the registration, each entity contacts with the DTE of the domain it belongs to authenticate itself and register theXRDS document that describe its exposed facets, each one with its own identifier based on XRI. Those facets, also known asvirtual identities, represent the entities during communication acts to protect their actual identity. In the next step, Alicesends a request to its DTE with a query to get an XRDS document that describes some facets of Bob, providing the facet Alicewants to use.

To start a session, Alice selects the identifier of the Bob’s facet to which it wants to contact, allocates a new session iden-tifier, and communicates it to Bob, using the selected facet identifier, through the DTE infrastructure. Once Bob has con-firmed the session communicating its own session identifier, Alice starts sending the data it wants using only thisidentifier because Bob has associated it to the session identifier allocated by Alice. Then Bob uses the session identifieralready provided by Alice to respond.

We should notice that both Alice and Bob may be instantiated by many elements. For instance, when certain content isgoing to be delivered to many clients (broadcast), the role of Bob is played by the content source and the role of Alice isplayed by the multiple clients that will receive the content. In this case, the negotiation is performed by the DTEs of theinvolved entities. When instantiated on top of architectures like CCN, the role of Bob is also played by the intermediateequipments involved in the content delivery.

4.1. Security analysis

As the main purpose of our architecture is to provide inherent authentication and enhanced privacy to other underlyingnetwork architectures, which are security related aspects, we need to be sure that the message exchanges of the proposedprotocol are secure. Below we discuss the security requirements and how they are covered by the proposed protocol. Themain security requirements to ensure communication privacy are defined as follows:

1. The session identifiers negotiated during the session establishment must be used to authenticate communication partic-ipants, which we call Alice and Bob, but represent any pair of abstract entities that can be formed by one or more realentities.

2. The service identifiers, which we call facets but also found as virtual identities in the literature, are used by the entities tostart a session and must only be seen by the entities involved in a communication and their DTEs.

The proposed architecture and protocol meets with these requirements by the introduction of the indirection mecha-nisms through the DTE infrastructure. On the one hand, as the entities are authenticated in their corresponding DTE, thenegotiation of session identifiers by a pair of DTEs can be used to authenticate the entities associated to them. This operationis secure because the DTEs communicate through a channel with assured confidentiality by using message encryption andassured integrity by using signature mechanisms. On the other hand, using the DTE infrastructure to query an entity andcommunicate the service identifiers to be used in a communication prevents any third party entity to know them and violatethe privacy of the communication participants.

To strengthen our claims about the good security of the protocol, we perform an automated validation of the security as-pects of the protocol by means of the AVISPA tool [27]. We selected this tool because its simplicity and strength when ana-lyzing network protocols.

First, we formalize the protocol model shown in Fig. 2 using Alice–Bob (A–B) notation that can be later used to perform itsanalysis and validation. We are interested to analyze the portion of the protocol started by Alice to communicate with Bob.The resulting notation is shown in Fig. 3. From this notation we create a full description in High-Level Protocol SpecificationLanguage (HLPSL) that is used by the AVISPA tool. On it we define a different role for each entity taking part of the commu-nication and fulfill each role with its specific responsibilities defined above in the notation. Then, we indicate that the ana-lyzer tool should check that AsID and BsID can be used to authenticate Alice and Bob respectively, and that it should checkthe secrecy of AfID and BfID to be sure that there is a secure channel between Alice and Bob through the DTE infrastructure,so the session identifiers can not be publicly associated to their owners.

Using the HLPSL file obtained from the Alice-and-Bob notation we run the AVISPA tool using the On-the-Fly Model Check-er (OFMC) as backend of the analysis. The tool output indicated us a SAFE result with 10 nodes and 9 plies. We also run theAVISPA tool using the Constraint Logic (CL-AtSe) backend and also got a SAFE output with 22 states analyzed and 7 states

Page 7: Using identities to achieve enhanced privacy in future content delivery networks

Fig. 3. Protocol represented in Alice and Bob notation. It describes the protocol used to start a session between A (Alice) and B (Bob) through theircorresponding DTEs (DTE1 corresponds to A and DTE2 corresponds to B). The parameters AfID and BfID are the service (facet) identifiers, the parametersAsID and BsID are the session identifiers, and KA, KB, KDTE1, and KDTE2 are the public keys of the involved entities. The functions {d}_(k) and {d}_inv(k) arerespectively used to encrypt the message d with the public key k or the private key that corresponds to the public key k. The encryption is used to prevent anattacker to see the session or facet identifiers if it intercepts a message, so facet identifiers cannot be linked to session identifiers. Finally, it shows aunencrypted message exchange, which is included to be sure that, after the session is established, an attacker cannot link the session identifiers with theentities forming participating in the communication.

352 P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355

reached. With these results we can be sure that the protocol is secure. That said, we demonstrated that the proposed protocolcovers the requirements raised above so our architecture can successfully prevent the traceability of network operations andthus enhance the privacy of communication parties.

5. Architecture evaluation over CCN

To demonstrate its feasibility we discuss how to implement the architecture we propose in this paper on top of CCNbecause of its novelty and outstanding capabilities. Since CCN does not directly provide the possibility to exchange end-to-end messages, we need to build an adaptation layer to allow direct communications between endpoint pairs.

In CCN, a subscriber declares its interest on some content, which is identified by a URI-like identifier, and waits until thatcontent is available. Then, a publisher updates the content with that identifier, sending that content to the intermediate ele-ments that deliver it to all interested subscribers. Therefore, when instantiated over CCN, the way an entity of our architec-ture communicates its identifier to the lower layer network is declaring its interest of some content, the content identified byits own identifier. Our adaptation layer exploits this behavior, so each communication party declares its interest on a contentidentified by its own endpoint identifier. Then, when other party wants to send a message, it just updates the content iden-tified by the destination identifier. This update makes the message to be received by the destination party.

After building the adaptation layer we define the message format that will be inserted in CCN content elements. As JSONis supported by CCN, we can directly use the message format described in the previous section without any change. Also,because of CCN identifiers are URI-like, we can use directly the XRI identifiers as proposed in our architecture, so it fits per-fectly on top of CCN.

Finally, once defined message and content formats we implement the DTE logic responsible of receiving requests andsending back responses for authentication, XRDS, and validation. Then, we instantiate a different DTE for each domain withits own configuration. Finally, we build the clients, Alice and Bob, that send and receive those messages defined in the sce-nario described above.

6. Results

In this section we show the results of the execution of the tests we performed with each instantiation to exercise thearchitecture and protocol working over CCN. We compared these results with the results obtained from the execution ofthe raw protocol, without our architecture, so we can get a notion of the performance penalties that can be introducedby our architecture. In the test we measure both the time spent in each message exchange and the total time spent inthe whole test. Then, with the former measures we calculate the average time spent for each message exchange and withthe latter measures we calculate the time spent by each message in terms of the whole application.

First, Fig. 4a shows the results of the tests performed with CCN as well as with our architecture instantiated on top of it.On the plot we can watch the average time spent on each message exchange displayed as ‘‘One Way Avg’’ and ‘‘Two WaysAvg’’. It also shows the total time taken by the whole execution divided by the number of exchanges, including the extraprocessing, displayed as ‘‘One Way Total’’ and ‘‘Two Ways Total’’. One-way results are obtained measuring the time spentin sending messages only from an emitter to a receiver, while two-way results are obtained measuring the time spent insending requests and receiving responses. The two-way test includes the messages exchanged with the other elements ofour architecture.

Fig. 4b shows the evolution of overhead when increasing the number of messages. The overhead is the principal remark ofall tests with respect to our architecture. It lets us see the time increased by using our identity based architecture and pro-tocol on top of the other lower layer network architectures. We can see that the overhead is bigger when there is a smallnumber of exchanges but decreases quickly as the number of exchanges increases, being stabilized below 10 ms.

Page 8: Using identities to achieve enhanced privacy in future content delivery networks

Fig. 4. Performance and scalability test results. (a) Comparison of the performance of our approach and raw CCN for different communication behaviors. (b)The overhead of our approach over raw CCN. (c) The following communication steps: Authentication (AuthN), identity search (Query), sessionestablishment (Session), and exchanging (request + response) 10 messages (Data � 10). (d) The overhead evolution for different communicationsituations: base message exchanges with session establishment (Session); base message exchanges with session establishment and identity query (Query);and base message exchanges with session establishment, identity query, and authentication (AuthN).

P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355 353

Fig. 4c shows the time spent in the different communication steps. It shows how the authentication, identity search, andsession establishment take more time because the first message exchanges in CCN need to establish the interest and the cor-responding data structures. Still, it takes less than 1800 ms to negotiate a session, so it benefits long sessions but harms shortsessions or individual out-of-session messages. Although it is not so much different from the behavior of the current Internet,which also benefits long sessions, in the future we should investigate how to treat short sessions and individual messages ina different way to lift their performance.

Fig. 4d shows the overhead of each communication step for increasing number of messages sent in the session. This dem-onstrates that for short sessions the architecture adds a very noticeable overhead but for long sessions it is almost negligible.For instance, for sessions with more than 40 message exchanges, the overhead is less than 100 ms. Thus, as the overheadquickly decreases, we confirm the behavior introduced above. We can also see that, as the three plots are parallel, thenew communication steps do not imply a big increase of the overhead compared with each other, they only add overheadto the base case (raw message exchanges).

Watching the results described above and especially the overhead comparison, we extract that our architecture takes onlya few milliseconds (ms) more than raw CCN, which responds to the necessary extra time to process JSON formatted messagesand certain specific operations of the adaptation layer. The worst case is in the exchange time of the request/response mes-sages because of two extra steps (JSON parsing and encoding) but, as shown by the overhead evolution chart, it only takesaround 6.5 ms more than raw CCN. Finally, the extra total time observed in the tests is due to the authentication, XRDS ex-changes, and identifier validation. Although this extra time is negligible for communications with several exchanges, it mustbe minimized for those with few exchanges.

7. Conclusions and future work

In this paper we discussed how to enhance the privacy of Content Delivery Networks (CDNs), with special focus on CCN,using an architecture and protocol that places digital identities in the middle of communications. The desirable essentialcapabilities we enumerated in Section 2 are provided by our architecture as follows:

� The traceability prevention is achieved by the dynamic negotiation of the identifiers used as communication endpoints.This is performed through the DTE infrastructure so the identifiers are only known by the entities involved in the com-munication. In CDNs, this is used to dynamically negotiate the content identifier used by content servers and clients.� The DTE infrastructure intermediates during communication initiation to securely identify the identities taking part of it

without needing to reveal the actual entity.

Page 9: Using identities to achieve enhanced privacy in future content delivery networks

354 P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355

� Since the architecture proposed here abstracts entities to virtual identities and places them as communication endpoints,any entity of any type can take part of a communication, including persons, machines, services, groups of entities, etc.These virtual identities are used to establish role-based endpoints.� During security negotiations, e.g. during session establishment, any part is able to request identity attributes through the

DTE infrastructure and they will be dynamically provided if permitted by policies.

Apart from the enhanced privacy, this architecture can also be used to bring secure identity-based end-to-end commu-nications (identity-to-identity) to any CDN architecture, thus making them appropriate as general purpose network archi-tectures for the Internet of the future.

Once defined the architecture we verified the security of our approach using an automated protocol security verificationtool. We checked that the session identifiers used by communication parties are not disclosed to other entities and that theycan not be mapped to the entity they correspond. Thus, we demonstrated that the proposed protocol satisfies the necessarysecurity requirements to prevent the traceability of network operations and thus enhance the privacy of communications.We also demonstrated the feasibility of the architecture building a proof-of-concept implementation on top of CCN and eval-uated it through the execution of performance tests. The obtained results demonstrate the good scalability of our architec-ture, adding less than 10 ms to each message exchange on top of CCN which, led to a percentage, is less than 30% of overheadin all of its operations.

For the future work we plan to investigate the decentralization of identity validation to gain certain level of independencefrom the DTE infrastructure. This may improve the performance of transactions involving only a few messages (short ses-sions). Also, we plan to investigate about a straight adaptation of XRI. Furthermore, while this paper shows the implemen-tation of our architecture over CCN as underlying networks, we plan to study the behavior of the architecture over otherinfrastructures, mainly overlay networks based on DHTs, trying to bring the capabilities of our architecture to them.

Acknowledgments

This work is partially supported by the European Commission’s Seventh Framework Programme (FP7/2007–2013) projectGN3 and by the Program for Research Groups of Excellence of the Séneca Foundation under Grant 04552/GERM/06.

References

[1] Pallis G, Vakali A. Insight and perspectives for content delivery networks. Commun ACM 2006;49(1):101–6.[2] Jacobson V, Smetters DK, Thornton JD, Plass MF, Briggs NH, Braynard RL. Networking named content. In: Proceedings of the 5th international

conference on emerging networking experiments and technologies (CoNEXT ’09). New York, NY, USA: ACM; 2009. p. 1–12.[3] Gomez-Skarmeta AF, Martinez-Julia P, Girao J, Sarma A. Identity based architecture for secure communication in future internet. In: Proceedings of the

6th ACM workshop on digital identity management. New York, NY, USA: ACM; 2010. p. 45–8.[4] López G, Cánovas O, Gómez-Skarmeta AF, Girao J. A swift take on identity management. IEEE Comput 2009;42(5):58–65.[5] Leighton T. Improving performance on the internet. Commun ACM 2009;52(2):44–51.[6] Dimitrov V, Koptchev V. PSIRP project – publish-subscribe internet routing paradigm: new ideas for future internet. In: Proceedings of the 11th

international conference on computer systems and technologies and workshop for Ph.D. students in computing on international conference oncomputer systems and technologies. New York, NY, USA: ACM; 2010. p. 167–71.

[7] Brunner M, Abramowicz H, Niebert N, Correia LM. 4WARD: a European perspective towards the future internet. IEICE Trans Commun 2010;E93–B(3):442–5.

[8] Koponen T, Chawla M, Chun BG, Ermolinskiy A, Kim KH, Shenker S, et al. A data-oriented (and beyond) network architecture. SIGCOMM ComputCommun Rev 2007;37(4):181–92.

[9] Caesar M, Condie T, Kannan J, Lakshminarayanan K, Stoica I. Rofl: routing on flat labels. SIGCOMM Comput Commun Rev 2006;36(4):363–74.[10] Balakrishnan H, Lakshminarayanan K, Ratnasamy S, Shenker S, Stoica I, Walfish M. A layered naming architecture for the internet. SIGCOMM Comput

Commun Rev 2004;34(4):343–52.[11] Li T. Design goals for scalable internet routing. Internet-Draft; Internet Research Task Force; 2007.[12] Jain R. Internet 3.0: ten problems with current internet architecture and solutions for the next generation. In: Proceedings of military communications

conference. Los Alamitos, CA, USA: IEEE Computer Society; 2006. p. 1–9.[13] Sarma AC, Girao J. Identities in the future internet of things. Wirel Pers Commun 2009;49(3):353–63.[14] Reed D, Chasen L, Tan W. Openid identity discovery with XRI and XRDS. In: Proceedings of the 7th symposium on identity and trust on the internet

(IDtrust ’08). New York, NY, USA: ACM; 2008. p. 19–25.[15] Recordon D, Reed D. OpenID 2.0: a platform for user-centric identity management. In: Proceedings of the second ACM workshop on digital identity

management. New York, NY, USA: ACM; 2006. p. 11–6.[16] International Telecommunication Union, Telecommunication Standardization Sector, Series X: Data Networks, Open system communications and

security. Cyberspace security – identity management. Baseline capabilities for enhancing global identity management and interoperability.Recommendation ITU-T X.1250. 2010.

[17] SAML. Security assertion markup language (SAML); 2010. Available from: <http://saml.xml.org>.[18] Shibboleth. Shibboleth; 2010. Available from: <http://shibboleth.internet2.edu>.[19] Sanchez Artigas M, Garcia Lopez P, Gomez Skarmeta AF. A comparative study of hierarchical DHT systems. In: Proceedings of the 32th conference on

local computer networks. Washington, DC, USA: IEEE Computer Society; 2007. p. 325–33.[20] Stoica I, Morris R, Karger D, Kaashoek MF, Balakrishnan H. Chord: a scalable peer-to-peer lookup service for internet applications. In: Proceedings of the

2001 conference on applications, technologies, architectures, and protocols for computer communications. New York, NY, USA: ACM; 2001. p. 149–60.[21] Cordasco G, Della Corte F, Negro A, Sala A, Scarano V. Relaxed-2-chord: efficiency, flexibility and provable stretch. In: Proceedings of the international

parallel and distributed processing symposium. Los Alamitos, CA, USA: IEEE Computer Society; 2009, p. 1–8.[22] Zhang H, Goel A, Govindan R. Incrementally improving lookup latency in distributed hash table systems. In: Proceedings of the 2003 ACM SIGMETRICS

international conference on measurement and modeling of computer systems. New York, NY, USA: ACM; 2003, p. 114–25.[23] Maymounkov P, MaziÈres D. Kademlia: a peer-to-peer information system based on the XOR metric. In: Proceedings of the first international

workshop on peer-to-peer systems. London, UK: Springer-Verlag; 2002, p. 53–65.

Page 10: Using identities to achieve enhanced privacy in future content delivery networks

P. Martinez-Julia, A.F. Gomez-Skarmeta / Computers and Electrical Engineering 38 (2012) 346–355 355

[24] Sánchez Artigas, M., García López, P., Pujol Ahulló, J., Gómez Skarmeta, A.F., Cyclone: A novel design schema for hierarchical dhts. In: Proceedings of theIEEE International Conference on Peer-to-Peer Computing. Los Alamitos, CA, USA: IEEE Computer Society; 2005, p. 49–56.

[25] Kaufman C et al. Internet Key Exchange (IKEv2) Protocol; 2005. Available from: <http://www.ietf.org/rfc/rfc4306.txt>.[26] Lafon Y et al. Simple Object Access Protocol; 2007. Available from: <http://www.w3.org/TR/soap>.[27] Viganó L. Automated security protocol analysis with the AVISPA tool. Electr Notes Theor Comput Sci 2006;155:61–86.

Pedro Martinez-Julia received the B.S. degree in Computer Science from the Open University of Catalonia in 2009 and the M.S. degree in AdvancedInformation Technology and Telematics from the University of Murcia in 2010. Since 2009 he is a research fellow and Ph.D. candidate in the Department ofCommunication and Information Engineering at the University of Murcia. He is the task leader of GEMBus (JRA3-T3), as part of the GN3 project (FP7-2007–2013). His main interests are the overlay networks, security, and distributed systems. He is reviewer of many international journals and an active associatemember of ACM and IEEE.

Antonio F. Gomez-Skarmeta received the M.S. degree in Computer Science from the University of Granada and B.S. (Hons.) and the Ph.D. degrees inComputer Science from the University of Murcia where, since 2009, is Full Professor. He has worked on different national and international researchprojects, like Euro6IX, 6Power, Positif, Seinit, Deserec, Enable, and Daidalos. His main interest is the integration of security services at different layers likenetworking, management, and services. He is editor of the IEEE SMC-Part-B and reviewer of several international journals. He has published over 90international papers and is member of several program committees.