replacing radius in large-scale roaming environments

15
Replacing RADIUS in Large-Scale Roaming Environments No Author Given No Institute Given Abstract. RADIUS[1] is a widely deployed protocol in many environ- ments where users of different administrative domains are to be given the ability to roam between those domains. While RADIUS is an excel- lent protocol for the uses it was designed for – authenticating dial-up users in point-to-point connections – it has some inherent shortcomings when used for roaming over untrusted networks. The larger the deploy- ment scenario, the bigger the role of these shortcomings. This paper describes the limitations of RADIUS when the roaming infrastructure is realized as a hierarchy. Since no versatile implementations of the next- generation roaming protocol, Diameter[2], exist, an intermediate solution that is compatible with existing RADIUS deployments named RadSec[3] is evaluated. Among the goals of RadSec is replacing the static trust re- lation between RADIUS servers with a dynamic, PKI-based server trust model with peer discovery and the ability to exchange user attributes in a privacy-conserving way and perform attribute-based authorization. 1 Introduction There are currently several protocols for AAA (Authentication, Authorization and Accounting) that are in widespread use. The most widely deployed proto- col is RADIUS[1] (for Remote Authentication Dial-In User Service), a protocol standardized in 1997. Another protocol that is deployed widely is the TACACS+ protocol[4], a proprietary protocol by Cisco Systems that has great similarities with RADIUS. Over time, the requirements for AAA servers have changed significantly and the protocol specifications for RADIUS start to show shortcomings for conditions that were not foreseen when the protocol was designed. One of the most notable changes in requirements is to enable roaming for users, particularly for Wire- less LAN scenarios where more than one administrative domain is involved and the authentication and authorization data has to be transported over untrusted networks. To address the new requirements for AAA, the AAA working group of Inter- net Engineering Task Force[5], IETF, has specified an advanced AAA protocol named Diameter. Unfortunately, currently there are not many implementations for this protocol. A large-scale roaming initiative, like for example the world-wide roaming federation for the educational and research sector named ”eduroam”[6],

Upload: vonguyet

Post on 02-Jan-2017

221 views

Category:

Documents


1 download

TRANSCRIPT

Replacing RADIUS in Large-Scale RoamingEnvironments

No Author Given

No Institute Given

Abstract. RADIUS[1] is a widely deployed protocol in many environ-ments where users of different administrative domains are to be giventhe ability to roam between those domains. While RADIUS is an excel-lent protocol for the uses it was designed for – authenticating dial-upusers in point-to-point connections – it has some inherent shortcomingswhen used for roaming over untrusted networks. The larger the deploy-ment scenario, the bigger the role of these shortcomings. This paperdescribes the limitations of RADIUS when the roaming infrastructure isrealized as a hierarchy. Since no versatile implementations of the next-generation roaming protocol, Diameter[2], exist, an intermediate solutionthat is compatible with existing RADIUS deployments named RadSec[3]is evaluated. Among the goals of RadSec is replacing the static trust re-lation between RADIUS servers with a dynamic, PKI-based server trustmodel with peer discovery and the ability to exchange user attributes ina privacy-conserving way and perform attribute-based authorization.

1 Introduction

There are currently several protocols for AAA (Authentication, Authorizationand Accounting) that are in widespread use. The most widely deployed proto-col is RADIUS[1] (for Remote Authentication Dial-In User Service), a protocolstandardized in 1997. Another protocol that is deployed widely is the TACACS+protocol[4], a proprietary protocol by Cisco Systems that has great similaritieswith RADIUS.

Over time, the requirements for AAA servers have changed significantly andthe protocol specifications for RADIUS start to show shortcomings for conditionsthat were not foreseen when the protocol was designed. One of the most notablechanges in requirements is to enable roaming for users, particularly for Wire-less LAN scenarios where more than one administrative domain is involved andthe authentication and authorization data has to be transported over untrustednetworks.

To address the new requirements for AAA, the AAA working group of Inter-net Engineering Task Force[5], IETF, has specified an advanced AAA protocolnamed Diameter. Unfortunately, currently there are not many implementationsfor this protocol. A large-scale roaming initiative, like for example the world-wideroaming federation for the educational and research sector named ”eduroam”[6],

described in section 2, could benefit greatly from the new protocol, but since nosuitable implementation exists an intermediate solution has to be found.

In order to address some of the shortcomings of RADIUS that are imminentin large-scale roaming scenarios, which are outlined below in section 3, severalprotocol modifications of the existing RADIUS protocol are currently being spec-ified and tested with the goal of having a flexible roaming infrastructure at handthat can be used with existing hardware and the well-tested existing RADIUSsoftware. Section 4 gives a short overview over the Diameter protocol and showsits approaches to solving several of the shortcomings of RADIUS. Section 5 de-tails the modifications to the RADIUS protocol that are known as the RadSecprotocol[3]. After describing the protocol, an extensive test of the RadSec im-plementation within the Radiator RADIUS software is described in section 6,which shows the benefits as well as some shortcomings that even RadSec cannotresolve.

2 Overview over the eduroam RADIUS hierarchy

eduroam[6][7] (for educational roaming) is an academic roaming environmentthat aims to enable academic users to use research networks all over the worldwith the login credentials from their home institution. eduroam is developedby the Joint Research Activity 5[8] of the IST project GN2[9], and maintainedand operated by the TERENA Task Force ”Mobility”[10]. Since the roots ofeduroam lie in Europe, most participants come from the European Researchand Education Networks (NRENs), but the movement is already spreading toother parts of the world. Figure 1 shows the current participant map of eduroam.

The infrastructure interconnects hundreds of institions, which makes it veryinconvenient to use a fully-meshed RADIUS server approach, because with thatapproach whenever a new institution somewhere in the world gets connectedto eduroam, all other institutions would need to update their client list, a taskwhich is nearly impossible to synchronize between all participants and withoutages occurring whenever such an update is necessary.

The approach that is taken instead in eduroam is a hierarchical one, as shownin figure 2. Two layers of RADIUS proxy servers interact with a well-definedset of clients: the root server has the per-TLD servers as clients, and the TLDservers are connected to the RADIUS servers of the institutions that reside underthe respective TLD. Large institutions may either authenticate their own usersthemselves or add further levels of proxying as they see fit: if an institutiondelegates its user management to departments, it can itself act as a proxy serverfor its departments.

When a user is roaming at a remote site, his home institution or departmentis determined by examining his user name. User names are in e-mail style andend in a DNS-like hierarchy indicator that includes all levels of the hierarchy. Asan example, the user name of the first author of this paper is [email protected] (blindedfor review). The proxying rules in the hierarchy’s proxy servers can evaluate therealm expression of @y.tld to first proxy authentication requests upward until

Fig. 1. eduroam participants as of december 2005 (see also: [6])

Fig. 2. eduroam RADIUS server hierarchy

the .tld TLD server can be reached and are then proxied by the .tld server tothe institutional server of y.tld, which then authenticates the user.

The eduroam infrastructure does not implement different authorization levels– it is up to the visited institution what services are given to the roaming user.Different authorization levels would be an extra feature for eduroam, but it is notimplemented because it would require to disclose personal information about theuser to intermediate hops while traversing the hierarchy, a limitation describedin section 3.3.

3 Shortcomings of the Current Approach

Using a hierarchy of RADIUS servers is the only way of interconnecting a RA-DIUS roaming infrastructure that scales to a worldwide level. Unfortunately,this design has some inherent shortcomings, which were originally summarizedin [11] and are explained in detail in the following sections. These limitationsare:

– Every piece of information about the user is transferred through the interme-diate RADIUS hops and almost all information is visible in clear-text formto each of these hops. Only when using common EAP methods that establishTLS tunnels that protect the most critical information (true user name andpassword) is encrypted end-to-end. This is what eduroam recommends andwhat in fact technically is required for proper wireless 802.1x authentication.

– Due to the use of UDP as a transport protocol it is very difficult to detectwhether one of the servers in the hierarchy is not responding, and which ofthe servers in the chain it is.

– The information for roaming between each RADIUS leaf is aggregated to-wards the root servers, leading to a potential high traffic volume on the rootserver and making it a possible single point of failure. This is however onlytraffic for roaming users; in most cases there will be more local or nationalauthenticating users than international roaming guest-users.

– New participants have to be statically configured in the hierarchy, whichinvolves exchanging a shared secret and binding the server to a specific IPaddress. All configuration changes involve human interaction on both endsof the statically configured link.Overcoming these limitations is one of the goals of the GEANT2 Joint Re-search Activity 5, who laid out the requirements for a next-generation roam-ing infrastructure in chapter 3 of [12].

3.1 Static Trust Relationship Between Servers

The RADIUS protocol uses a pair of (IP-address;shared secret) to identify knownpeers and there is no mechanism for dynamic peer discovery. All incoming con-nections from peers that are not configured statically are silently discarded.

The need to actively manage new connections makes this approach scalebadly, and a means of aggregating entire groups of RADIUS clients is important

when the number of clients per RADIUS server grows. Recent implementationsof RADIUS servers allow for simple means of aggregating clients into groups,mostly by allowing to specify an entire subnet as a client. The drawback of sucha simple approach is that the shared secret must be identical on all RADIUSclients in the subnet, thereby weakening overall security.

A second problem with static configurations comes into play when the RA-DIUS client’s IP address is not static. An example for such a scenario is wheredomestic wireless access points behind DSL connections that time out are con-nected to a central RADIUS server. In this case, only specifying the entire IPrange that the client could possibly receive upon a new log-in would make aconnection to the RADIUS server possible. Again, the shared secret would thenbe spread over a large number of hosts.

3.2 Data Flow Aggregation Towards the Root Server

Grouping RADIUS clients in hierarchical levels can ease the problem of needingstatically configured clients, because only the children of one particular node haveto be configured together with their parent, leaving the rest of the hierarchyuntouched. Unfortunately, the hierarchy must be traversed for every piece ofinformation that is to be transmitted from one leaf of the hierarchy to another. Inthe case of eduroam, the national RADIUS proxy servers aggregate all user trafficfrom and to their country, and the root server aggregates all traffic where cross-country authentication takes place. This can quickly lead to high load conditionson the root server, which implies a higher likelihood for failure. Furthermore, theRADIUS protocol is limited to a maximum of 256 simultaneous connections perclient which directly translates into: only a maximum of 256 users per countrycan attempt to authenticate abroad at the same time.

3.3 Limited Privacy for User Attributes

Transporting RADIUS attributes through a hierarchy also has the drawbackthat the user information is in large parts visible to intermediate hops. Mostof the RADIUS attributes are traveling through the Internet in clear text andcan thus be read by any IP hop between the leaf RADIUS servers. When trans-porting RADIUS packets over untrusted networks there is only one option toachieve end-to-end encryption: using EAP methods that utilize TLS tunnels totransmit sensitive information. Examples of TLS protected EAP methods areEAP-TLS[13], EAP-TTLS and PEAP. However, using these methods typicallyonly encrypts the user name and password (with the notable exception of EAP-TTLS, which can in principle encapsulate arbitrary Diameter attributes; but thisfeature is currently unimplemented in EAP-TTLS clients and thus not usable).So, clear-text transmission is an imminent problem when custom user attributesare to be transported in the Vendor-Specific attribute namespace. The standardfor the transmission of Vendor-Specific attributes does not foresee encryption.Recent proprietary extensions made by several vendors do introduce encryptionfor these attributes, but there is no standardized way of doing so, and specific

vender extensions are not guaranteed to be compatible. Furthermore, only fewRADIUS servers actually implement support for these proprietary extensions.When the attributes contain sensitive information about the user like his role (i.e.professor or student)or his current location it is imperative that this informationis not disclosed to intermediate parties.

It would be desirable to have end-to-end encryption for all kinds of attributes,but the only safe place to store data encrypted-to-end is within the TLS tunnelsthat are encapsulated in RADIUS packets when using one of the EAP typesmentioned above. However, this would require modifications on the end-usersclient device, since the TLS tunnels are built directly from the device to thetarget RADIUS server, and attributes could only be injected directly by theclient. Ultimately, new EAP payload protocols would need to be defined to pro-vide a standardized transmission of attributes in this way, making it more of along-term extension rather than an instantly available solution.

3.4 Insufficient Server Failure Detection

Due to the transport protocol in use by RADIUS, UDP, it is impossible to de-termine whether a particular RADIUS server has failed or if merely the packetin question got lost on the network. Almost all RADIUS clients have the possi-bility of marking a server as dead when connection attempts fail. In cases wherea single RADIUS server is used, this can lead to an unnecessary service degra-dation when a server is marked as dead even though the packet loss was causedby network congestion.

While the feature of marking servers as unresponsive is in principle desirablewhen a failover server is configured, it will bring down large parts of a RADIUSinfrastructure when used in a hierarchical way such as in the case of eduroam.Consider the example in figure 3 of two branches of the eduroam RADIUS hi-erarchy, where a user from the domain showcase.surfnet.nl currently resides inLuxembourg and wants to get Internet access, but his home authentication serverdoes not respond to requests. The figure on the left-hand side shows the intactinfrastructure before the client attempts to authenticate.

Because his home RADIUS server for the domain showcase.surfnet.nl is of-fline, his authentication request will be transported up to surfnet.nl, and all theRADIUS servers in the chain (including surfnet.nl) will wait for an answer, whichwill never be generated. Since none of the RADIUS clients and servers involvedknows about the chain and can only see their next hop, every RADIUS serverwill assume it is the next hop who is unreachable. This leads to the networkview in the right-hand side of figure 3 from the participating devices, assumingthat they are configured for dead peer detection. Altogether six wrong brokenlinks are detected by the various intermediate hops, while only one real outageat showcase.surfnet.nl took place.

The other wrong detected outages lead to a huge service degradation for usersall across the hierarchy, not only for those of the domain showcase.surfnet.nl.

(before failure) (after failure)

Fig. 3. Incorrect failure handling when servers are marked dead

Even a failover setup with a spare server on the country- and top-level of thehierarchy will not do any good in this case – if the user tries to authenticate asecond time, all the secondary servers are marked dead too.

This issue is currently avoided by turning the feature for disabling non-responsive servers off entirely throughout the hierarchy. This, however, has thedrawback that even in case a server really is unreachable it will still be queried.

4 The Diameter Protocol

The Diameter protocol[2] is a replacement protocol for RADIUS. It was designedby the IETF AAA working group[5] to address a list of shortcomings in RADIUS.This new protocol uses a different paradigm: it is not client-server oriented likeRADIUS, but follows a peer-to-peer approach. This enables new usage scenarios,because a Diameter node can initiate new connections and trigger actions on itsown (contrary to RADIUS, where the server can only react to messages sentby a client), making the communication between the network access devices(NASes) more flexible. This paradigm change requires completely new protocolimplementations, both on the NAS side and on the Diameter node side.

Unfortunately, Diameter implementations on the NAS side are uncommonat the moment, and implementations of Diameter servers are currently onlycommercially available and are only suited for specialized scenarios, like cellphone roaming. In an attempt to address a large subset of the problems ofRADIUS but still use existing implementations, the RadSec protocol was definedas a simple, but powerful extension to plain RADIUS.

5 The RadSec Protocol

RadSec, originally described in [3], is a modification of the traditional RADIUSprotocol. It preserves the RADIUS packet format and thus provides good back-ward compatibility with pure RADIUS clients and servers in mixed environ-ments. The differences between RADIUS and RadSec lie in the transport mech-anisms used for packet delivery and peer discovery. The following sections providean in-depth explanation of the differences between the two protocols and givean overview over the current implementation status.

5.1 Operational Differences to plain RADIUS

There are several tiers of RadSec extensions that can be used depending on theneeds of the concrete authentication infrastructure.

Transport ProtocolThe first – and most basic – extension is that the transport protocol used

is TCP or SCTP instead of UDP. The RADIUS protocol specified UDP as atransport protocol, but specified mechanisms that duplicated several features ofTCP in a custom manner. As an example, positive acknowledgments in formof RADIUS response messages were specified for the accounting messages inRADIUS (”Accounting-Response”) to reduce the probability of information loss.Also, periodic re-sending of packets was specified for the cases where no replywas received. This functionality, along with a lot of other improvements (like,for example, the three-way handshake to detect if the server is up and running)comes built-in in TCP. So, using TCP actually makes packet handling for RadSecservers simpler because a lot of the specialties of RADIUS packet handling arehandled automatically by the TCP stack of the operating system.

Packet Encryption and Peer AuthenticationThe second extension in RadSec, which is deployed on top of the TCP exten-

sion, is the use of TLS tunnels for communication between servers. Using TLSprovides a much more elegant and secure way to prove the authenticity of thecommunicating entities. First, authentication is no more bound to the IP ad-dress of a node, but instead to the Common Name or subjectAltName fields ofthe X.509 certificate presented by the peers. Furthermore, the entire RADIUSpacket is transported within the TLS tunnel so that no information about anongoing user authentication is revealed to intermediate IP hops. As a side-effectof the strong encryption that protects the entire packet and the peer authenticityverification with certificates, the RADIUS way of protecting user passwords andpeer authenticity with the shared secret is obsolete.

Making the shared secret obsolete and becoming independent of IP addressesalso has another positive side-effect: even RadSec servers that suffer of constantlychanging IP addresses (for example because of being behind a DSL or dial-upconnection) can be integrated into the infrastructure without problems.

While only replacing RADIUS with another transport protocol, this setupstill depends on a hierarchicy for roaming setups. Therefore the problem ofexposing information to intermediate RadSec nodes still remains. This can onlybe solved by either encrypting that data too, or by directing the request directlyto the final peer.

Dynamic Peer DiscoveryIt is also possible to have a RadSec server dynamically discover other peers.

Enabling dynamic peer discovery makes hardwired client configurations obso-lete and allows for direct communication between the authentication-servers.Provided that both endpoints are using the same techniques, the traditionalhierarchy (with its root server and other potential points-of-failure) can be by-passed.

This solves both security and scalability issues: only the necessary authen-tication servers are used for a certain request, and all user information is onlyexposed to the two servers that are dealing with the request.

The peer can be verified by the RadSec client using the previously describedauthentication mechanisms (TLS). Finding another peer requires a lookup-servicethat, in the current implementations, is based on DNS. When using DNS with-out DNSSEC it is not possible to prove the authenticity of the server that wascontacted, because no reliable source of information about the server’s identityexists. Anyway, it is possible to prove that the server contacted is authorized tohandle the request by checking the attributes of the certificate that is presentedin the TLS challenge after the lookup is done. A variety of attributes can beused to check the authorization, for example if the certificate is issued from aCA that is dedicated to issuing certificates for the roaming purpose, or a specificOID for certificate usage.

When blending this dynamic discovery in with an existing infrastructure, thetechnique can also be used to discover paths in only parts or leaves of the in-frastructure. For instance an important part of the infrastructure with potentialload, the root server, can be kept out if cross-country requests are handled withdynamic discovery. While meshing only a smaller part of the infrastructure canbe considered less desirable, it might be a feasible improvement to an existinginfrastructure when requiring all of the already implemented parts to use a newtechnique is not an option or unrealistic. Since a RadSec server can also act asa standard RADIUS server for legacy authentication servers, it could be used asmigration path too.

Bypassing hierarchies and sending requests directly to other authenticationservers participating in a federation has some security drawbacks. Where witha static configured hierarchy the number of possible connecting peers is limitedand can be restricted with firewalls, lighting-up peer-to-peer connections makethis kind of security impossible. This makes the validation of peers (both au-thentication and authorization) and using robust server implementations evenmore important.

Other lookup-services than DNS can be considered. Some lookup servicesprovide reliable information (like LDAP or DNSSEC) and thus provide a securemechanism to prove a server’s authenticity, or can even allow for exchanging moreinformation than just providing an authentication server’s address. Examples arefor instance fingerprints of certificates that are allowed to answer for a specificrealm in a specific federation, or information about the CA / CRL data that canbe used to validate the peer. If this complexity is added to a lookup service thisalso changes the view on the security of the setup: a lookup service that providesmore information should have a higher level of trust, but it does allow for a lowerlevel on the actual authentication servers, since those do not necessarily have toprove their membership of the federation themselves anymore.

5.2 Current Implementation Status

The Radiator RADIUS server by Open System Consultants currently supportsRadSec, implementing both extensions for reliable transport (TCP or SCTP),packet encryption and peer authentication (TLS). There is also an implementa-tion of dynamic peer discovery that can be considered beta quality.

Developers of other implementations, like FreeRADIUS[14] and OpenRA-DIUS[15], have shown interest in RadSec (like implemented within Radiator)since it solves the most basic shortcomings of RADIUS in a simple way andwithout adding the complexities of Diameter. Not having an official proposedstandard for RadSec to refer to however (only a whitepaper from Open SystemConsultants [3]) could keep back some implementors.

The peer discovery implementation in Radiator, named DNSROAM, is basedon the Domain Name System (DNS). It uses the DNS to find an authenticationserver for a given (RADIUS) realm. The result can either point to a traditionalRADIUS server (with UDP) or to a RadSec server that enables a reliable trans-port and validation of the peer.

If a request flowing through DNSROAM does not have a hardwired route inthe Radiator configuration, a NAPTR record [16] is looked up for the realm inDNS (after optional rewriting of the realm to lookup). The NAPTR record allowsthe DNS to be used as a lookup service for many purposes, even if resources arenot in the domain name syntax (like URI’s). It also offers means for indicatingwhat kind of service a site has to offer: RadSec with or without TLS, RADIUS,TCP or SCTP transport and if either a regular A-, AAAA-, A6-record or a SRV-record should be used for the authentication server’ address. The SRV record canbe used to define IP port-numbers for the specific servers, and allows indicationof a weight. While both SRV records and NAPTR records add flexibility ofdefining weight, preference and order of results, the fallback mechanism to usethese details within Radiator is not yet in place but is on the roadmap andconsidered trivial to add.

If no NAPTR record is found while using DNSROAM a lookup is done forregular A-/AAAA-records. This does neither offer options for order or preferencenor for configuration of the ports used (Radiator reverts to the default ports),

so using NAPTR records is preferred. There is also a configurable fallback forsituations where no record is available after a lookup: a DEFAULT route.

6 RadSec Testing Experiences

In order to test the new technologies available within Radiator and see howthey would fit in the current eduroam infrastructure a testing group was createdwithin the Geant2-JRA5 group[8]. Three possible scenario’s that could improvethe traditional hierarchy were sketched.

In the tests three levels of authentication servers were introduced, replicatingthe traditional eduroam hierarchy as described in section 2.

The tests were conducted in three subsequent phases, each eliminating biggerparts of the hierarchy and using more of the new Radiator features like RadSecand DNSROAM. The first test was a simple duplication of the hierarchy, wherethe RADIUS servers were replaced with RadSec servers. In the second phase, theTLD-level servers used dynamic peer discovery for the other TLD-level servers,which eliminated the need for a central root server. In phase three, institutionalservers were set up to detect the home server directly, which also obsoletedthe TLD-level servers. This phase also included a fallback behaviour if one ofthe home servers could not be dynamically discovered. The following sectionsdescribe the three phases of the tests in detail.

6.1 Blank test: Duplication of the Classical Hierarchy

The most simple test is to replace the RADIUS (UDP) protocol with RadSec(using TCP and TLS). The intention was to verify that all functionality in usewith RADIUS is also available with RadSec and to create a setup similar to thesituation with RADIUS.

Peers were statically configured, unknown realms were forwarded using staticdefinitions to a different server higher in the hierarchy up to the point where therealm or a part of it is known.

In order to keep the administrative and trust domains the same as in thetraditional setup a node has to trust the adjacent nodes in the hierarchy. Foran NREN this means there are two trust domains: between the NREN itselfand the other NRENs, and between the NREN and its institutions. For trustingentities the NREN often has its own PKI in place. It makes sense to use sucha PKI setup in the trust domain between the institutions and the NREN. Butwe don’t want this PKI (yet) between the NRENs: institutions of one NRENhave nothing to do with the institutions of another NREN or at least no directexisting relationship. Therefore a small CA was created for the tests in order tonot only authenticate the NREN servers with the top level server, but also toauthorize them and their institutions as a member of the federation.

There are two ways to split these PKI domains in the RadSec implementation.We can either instantiate different RadSec servers on different ports with each its

own TLS settings, or configure the RadSec server process on the NREN serversto allow multiple CAs.

Using multiple CAs was the most challenging approach and worked fine,but the institutions had to include the top level CA certificate in their clientcomponent if the NREN server is using the certificate signed by the top levelCA. As an alternative the top level server can configure the NREN CA in itsRadSec client definitions for forwarding traffic to the NREN. The former (beingagain the most challenging solution) was tested – the later seems to be the mostsensible in restricting the security domains most.

Apart from these design choices and issues there was a neglectible numberof bugs and technical problems during the first test scenario.

One was related to the validation of the remote host identity (FQDN or IP-address) with the information provided in the certificates (subjectAltName orCN ). This validation of attributes takes place after the peer certificate is vali-dated using the issuer’s root certificate and available revocation lists (CRL). TheSSL modules used by Radiator and Radiator itself needed some modificationsin order to validate this conform to other implementations and specifications.

Although these problems were raised during the first setup, the validationof these attributes was actually not critical in this scenario since the values areeasily statically configured in the RadSec server and routing definitions.

The first scenario was successfully tested in the first week of testing amonga handful of institutions and NRENs.

6.2 Using peer discovery on the top half

The first step in using a lookup service in the experimental setup was by usingonly cc-TLD’s to determine the next route on the top-level server and later onthe NREN servers themselves and thus eliminating the top-level server. Appro-priate cc-TLD entries were created in a centralized zone, and the DNSROAMcomponent of Radiator was configured to lookup the TLD part of the realmunder <tld>.test.eduroam.org.

This change simplified the configuration on the top level server a lot. Trustissues remained the same (different PKI between NREN’s and institutions) butthe issue with the peer certificate attribute validation was even more importantcompared to the first scenario, since there was no room for configuring this perrealm.

After using this on the top level server for a couple of days the NREN serverswere configured to use DNS for routing decisions too. Only requests from NRENsetups not yet configured for DNSROAM or to NRENs with no configured DNSwere proxied through the top level server.

A number of bugs showed up with DNSROAM. Especially at the time thatone of the NREN servers was unavailable for different reasons the other authen-tication servers started to show instabilities.

Besides the used techniques within the authentication servers there are ad-ditional complities and dependancies introduced with a lookup service. During

the tests DNS specific problems showed up, that where not foreseen and not re-lated to the authentication-servers. These were due to incorrect zone transfers ofSRV records on some server implementations, and could be sorted out by usingproperly working nameservers.

During the second week multiple issues were reported to Open System Con-sultants regarding DNSROAM and the issues found were swiftly patched.

6.3 Completely Meshed setup

In the third setup (also the third week of testing) a total of six NREN’s partici-pated in the tests and a total of 18 (institution) servers were available for testingthe meshed setup. In this scenario the goal was to have direct connections fromone authentication server to another, without ever unintentionally proxying therequest through a third server. Instead of using a dedicated zone for finding apeer, the realm was looked up directly in DNS. If no records for the realm werefound for the realm, the request was forwarded to a default route – which in thetest setup was the top level server still configured as in the previous scenario.

One of the difficulties in this completely meshed setup was the use of PKI.While using NREN CAs for the institution servers a bag with all involed root-certificates had to be created. Updating the bag requires some distribution. Evenmore complicated appeared to be the use of CRL certificates since at the timeCRL validation was only possible as long as CRL files for all involved CAswere available. Both CRL’s and a proper distribution mechanism are of courserequired for daily use of the tested scenario.

Questions were raised regarding the use of NREN CAs in this setup. For anNREN it is hard to keep multiple CAs running, if not only for the costs involved.Because the use of a general purpose CA does not tell what federation one isa member of, there should be other means for providing this information. Thisinformation could for instance be published in DNS too – if that is properlysecured with DNSSEC – or put into a different lookup service. Another way isto put some attributes (OID) in the certificate while signing. This allows theCA server to sign a certificate specifically for a federation (e.g. eduroam) butit requires additional changes in the software (along with a definition of theseextensions) in order to do this.

Some system administrators are (likely to be) reluctant to having their serverports wide open for the world, without any firewall or filtering mechanism inplace. While static and hierarchical configurations only communicate with alimited amount of hosts, dynamic setups have to send and receive traffic to andfrom unpredictable addresses. This is not necessarily a problem, given that peersare validated ”at the doorstep” using TLS sessions. But it does have implicationsfor firewalls.

7 Conclusions

The RadSec protocol solves several problems that arise when deploying hier-archical roaming environments. The tests that were conducted show that the

protocol is usable already and can provide a working intermediate solution un-til versatile implementations of the Diameter protocol are available. Replacingthe static RADIUS hierarchy with an identical RadSec hierarchy is already in afairly bug-free state and so it is possible to replace the infrastructure instantly.

However, before bringing RadSec into full production use with all the newfeatures it offers – especially the peer discovery mechanism ”DNSROAM” – amore thorough stress-test of the reference implementation is necessary to elimi-nate the bugs that are currently still present in this subsystem.

The rather simple extensions that RadSec does to the RADIUS protocolalready enable several of the new features that are specified in the Diameterdesign. Using these new features in practice and gathering experience with themcan already provide insight into the new Diameter concepts and will probablymake later implementations of Diameter easier.

This especially holds true for the peer lookup service – this is foreseen inthe Diameter specification, and providing several implementations (based ondifferent directory services like LDAP, DNSSEC or DNS-NAPTR with certificatevalidation) for that service right now in RadSec produces valuable results thatcan flow directly into Diameter implementations.

8 Future Work

Although RadSec is ready to be used as a replacement for RADIUS authentica-tion, availability of independent RadSec implementations still limits widespreaduse. The latest Radiator versions have good and stable support for it, but noother RADIUS server software has the same functionality available yet.

Apart from the server implementations it would be good if some of the tradi-tional RADIUS clients would have RadSec implemented too. They could benefitfrom better dead server detection, but also from the improved security and morereliable TCP or SCTP transport between the authenticator (AP or switch) andauthentication server.

Work has to be done on the authorization level. It is desirable to use existingPKIs between authentication-servers, but then no authorization component isyet in place. This could be an extension/attribute in the x509 certificates inthe case of TLS with RadSec. An extra step in the lookup service can also beconsidered.

In order to have a more structural change to the existing infrastructuresmore work has to be done on lookup services too. Other lookup-services thanDNSROAM could be designed or evaluated, like services with a more centralizedauthorization role. Mechanisms other than the insecure DNS could enhance thepossibilities for authorization on this level, for instance LDAP or DNSSEC. Fordeployment of the current DNSROAM technology, more testing and debuggingis required.

A way to prevent authentication requests to loop with DNSROAM has yet tobe implemented. Adding a visited-peer attribute (like Diameters ’Route-Record’)or a hop counter (like TCP/IP) is suggested. Fallback mechanisms are not yet

in place in DNSROAM either, but both features are considered trivial to add byRadiator’s authors.

References

1. Rigney, C., Willens, S., Rubens, A., Simpson, W.: Remote Authentication Dial InUser Service (RADIUS). RFC 2865 (Draft Standard) (2000) Updated by RFCs2868, 3575. http://www.ietf.org/rfc/rfc2865.txt

2. Calhoun, P., Loughney, J., Guttman, E., Zorn, G., Arkko, J.: Diameter Base Pro-tocol. RFC 3588 (Proposed Standard) (2003) http://www.ietf.org/rfc/rfc3588.txt

3. Open Ssystems Consultatants Pty, Ltd.: Radsec – a secure, reliable radius protocol,http://www.open.com.au./radiator/radsec-whitepaper.pdf (2005)

4. Carrel, D., Grant, L.: The TACACS+ Protocol, http://bgp.potaroo.net./ietf/all-ids/draft-grant-tacacs-00.txt (1996)

5. IETF: Authentication, authorization and accounting (aaa) working group home-page (2004) http://www.ietf.org./html.charters/aaa-charter.html

6. ”Mobility”, TERENA Task Force: eduroam – educational roaming infrastructure(2005) http://www.eduroam.org.

7. Wierenga, K., Florio, L.: Eduroam – past, present and fu-ture. In: TERENA Networking Conference 2005 Proceedings,http://www.terena.nl/conferences/tnc2005/core/getfile.php?file id=630, Trans-European Reseach and Education Networking Association (2005)

8. GN2-JRA5: Joint Research Activity 5 – Roaming and Authorisation (2005)http://www.geant2.net./server/show/nav.00d00a005

9. GN2 project: GEANT2 home (2004) http://www.geant2.net.10. TERENA: The TERENA Task Force Mobility (2005)

http://www.terena.nl./tech/task-forces/tf-mobility/11. Eertink, H., Peddemors, A., Arends, R., Wierenga, K.: Combin-

ing RADIUS with Secure DNS for Dynamic Trust Establishment Be-tween Domains. In: TERENA Networking Conference 2005 Proceedings,http://www.terena.nl/conferences/tnc2005/core/getfile.php?file id=113, Trans-European Reseach and Education Networking Association (2005)

12. Rauschenbach, J., Wierenga, K., et al.: Deliverable DJ5.1.2: Documentation onGEANT2 Roaming Requirements, http://www.geant2.net/upload/pdf/GN2-05-71v6.pdf (2005)

13. Aboba, B., Simon, D.: PPP EAP TLS Authentication Protocol. RFC 2716 (Ex-perimental) (1999) http://www.ietf.org/rfc/rfc2716.txt

14. DeKok, A., et al.: The FreeRADIUS server project (2005)http://www.freeradius.org.

15. van Bergen, E.: OpenRADIUS (2005) http://www.xs4all.nl/˜evbergen/openradius/16. Mealling, M.: Dynamic Delegation Discovery System (DDDS) Part Three: The

Domain Name System (DNS) Database. RFC 3403 (Proposed Standard) (2002)http://www.ietf.org/rfc/rfc3403.txt