Download - AAA Transport Issues
AAA Transport Issues
Draft-ietf-aaa-transport-00.txthttp://www.drizzle.com/~aboba/AAA/AAA_transport.ppt
Bernard Aboba
Barney Wolff
Dave Mitton
Outline• Goals and objectives
• Introduction
• AAA proxy bestiary
• Congestion control principles
• Summary
Goals and Objectives
• To understand how AAA protocols interact with the transport layer
• To understand the transport behavior of AAA protocols running over UDP, TCP and SCTP– Useful to understand behavior of existing protocols
(RADIUS, TACACS+) as well as DIAMETER
• To understand the transport behavior of proxy systems– Transport layer proxies, Store & Forward proxies, Routing
proxies, Re-direct proxies exhibit different transport behavior– Implications for other proxied protocols (SIP, DNS) as well
as AAA
Introduction• AAA protocol exchanges• Transport connection usage• Firewall issues• The “Mice” problem• Application driven vs. Network driven • Reliable versus unreliable transport traces
AAA Protocol Exchanges• Single request/response
– Simple authentication/authorization exchanges (NAS initiated)
– Accounting exchanges (NAS initiated)
• Multiple request/response– EAP exchanges (NAS initiated)
• Unsolicited server messages– Request/response initiated by server
Transport Connection Usage• Implementation experience (RADIUS)
– Some implementations use a single socket for NAS-AAA server communication
– Some implementations use a socket per port!
• Implications– Possible for AAA to use congestion-friendly transport in a non-
congestion-friendly way– Pipelining desirable
• No need for NAS to wait for a Response before sending another Request• May make use of a single connection more palatable
– Congestion Manager support desirable• Enables separate connections to share information with each other and
with application layer
Firewall Issues• Designing firewall for AAA server is not hard
– Only allow DNS, AAA traffic to/from NASen to AAA server on AAA port
– Typically don’t need layer 7 filtering
• What about denial of service attacks?– RADIUS vulnerable to DoS attacks– Bogus client can send large number of Requests– Server must validate User-Password or CHAP-Password attribute,
Message-Authenticator attribute if present– Strong crytography support increases DoS vulnerability
• Solution– Need port-specific rate limiting on router or built-in TCP/SCTP DoS
protection
The “Mice” Problem
• Many NASes (000s) can converse with a single AAA server or proxy• Traffic from a single NAS may be light, but traffic close to the
server/proxy may be substantial• Result can be packet loss in router near server, or buffer overflow
within the server itself• Data traffic can also compete with AAA traffic near the NAS
Internet
Router
AAA Server
NAS
NAS
NAS
NAS
NAS
NAS
NAS
NAS
Router Internet10 Mbps Router
RADIUS proxy
Router
Router
AAA Server (local realm)
AAA Server(DinkyLink.org realm)
The Proxy Congestion Problem• Bottleneck may be between AAA proxy
and a particular AAA server– DinkyLink.org AAA server has trans-
oceanic 56 Kbps Internet connection
• Proxy may be overloaded at the application layer (many NAS requests)
• NAS can’t sense proxy-AAA server bottleneck since it has no transport layer connection to the AAA server
– Result: NAS sends more requests than proxy can forward, proxy send buffer fills up
• NAS can’t differentiate reasons for poor application-layer response experienced with the proxy
– Result: NAS switches to another proxy inappropriately, re-transmits request at the application layer, etc.
• Proxy needs a way to communicate status to the NAS (Unable to forward, No Response, Busy)
56 Kbps
10 Mbps
10 Mbps
10 Mbps
Application Driven Vs. Networking Driven
• AAA protocol exchanges typically application driven– Definition: time between exchanges larger than RTT– Examples
• 48 port NAS, session time of 20 minutes– Authentication & Accounting Request every 25 seconds– Traffic assuming 1500 octet packets: 480 bps– Total traffic, assuming 4Kbps per port: 192 Kbps
• 2048 port NAS, session time of 10 minutes– Authentication & Accounting Request every 293 ms– Traffic assuming 1500 octet packets: 41 kbps– Total traffic, assuming 4Kbps per port: 8.2 Mbps
• AAA exchanges can also be network driven– Employees come to work in the morning, logon to the network– NAS reboots, users logon again– After network partition, NAS sends stored accounting records
Transport Parameter Validation Issues• CWND, RTT/RTONAS-proxy, RTT/RTOproxy-AAA server estimates not
valid for application-driven scenarios typical of AAA– Multiple RTTs may elapse between packets
• 48 port NAS ~ 200 RTT• 2048 port NAS ~ 2 RTT
– CWND can open without being fully utilized
• CWND validation– RFC 2581 recommends slow-start after an interval larger than the RTO– RFC 2861 recommends only increasing the congestion window only if
it was full when the ACK arrived; congestion window reduced by half once per RTO
– Ssthresh not reduced
• Remaining issue: RTT/RTO validation
Reliable Transport Protocol Trace
NAS
Auth
Auth Request
Accounting Start
Accounting Stop
Response/ACK
ACK
Response/ACK
Response/ACK
ACK
{
{
AAA Server
Actng
Notes:•8 packets if same port used for auth and accounting, 9 otherwise•Server typically piggybacks ACK with Auth Response unless it’s really overloaded•ACK of Auth Response can’t piggyback on Accounting Start if Accounting uses a different port•Long delay between auths means that previous RTT/CWND estimates not valid between transactions•Long delay between Accounting Start/Stop means RTT/CWND typically no longer valid within a single transaction•Since Responses are ACK’d at the transport layer, an app layer Response ACK would not add additional packets due to piggybacking
ACK
Reliable Transport Protocol Trace (Routing Proxy)
NAS
Auth
Auth Request
Accounting Start
Accounting Stop
ACK
Response/ACK
ACK
ACK
Response/ACK
ACK
Response/ACK
ACK
{
{
Routing Proxy
Actng
Notes:•Proxy may send delayed ACK if home server is sufficiently far away (>100 ms RTTproxy-server)•11 packets in worst case if same port used for auth and accounting, 12 otherwise
ACK
UDP (RADIUS) Protocol Trace
NAS
Auth
Auth Request
Accounting Start
Accounting Stop
Response
Accounting Response
Accounting Response
{
{
AAA Server/Routing Proxy
Actng
Notes:•6 packets in worst case•Retransmission behavior undefined (no RTT/RTO measurement)•Failover/failback behavior undefined•Transport doesn’t self clock hop-by-hop OR end-to-end•Accounting response represents a transport layer, not app layer ACK; (no error messages)
AAA Proxy Bestiary
• Routing proxies
• Re-direct proxies
• Store and Forward proxies
• Transport layer proxies
Routing Proxy (Auth Only)NAS
Auth Request
ACKAuth Request
Routing Proxy
Notes:•Routing proxy means that transport dynamics are hop-by-hop•Transport self-clocks hop-by-hop but not necessarily End-to-end •End-to-end transport dynamics depends on details of proxy buffer management (back pressure)•AAA server can often piggyback ACK with Response if it is not overloaded•Proxy may send delayed ACK to Auth Request if AAA server is sufficiently far away
AAA Server
Response/ACK
Response/ACK ACK
ACK
Store & Forward Proxy (Actng Only)
NASActng. Start
Response/ACKActng. Start
S&F Proxy
Notes:•Store and Forward proxies only used for accounting•S&F proxy means that transport dynamics are completely hop-by-hop•No issues with end-to-end self clocking•Store and Forward proxy can often piggyback ACK with Response if it is not overloaded, since no forwarding need occur before responding•Store and Forward proxies are a bad idea since the NAS is fooled into believing that it has received an App layer ACK when this is not the case; NAS may delete accounting record from non-volatile storage. •If Store and Forward proxy stores accounting messages in memory or has moving parts while NAS does not, result can be lower reliability
Actng Server
Response/ACKACK
ACK
Re-Direct Proxy (Auth Only)NAS
Auth Request
Redirect/ACK
ACK
ACK
Response/ACK
Re-Direct
Notes:•Redirect means that transport dynamics are end-to-end•Implication: TCP/SCTP transport self-clocks both hop-by-hop and end-to-end•Redirect proxy can typically piggyback ACK with Redirect if Redirect table kept in memory•Redirect ACK cannot be piggybacked with second Auth Request since they go to different destinations•AAA server can often piggyback ACK with Response if not overloaded•Since the Response will be ACK’d anyway, an application layer ACK of the Response will not add to the packet count.
AAA Server
Auth Request
Transport Layer Proxy (Auth Only)
NASRequest
Request
Transport Proxy
Notes:•NAS has separate transport connection for each realm, must know about realms•Several types of transport proxy; type shown is “transparent”•With “transparent” Transport layer proxy, transport layer dynamics are end-to-end•Result: End-to-end self-clocking•If AAA server sends piggyback’d Response/ACK, so will proxy (no proxy-originated delayed ACKs)•Resembles behavior of RADIUS proxies (minus the final ACK)!
AAA Server
Response/ACK
ACK
Response/ACK
ACK
Congestion Control Principles
• Conservation of packets• Failover and failback• Self-clocking
“Conservation of Packets”
• Once you’ve reached the end of the window, don’t send more packets until you have evidence that original packets are no longer transiting the network– Packets received by destination (ACK)– Packets lost (Timeout, triplicate ACK)
• Self-clocking occurs when sending rate is limited to rate at which ACKs are received
Internet
Router
AAA Server 1
NAS
NAS
NAS
NAS
NAS
Router
AAA Proxy 1
Router
AAA Server 2
AAA Proxy 2
“Conservation of Packets” Applied to AAA:Failover/Failback
Control Volume Notes:•NAS should not re-transmit to Proxy 1 until RTONAS-proxy1 has elapsed, or triplicate ACKs received•NAS should not failover to Proxy 2 until nRTONAS-
proxy1 has elapsed•NAS cannot handle failover from AAA server 1 to 2 because it does not estimate RTONAS-AAAserver1 •AAA proxy 1 should not failover to AAA Server 2 until nRTOproxy1-server1 has elapsed•Not easy to implement failover/failback with TCP
Self-Clocking
Source: V. Jacobson, “Congestion Avoidance and Control, ACM SIGCOMM ’88 Vol 18 No. 4, August 1988
Self-Clocking w/ProxiesProxy AAA Server
NAS
Receivebuffer
Sendbuffer
Receivebuffer
Sendbuffer
Goal: NAS can’t advance window until it receives an application layer ACK from the AAA server
Unless send and receive buffers are coupled, no self-clocking!
Hop-by-Hop vs. End-to-End Self Clocking• Proxy systems consist of two transport connections
– NAS-proxy transport connection– Proxy-server transport connection
• TCP/SCTP provides hop-by-hop self-clocking – NAS will only advance the window as it receives ACKs from proxy– Proxy will only advance the window as it receives ACKs from the AAA
Server
• Only transport, re-direct proxy types guarantee end-to-end self-clocking– Transport proxies: splice together two hop-by-hop connections to
simulate end-to-end transport dynamics– Re-direct proxies: connection is end-to-end after initial re-direct
Hop-by-Hop vs. End-to-End (cont’d)• Hop-by-hop congestion avoidance does not prevent proxy congestion in other
proxy types– Store & Forward proxies completely decouple the NAS-Proxy connection
from the Proxy-Server connection (BAD!)– Routing proxies do not automatically propagate congestion signals
between receive and send buffers• Micro level self-clocking not possible• Macro level coupling via “back pressure” requires multiple NAS-proxy connections
for proper granularity• Uber-macro level: application-layer error messages
• Conclusion– Transport dynamics with proxies at best equal to end-to-end case– TCP/SCTP transport not sufficient for end-to-end self-clocking with routing or store &
forward proxies
Solutions• Don’t use proxies
– Can we ban S&F proxies altogether?
• Use re-directs• Use transport proxies
– Looks like a single transport connection with micro scale coupling (individual ACKs)– Requires extensive application/transport integration
• Routing proxies– Application layer error messages
• Simplest solution• Does this completely address the proxy congestion issue?
– Macro scale coupling (window) between receive and send buffers (“backpressure”)• Only empty receive buffer as fast as send buffer empties• Requires separate connections/streams for each realm• Without individual connections/streams, no way to enable self-clocking on a per-path basis• Problem: with n connections, initial slow-start window is effectively n or 2n• Too complex to implement?
Application Layer Error Messages• “Busy”: Proxy/Server too busy to handle additional requests, NAS
should failover requests to another proxy/server• “Forwarding”: Proxy has located AAA server, but timely response is
not forthcoming; NAS should wait for final response• “Can’t Locate”: Proxy can’t locate the AAA server for the indicated
realm; NAS should reject access• “Failover”: Proxy has tried primary server, is failing over to secondary
server; NAS should reset app layer timers, not attempt failover to secondary proxy
• “Can’t Forward”: Proxy has tried both primary and secondary AAA servers with no response; NAS should reject access
• “Processing”: Server cannot provide an immediate response to the request; NAS should wait for final response
AAA Reliable Transport “Profile”• What is a transport profile? A recommendation on how to use transport within AAA• Efficiency
– Persistent connections/pipelining
• Nagle algorithm enabled– AAA packets often smaller than MSS – Useful for transport layer batching when packet spaced close together, but…– Typically no additional packets for NAS to send in response to AAA server/proxy ACK
• CWND validation– RFC 2861– With high inter-packet spacings, RTT measurements made so infrequently that network conditions
may change between measurements– Don’t let CWND build as a result of (now) invalid measurements, decay it instead– Result: CWND=1 or 2 most of the time, AAA operates in perpetual “slow start”
• Congestion Manager– Draft-ietf-ecm-cm-03.txt– Enables multiple AAA connections to share state with each other, and possibly with the application
as well– May be helpful for failover/failback
Preliminary Recommendations• TCP: Feasible
– Recommended practice: Nagle algorithm enabled, Congestion window validation, Congestion Manager
– More work needed on failover/failback
• SCTP: Feasible– Recommended practice: Nagle algorithm enabled, Congestion window validation,
Congestion Manager– Failover features, built-in support for multiple streams– More work needed on failback
• UDP: More investigation needed– Only marginally fewer packets than TCP/SCTP, except where RTTproxy-server > delayed
ACK timer– Probably can only offer simple windowing (CWND=1, 2) without heading down
slipper slope– Would require per-realm RTT, RTO logic for failover/failback (congestion manager)
Summary• TCP/SCTP feasible for use with AAA
– More work needed on failover/failback– Which transport(s) should be mandatory?
• Reliable Transport “profile” recommended– Persistent connections/pipelining– Nagle algorithm enabled– Congestion window validation– RTO validation– Congestion Manager
• UDP transport needs more investigation• Proxies complicate analysis of AAA transport behavior
– End-to-end congestion avoidance not guaranteed in proxy environments, even when reliable transport is utilized
– Microscopic self-clocking difficult in routing proxies– Application layer error messages recommended– Use of re-direct proxies encouraged