ripe52 plenary kroot anycast

Upload: john-thomas-rogan

Post on 04-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    1/42

    1RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Effects of anycast on K-rootperformance

    Status update

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    2/42

    RIPE 52, 26 April 2006 http://www.ripe.net 2Lorenzo Colitti

    Work presented @ RIPE 51

    Evaluated anycast goals:

    - Latency

    Measured by querying from TTM

    - Load balancing

    Looked at activity logs

    - Stability Looked at instance switches seen by servers

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    3/42

    RIPE 52, 26 April 2006 http://www.ripe.net 3Lorenzo Colitti

    RIPE 51 results

    Anycast is good for latency

    - TTM saw very good performance

    - BGP almost always picked the right node

    Although local nodes seem to confuse things

    Not so good for load balancing

    - Wide variation in node load

    Instance switches are infrequent

    - But there are pathological switchers

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    4/42

    RIPE 52, 26 April 2006 http://www.ripe.net 4Lorenzo Colitti

    Unanswered Questions

    Only 2 global nodes measured, and only on 2 occasions

    - Do the same results hold for the current 5 nodes?

    - Are the results consistent over time?

    Did measurement point bias affect the results?

    - TTM boxes are mostly based in Europe

    Pathological instance switchers- What causes this?

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    5/42

    5RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Latency measurements using 5 nodes

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    6/42

    RIPE 52, 26 April 2006 http://www.ripe.net 6Lorenzo Colitti

    Latency with TTM

    Ideally, BGP should choose the node with the lowest RTT.Does it?

    Measure RTTs from the TTM boxes to:

    - Anycasted IP address (193.0.14.129)

    - Service interfaces of global nodes (not anycasted)

    Compare results

    To make sure this is apples to apples:- Are paths to service interfaces the same as to production IP, if picked?

    - According to the RIS, mostly yes

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    7/42

    RIPE 52, 26 April 2006 http://www.ripe.net 7Lorenzo Colitti

    Latency with TTM: methodology

    Send DNS queries from all test-boxes

    - For each K-root IP:

    Do a dig hostname.bind Extract RTT

    Take minimum value of 5 queries

    - Compare results of anycast IP with those of service interfaces

    = RTTK/ min(RTTi)

    - 1: BGP picks the right node- > 1: BGP picks the wrong node

    - < 1: local node?

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    8/42

    RIPE 52, 26 April 2006 http://www.ripe.net 8Lorenzo Colitti

    Latency with TTM: results (5 nodes)

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    9/42

    RIPE 52, 26 April 2006 http://www.ripe.net 9Lorenzo Colitti

    Whats up with tt103?

    tt103 is in Yokohama

    - Tokyo is 2ms away

    But it goes to Delhi ... through Tokyo, Los Angeles and Hong Kong

    RTT = 416 ms, = 208

    results/200604120000 $ cat tt103.ripe.net193.0.14.129 k1.delhi 422 k1.delhi 416 k1.delhi 423 k1.delhi 428 k1.delhi 419[...]203.119.22.1 k1.tokyo 2 k1.tokyo 2 k1.tokyo 2 k1.tokyo 2 k1.tokyo 2

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    10/42

    RIPE 52, 26 April 2006 http://www.ripe.net 10Lorenzo Colitti

    Problem: different prepending lengths

    Got BGP paths from AS2497

    - Thanks to Matsuzaki and Randy Bush

    Problem: bad interaction of different prepending lengths

    - Tokyo:

    2914 25152 25152 25152 25152 4713 25152 25152 25152 25152

    6461 25152 25152 25152 25152

    - Delhi:

    2200 9430 25152 25152

    We need to fix prepending on Tokyo node

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    11/42

    RIPE 52, 26 April 2006 http://www.ripe.net 11Lorenzo Colitti

    5-node vs 2-node results

    2 nodes 5 nodes

    Essentially no different

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    12/42

    12RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Consistency of results over time

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    13/42

    RIPE 52, 26 April 2006 http://www.ripe.net 13Lorenzo Colitti

    Consistency of over time

    Is this a chance event or is this behaviour consistent?

    Plot average over time

    - Collect for all test-boxes every hour

    - Take average (excluding tt103)- Plot over time

    Results:

    - Average: 1.25, median: 1.22

    - BGP is fairly consistent

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    14/42

    RIPE 52, 26 April 2006 http://www.ripe.net 14Lorenzo Colitti

    Average value of over time

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    15/42

    15RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Is TTM data meaningful?

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    16/42

    RIPE 52, 26 April 2006 http://www.ripe.net 16Lorenzo Colitti

    TTM: probe locations

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    17/42

    RIPE 52, 26 April 2006 http://www.ripe.net 17Lorenzo Colitti

    Measuring from TTM and from servers

    TTM latency measurements not optimal

    - Locations biased towards Europe

    - Only limited number of probes (~100)

    - Do not necessarily reflect K client distribution

    How do we fix this?

    Ping servers from clients

    - Much larger data set (~100 -> ~ 1M)

    - Measures the effect Ks actual clients

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    18/42

    RIPE 52, 26 April 2006 http://www.ripe.net 18Lorenzo Colitti

    Methodology

    Methodology:

    - Analyse packet traces on K global nodes

    - Extract list of IP addresses, merge lists

    - Ping all addresses from all servers

    - Plot distribution of

    Results:

    - 6 hours of data

    - 246,769,005 queries

    - 845,328 IP addresses

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    19/42

    RIPE 52, 26 April 2006 http://www.ripe.net 19Lorenzo Colitti

    CDF of seen from servers

    Results not as good as seen by TTM

    - Only 50% of clients have = 1

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    20/42

    RIPE 52, 26 April 2006 http://www.ripe.net 20Lorenzo Colitti

    Latency: conclusions

    5-node results comparable to 2-node results

    TTM clients (= Europe) very well served by K

    If we look at total K client population, things not so rosy

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    21/42

    21RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Incremental benefit of nodes

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    22/42

    RIPE 52, 26 April 2006 http://www.ripe.net 22Lorenzo Colitti

    How many nodes are enough?

    Does it make sense to deploy more instances?

    - Have we reached the point of diminishing returns?

    Evaluate benefit of existing instances

    - Hope this will tell us at what point in the curve were on

    How do we measure the benefit of an instance?

    - We can quantify how much performance would worsen if thatinstance did not exist

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    23/42

    RIPE 52, 26 April 2006 http://www.ripe.net 23Lorenzo Colitti

    Methodology

    Assume optimal instance selection

    - That is, every client sees closest instance

    - This is an upper bound to benefit

    Consistent with our aim of seeing whether we have reached the point ofdiminishing returns

    For every client, see how much its performance wouldsuffer if a given instance did not exist

    - We can do this because we ping all clients from all instances

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    24/42

    RIPE 52, 26 April 2006 http://www.ripe.net 24Lorenzo Colitti

    Loss factor

    Loss factor determines how much a client wouldsuffer if an instance were knocked out

    If = 1, the client would see no loss in performance If = 2, the client sees double RTT

    Plot CCDF of for every node

    This gives us an idea of how important a node is

    RTTknockoutRTTbest

    =

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    25/42

    RIPE 52, 26 April 2006 http://www.ripe.net 25Lorenzo Colitti

    Results: LINX

    Not much benefit on its own

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    26/42

    RIPE 52, 26 April 2006 http://www.ripe.net 26Lorenzo Colitti

    Results: AMS-IX

    Not much benefit on its own

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    27/42

    RIPE 52, 26 April 2006 http://www.ripe.net 27Lorenzo Colitti

    Results: LINX and AMS-IX

    But wait, LINX and AMS-IX are important taken together

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    28/42

    RIPE 52, 26 April 2006 http://www.ripe.net 28Lorenzo Colitti

    Results: Tokyo

    Few clients, but very badly served by other nodes

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    29/42

    RIPE 52, 26 April 2006 http://www.ripe.net 29Lorenzo Colitti

    Results: NAP

    Moderately better for some clients

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    30/42

    RIPE 52, 26 April 2006 http://www.ripe.net 30Lorenzo Colitti

    Results: Delhi

    Not very effective

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    31/42

    RIPE 52, 26 April 2006 http://www.ripe.net 31Lorenzo Colitti

    Incremental benefit of a node

    Take values for all clients

    Take the weighted average, where the weights are the

    number of queries seen by each client

    iiQi

    iQiB =

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    32/42

    RIPE 52, 26 April 2006 http://www.ripe.net 32Lorenzo Colitti

    Europe

    NAP Delhi

    Tokyo

    Values of B

    B = 23.1 B = 14.1

    B = 2.5 B = 1.01

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    33/42

    RIPE 52, 26 April 2006 http://www.ripe.net 33Lorenzo Colitti

    Does anycast provide any benefit?

    What if we didnt do anycast at all?

    Knock out all

    except LINX:dark red curve

    B = 18.8

    For K, anycastworks well

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    34/42

    34RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Stability

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    35/42

    RIPE 52, 26 April 2006 http://www.ripe.net 35Lorenzo Colitti

    Stability

    RIPE 51 presentation concluded that instance switchesare not a problem

    Is this still the case with 5 nodes?

    - The more nodes, the more routes in BGP and the more churn

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    36/42

    RIPE 52, 26 April 2006 http://www.ripe.net 36Lorenzo Colitti

    Stability results

    2 nodes (RIPE 51)

    24 hours of data:

    - 527,376,619 queries

    - 30,993 switches (~0.006%)

    - 884,010 IPs seen

    - 10,557 switchers (~1.1%)

    5 nodes

    ~5 hours of data:

    - 246,769,005 queries

    -150,938 switches (0.06%)

    - 845,328 IPs seen

    - 2,830 switchers (0.33%)

    Still does not seem a serious problem

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    37/42

    37RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Questions?

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    38/42

    RIPE 52, 26 April 2006 http://www.ripe.net 38Lorenzo Colitti

    Oh K can you see

    Problem pointed out by Randy Bush

    http://www.merit.edu/mail.archives/nanog/2005-10/msg01226.html

    Nasty interaction of no-export with anycast

    - We use no-export to prevent local nodes from leaking- If we have a customer AS

    Whose providers all peer with a local node

    - And honour no-export

    They might see no route at all!

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    39/42

    RIPE 52, 26 April 2006 http://www.ripe.net 39Lorenzo Colitti

    ISP1

    ISP2

    Customer

    25152 i(no-export)

    AS25152

    25152 i(no-export)

    RK

    RK announces 193.0.14.0/24 with no-export ISP1 and ISP2 honour no-export Customer has no route to 193.0.14.0/24

    Oh K can you see (2)

    Klocal

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    40/42

    RIPE 52, 26 April 2006 http://www.ripe.net 40Lorenzo Colitti

    Extent of the problem

    Solution: announce 193.0.14.0/23 without no-export @ams-ix

    Was this a problem?

    See what happened when prefix was announced

    Red: AMS-IX queries per second Green: BGP activity

    Nothing here

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    41/42

    RIPE 52, 26 April 2006 http://www.ripe.net 41Lorenzo Colitti

    RIPE 51 results (2 nodes)

    24 hours of data:

    - 527,376,619 queries- 30,993 node switches (~0.006%)

    - 884,010 IPs seen- 10,557 switching IPs (~1.1%)

    Is this still the situation with 5 nodes?

    - The more routes competing in BGP, the more churn

  • 7/30/2019 Ripe52 Plenary Kroot Anycast

    42/42

    42RIPE 52, 26 April 2006 http://www.ripe.netLorenzo Colitti

    Covering prefix announcement