impact of configuration errors on dns robustness v. pappas * z. xu *, s. lu *, d. massey **, a....

28
Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu * , S. Lu * , D. Massey ** , A. Terzis *** , L. Zhang * * UCLA, ** Colorado State, *** John Hopkins

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Impact of Configuration Errors on DNS Robustness

V. Pappas *

Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang *

* UCLA, ** Colorado State, *** John Hopkins

Page 2: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Motivation

• DNS: part of the Internet core infrastructure– Applications: web, e-mail, e164, CDNs …

• DNS: considered as a very reliable system– Works almost always

• Question: is DNS a robust system?– User-perceived robustness– System robustness

are they the same?

Page 3: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

– Thousands or even millions of users affected– All due to a single DNS configuration error

MotivationShort Answer:

“Microsoft's websites were offline for up to 23 hours -- the most dramatic snafu to date on the Internet --because of an equipment misconfiguration”

-- Wired News, Jan 2001

Page 4: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Related Work

• Traffic & implementation errors studies:– Danzig et al. [SIGCOMM92]: bugs– CAIDA : traffic & bugs

• Performance studies: – Jung et al. [IMW01]: caching– Cohen et al. [SAINT01]: proactive caching – Liston et al. [IMW02]: diversity

• Server availability :– To appear [OSDI04, IMC04]

Page 5: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Our Work: Study DNS Robustness

• Classify DNS operational errors:– Study known errors – Identify new types of errors

• Measure their pervasiveness

• Quantify their impact on DNS – availability – performance

Page 6: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Outline

• DNS Overview

• Measurement Methodology

• DNS Configuration Errors– Example Cases– Measurement Results

• Discussion & Summary

Page 7: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

net com uk ca jp

foo

buz bar

bar1 bar2 bar3

Zone:Occupies a continues subspace Served by the same nameservers

bar.foo.com. NS ns1.bar.foo.com.bar.foo.com. NS ns3.bar.foo.com.bar.foo.com. NS ns2.bar.foo.com.bar.foo.com. MX mail.bar.foo.com. www.bar.foo.com. A 10.10.10.10

bar

name servers

resource records

Background

Page 8: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

caching server

client

bar zone

foo zone

com zone

root zone

asking for www.bar.foo.com

answer:www.bar.foo.com A 10.10.10.10

referral:com NS RRscom A RRs

referral:foo NS RRsfoo A RRs

referral:bar NS RRsbar A RRs

Page 9: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Infrastructure RRs

foo.com. NS ns1.foo.com.foo.com. NS ns2.foo.com.foo.com. NS ns3.foo.com.

foo.com. NS ns1.foo.com.foo.com. NS ns2.foo.com.foo.com. NS ns3.foo.com.

foo.com

comns1.foo.com. A 1.1.1.1ns2.foo.com. A 2.2.2.2ns3.foo.com. A 3.3.3.3

ns1.foo.com. A 1.1.1.1ns2.foo.com. A 2.2.2.2ns3.foo.com. A 3.3.3.3

•NS Resource Record:–Provides the names of a zone’s authoritative servers

–Stored both at the parent and at the child zone

•A Resource Record–Associated with a NS resource record

–Stored at the parent zone (glue A record)

Page 10: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

What Affects DNS Availability

• Name Servers:– Software failures – Network failures – Scheduled maintenance tasks

• Infrastructure Resource Records:– Availability of these records– Configuration errors

focus of our work

Page 11: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Classification of Measured Errors

Inconsistency Dependency

LameDelegation

DelegationInconsistency

DiminishedRedundancy

CyclicDependency

The configuration of infrastructure RRs does not correspond to the actual authoritative name-servers.

More than one name-servers share a common point of failure.

Page 12: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

What is Measured?

• Frequency of configuration errors:– System parameters: TLDs , DNS level, zone size (i.e.

the number of delegations)

• Impact on availability:– Number of servers: lost due to these errors

– Zone’s availability: probability of resolving a name

• Impact on performance: – Total time to resolve a query

• Starting from the query issuing time

• Finishing at the query final answer time

Page 13: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Measurement Methodology

• Error frequency and availability impact:– 3 sets of active measurements

• Random set of 50K zones

• 20K zones that allow zone transfers

• 500 popular zones

• Performance impact:– 2 sets of passive measurements:1-week DNS

packet traces

Page 14: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Lame Delegation

com

foo

foo.com. NS A.foo.com.foo.com. NS B.foo.com.

A.foo.com

A.foo.com. A 1.1.1.1B.foo.com. A 2.2.2.2

2) DNS error code -- 1 RTT perf. penalty

3) Useless referral -- 1 RTT perf. penalty

4) Non-authoritativeanswer (cached)

1) Non-existing server -- 3 seconds perf. penalty

B.foo.com

Page 15: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Lame Delegation Results

Page 16: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Lame Delegation Results

0.06 sec

0.4 sec3 sec

50%

Page 17: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Lame Delegation Results

• Error Frequency:– 15% of the zones

– 8% for the 500 most popular zones

– independent of the zone’s size, varies a lot per TLD

• Impact:– 70% of the zones with errors lose half or more of the

authoritative servers

– 8% of the queries experience increased response times (up to an order of magnitude) due to lame delegation

Page 18: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

C) Geographic location level: - belong to the same city

B) Autonomous system level: - belong to the same AS

Diminished Server Redundancy

com

foo

foo.com. NS A.foo.com.foo.com. NS B.foo.com.

A.foo.com B.foo.com

A.foo.com. A 1.1.1.1B.foo.com. A 2.2.2.2

A) Network level: - belong to the same subnet

Page 19: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Diminished Server Redundancy Results

• Error Frequency:– 45% of all zones have all servers in the same /24 subnet

– 75% of all zones have servers in the same AS

– large & popular zones: better AS and geo diversity

• Impact:– less than 99.9% availability: all servers in the same /24

subnet

– more than 99.99% availability: 3 servers at different ASs or different cities

Page 20: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Cyclic Zone Dependency (1)

com

foo

foo.com. NS A.foo.com.foo.com. NS B.foo.com.

A.foo.com B.foo.com

A.foo.com. A 1.1.1.1

B.foo.com depends on A.foo.com

The A glue RR for B.foo.com missing

B.foo.com. A 2.2.2.2

If A.foo.com is unavailable then B.foo.com is too

Page 21: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Cyclic Zone Dependency (2)

com

foo

foo.com. NS A.foo.com.foo.com. NS B.bar.com.

A.foo.com B.bar.com

A.foo.com. A 1.1.1.1

bar

B.foo.com A.bar.com

bar.com. NS A.bar.com.bar.com. NS B.foo.com.

A.bar.com. A 2.2.2.2

The foo.com zone seemscorrectly configured

The combination of foo.com and bar.com zones is wrongly

configured

The B serversdepend on A servers

If A.foo and A.bar are unavailable, B addr. are unresolvable

Page 22: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Cyclic Zone Dependency Results

• Error Frequency:– 2% of the zones– None of the 500 most popular zones

• Impact:– 90% of the zones with cyclic dependency errors

lose 25% (or even more) of their servers– 2 or 4 zones are involved in most errors

Page 23: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Discussion: User-Perceived != System Robustness

• User-perceived robustness:– Data replication: only one server is needed

– Data caching: temporary masks infrastructure failures

– Popular zones: fewer configuration errors

• System robustness:– Fewer available servers: due to inconsistency errors

– Fewer redundant servers: due to dependency errors

Page 24: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Discussion: Why so many errors?

• Superficially: are due to operators:– Unaware of these errors – Lack of coordination

• parent-child zone, secondary servers hosting

• Fundamentally: are due to protocol design:– Lack of mechanisms to handle these errors

• proactively or reactively

– Design choices that embrace some of them:• Name-servers are recognized with names • Glue NS & A records necessary to set up the DNS tree

Page 25: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Summary

• DNS operational errors are widespread• DNS operational errors affect availability:

– 50% of the servers lost

– less than 99.9% availability

• DNS operational errors affect performance:– 1 or even 2 orders of magnitude

• DNS system robustness lower than user perception– Due to protocol design, not just due to operator errors

Page 26: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Ongoing Work

• Reactive mechanisms:– DNS Troubleshooting [NetTs 04]

• Proactive mechanisms:– Enhancing DNS replication & caching

Page 27: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John

Thank You!!!

Page 28: Impact of Configuration Errors on DNS Robustness V. Pappas * Z. Xu *, S. Lu *, D. Massey **, A. Terzis ***, L. Zhang * * UCLA, ** Colorado State, *** John