aws re:invent 2016: moving mountains: netflix's migration into vpc (net304)
TRANSCRIPT
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Andrew Braham, Manager - Cloud Network Engineering, Netflix
Laurie Ferioli, Senior Program Manager, Netflix
December 1, 2016
Moving MountainsNetflix’s Migration into VPC
NET304
2008
2010 2011 2012 2013 2014
Why.
Learnings.
How.
What.
Security.
Networking.
Configurability.
Diagnostics.
VPC Advantages.
Netflix Ecosystem.
Lots and lots.
10s of critical tools.
Delivery.
Monitoring.
10s of critical tools.
100s of databases & ELBs.
1,000s of services.
10,000s of instances.
Migration Management.
If “plan A” didn’t work,
the alphabet has 25 more letters.
Guiding Principles.
Seamless to engineers.
Velocity of innovation.
Opportunistic improvements.
Service Classification.
Small non-critical
Large non-critical
Small critical
Large critical
Critical to members
Size
2016 – The Migration.Jan Feb Mar Apr May June Jul Aug Sept Oct
Infrastructure
Large non-critical apps
Small non-critical apps
The long poles – services with long migrations
Large critical apps
Small critical apps
Cleanup
VPC Exploration.
EXPECTATIONS.
SURPRISES.
Primary Goals.
Identify environmental differences.
Alignment on desired VPC end state.
Develop migration strategy.
Technical Challenges.
Network Routing.
DNS.
Security Groups.
Account Rationale.
Security compartmentalization.
Administrative Domain.
Rate Limit Restrictions.
Capacity Constraints.
Account Classifiers.
Business Purpose.
Operational Model.
User Access.
Regional Routing.
PACKET KUNG-FU.
ClassicLink.
ClassicLink is a feature that allows EC2-Classic instances the ability to communicate directly with instances in a single VPC in the
same region.
IP Addressing Allocation.
Amazon AWS utilizes 10.0.0.0/8.
Globally non-overlapping IP addresses.
Network Size.
RFC 6598.
100.64.0.0/10 network is a reserved
block to facilitate Carrier Grade
Network Address Translation (CGN).
IP Addressing Reservation.
Cloud IP (VPC EIP API).
ENI Auto-attach.
VPC Subnet Layout.
External Subnets.
Internal Subnets.
Partner Subnets.
VPC Subnet Layout.
/16
External Subnets.
Internal Subnets.
Partner Subnets.
VPC Subnet Layout.
External Subnets.
Internal Subnets.
Partner Subnets.
/18 /18
/18 /18
VPC Subnet Layout.
/18 /18
/18/20/20
/20 /20
External Subnets.
Internal Subnets.
Partner Subnets.
VPC Subnet Layout.
/18 /18
/18/20
/20 /20
/22/22
/22/22
External Subnets.
Internal Subnets.
Partner Subnets.
VPC Subnet Layout.
Availability Zone A Availability Zone B Availability Zone C
/20
0/0 => IGW
/20
0/0 => IGW
/20
0/0 => IGW
/18
0/0 => NGW
/18
0/0 => NGW
/22
0/0 => NGW
/22
0/0 => NGW
/22
0/0 => NGW
/18
0/0 => NGW
Internet
Gateway
(IGW)
VPN
Gateway
(VGW)
NAT
Gateway
(NGW)
Scaling NAT Gateways.
NAT
Gateway
(NGW)
Availability Zone A
/18
0.0.0.0/0 => NGW
Scaling NAT Gateways.
Availability Zone A
/18
0.0.0.0/1 => NGW #1
128.0.0.0/1 => NGW #2
NAT
Gateway
(NGW #1 )
NAT
Gateway
(NGW #2)
Scaling NAT Gateways.
Availability Zone A
/18
0.0.0.0/2 => NGW #1
64.0.0.0/2 => NGW #2
128.0.0.0/2 => NGW #3
192.0.0.0/2 => NGW #4
NAT
Gateway
(NGW #2 )
NAT
Gateway
(NGW #3)
NAT
Gateway
(NGW #4)
NAT
Gateway
(NGW #1)
Scaling NAT Gateways.
NAT
Gateway
(NGW)
Availability Zone A
/18
0.0.0.0/0 => NGW
Scaling NAT Gateways.
Availability Zone A
NAT
Gateway
(NGW #1 )
NAT
Gateway
(NGW #2)
/19
0.0.0.0/0 => NGW #1
/19
0.0.0.0/0 => NGW #2
Scaling NAT Gateways.
Availability Zone A
NAT
Gateway
(NGW #2 )
NAT
Gateway
(NGW #3)
NAT
Gateway
(NGW #4)
NAT
Gateway
(NGW #1)
/20
0.0.0.0/0 => NGW #3
/20
0.0.0.0/0 => NGW #4
/20
0.0.0.0/0 => NGW #1
/20
0.0.0.0/0 => NGW #2
ClassicLink.
EC2-Classic
VPC
ClassicLink.
Golf
Zulu
Alpha
gethostname(zulu.public)
10.0.0.100
gethostname(alpha.public)
10.0.0.200
ClassicLink.
Golf
Zulu
Alpha
Issue:
gethostname(zulu.public)
54.aaa.bbb.ccc
Service Discovery.
Registration
• hostname.public.
• hostname.private.
• ipaddress.public.
• ipaddress.private.Zulu
ClassicLink.
Golf
Zulu
Alpha
Issue:
gethostname(zulu.public)
54.aaa.bbb.ccc
Resolution:
DNS over ClassicLink
gethostname(zulu.public)
100.64.0.100
ClassicLink over Peering.
Golf
Zulu Alpha
Expectation:
gethostname(alpha.public)
100.64.128.200
ClassicLink over Peering.
Golf
Zulu Alpha
Issue:
gethostname(alpha.public)
54.xxx.yyy.zzz
ClassicLink over Peering.
Golf
Zulu Alpha
Issue:
gethostname(alpha.public)
54.xxx.yyy.zzz
Resolution:
ClassicLink over Peering
DNS over Peering
gethostname(alpha.public)
100.64.128.100
ClassicLink Everywhere.
Golf
Zulu Alpha
Romeo
ClassicLink Everywhere.
Golf
Zulu Alpha
Romeo
ClassicLink at Scale.
launch config1
1 2 3 N
. . . . .
Service Classification.
Small non-critical
Large non-critical
Small critical
Large critical
Critical to members
Size
Dependency Mappings.
Flow Collection.
IP Metadata.
Flow Analysis.
Global Routing
MORE PACKET KUNG-FU.
AWS Direct Connect.
Omega Bravo
Delta
Netflix Backbone
Global Backbone.
Direct Connect.
Netflix Backbone
us-west-1 us-east-1
eu-west-1
Backbone Traffic.
100.64.0.0/10
Netflix Backbone
10.0.0.0/8 10.0.0.0/8
Backbone Traffic.
100.64.0.0/10
Netflix Backbone
10.0.0.0/8 10.0.0.0/8
Backbone Traffic.
100.64.0.0/10 10.0.0.0/8
DNS
Netflix Backbone
Global Infrastructure.
Netflix BackboneRegion 1
Classic
Classic
VPC
VPC
Corp
Region 2
Classic
Classic
VPC
VPC
Corp
3rd
Party
3rd
Party
Retrospective.
MULLIGANS.
SECOND CHANCES.
Lessons Learned.
IP address scheme.
Traffic patterns.
Partner engagement.
Technical debt.
Features.
ClassicLink.
ClassicLink over Peering.
DNS over Peering.
EC2 DNS for non-RFC 1918.
Delivery.
Thank you!
Remember to complete your evaluations!
Related Sessions
• NET201 – Creating Your Virtual Data Center: VPC Fundamentals and Connectivity Options
• NET303 – NextGen Networking: New Capabilities for Amazon’s Virtual Private Cloud
• NET402 – Deep Dive: AWS Direct Connect and VPNs
• NET 403 – Elastic Load Balancing Deep Dive and Best Practices
• NET 404 – Making Every Packet Count
Questions?
ClassicLink.
Golf
Zulu
Alpha
Expectation:
gethostname(zulu.public)
100.64.0.100
ClassicLink.
Golf
Zulu
Alpha
Issue:
gethostname(zulu.public)
54.aaa.bbb.ccc