neutron: peeking behind the curtainsfiles.meetup.com/6653182/1_salvatore orlando_vmware_h14.pdf ·...
TRANSCRIPT
Neutron: peeking behind the curtains (that is to say how Neutron builds virtual networks)
Salvatore Orlando VMware
Twitter: @taturiello
IRC: salv-orlando Email: sorlando(at)vmware.com
Before we start • Make your voice heard! ���
Audience interaction is very valuable.
• This will be a fairly technical deep-dive on Neutron internals. – Please try not to fall asleep! – We will focus exclusively on Neutron’s built-in, purely FOSS components. – Questions regarding other plugins are however welcome
• By the way, who is this chap talking to us? – Professional procrastinator, SSC Napoli supporter – Breaking Openstack since 2010 – Founder and core team member of the OpenStack Networking project
(Neutron)
Stuff we will talk about
• Neutron and its place in the OpenStack universe
• Neutron’s architecture overview
• The ML2 plugin
• OVS agent: layer-2 network virtualization and VIF security
• L3 Agent: Routing, gateway, and floating IPs
• Configuration agents: DHCP and Metadata
• WIPs, limitations and alternatives
The OpenStack universe���(and Neutron’s place in it)
21 official programs | 17 integrated, 4 in incubation
Neutron’s logical resources…
Network A1 Network A2 Network B1
Router A
Router B
External Network
Tenant “A” Tenant “B”
DHCP DHCP DHCP A11 A12 A21 B1���1
B1���2
Internal���Gateway Internal
Gateway
Internal���Gateway
External Gateway
External Gateway
…and their physical mappings
Compute Node C2 Compute Node C3
Network Node Compute Node C1
Br-tun
Br-int B
r-tu
n
Br-
int
Br-
tun
Br-
int B
r-tun
Br-int
A12
B11 B12
A21
A11
Local VLAN tags converted into GRE keys (and vice versa)
DHCP
L3
Br-
ex
Neutron architecture overview
Neutron Server
ML2 L3 base plugin
OVS agent
RPC RPC
REST
OVS Agent
OVS Agent
OVS Agent
DHCP Agent
L3���Agent
L3���Agent
Metadata Agent
L2-population
GRE
- ML2 plugin
- Open vSwitch edges
- GRE tunnels transport
AMQP
The modular layer-2 plugin
• Driver-based vs. monolithic – Type drivers for network transport type – Mechanism drivers for:
• Interacting with backend (e.g.: Arista, Cisco N1kV) • Providing additional features or optimize scale/performance (e.g: L2
population)
– More information: https://wiki.openstack.org/wiki/Neutron/ML2
• Modular, driver-based approach not (yet) available for layer-3 plugins – Reference plugin implements all supported capabilities
• Builds GRE tunnel mesh – Alternatively VxLAN or no mesh if VLANs are used as network transport
• Populates MAC forwarding table for quicker instance lookups (l2-population)
• Wires instance VIFs’ to the appropriate virtual network
• Secures virtual interfaces – Basic anti-spoofing rules (ARP, MAC, IP) – security group rules
• Runs on hypervisor
• Communicates with server via RPC over AMQP
The OVS Agent: responsibilities
OVS agent
VIF
The OVS Agent: architecture
Open vSwitch (br-int)
qvb
qvo
Bridge iptables
OVS Monitor
Agent main loop
Security manager
Conceptual representation of agent architecture • Does not map to actual components/processes
FDB manager
Tunnels to other hosts
To server via AMQP bus
The GRE tunnel overlay mesh – Full mesh between hosts
• n(n-1) total tunnels
– L2-in-L3 tunnels • Abstracts from data center’s network complexity • Only needs IP connectivity to the destination host
– Nothing is free • About 3% GRE headers overhead • No hardware TSO. TSO done in software uses CPU and reduces
overall throughput
VxLAN (kind of) similar, but not discussed for time constraints
Handling BUMs: layer-2 population
• Broadcast, unknown unicasts, and multicasts can be a serious problem in large deployments
• However Neutron knows where target VM instances are – Pre-populate forwarding tables – Optimizes both GRE and VxLAN overlays
• L2-population implemented by: – Server side driver – On each host local ARP responder and forwarding table
population – For more info: h"ps://wiki.openstack.org/wiki/L2popula7on
Tunnel mesh with L2-population
Host 1
Host 2
Host 3 Host 4
Host 5
VM A
VM G VM E VM D
VM B
VM C
VM F
VM H
Proxy Arp
The ARP request from “VM A” for “VM G” is intercepted and answered using a pre-populated neighbour entry
Traffic from VM A to VM G is encapsulated and sent to Host 4 according to the bridge forwarding table entry.
Without L2-population ARP broadcasts would have filled the tunnel mesh
Wiring and securing interfaces
OVS agent
OVS Monitor
Agent main loop
Security manager
FDB manager
To server via AMQP bus
OVS monitor detects new interfaces on integration bridge
1
Agent calls server for interface(s) details 3
Agent loop collects new, updated, and removed interfaces
2
Port is wired assigning “local” VLAN tag
4 Security groups translated into iptables rules 5
Neutron’s layer-3 services
• East/west routing
• Source NAT ���(instances’ external gateway)
• Destination NAT���(floating IPs)
• Static routes Network
Router
External Network
VM
Network
VM
Floa7ng IP
East/West rou7ng
External gateway
Network namespaces
• Isolated copy of network stack – Cloned from ‘root’ namespace – Scope limited to namespace – veths can connect devices from child namespace to root namespace – Ability to reuse IP addresses, routes, iptables rules
• Processes can spawn in namespace – And access that namespace’s network stack
• L3, DHCP and metadata agents rely on them
• Available in upstream Linux kernel since 2.6 series • For more info:
– http://pastebin.com/ruR70tH4
Network namespaces example HOST (root)
lo
• veth pair
ip netns add A!
ip link add tapA-root type veth peer name tapA-ns!
ip link set tapA-ns netns A!
ovs-vsctl add-port br-int tapA-root!
!
• OVS internal interface
ip netns add B!
ip link add tapB-int!
ip link set tapB-int netns B!
ovs-vsctl add-port br-int tapB-int – set Interface tapB-int type=internal!
!
!
!
!
eth0
tapA-root
br-int
NETNS ���A lo
tapA-ns
NETNS ���B lo
tapB-int
Layer-3 agent: responsibilities
• Handle server RPC notifications for routers’ state changes – E.g.: new interface, new floating IP
• Query server for current router state – Ensures agent-side state is consistent with server – Coalesce multiple changes in short timespan
• Apply configuration on host – Add router interfaces into namespace – Set default SNAT rule for external gateway – Reconfigure DNAT/SNAT rules for floating IPs – Apply extra static routes to network namespace
• Distinct namespace for each logical router
• Internal interfaces for tenants’ networks – Create interface, configure IP but does not wire port
• Gateway interfaces for uplink to external network
• Iptables rules in NAT table for default gateway and floating Ips
• Additional routes in���namespaces’ routing���table
Layer-3 agent: how it works
Br-
int
Br-
ex
NS-Rtr-B
NS-Rtr-A
SNAT/DNAT
SNAT/DNAT
L3���Fwd
L3���Fwd
NS-Rtr-C SNAT/DNAT
L3���Fwd
Configuration agents: DHCP
• Server RPC notifications for: – Subnet state changes – IP/MAC address pair changes
• Addresses distributed via dnsmasq – Code allows for implementing more drivers (e.g.: ISC DHCP)
• Isolation ensured via network namespaces – Overlapping IPs, anyone?
• Multiple instances for load distribution and HA
Root Network
Namespace
DHCP agent: how it works
DHCP Network Namespace
DHCP Interface br-int dnsmasq
Options Hosts
Spawn upon server notification if not yet
configured
Set as startup, updated through
DHCP options server extensions
Updated when active port set changes
Reloaded with ���kill –HUP on every
change
Configuration agents: Metadata
• Proxies requests to nova metadata server – Used by services such as cloudinit – Namespace proxy captures requests for 169.254.169.254 – Metadata agent forwards requests to nova – “Bridge” between tenant and management networks
• Routed networks: embedded in router namespace – Instance reach 169.254.169.254 via default route
• Non-routed networks: proxy running in DHCP namespace – Static route injected into instance via DHCP option 121
Metadata agent: how it works Cloud-init (from instance 10.0.0.3):���curl http://169.254.169.254/openstack/latest/meta_data.json
Router (DHCP) ���Network Namespace
Metadata proxy
169.254.169.254:80
10.0.0.1
10.0.1.1
10.0.2.1
Metadata agent
Unix socket Tenant network
Management network
Add HTTP headers: X-Router-ID <uuid> X-Forwarded-For <instance_ip> X-Instance-ID: <instance_id>
Nova metadata server HTTP
More stuff: high-level services • Available with “Service Plugin”
– Every service plugin is driver based to allow for implementations different from the reference one
• Load balancing – Reference implementation based on Haproxy
• VPN – ipsec only at the moment – Reference implementation uses openswan – Requires l3 agent
• Edge firewall – Reference implementation based on iptables – Requires l3 agent – Experimental
WIPs and limitations
• OVS agent scale: – Better scalability must be achieved with high number of VIFs security
group rules per host
• L3 agent: – Traffic distribution (east/west, gateway) not yet supported – HA/failover in progress
• IPAM logic baked in DB logic
• Security group rules as OVS flow entries – Will avoid the need for bridge/OVS “hybrid” plugging
• IPv6 support
• General reliability and failure condition handling
Beyond the ‘built-in’ solution
• Neutron’s plugin mechanism allows for choosing among a wide range of solutions
• Open Source – Open Daylight: ML2 driver
• leverages L3/DHCP/metadata agents
– Open Contrail • Standalone plugin providing basic network services & NFV
• Commercial – Available either as standalone plugins or ML2 drivers – VMware (NSX/vCNS), Cisco (UCS/Nexus 1kv), NEC,
Midonet, etc. etc.
Summary
• Overview of the “reference” Neutron network virtualization solution.
• OVS agent builds L2 networks and secures VIFs.
• L3 agent provides routing/NAT.
• DHCP/Metadata agent for instance configuration.
• Current suite is not (yet) perfect. – If it does not suit your needs Neutron supports a plethora of backends to
choose from