open stack advanced_part
Post on 13-Jul-2015
399 Views
Preview:
TRANSCRIPT
Network internals (advanced parts)
Giuliano Santandrea – CIRI ICT
University of Bologna
● Internal-external VLAN translation
● packet captures
● Security groups
● routing
During the VM creation these elements are created in the compute node:◦ qbrZZZ: LB and its mgmt interface
◦ qvbZZZ: veth pair connected to the LB
◦ qvoZZZ: veth pair connected to the OVS bridge “br-int”
◦ tapZZZ: tap interface, connected to the LB
ZZZ: first 11 letters of the Neutron "port" for the VM interface
Subnet creation (network node):◦ tap-YYY: tap interface connected to br-int, inside a
network namespace (YYY are the first 11 letters of the "port" of the DHCP server)
Router creation (network node):◦ tap-AAA tap interface connected to br-ex, inside a
network namespace (AAA are the first 11 letters of the "port" of the router gateway)
◦ tap-BBB tap interface connected to br-int, inside a network namespace (BBB are the first 11 letters of the "port" of the router internal port)
On the physical data network many network virtualization technologies are possibile (VLAN,VXLAN,GRE,..).
Internally OS maps each virtual network to an internal VLAN
The Cesena cluster uses the VLANs. The bridgesin the VNI are configured to do the translationbetween external-internal VLANs
Other example: GRE encapsulation◦ for packets directed to the data network, the bridges
remove internal VLAN tags and encapsulate them with a a tunnel_id
public net
gateway
External net
Mgmt net
Data net
CPU node 1Controller Network node
br-data
br-int
linux
bridge
VM
br-data
br-int
br-ex
br-data
br-int
Internet
Untagged
internal VLAN tag
External VLAN tag
untagged
specificroutingtables
dhcp server
dhcp server
Network namespaces
No traffichere
VMeth0
VLAN access port-based(internal VLAN)
Trunk all
Trunk allVLAN access port-based(internal VLAN)
VMeth0
TCAM (OpenFlow rules):priority=4,in_port=8,dl_vlan=1 actions=mod_vlan_vid:1000,NORMALpriority=2,in_port=8 actions=droppriority=1 actions=NORMAL
• For all packets coming from phy-br-data and tag=1: changetag=1000, then do classic MAC Learning Switching (MLS)• Discard packets coming from phy-br-data • Otherwise MLS (least priority)
VLAN 1 => 1000
VMeth0
priority=3,in_port=17,dl_vlan=1000 actions=mod_vlan_vid:1,NORMALpriority=2,in_port=17 actions=droppriority=1 actions=NORMAL
VLAN1000 => 1
VMeth0
untagged
Internal VLAN tag
external VLAN tag
External VLAN tag
Internal VLAN tag
external VLAN tag
External VLAN tag
No traffic here!No traffic here!
untagged
On the VM we send an Ethernet frame (ARP) in broadcast:◦ sudo arping –bI eth0 10.0.0.9
broadcast allows to bypass MAC learning of the bridges: each bridge will forward theframe to every port!
On the cluster node:◦ tcpdump –nnvei interface
Or if in a netns:◦ sudo ip netns exec <netns> bash #enter in the netns
◦ tcpdump –nnvlei interface #flush with -l
stack@hc01:~/devstack$ sudo tcpdump -nnvei qvb71cbe0bd-6f
tcpdump: WARNING: qvb71cbe0bd-6f: no IPv4 address assigned
tcpdump: listening on qvb71cbe0bd-6f, link-type EN10MB (Ethernet), capture
size 65535 bytes
18:44:23.752905 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
18:44:24.752998 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
NO VLAN TAG!!!
root@hc01:/opt/stack# sudo tcpdump -nnvei int-br-data
tcpdump: WARNING: int-br-data: no IPv4 address assigned
tcpdump: listening on int-br-data, link-type EN10MB (Ethernet), capture size 65535 bytes
18:46:41.212436 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Requestwho-has 10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
18:46:42.212633 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Requestwho-has 10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
^C
2 packets captured
2 packets received by filter
0 packets dropped by kernel
root@hc01:/opt/stack# sudo tcpdump -nnvei eth0
tcpdump: WARNING: eth0: no IPv4 address assigned
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
18:49:57.241431 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1000, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length28
18:49:58.020910 d0:7e:28:90:d9:4b > 01:80:c2:00:00:00, 802.3, length 64: LLC, dsap STP (0x42) Individual, ssap STP (0x42) Command, ctrl 0x03: STP 802.1w, Rapid STP, Flags [Forward], bridge-id 8000.d0:7e:28:90:d9:3d.800d, length 47
message-age 0.00s, max-age 20.00s, hello-time 2.00s, forwarding-delay15.00s
root-id 8000.d0:7e:28:90:d9:3d, root-pathcost 0, port-role Designated
18:49:58.241620 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 1000, p 0, ethertype ARP, Ethernet (len 6), IPv4 (len 4), Request who-has 10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length28
^C
3 packets captured
3 packets received by filter
0 packets dropped by kernel
root@hc01:~# tcpdump -nnvei tap8356e24c-67
tcpdump: listening on tap8356e24c-67, link-type EN10MB (Ethernet), capture
size 65535 bytes
^C11:39:28.424480 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
11:39:29.424638 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
11:39:30.424733 fa:16:3e:e6:9e:f8 > ff:ff:ff:ff:ff:ff, ethertype ARP
(0x0806), length 42: Ethernet (len 6), IPv4 (len 4), Request who-has
10.0.0.9 (ff:ff:ff:ff:ff:ff) tell 10.0.0.66, length 28
3 packets captured
3 packets received by filter
0 packets dropped by kernel
NO VLAN TAG!!!
Secgroup: contains firewall rules configured by the user (atthe cloud platform level)◦ During the VM creation we associate one or more secgroups
It is «default deny», we can add rules to allow ingresstraffic
In the default secgroup there are already rules allowingegress traffic
Implementation: iptables rules on the CPU node
Note: it’s implemented by Neutron applying the native kernel filtering functions (netfilter) to bridged tap interfaces, and this works only with LBs. For this reason an additional LB is needed as an intermediate element to interconnect the tap interface to the integration bridge.
Iptables rules (global namespace) on the linuxbridge port
VMeth0
We have enabled ssh and pingin Ingress
For all packets entering the LB, passing through the tap (outbound VM traffic, EGRESS), use the following chains (iptables filter table in the global netns of the compute node):
neutron-openvswi-sg-chain
neutron-openvswi-oXXX
neutron-openvswi-FORWARD
FORWARD
Source: http://goo.gl/lD30Vl
VMeth0
… exiting the LB (inbound traffic,INGRESS)…:
neutron-openvswi-sg-chain
neutron-openvswi-iXXX
neutron-openvswi-FORWARD
FORWARD
Source: http://goo.gl/lD30Vl
VMeth0
We enabled ssh (TCP porta 22) and ping (ICMP), wecan see these rules:
-A neutron-openvswi-sg-chain -m physdev --physdev-out tapb5d4535b-8f --physdev-is-bridged -j neutron-openvswi-ib5d4535b-8
-A neutron-openvswi-ib5d4535b-8 -m state --state INVALID -j DROP-A neutron-openvswi-ib5d4535b-8 -m state --state RELATED,ESTABLISHED -j
RETURN-A neutron-openvswi-ib5d4535b-8 -p tcp -m tcp --dport 22 -j RETURN-A neutron-openvswi-ib5d4535b-8 -p icmp -j RETURN-A neutron-openvswi-ib5d4535b-8 -s 192.168.101.2/32 -p udp -m udp --sport
67 --dport 68 -j RETURN-A neutron-openvswi-ib5d4535b-8 -j neutron-openvswi-sg-fallback
The admin creates a provider network with the allocation pool 10.250.0.50-10.250.0.70 (20 addresses)
It is attached to a virtual router
The virtual router is attached to a user private network
The router ◦ has an address on the provider network (10.250.0.50)
◦ has an address on the user network (192.168.101.1)
◦ acts as a NAT
sudo ip netns exec qrouter-XXX bash
ip address show1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
27: qr-8110d0f8-64: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether fa:16:3e:98:f7:dd brd ff:ff:ff:ff:ff:ff
inet 192.168.101.1/24 brd 192.168.101.255 scope global qr-8110d0f8-64
inet6 fe80::f816:3eff:fe98:f7dd/64 scope link
valid_lft forever preferred_lft forever
29: qg-64643e7a-3e: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN
link/ether fa:16:3e:e9:46:1a brd ff:ff:ff:ff:ff:ff
inet 10.250.0.50/24 brd 10.250.0.255 scope global qg-64643e7a-3e
inet6 fe80::f816:3eff:fee9:461a/64 scope link
valid_lft forever preferred_lft forever
ip route showdefault via 10.250.0.3 dev qg-64643e7a-3e
10.250.0.0/24 dev qg-64643e7a-3e proto kernel scope link src 10.250.0.50
192.168.101.0/24 dev qr-8110d0f8-64 proto kernel scope link src 192.168.101.1
sudo iptables -t nat –nvL
…
Chain neutron-l3-agent-snat (1 references)
pkts bytes target prot opt in out source destination
12 882 neutron-l3-agent-float-snat all -- * * 0.0.0.0/0 0.0.0.0/0
6 426 SNAT all -- * * 192.168.101.0/24 0.0.0.0/0 to:10.250.0.50
…
In some old OS docs «br-data» is called «br-ethX»
Using GRE tunnel, bridge br-data is called br-tun
Provider network can be currently createdonly via CLI◦ The creation of a provider network require to
specify the physical network (mapped to a virtualbridge, conneted to a physical network)
The netns/dhcp server are not implementedat the their definition time, but only when a VM on that network is created
To ensure connectivity to a VM:
● The tenant user that booted the VM must have enabled the access by inserting the appropriate rules in the secgroups and then attaching the secgroup to the VM
● neutron-plugin must have inserted the correct OpenFlow rules in the OVS bridges (br-int, br-data, br-ex)
● The “dnsmasq” linux process (managed by neutron-dhcp) must be working properly as DHCP server for the VM
top related