tips tricks and tactics with cells and scaling openstack - may, 2015
TRANSCRIPT
OpenStack Summit - Paris 2014
Multi-Cell Openstack: How to Evolve your Cloud to Scale
https://www.openstack.org/summit/openstack-paris-summit-2014/session-videos/presentation/multi-cell-openstack-how-to-evolve-your-cloud-to-scale
Australian Research Cloud● Started in 2011● Funded by the Australian Government● 8 institutions around the country● Production early 2012 -
Openstack Diablo● Now running a mix of Juno
and Icehouse● Use Ubuntu 14.04 and KVM● 100Gbps network connecting most sites (AARNET)
Reasons for using cells
Single API endpoint, compute cells dispersed around australia. Simpler from users perspective.Single set of security groups, keypairs etc.Less openstack expertise needed as only one version of some core openstack services.
Size● 8 sites● 14 cells● ~6000 users registered ● ~700 hypervisors● 30,000+ cores
People● Core team of 3 devops● 1-2 operators per site
http://status.rc.nectar.org.au/growth/infrastructure/
Interaction with other servicesEach cell also has one or more:● cinder-volume host, using Ceph, LVM and NetApp
backends● A globally replicated swift region per site● Glance-api pointing to local swift proxy for images● Ceilometer collectors at each cell push up to a central
mongo● Private L3 network spanning all cells - will be useful for
neutron
Cells InfrastructureEach compute cell has:
● MariaDB Galera cluster● RabbitMQ cluster● nova scheduler/cells/vnc/compute/network/api-metadata● glance-api● swift region - proxy and storage nodes● ceilometer-collectors to forward up to a global collector● cinder-volume
API cell:● nova-api, nova-cells● keystone● glance-registry● cinder-api, scheduler● heat, designate, ceilometer-api
Scheduling● Have some “private” cells only available to certain tenants. This is usually
determined by funding source.● Global flavours and cell local flavours● Cell level aggregates for intra cell scheduling● Some sites have GPUs, fast IO for their own use.
○ Introduced compute and ram optimised flavours○ Not all cells support all flavours
● Each cell advertises 1 or more availability zones to use in scheduling.○ Ties in with cinder availability zones
Bringing on new cellsWant to test in production before opening to publicDon’t want to flood brand new cells
Scheduler filters● Role based access to cell. Cells advertise what roles can schedule to them● Direct only - Allow public to select that cell directly but global scheduler
doesn’t count it
Operating cellsHave a small openstack cluster to manage all global infrastructureStandard environment - use puppetUpgrade cells one at a time - live upgrades● upgrade compute conductors● upgrade API cell● upgrade compute cells● upgrade compute nodes
Read access to compute cells RabbitMQs for troubleshooting and monitoring. Only real interface into each of the cells.Console-log is a good test of cells functionality - have one in each cell and monitor
Future plansMove to Neutron - in planning and testing stage● Currently have a single public network per cell, want to provide tenant
networks and higher level servicesStart off with a global neutron and simple shared flat provider networks per cell.All hypervisors talking to the same rabbit - scale issues?
Also looking at other higher level openstack services (which there are many!)
• Managed Cloud company offering a suite of dedicated and cloud hosting products
• Founded in 1998 in San Antonio, TX
• Home of Fanatical Support
• More than 200,000 customers in 120 countries
Rackspace
24www.rackspace.com
• In production since August 2012– Currently running: Nova; Glance; Neutron; Ironic; Swift; Cinder
• Regular upgrades from trunk– Package built on trunk pull from mid March in testing now
• Compute nodes are Debian based– Run as VMs on hypervisors and manage via XAPI
• 6 Geographic regions around the globe– DFW; ORD; IAD; LON; SYD; HKG
• Numbers– 10’s of 1000’s of hypervisors (Over 340,000 Cores, Just over 1.2 Petabytes of RAM)
• All XenServer
– Over 170,000 virtual machines– API per region with multiple Compute cells (3 – 35+) each
Rackspace – Cloud Infrastructure
25www.rackspace.com
• Cells Infrastructure– Size between ~100 and ~600 hosts per cell– Different Flavor Types (General Purpose, HIgh I/O, Compute Optimized, etc)– Working on exposing maintenance zones or near/far scheduling (host, shared IP space, network aggregation)– Separate DB cluster for each cell
• Run our Cells infrastructure in cells– Control Plane exists as instances in small OpenStack deployment– Multiple Hardware types– Separate tenants – Control plane instances from other internal users
Rackspace – Cloud Infrastructure - Cells
26www.rackspace.com
• Multiple cells within each flavor class– Hardware Profile
• Additionally, we group by vendor• Live migration needs matching CPUs
– Range of flavor size within each cell (eg. General Purpose 1, 2, 4 and 8 Gig)
• Tenant Scheduling– Custom filter schedules by Flavor class first
• All General Purpose cells, for example
– Scheduled by available RAM afterwards• Enhancements for spreading out tenant load and max IOPs per host
– In some cases, filters can bind a cell to specific tenants (testing and internal use)
• Work in Cells V2 to enhance scheduling– https://review.openstack.org/#/c/141486/ as one example
27
Cell Scheduling
www.rackspace.com
• Common control plane nodes deployed by ansible play book– DB Pair– Cells service– Scheduler– Rabbit
• Playbook Populates flavor info based on hardware type• Hypervisors bootstrapped once CP exists
– Create Compute Node VM– Deploy Code and configure– Update routes, etc
• Provision IP blocks• Test• Link via playbook
28
Deploying a Cell
www.rackspace.com
• Larger region has run rate around 50,000 VMs
• 1000’s of VMs created/deleted per hour in busiest regions
• Downstream BI and Revenue assurance teams require deleted instance records be kept for 90 days
• Current deleted instance counts range between 132,000 and 900,000
29
Rackspace – Purge Nova DBs
www.rackspace.com
• By Pass URL prior to linking a cell up– Test API endpoint: http://nova-admin-api01.memory1-0002.XXXX.XXXXXX.XXXX:8774/v2
• Full set of tests– Instance creates, deletes, resizes– Overlay network creation– Volume provisioning– Integration with other RS products
• Trickier to test hosts being added to an existing cell– Hosts are either enabled or disabled– Targeting helps
• --hint target_cell=’<cellname>’• --hint 0z0ne_target_host=<host_name>
31
Testing Cells
www.rackspace.com
• No formal way of disabling a cell• Weighting helps – but is not absolute
– Weighting cell can still “win” scheduler calculation based on available RAM
• Solution: custom filter uses specific weight offset value to avoid scheduling (- 42)
32
Managing Cells – “Disable”
www.rackspace.com
class DisableCellFilter(filters.BaseCellFilter): """Disable cell filter. Drop cell if weight is -42. """
def filter_all(self, cells, filter_properties): """Override filter_all() which operates on the full list of cells... """ output_cells = [] for cell in cells: if cell.db_info.get('weight_offset', 0) == -42: LOG.debug("cell disabled: %s" % cell) else: output_cells.append(cell) return output_cells
• Rackspace uses Quark Plugin– https://github.com/rackerlabs/quark
• Borrowed old idea from Quantum/Melange days– Default tenant for each cell– Each cell is a segment– Provider subnets are scoped to a segment– Nova requests ports on provider network for the segment
• Public• Private• MAC addresses too
34
Neutron and Cells
www.rackspace.com