considerations for operating an openstack cloud
TRANSCRIPT
![Page 1: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/1.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 1
Mark T. Voelker, Technical Leader @ Cisco
OpenStack ATC/StackForge Puppet Core/Foundation Member #54
All Things Open 2014
![Page 2: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/2.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 2
@marktvoelker
• Tech Lead at Cisco, StackForge Puppet core developer, OS Foundation Member #54
• Fact: can be bribed with doughnuts
• Currently works in Cisco’s Cloud & Virtualization Group
• In copious (hah!) spare time: OpenStack solutions, Big Data, Massively Scalable Data Centers, Devops, making sawdust with extreme prejudice
![Page 3: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/3.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 3
• Tech lead, manager, software developer, architect
• Started in OpenStack in 2011 at the Diablo Design Summit
![Page 4: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/4.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 4
The great thing about my job is that I get to have fun exploring a lot of new things…
![Page 5: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/5.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 5
….and I get to help build a LOT of clouds.
![Page 6: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/6.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 6
Today’s talk won’t be overly formal….
![Page 7: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/7.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 7
…because I tend to get excited by this stuff.
![Page 8: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/8.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 8
![Page 9: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/9.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 9
![Page 10: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/10.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 10
……then you know how to get to Day 1.
Now let’s talk about getting to Day 30…
![Page 11: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/11.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 11
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD
• Packaging
• Automated test
• Monitoring
• Up/down alerting
• Trending data
• Logging and log search
![Page 12: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/12.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 12
High
Availability?
Sounds
great--I’ll
take two!
![Page 13: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/13.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 13
• Consider whether you want active/active or active/passive
• Setup and tooling differs a bit, but I generally like active/active
• Note that docs.openstack.org has an HA Guide
• A bit dated…patches welcome!
• Prioritize HA for the control plane
• That also means thinking about your database, network, and RPC bus
• Instance-level HA: there be dragons
• But yes, it’s being looked at
• Pets vs cattle
• Note: HA == more hardware
• Some components need at least 3 nodes
![Page 14: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/14.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 14
• Stuff OpenStack needs to run: message brokers
• Check out RabbitMQ clustering and mirrored queues
• Check out Galera for MySQL/MariaDB
• I usually see Percona XtraDB
• Frontend with an HAProxy/Keepalived pair
![Page 15: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/15.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 15
• Don’t do rabbit clustering
over a WAN
• Be aware of the SELECT…
FOR UPDATE issue
![Page 16: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/16.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 16
• Long story short: Neutron and some parts of Nova invoke an SQL pattern known as “SELECT…FOR UPDATE” which Galeradoesn’t support due to issues with cross-node locking.
• Can cause deadlocks symptoms.
• Neutron/nova code being refactored to remove, but will likely not be done until at least Kilo.
• Meanwhile: use HAProxy to send writes to a single Galera node and you should be fine
• With the obvious scalability bottleneck
• More info here.
• Thank Jay Pipes & Peter Boros for
the find!
![Page 17: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/17.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 17
• Use Swift, Ceph, or other highly available storage to back Glance
• Pick a highly available storage backend for Cinder too
• Use Keepalived/HAProxy to front-end multiple API servers
• Or another load balancer technology of your choice
• Can be deployed as dedicated nodes for scale, or cohabitate
• Network: DVR vs Provider Network Extensions
• Distributed Virtual Routers are a new experimental feature in Juno (not yet ready for production)
• Please go test it and report/fix bugs!
• Provider networks essentially punt the availability issue to your physical network
• Allows you to use standard tools like virtual port channels and VRRP
• Also highly performant
![Page 18: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/18.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 18
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD• Packaging
• Automated test
• Monitoring• Up/down alerting
• Trending data
• Logging and log search
![Page 19: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/19.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 19
We start with bare metal.
![Page 20: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/20.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 20
• For a cloud of any real size, you don’t want to be installing operating systems by hand
• Remember that baremetal bringup actually isn’t something that just happens once…often recurs for upgrades, capacity expansion, etc.
• Baremetal bringup tools can also have other uses, like inventory or bootstrapping configuration management agents.
![Page 21: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/21.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 21
• A simple (~15k lines of Python code) tool for managing baremetaldeployments
• Flexible usage (API, CLI, GUI)
• Allows you to define systems (actual machines) and profiles (what you want to do with them)
• Provides hooks for Puppet so you can then do further automation once the OS is up and running
• Provides control for power (via IPMI or other means), DHCP/PXE (for netbooting machines), and more.
![Page 22: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/22.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 22
![Page 23: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/23.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 23
![Page 24: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/24.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 24
• Razor• Developed by EMC, managed by Puppet Labs (occasionally used with Chef
too)
• Initial release in 2012
• Uses a “microkernel” loaded onto the machine to gather facts before provisioning
• Tag + Policy model
• Crowbar• Originally written by Dell, now a community project
• Originally designed to deploy OpenStack on all the way from baremetal
• Now deploys other stuff too (namely, Hadoop)
• Uses Chef to handle everything after the OS install
• Foreman• Used by Red Hat among others
• Does baremetal bringup and serves as a Puppet ENC
![Page 25: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/25.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 25
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD• Packaging
• Automated test
• Monitoring• Up/down alerting
• Trending data
• Logging and log search
![Page 26: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/26.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 26
![Page 27: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/27.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 27
“Cloud isn’t just an infrastructure technology….it’s a new operations model. And with OpenStack in particular, it’s one that’s very well suited to a DevOps style of management. Many companies aren’t just adopting cloud, they’re changing how they operate.”
“Besides, logging into servers to mess with config files makes me sad.”
--That ranty guy in Raleigh again
![Page 28: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/28.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 28
• Remember, OpenStack is a set of interoperating distributed systems
• That means you’re going to have a lot of software to configure on a lot of machines
• You’re probably going to want to make changes over time
• You’re probably going to have more than one person touching your cloud
• CM tools help you treat configuration as code, so you can collaborate more easily
![Page 29: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/29.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 29
Pile of
Bash
Scripts
![Page 30: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/30.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 30
![Page 31: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/31.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 31
• An increasingly common pattern:
• Puppet or Chef for configuration management, PLUS
• Ansible or Salt for cross-node orchestration
• Recommendation: use the tools that work for you!
• But remember: you don’t have to do it alone.
• Several CM tools have thriving collaborators in the OpenStack community
• Links for later:
• Puppet for OpenStack
• Chef for OpenStack
• Ansible for OpenStack
• SaltStack for OpenStack
• Pile of bash scripts for OpenStack
![Page 32: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/32.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 32
• Unit tests for your deployment code are a good idea
• ServerSpec tests to make sure your config management system did what it was supposed to are great
![Page 33: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/33.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 33
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD• Packaging
• Automated test
• Monitoring• Up/down alerting
• Trending data
• Logging and log search
![Page 34: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/34.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 34
…well, haven’t you always wanted a butler?
![Page 35: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/35.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 35
• DevOps: actually pretty handy
• OpenStack change velocity (community’s and yours)
• Anecdote: the majority of deployments I work with have some customizations or backports from future releases
• It’s not just OpenStack, it’s all the underpinning components and your CM code too!
![Page 36: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/36.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 36
• OpenStack itself uses CI/CD tools in it’s development process…you should consider using them in your cloud buildouttoo!
• The OpenStack Infra team has created some awesome tools: JJB, Zuul, etc
• They’re all open source and you can even see how OpenStack’s own CI is set up (check out Elizabeth Joseph’s slides from yesterday for more!).
• The basics:
• An integration server (Jenkins, Go, Travis, etc)
• A code review and repository tool (Gerrit, Cgit, GitHub, etc)
• A battery of automated tests (lint checks, rspec-puppet, Tempest, Rally, etc)
• Some form of packaging (rpmbuild/mock, sbuilder/pbuilder, etc)
• An artifact repository (Artifactory, yum/apt repos, etc)
• Optionally, some deployment jobs (usually powered by your CM tool)
![Page 37: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/37.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 37
• …you never intend to change the code yourself
• …building your own packages would violate a support contract with your distribution
• …you’ve never used a CI/CD pipeline before (but really: you should start learning)
• …you have a static environment that absolutely will not change, need to add capacity, etc.
![Page 38: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/38.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 38
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD• Packaging
• Automated test
• Monitoring• Up/down alerting
• Trending data
• Logging and log search
![Page 39: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/39.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 39
• Now that you have a cloud, you’ll probably want to know that all it’s parts stay in good working order.
![Page 40: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/40.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 40
![Page 41: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/41.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 41
![Page 42: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/42.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 42
• I’ve worked on a lot of OpenStack clouds and almost everyone has their own preferred monitoring toolset.
• One possible exception: almost everybody seems to love Graphite.
• The golden rule is: use the tools that work for you!
• Very often this will be whatever you’re using in the rest of your infrastructure.
• Break it down into at least two buckets:
• Up/down and alerting (ex: Nagios or it’s derivatives…yes, there are OpenStack plugins out there on NagiosExchange)
• Trending data collection/plotting (ex: collectd/statsd feeding graphite)
• Also: use your peers!
• Check out Tong Li’s Monitoring as a Service talk later today!
• Operators often willing to share, so ask on the openstack-operators list.
![Page 43: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/43.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 43
• Architecture
• Components
• High Availability
• Bare metal bring-up
• Config management
• CI/CD
• Packaging
• Automated test
• Monitoring
• Up/down alerting
• Trending data
• Logging and log search
![Page 44: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/44.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 44
![Page 45: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/45.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 45
• Distributed systems generate logs…..all over the place.
• Finding the root of problems may mean correlating logs from different machines…but which?
• OpenStack in particular *can* be pretty verbose
• You may also be dealing with logs from other distributed tools in your cloud (RabbitMQ, databases, etc)
• Generally you want to get logs together, be able to search them, and be able to visualize them.
![Page 46: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/46.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 46
Unlike monitoring tools, there seems to be pretty broad consensus on good tools here in deployments I’ve worked with….
![Page 47: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/47.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 47
http://www.elasticsearch.org/blog/openstack-elastic-recheck-powered-elk-stack/
(visualization)
(collection)
(search/analytics)
![Page 48: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/48.jpg)
© 2010 Cisco and/or its affiliates. All rights reserved. Cisco Confidential 48
Questions?@marktvoelker
http://openstack.org/
http://cisco.com/go/openstack/
(yes, we’re hiring!)
![Page 49: Considerations for Operating an OpenStack Cloud](https://reader034.vdocuments.mx/reader034/viewer/2022051617/55a4dea01a28aba70e8b4591/html5/thumbnails/49.jpg)