anynines - running cloud foundry for 12 months - an experience report
DESCRIPTION
anynines runs a public PaaS located in a German datacenter based on Cloud Foundry. In more than 12 months of running a Cloud Foundry PaaS man lessons about security, high availability, open stack and many other exciting topics have been learned. See how Bosh can be used and how it shouldn't be used. Learn how to perform Cloud Foundry upgrades and read how to harden Cloud Foundry by adding more fault tolerance with pacemaker.TRANSCRIPT
![Page 1: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/1.jpg)
Running Cloud Foundry An Experience Report
![Page 2: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/2.jpg)
About this talk
![Page 3: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/3.jpg)
• Receive an opinion about running Cloud Foundry (CF)
• How to shoot your own leg with CF and overcommitment settings
• How to perform CF updates
• How to harden CF
• Wise words about CF services
![Page 4: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/4.jpg)
Introduction
![Page 5: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/5.jpg)
about.me/fischerjulian
![Page 6: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/6.jpg)
Running a public Cloud Foundry
for more than a year.
![Page 7: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/7.jpg)
It works.
![Page 8: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/8.jpg)
In order to run Cloud Foundry smoothly …
![Page 9: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/9.jpg)
… refer to the package leaflet for risks and side effects and consult pivotal, cloudcredo or anynines.“
![Page 10: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/10.jpg)
The details
![Page 11: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/11.jpg)
The anynines Stack
![Page 12: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/12.jpg)
Hardware
OpenStack
Cloud Foundry
VMware
![Page 13: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/13.jpg)
We migrated from a Rented VMware to a
self-hosted OpenStack.
![Page 15: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/15.jpg)
Proof point made…
![Page 16: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/16.jpg)
Cloud Foundry saves investments into software development
by being infrastructure agnostic.
![Page 17: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/17.jpg)
Running Cloud Foundry. What happened.
![Page 18: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/18.jpg)
Security Issues
![Page 19: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/19.jpg)
• Pivotal informs partners early about issued
• Usually along with fixes
![Page 20: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/20.jpg)
OpenStack Issues
![Page 21: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/21.jpg)
• Ext4 vs. Ext3
• DEA MTU
• rsyslogd command not found
![Page 22: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/22.jpg)
CF Gotchas
![Page 23: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/23.jpg)
DEA evacuate & Bosh timeout race-condition
![Page 24: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/24.jpg)
• Removing a DEA → Apps will be evacuated→ DEA will be stopped
• Bosh deployment will fail when evacuation takes longer than the Bosh timeout
• Set your Bosh timeout accordingly!
![Page 25: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/25.jpg)
DEA over-commitment
![Page 26: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/26.jpg)
Default overcommitment factor = 4
![Page 27: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/27.jpg)
RAM peaks may cause random errors
![Page 28: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/28.jpg)
• Failures during staging
• Random application crashes
• No meaningful log information
![Page 29: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/29.jpg)
Reducing over-commitment
![Page 30: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/30.jpg)
• Native strategy
• Reduce over-commitment factor
• Bosh deploy
![Page 31: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/31.jpg)
![Page 32: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/32.jpg)
• 8 GB VM, OC factor 4 → Announces 32 GB (V)RAM
• 8 GB VM, OC factor 2 → Announces 16 GB (V)RAM
• When evacuating a 32 GB (V)RAM host, another 32 GB (V)RAM host will be preferred (more free space)
![Page 33: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/33.jpg)
Evacuation Wave
![Page 34: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/34.jpg)
1 GB
1 GB
1 GB
1 GB
![Page 35: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/35.jpg)
= maximum impact on running apps!
![Page 36: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/36.jpg)
New DEAs (OC 2) will receive apps when old DEAs
(OC 4) have been stopped.
![Page 37: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/37.jpg)
Hints
![Page 38: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/38.jpg)
• Create 2nd resource pool for new DEAs
• Deploy the 2nd resource pool before startup to stop old DEAs
• (-) Needs more resources
• (+) Smoother transition
![Page 39: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/39.jpg)
Updating Cloud Foundry
![Page 40: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/40.jpg)
Required: Staging System
![Page 41: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/41.jpg)
• Structurally identical
• Less VMs
![Page 42: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/42.jpg)
1. Determine new features
since last release
![Page 43: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/43.jpg)
2. Study
deployment manifest changes
![Page 44: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/44.jpg)
3. Apply
deployment manifest changes
![Page 45: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/45.jpg)
4. First staging attempt
![Page 46: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/46.jpg)
5. Debug and Fix it!
![Page 47: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/47.jpg)
6. Simulate the live-upgrade
![Page 49: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/49.jpg)
8. Perform the upgrade
and cross fingers.
![Page 50: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/50.jpg)
CF Hardening
![Page 51: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/51.jpg)
Accept that VMs are ephemeral
![Page 52: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/52.jpg)
VM Failover Strategies
![Page 53: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/53.jpg)
Resurrect
![Page 54: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/54.jpg)
• Monitor VM
• Re-Build VMs automatically
• e.g. using Cloud Foundry Bosh
• + Easy
• - Takes long (minutes not seconds)
• - Open Stack doesn’t release persistent disks automatically
![Page 55: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/55.jpg)
Failover to Standby VM
![Page 56: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/56.jpg)
Distribute CF components across availability zones
![Page 57: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/57.jpg)
• Build disjunct networks, racks, etc.
• Each disjunct zone = availability zone
• Tell your IaaS about availability zones
• On provision choose the AZ
• Build Bosh releases accordingly
![Page 58: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/58.jpg)
• Provide stand-by VM
• Monitor VM and perform failover
• IP failover using Pacemaker
• + Fast failover (seconds)
• - Pacemaker not easy to use (& boshify)
• - Increased resource usage by stdby VM(s)
![Page 59: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/59.jpg)
• 2 * UAA
• 2 * CC
• 2 * n * DEAs
• 2 * Health Manager
• …
![Page 60: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/60.jpg)
UAA & CC DB =
SPOF
![Page 61: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/61.jpg)
HA Postgres
![Page 62: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/62.jpg)
• UAA and Cloud Controller database
• Single point of failure for Cloud Foundry
![Page 63: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/63.jpg)
• Postgres not inherently clusterable > failover with standby vm
• Master/slave replication
• Pacemaker/corosync
• IP-Failover using NIC-reattachment
![Page 64: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/64.jpg)
That’s half way towards a PostgreSQL CF Service
![Page 65: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/65.jpg)
• Add a V2 Service Broker
• Add a provisioning logic
• Provision 2-node db cluster on cf create service postgres medium-cluster
![Page 66: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/66.jpg)
Services
![Page 67: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/67.jpg)
“The best way to find yourself is to lose yourself in the service of others.”
― Mahatma Gandhi
![Page 68: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/68.jpg)
Wardenized Services (community services)
are cute for pet projects.
![Page 69: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/69.jpg)
Not suitable for production.
![Page 70: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/70.jpg)
• Implementations are outdated
• One size doesn’t fit all!
![Page 71: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/71.jpg)
No production CF without high quality services.
![Page 72: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/72.jpg)
CF Service Design
![Page 73: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/73.jpg)
• Use clusterable services if possible
• Implement automatic failover if not
• Autoprovisioning using Bosh
• Organize self-healing
• (Semi-)Automatic recovery from degraded mode
![Page 74: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/74.jpg)
Summary
![Page 75: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/75.jpg)
• Bosh & the CF release are powerful, yet you can cut yourself.
• HA Services are very necessary.
• CF is ready to be used in production.
![Page 76: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/76.jpg)
Questions?
![Page 77: Anynines - Running Cloud Foundry for 12 months - An experience report](https://reader033.vdocuments.mx/reader033/viewer/2022061123/54740e84b4af9f9d0a8b5595/html5/thumbnails/77.jpg)
Thank you!