GanetiPrivate Cloud as Google does it
Helga Velroyen <[email protected]>Linuxtag Berlin, May 9th, 2014
·
·
A Ganeti Cluster
Instance: a virtualization guestNode: a virtualization hostNodegroup: a homogeneous set of nodesCluster: a set of nodes, managed as a collective, partitioned by nodegroups
·
·
·
·
4/20
What can it do?
Manage clusters of physical machinesDeploy virtual machines on them
·
·
Resiliency to failure (distributed storage)Live migrationEase of repairs and hardware swapsCluster balancing
-
-
-
-
5/20
Ideas
Interact with the cluster as an entity, instead of the individual machines.Making the virtualization entry level as low as possible
Scale to enterprise ecosystems
·
·
Easy to install/manageLightweight (no "expensive" dependencies)No specialized hardware needed (eg. SANs)Start small, grow big
-
-
-
-
·
Manage simultaneously from 1 to ~200 host machinesAccess to advanced features (distributed storage, live migration, clusterbalancing)
-
-
6/20
Technologies
Linux and standard utils (iproute2, bridge-utils, ssh)Hypervisors:
Storage:
Programming languages:
·
·
Xen, KVM, LXC-
·
DRBD, LVM, file, distributed storage, Ceph/Gluster-
·
Python, Haskell-
7/20
Controlling Ganeti
(*) Programmable interfaces
Command line (*)RAPI (Rest-full http interface) (*)Webinterfaces:
·
·
·
Ganeti Web manager, aiming for admins, but includes "self-servicemanagement" for usersganetimgr web manager, simplified multicluster web manager for endusersSynnefo, complete cloud service solution, OpenStack API compatible
-
-
-
8/20
Production clusterAs we use it in a Google Datacentre
Ganeti node group / rack
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node group / rack
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Per m
achine monitoring
Ganeti node group / rack
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Ganeti node
Remote API
SSH access
Ganeti cluster
current master node
Per m
achine monitoring
Per m
achine monitoring
... ... ...
9/20
Fleet at Google
Ganeti clustertype Dedicatedmaint window A
Ganeti cluster
maint window Btype Dedicated
Ganeti cluster
maint window Atype General
Datacenter Y
Ganeti clustertype Generalmaint window A
Ganeti clustertype Generalmaint window B
Ganeti clustertype Ubiquityno maint window
Datacenter X
Ganeti cluster
no maint windowtype Office
Office ZURICH
VirgilEuripidesDradis
Ganeti clustertype Generalmaint window A
Ganeti clustertype Generalmaint window B
Ganeti clustertype Ubiquityno maint window
Datacenter Z
VM transfer
Fleet Management
10/20
Instance provisioning at Google
type General
Ganeti cluster
Ganeti cluster
type Ubiquity
type Dedicated
Ganeti cluster
Monitoring
Ganeti cluster
type General
Virgil
Alloc request
Machine DB
scan capacityRAPI interface
gather capacity
11/20
Auto node repair at Google
Ganeti HW
Ganeti HW
Ganeti HW
Ganeti HW
broken HW
Ganeti HW Ganeti cluster
Euripides Virgil
Machine database
Send machine
Tell cluster to evacuate
to repairs (2)
the broken machine (4)
Monitoring detects fault (1)
Mark machine broken (3)
Send to repairs (5)
12/20
Auto node readd at Google
Watches machine for 24hrs (2)
Euripides Machine DB
Virgil
Dradis
Ganeti HW
Ganeti HW
Ganeti HW
Ganeti HW
repaired HW
Ganeti HW
Detects machinewas repaired (1)
Tells Virgil to reintegrate machine (3)
Configure machine (4)
Tell cluster to add it (5)
Mark machine serving (6)
13/20
Ganeti 2.8, 2.9
2.8.4
2.9.6
DowngradingAutorepair toolHrollerImprovements on storage, monitoring
·
·
·
·
DRBD 8.4 supportContinued work on monitoring, storage, hroller
·
·
14/20
Ganeti 2.10
2.10.3, available in debian wheezy backports, debian jessi
Cross-cluster instance moves:
Cluster balancing based on CPU loadKVM: Hotplug support, direct access to RBD storageGaneti upgrades!
·
automatic node allocation on destination clusterconvert disk templates on the fly
-
-
·
·
·
15/20
Updates
In the past, updating Ganeti was a pain:
From 2.10 on, Ganeti comes with a built-in upgrade mechanism:
Note that you still have to install the new and deinstall the old packagesmanually.
/etc/init.d/ganeti stop // on all nodesapt-get install ganeti2=2.7.1-1 ganeti-htools=2.7.1-1 // on all nodes/usr/lib/ganeti/tools/cfgupgrade // on master/etc/init.d/ganeti start // on all nodesgnt-cluster redist-conf // on master... // lots of other steps, depending on the version// If something goes wrong, fix the mess manually.
apt-get install ganeti-2.11 // on all nodesgnt-cluster upgrade --to 2.11 // on mastergnt-cluster upgrade --to 2.10 // to roll back
16/20
Ganeti 2.11
Current stable release, 2.11.0.
RPC security: individual node certificatesCompression for instance moves / backups / importsConfigurable SSH ports per node groupGluster support (experimental)
·
·
·
·
17/20
Current and Future development
No guarantees!
Google Summer of Code:
Network improvements (IPv6, more flexibility)Storage: more work on shared storageHeterogeneous clustersImprovements on cross-cluster instances moves
·
·
·
·
Make LXC support production-readyConversion between arbitrary disk templates
·
·
18/20
Open Source Events
Confirmed:
Not confirmed yet:
Linuxcon Japan, Tokyo, May 20th 2014Ganeticon, Portland, Oregon, September
·
·
Linuxcon North America, Chicago, AugustFrOSCon, St. Augustin, Germany, AugustLISA '14, Seattle, November
·
·
·
19/20
Thank You!Questions?
© 2010 - 2014 GoogleUse under GPLv2+ or CC-by-SASome images borrowed / modified from Lance Albertson, IustinPop, and Guido TrotterSome slides were borrowed / modified from Tom Limoncelli
·
··
·
·
·