techday - cambridge 2016 - opennebula at harvard univerity

42
OpenNebula at Harvard University FAS Research Computing John Noss April 22 2016

Upload: opennebula-project

Post on 07-Jan-2017

11.513 views

Category:

Technology


1 download

TRANSCRIPT

OpenNebula at Harvard University

FAS Research ComputingJohn Noss

April 22 2016

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Harvard FAS Research Computing

Overview of Odyssey

•150 racks spanning 3 data centers across 100 miles using 1 MW power

•60k CPU cores, 1M+ GPU Cores

•25 PB (Lustre, NFS, Isilon, Gluster)

•10 miles of cat 5/6 + IB cabling

•300k lines of Puppet code

•300+ VMs

•2015: 25.7 million jobs

240 million CPU hours

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Where we’re coming from

● Previous kvm infrastructure:○ One datacenter○ 4 C6145s (8 blades, 48 core/ 64 core, 256GB ram)○ 2 10GbE switches but not 802.3ad LACP, they are active-passive○ 2 R515 replicated gluster

● VM provisioning process very manual○ add to dns○ add to cobbler for dhcp ○ edit in cobbler web GUI if changing disk, ram, or cpu○ run virt-builder script to provision on a hypervisor (manually selected for load-balancing)

■ Full OS install, and puppet run from scratch - takes a long time

● Issues:○ Storage issues with gluster, heal client-side (on kvm hypervisors), VMs going read only○ Management very manual - changing capacity is manual, etc

Hardware Setup - OpenNebula

● Hypervisors (nodes): ○ 8 Dell R815

■ 4 each in 2 datacenters○ 64 core, 256GB ram○ Intel X520 2-port 10GbE, LACP

● Controller:○ Currently one node is serving as controller as well as hypervisor, but the controller function

can be moved to a different node manually if the db is on replicated mysql (tested using galera)

Hardware Setup - Ceph

● OSDs:○ 10 Dell R515

■ 5 each in 2 primary datacenters○ 16 core, 32GB ram○ 12x 4TB ○ Intel X520 2-port 10GbE, LACP

● Mon:○ 5 Dell R415

■ 2 each in 2 primary datacenters■ 1 in a 3rd datacenter as a tie-breaker

○ 8 core, 32GB ram○ 2x 120GB SSD, raid1 for mon data device○ Intel X520 2-port 10GbE, LACP

● MDS○ Currently using cephfs for opennebula system datastore mount○ MDS running on one of the mons

Network Setup

2x Dell Force10 S4810 10gbe switches in each of the 2 primary datacenters (with 2x 10gb between datacenters)

2x twinax (one from each switch) to each of the opennebula and ceph nodes, bonded LACP (802.3ad)

Tagged 802.1q vlans for:

1. Admin (ssh, opennebula communication, sunstone, puppet, nagios monitoring, etc; MTU 1500)2. Ceph-client network (used for clients--opennebula hypervisors--to access ceph; routes only to other

ceph-client vlans in other datacenters; MTU 9000)3. Ceph-cluster network (MTU 9000) (backend ceph network; routes only to other ceph-cluster vlans in

other datacenters; only on ceph OSDs)4. Opennebula guest vm networks

a. Some in one datacenter only, some span both datacenters

Note that vlan (1) needs to be tagged to have a normal MTU of 1500, because the bond MTU needs to be 9000 so that (2) and (3) can have their MTU 9000

Network diagram: OpenNebula and Ceph networks

Network diagram: multiple datacenters

Network setup: static routes

profiles::network::datacenter_routes::routes_hash:

datacenter1:

ceph-client-datacenter2:

network: 172.16.20.0/24

gateway_ip: 172.16.10.1

gateway_dev: bond0.100

require: Network::Interface[bond0.100]

ceph-client-datacenter3:

network: 172.16.30.0/24

gateway_ip: 172.16.10.1

gateway_dev: bond0.100

require: Network::Interface[bond0.100]

datacenter2:

ceph-client-datacenter1:

network: 172.16.10.0/24

gateway_ip: 172.16.20.1

gateway_dev: bond0.200

require: Network::Interface[bond0.200]

ceph-client-datacenter3:

network: 172.16.30.0/24

gateway_ip: 172.16.20.1

gateway_dev: bond0.101

require: Network::Interface[bond0.200]

172.16.20.0/24 via 172.16.10.1 dev bond0.100

172.16.30.0/24 via 172.16.10.1 dev bond0.100

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Configuring OpenNebula with puppet

Installation:

● PXE boot - OS installation, runs puppet● Puppet - bond configuration, tagged vlans, yum repos, opennebula and

sunstone passenger installation and configuration○ Combination of local modules and upstream (mysql, apache, galera, opennebula)

● Puppetdb - exported resources to add newly-built hypervisors as onehosts on controller, and, if using nfs for system datastore, to add to /etc/exports on the controller and to pick up the mount of /one

Ongoing config management:

● Puppet - adding vnets, addressranges, security groups, datastores (for various ceph pools, etc)

● Can also create onetemplates, and onevms as well

OpenNebula puppet module

Source: https://github.com/epost-dev/opennebula-puppet-moduleOr: https://forge.puppet.com/epostdev/one (not up-to-date currently)

(Deutsche Post E-Post Development)

Puppet module to install and manage opennebula:

● Installs and configures opennebula controller and hypervisors○ Takes care of package installs○ Takes care of adding hypervisor as onehost on controller (using puppetdb)

● Also can be used for ongoing configuration management of resources inside opennebula - allows to configure onevnets, onesecgroups, etc, within opennebula

Minimum code to setup opennebula with puppet:

package {'rubygem-nokogiri':

ensure => installed,

} ->

class { '::one':

oned => true,

sunstone => true,

sunstone_listen_ip => '0.0.0.0',

one_version => '4.14',

ssh_priv_key_param => '-----BEGIN RSA PRIVATE KEY-----...',

ssh_pub_key => 'ssh-rsa...',

} ->

onehost { $::fqdn :

im_mad => 'kvm',

vm_mad => 'kvm',

vn_mad => '802.1Q',

}

Only needed if not using puppetdb

Can encrypt using eyaml if passing this via hiera

Hiera for opennebula config

one::one_version: '4.14.2'

one::enable_opennebula_repo: 'false'

one::ntype: '802.1Q'

one::vtype: 'kvm'

one::puppetdb: true

one::oneid: opennebula_cluster1

one::oned: true

one::oned_port: 2634

one::oneflow: true

one::sunstone: true

one::sunstone_passenger: true

one::sunstone_novnc: true

one::oned::sunstone_sessions: 'memcache'

one::oned::sunstone_logo_png: 'puppet:///modules/profiles/logo.png'

one::oned::sunstone_logo_small_png: 'puppet:///modules/profiles/logo.png'

one::ldap: true

one::backend: mysql

one::oned::db: opennebula

one::oned::db_user: oneadmin

...

one::sched_interval: 10

one::sched_max_host: 10

one::sched_live_rescheds: 1

one::inherit_datastore_attrs:

- DRIVER

one::vnc_proxy_support_wss: 'only'

one::vnc_proxy_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_vnc.cer"

one::vnc_proxy_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_vnc.key"

one::kvm_driver_emulator: '/usr/libexec/qemu-kvm'

one::kvm_driver_nic_attrs: '[ filter = "clean-traffic", model="virtio" ]'

...

Puppet Roles/Profiles

Puppet roles/profiles provide a framework to group technology-specific configuration (modules, groups of modules, etc) into profiles, and then combine profiles to make a role for each server or type of server.

- http://www.craigdunn.org/2012/05/239/- http://garylarizza.com/blog/2014/02/17/puppet-workflow-part-2/- https://puppet.com/podcasts/podcast-getting-organized-roles-and-profiles

OpenNebula roles

# opennebula base role

class roles::opennebula::base inherits roles::base {

include ::profiles::storage::ceph::client

include ::profiles::opennebula::base

}

# opennebula hypervisor node

class roles::opennebula::hypervisor inherits roles::opennebula::base {

include ::profiles::opennebula::hypervisor

include ::profiles::opennebula::controller::nfs_mount

}

# opennebula controller node

class roles::opennebula::controller inherits roles::opennebula::base {

include ::profiles::opennebula::controller

include ::profiles::opennebula::controller::nfs_export

include ::profiles::opennebula::controller::local_mysql

include ::profiles::opennebula::controller::mysql_db

include ::profiles::opennebula::controller::sunstone_passenger

}

OpenNebula profiles

site/profiles/manifests/opennebula

├── base.pp

├── controller

│ ├── local_mysql.pp

│ ├── mysql_db.pp

│ ├── nfs_export.pp

│ └── sunstone_passenger.pp

├── controller.pp

├── hypervisor

│ ├── nfs_mount.pp

│ └── virsh_secret.pp

└── hypervisor.pp

OpenNebula profiles: NFS mount on hypervisors

class profiles::opennebula::hypervisor::nfs_mount (

$oneid = $::one::oneid,

$puppetdb = $::one::puppetdb,

) {

# exported resource to add myself to /etc/exports on the controller

@@concat::fragment { "export_${oneid}_to_${::fqdn}":

tag => $oneid,

target => '/etc/exports',

content => "/one ${::fqdn}(rw,sync,no_subtree_check,root_squash)\n",

}

# set up mount /one from head node

if $::one::oned == true {

} else {

# not on the head node so mount it

# pull in the mount that the head node exported

Mount <<| tag == $oneid and title == "${oneid}_one_mount" |>>

}

}

Collect this from the controller (note, this will have a 2-run dependence before completing successfully - but, it will continue past the error on the first run)

Export this to the controller

OpenNebula profiles: NFS export on controller node

class profiles::opennebula::controller::nfs_export (

$oneid = $::one::oneid,

){

concat { '/etc/exports':

ensure => present,

owner => root,

group => root,

require => File['/one'],

notify => Exec['exportfs'],

}

# collect the fragments that have been exported by the hypervisors

Concat::Fragment <<| tag == $oneid and target == '/etc/exports' |>>

# export a mount that the hypervisors will pick up

@@mount { "${oneid}_one_mount":

ensure => 'mounted',

name => '/one',

tag => $oneid,

device => "${::fqdn}:/one",

fstype => 'nfs',

options => 'soft,intr,rsize=8192,wsize=8192',

atboot => true,

require => File['/one'],

}

}

Collect these from the hypervisors

Export this to the hypervisors

OpenNebula profiles: Cephfs

class profiles::storage::ceph::client (

$fsid = hiera('profiles::storage::ceph::fsid',{}),

$keyrings = {},

$cephfs_keys = {},

$cephfs_kernel_mounts = {},

$mon_hash = hiera('profiles::storage::ceph::mon_hash',{}),

$network_hash = hiera('profiles::storage::ceph::network_hash', {}),

) inherits profiles::storage::ceph::base {

...

create_resources(profiles::storage::ceph::keyring, $keyrings)

create_resources(profiles::storage::ceph::cephfs_key, $cephfs_keys)

create_resources(profiles::storage::ceph::cephfs_kernel_mount, $cephfs_kernel_mounts )

}

[opennebula-node01]# df -h /one

Filesystem Size Used Avail Use% Mounted on

172.16.10.10:6789,172.16.10.11:6789:/one 327T 910G 326T 1% /one

OpenNebula profiles: Local mysql

class profiles::opennebula::controller::local_mysql (

) {

include ::mysql::server

# disable PrivateTmp - causes issues with OpenNebula

file_line { "${::mysql::server::service_name}

_disable_privatetmp":

ensure => present,

path => "/usr/lib/systemd/system/${::mysql::server::

service_name}.service",

line => 'PrivateTmp=false',

match => 'PrivateTmp=true',

notify => [

Exec['systemctl-daemon-reload'],

Service['mysqld']

]

}

}

class profiles::opennebula::controller::mysql_db (

$oned_db = hiera('one::oned::db', 'oned'),

$oned_db_user = hiera('one::oned::db_user', 'oned'),

$oned_db_password = hiera('one::oned::db_password', 'oned'),

$oned_db_host = hiera('one::oned::db_host', 'localhost'),

) {

# setup mysql server, local currently, on the master

mysql::db { $oned_db:

user => $oned_db_user,

password => $oned_db_password,

host => $oned_db_host,

grant => ['ALL'],

}

}

OpenNebula profiles: Sunstone passenger

class profiles::opennebula::sunstone_passenger (

$web_ssl_key = 'undef',

$web_ssl_cert = 'undef',

$vnc_ssl_key = 'undef',

$vnc_ssl_cert = 'undef',

) inherits profiles::opennebula::base {

include ::profiles::web::apache

include ::apache::mod::passenger

include ::systemd

# disable PrivateTmp - causes issues with sunstone image uploads

file_line { "${::apache::params::service_name}_disable_privatetmp":

ensure => present,

path => "/usr/lib/systemd/system/${::apache::params::service_name}.service",

line => 'PrivateTmp=false',

match => 'PrivateTmp=true',

notify => [

Exec['systemctl-daemon-reload'],

Service['httpd'],

]

}

...

OpenNebula profiles: Sunstone passenger hiera

one::sunstone: true

one::sunstone_passenger: true

one::sunstone_novnc: true

one::oned::sunstone_sessions: 'memcache'

profiles::opennebula::percentliteral: '%'

profiles::web::apache::vhosts:

opennebula01:

vhost_name: <fqdn>

custom_fragment: 'PassengerUser oneadmin'

docroot: /usr/lib/one/sunstone/public/

directories:

-

path: /usr/lib/one/sunstone/public/

override: all

options: '-MultiViews'

port: 443

ssl: true

ssl_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_web_cert.cer"

ssl_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_web.key"

...

OpenNebula profiles: Sunstone passenger hiera cont.

...

opennebula01-80to443:

vhost_name: <fqdn>

docroot: /var/www/html

port: 80

rewrite_rule: "^.*$ https://%{hiera('profiles::opennebula::percentliteral')}{HTTP_HOST}%{hiera('profiles::

opennebula::percentliteral')}{REQUEST_URI} [R=301,L]"

apache::mod::passenger:passenger_high_performance: on

apache::mod::passenger:passenger_max_pool_size: 128

apache::mod::passenger:passenger_pool_idle_time: 600

apache::mod::passenger:passenger_max_requests: 1000

apache::mod::passenger:passenger_use_global_queue: 'on'

Other puppetized configs: XMLRPC SSL

one::oned_port: 2634

profiles::web::apache::vhosts:

opennebula-xmlrpc-proxy:

Vhost_name: <fqdn>

docroot: /var/www/html/ # doesn’t matter, just needs to be there for the vhost

port: 2633

ssl: true

ssl_cert: "/etc/pki/tls/certs/%{hiera('one::oneid')}_xmlrpc_cert.cer"

ssl_key: "/etc/pki/tls/private/%{hiera('one::oneid')}_xmlrpc.key"

proxy_pass:

path: '/'

url: 'http://localhost:2634/'

file { '/var/lib/one/.one/one_endpoint':

ensure => file,

owner => 'oneadmin',

group => 'oneadmin',

mode => '0644',

content => "http://localhost:${oned_port}/RPC2\n", # localhost doesn't use the ssl port

require => Package['opennebula-server'],

before => Class['one::oned::service'],

}

ONE_XMLRPC=https://<fqdn of controller>:2633/RPC2 # for end user CLI access

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Configuration inside OpenNebula once it’s running

Types provided by opennebula-puppet-module:

onecluster

onedatastore

onehost

oneimage

onesecgroup

onetemplate

onevm

onevnet

onevnet_addressrange

Add vnets, datastores, etc:

profiles::opennebula::controller::onevnets:

vlan100:

ensure: present

bridge: 'br101'

phydev: 'bond0'

dnsservers: ['172.16.99.10','172.16.99.11']

gateway: '172.16.100.1'

vlanid: '101'

netmask: '255.255.255.0'

network_address: '172.16.100.0'

mtu: '1500'

profiles::opennebula::controller::onevnet_addressranges:

vlan100iprange:

ensure: present

onevnet_name: 'vlan100'

ar_id: '1' # read only value

protocol: 'ip4'

ip_size: '250'

ip_start: '172.16.100.5'

profiles::opennebula::controller::onesecgroups:

onesecroup100:

description: 'description'

rules:

-

protocol: TCP

rule_type: OUTBOUND

-

protocol: TCP

rule_type: INBOUND

ip: '172.16.100.0'

size: '255'

range: '22,1024:65535'

profiles::opennebula::controller::onedatastores:

ceph_datastore:

ensure: 'present'

type: 'IMAGE_DS'

ds_mad: 'ceph'

tm_mad: 'ceph'

driver: 'raw'

disk_type: 'rbd'

ceph_host: 'ceph-mon1 ceph-mon2'

ceph_user: 'libvirt-opennebula'

ceph_secret: '<uuid_name_for_libvirt_secret>'

pool_name: 'opennebula_pool'

bridge_list: 'opennebula_controller01'

Create_resources on controller

class profiles::opennebula::controller (

$onevnets = {},

$onevnet_addressranges = {},

$onesecgroups = {},

$onedatastores = {},

$oneid = $::one::oneid,

){

validate_hash($onevnets)

create_resources(onevnet, $onevnets)

validate_hash($onevnet_addressranges)

create_resources(onevnet_addressrange, $onevnet_addressranges)

validate_hash($onesecgroups)

create_resources(onesecgroup, $onesecgroups)

validate_hash($onedatastores)

create_resources(onedatastore, $onedatastores)

...

}

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Context scripts for load testing

Graphite/ Grafana vm

Diamond, bonnie++, dd, etc for load test vms:

Context script to configure diamond and load tests

#!/bin/bash

source /mnt/context.sh

cd /root

yum install -y puppet

puppet module install garethr-diamond

puppet module install stahnma-epel

...

cat > diamond.pp <<EOF

class { 'diamond':

graphite_host => "$GRAPHITE_HOST",

...

EOF

puppet apply diamond.pp

diamond

if [ $(echo $LOAD_TESTS | grep dd) ] ; then

dd if=/dev/urandom of=/tmp/random_file bs=$DD_BLOCKSIZE count=$DD_COUNT

for i in $(seq 1 $DD_REPEATS); do

date >> ddlog

sync; { time { time dd if=/tmp/random_file of=/tmp/random_file_copy ; sync ; } ; } 2>> ddlog

...

Onetemplate context variables & instantiation

Onetemplate update (or in Sunstone):

CONTEXT=[ LOAD_TESTS="$LOAD_TESTS", GRAPHITE_HOST="$GRAPHITE_HOST”...

Instantiate with:

onetemplate instantiate 19 --raw "$( cat paramfile )" --name vmname-%i -m4

Using paramfile with newline-separated contents:

LOAD_TESTS=ddGRAPHITE_HOST=172.16.100.12VAR_NAME2=var_value2...

Context script to install graphite and grafana

#!/bin/bash

source /mnt/context.sh

MY_HOSTNAME=$(nslookup $ETH0_IP | grep name|sed -e 's/.* //' -e 's/\.$//')

cd /root

yum install -y puppet

puppet module install dwerder-graphite

yum install -y git

git clone https://github.com/bfraser/puppet-grafana.git /etc/puppet/modules/grafana

puppet module install puppetlabs-apache

mkdir /opt/graphite

cat > grafana.pp <<EOF

class {'::apache':

default_vhost => false,

}

apache::vhost { '$MY_HOSTNAME-graphite':

port => '8080',

servername => '$MY_HOSTNAME',

docroot => '/opt/graphite/webapp',

wsgi_application_group => '%{GLOBAL}',

wsgi_daemon_process => 'graphite',

wsgi_daemon_process_options => {

processes => '5',

...

About FAS RC

Our OpenNebula setup: - OpenNebula and ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Use cases in RC

● Streamlining and centralizing management of VMs● Creating testing vms: with OpenNebula, much easier to create and manage

the one-off vms needed to test something out (this makes it less likely to need to test something in production)

● Automatically spinning up vms to test code: when making a change in puppet, have a git hook do a test run on each category of system we have in temporary opennebula vms first

● Oneflow templates, and HA for client applications by leveraging two datacenters

● Elastic HPC: spin up and down compute nodes as needed

About FAS RC

Our OpenNebula setup: - OpenNebula and Ceph hardware- Network setup

Our configuration with puppet:- opennebula-puppet-module- roles/profiles- Config within OpenNebula

Context scripts / load testing

Use cases for OpenNebula at RCThings we’d love to see

Agenda

Things we’d love to see

● Confining certain vlans to certain hosts without segmenting into clusters (vlans and datastores can be in multiple clusters in 5.0)

● Folders or other groupings on vm list, template list, security groups, etc, to organize large numbers of them in sunstone view (labels coming in 5.0)

● Image resize, not just when launching a VM (coming in 5.0)● Oneimage upload from CLI - not just specify path local to frontend● Onefile update from CLI● Dynamic security groups with auto commit (coming in 5.0)● Private vlan / router handling (with certain 802.1q vlan id’s trunked to hypervisors; coming in 5.0)● Changelog on onetemplates, onevm actions, etc (it’s possible to see user in oned.log but not

changes)● Sunstone: show VM name not just ID when taking action such as shutdown

Sunstone: change the name of “shutdown” to describe what will actually happen for non-persistent VMsSunstone: show eth0 IP on vm info page, or add a copy button for IP from vm list page

● Move Ctrl-Alt-Del button away from the X button to close VNC (or prompt for confirmation)

Thank you! Questions?