openstack: the opensource cloud’s application in high energy physics
DESCRIPTION
OpenStack: The OpenSource Cloud’s Application in High Energy Physics. That Title’s Overstated. OpenStack: The OpenSource Cloud’s Potential Application in Data Intensive Research. Not as Catchy. Caveats. I am not a storage or network engineer I am not a scientist. - PowerPoint PPT PresentationTRANSCRIPT
OpenStack: The OpenSource Cloud’s Application in High Energy Physics
That Title’s Overstated
OpenStack: The OpenSource Cloud’s Potential Application in
Data Intensive Research
Not as Catchy...
Caveats» I am not a storage or network
engineer
» I am not a scientist
I am:» a Technical Product Manager.
» Dashboard Developer
» working for piston{cloud}computing
» Pragmatic.
» despite illusions of grandeur.
What is openstack?
» Founded by NASA and Rackspace
» The open source cloud computing platform
» Feature-rich and massively scalable
» Powers cloud storage, compute, and networking
» A world-wide open source collaboration
OpenStack as a Cloud OS
APPS
Creates Pools of Resources
Automates The Network
USERS
ADMINS
CLOUD OPERATING SYSTEMCLOUD OPERATING SYSTEM
Connects to apps via APIs
Self-service Portals for users
Benefits of OpenStack as a Common Platform
» Easy to migrate data and applications across clouds
Based on:» security policies» economics» research needs
» No vendor lock-in
» Common Layer of Data Exchange
» Less exposed to security issues than public cloud, but still interoperable.
3 Major OpenStack Components
» OpenStack Compute/Nova: provision and manage large networks of virtual machines
» OpenStack Object Store/Swift: Create petabytes of reliable storage using standard servers
» OpenStack Image Service/Glance: Catalog and manage large libraries of server images
+
» Other components: Dashboard, Load Balancing, Authentication...
Compute/Nova Key Features
2. Horizontally and massively scalable
1. REST-based API
3. Hardware agnostic: supports a variety of standard commodity hardware.
4. Hypervisor Agnostic: support for Xen, Citrix XenServer, Microsoft Hyper-V, KVM, UML, LXC and ESX
HOST 1 HOST 2 HOST 3 HOST 4, ETC.
VMs
Hypervisor:Turns 1 server into many “virtual machines” (instances or VMs)(VMWare ESX, Citrix XEN Server, KVM, Etc.)
» Hypervisors provide abstraction layer between apps and hardware (SERVERS)
» OpenStack pools servers, you run operating systems and applications on VMs instead of physical computers
Nova close up
» nova-api daemon» endpoint for all OpenStack or EC2 API queries
» nova-schedule process» takes a virtual machine instance request from
the queue and determines which compute server host it should run on
» a pluggable architecture allowing custom scheduling algorithm
» nova-compute process» worker daemon that creates and terminates
virtual machine instances
We mentioned Commodity.How Commodity?
Commodity Hardware
» Piston Silicon Mechanics» 2 Intel Xeon processors 5600 Series» 96GB of DDR3 RAM» 24TB of SATA storage» Redundant 1200W power supplies» 2U rackmount chassis
» That’s what our clients get, we’re on:» 32GB, 16TB, 2 Intel Xeon E5645
processorsDevOp borrowed the rest for other machines
Performance: 500 VM Spin Up» Assuming:
» 500 copies of one 8GM image» Image warm on the nodes» 50 VMs/Server
» Based on NASA’s experience in regular use, less than 30 seconds
» Worst case:» Image is still in Glance» VM has to be copied via HTTP
Image Service/Glance
2. REST-based API
1. Store & retrieve VM images
3. Compatible with all common image formats
4. Storage agnostic: Store images locally, or use
OpenStack Object Storage, HTTP, or S3
Storage/Swift Key Features
4. Scalable to multiple petabytes, billions of objects
1. REST-based API
6. Account/Container/Object structure (not file system, no nesting) plus Replication (N copies of accounts, containers, objects)
5. No central database required
2. Data distributed evenly throughout system.
3. Runs on commodity hardware
The Storage Story: Nova» Nova/Compute has it’s own storage
» Block Storage or Nova-volume» an iSCSI solution» employs the use of Logical Volume
Manager (LVM) for Linux» intended for read/write purposes
(databases, log, etc.) » basically is an LVM/iSCSI
implementation to mount block devices in VM.
The Storage Story: Swift» Swift: Object Storage
» Fully Distributed» Commodity Hardware (Linux/x86)» Data Protection in Software» Not a File System» Not SAN/NAS/DAS... or any attached
storage» Optimized for Scale - Petabytes
Swift in Production
» Swift has been running in production at Rackspace for over a year with near 100% uptime.
» Rackspace’s swift clusters store billions of objects and petabytes of data.
» Internap, KT, SDSC, and HP are also running Swift in production
SwiftSwiftSwiftSwift
OS OS or or
EC2EC2APIAPI
OS OS or or
EC2EC2APIAPI
Sharing the Research
Location B
Location APrivate Cloud
Private Cloud
Common software platform making Federation possible, through a shared API.
To federate Swift across locations, you write a scheduler within OpenStack and drive it through the API.
Swift Components
Proxy Servers
Clients
Account Servers
Container Servers
Object Servers
Rings
Swift Components
» Proxy Server» Tie together the Swift architecture» Request routing» Exposes the public API
Swift Components
» The Ring: Maps names to entities (accounts, containers, objects) on disk.» Stores data based on zones, devices,
partitions, and replicas» Weights can be used to balance the
distribution of partitions» Used by the Proxy Server for many
background processes
Swift Components...
» Object Server:» Blob storage server» metadata kept in xattrs» data in binary format» Object location based on name &
timestamp hash
Swift & Large Object Storage
» default 5GB limit on the size of an uploaded object
» segmentation makes download size of a single object is virtually unlimited
» segments large object are uploaded and a special manifest file is created
» when downloaded, all segments are concatenated as a single object.
» greater upload speed » possible parallel uploads of segments.
But Wait, Swift...» Doesn’t load balance for often
requested objects.» throw Varnish Cache or Squid Proxy
in front of Swift
» Has a “simple” ReSTful API
» Wasn't intended for storing unknown data
» Isn’t searchable
» Is like Amazon’s S3
Potential Solutions for Those Needing to Search Data
» Or wait...» Swifts Blueprints Include Searchable
MetaData» https://blueprints.launchpad.net/swift/+s
pec/future-searchable-metadata» Contribute to the greater community
What’s Piston Doing Different?
» Piston Enterprise OS:
» A hardened cloud operating system built on OpenStack™
» Optimized for secure and easy operation of enterprise private clouds
» Fully supports interoperability with other OpenStack™ powered public and private cloud solutions.
{pentOS}TM features
{CloudKey}™
»Two-factor capable physical authentication
»Minimizes security risk of administrative logins
»Hands-free install in under 5 minutes
Null-Tier [Architecture]™
»Storage, compute and networking on every node
»Massively scalable
»Automated scaling
Top of Rack SwitchTop of Rack SwitchTop of Rack SwitchTop of Rack Switch
{pentOS}TM Null-Tier [Architecture]™
Server<1>-Networking-Storage-Compute-Management
Server<N>-Networking-Storage-Compute-Management
Highly available
{pentOS}controllers
Highly available Virtual
Machines
Highly available Virtual Storage
Hands-Free OS Install and Configuration
…
{CloudKey}™
Contact» Neil Johnston
» email: [email protected]» twitter: @neiljohnston
Or my co-authors:
» Joshua McKenty» email: [email protected]
» Christopher MacGown» email: [email protected]