cloud operating system unit 15 cloud resource management m. c. chiang department of computer science...
TRANSCRIPT
Cloud Operating System
Unit 15Cloud Resource Management
M. C. Chiang
Department of Computer Science and Engineering National Sun Yat-sen University
Kaohsiung, Taiwan, ROC
Cloud Operating System
Outline What are resources? Resources
Requesting/Managing SOAP REST
Load Balancing Auto-scaling
Shared-Nothing Architecture
Scheduling Crosses Nodes
Network Management
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-2
What Are Resources
Everything in a host CPU Memory Disk storage Network Bandwidth
Resources should be distributed to instances/users not only fairly, but also properly
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-3
Resources Requesting/Managing – URI Uniform Resource Identifier
URL is a type of URI, used for representing resources on internet.
URI can represent more types of resource, including local resources.
Need a standard for parsing the data in a page/resource
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-4
Resources Requesting/Managing – Data For web services, standardizing information
exchanging helps simplifying communications between services.
In 1998, Microsoft and UserLand Software created the XML-RPC protocol, and supported by other companies.
SOAP is based on XML-RPC, with more flexibility, and become the part of W3C standard in 2000
JSON is another protocol describing data by using JavaScript
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-5
Resources Requesting/Managing – REST REpresentational State Transfer It’s an architectural style, not a protocol Essence of the way the Web already works
For example:
http://example.com/resources represents a collection of resources. And a HTTP GET operation means “list” the resources.
http://example.com/resources/item17 represent a resource in collection. HTTP GET means “retrieve”
the resource.
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-6
Resources Requesting/Managing – RESTResource GET PUT POST DELETE
Collection URI, such ashttp://
example.com/resources/
List the URIs and perhaps other details of the collection's members.
Replace the entire collection with another collection.
Create a new entry in the collection. The new entry's URL is assigned automatically and is usually returned by the operation.
Delete the entire collection.
Element URI, such ashttp://
example.com/resources/item17
Retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type.
Replace the addressed member of the collection, or if it doesn't exist, create it.
Treat the addressed member as a collection in its own right and create a new entry in it.
Delete the addressed member of the collection.
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-7
Resources Requesting/Managing – REST OpenStack, AWS all support RESTful API Example for OpenStack
POST http://localhost/v1/AUTH_admin/mycontainer Create a container “mycontainer”
PUT file1 http://localhost/v1/AUTH_admin/mycontainer/ Upload a file “file1” to “mycontainer”
GET http://localhost/v1/AUTH_admin/mycontainer/?prefix=fi List files in “mycontainer” with prefix fi
GET http://localhost/v1/AUTH_admin/mycontainer/file1 Download the file “file1”
The RESTful way makes better and scalable interface for a system
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-8
Resources Requesting/Managing - SOAP Simple Object Access Protocol
It’s a protocol, not a style like REST
No new inventions involved Use XML for describing the content Can be used over any transport protocol such as
HTTP, SMTP, or raw TCP
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-9
Resources Requesting/Managing - SOAP A SOAP message
components: SOAP envelope
The wrapper
SOAP body Application-specific
message. The client/server can
interpret it by their specific way
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-10
Resources Requesting/Managing - SOAP
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-11
A price request from client:1. <SOAP-ENV:Envelope
2. xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
3. SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
4. <SOAP-ENV:Header>
5. </SOAP-ENV:Header>
6. <SOAP-ENV:Body>
7. <m:GetLastTradePrice xmlns:m="http://foo.bar/prices">
8. <m:Item>Apple</m:Item>
9. </m:GetLastTradePrice>
10. </SOAP-ENV:Body>
11. </SOAP-ENV:Envelope>
encodingStyle in line 3 is the attribute of envelop describes how the message encoded
Line 5 to 7 is the message wraps the application-specific information It’s a call to “GetLastTradePrice” request for server in line 5
Resources Requesting/Managing - SOAP Response from server:1. <SOAP-ENV:Envelope
2. xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/"
3. SOAP-ENV:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
4. <SOAP-ENV:Body>
5. <m:GetLastTradePriceResponse xmlns:m=“http://foo.bar/prices">
6. <m:Price>34.5</m:Price>
7. </m:GetLastTradePriceResponse>
8. </SOAP-ENV:Body>
9. </SOAP-ENV:Envelope>
As we can see in Line 5 it’s a “GetLastTradePriceResponse” And in Line 6 the “Price” is 34.5
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-12
Resources Requesting/Managing - SOAP A run-instance SOAP request body example in AWS EC2:1. <RunInstances xmlns="http://ec2.amazonaws.com/doc/2012-03-01/">
2. <instancesSet>
3. <item>
4. <imageId>ami-60a54009</imageId>
5. <minCount>1</minCount>
6. <maxCount>3</maxCount>
7. </item>
8. </instancesSet>
9. <groupSet/>
10. </RunInstances>
1. AWS EC2 service will parse the request, and get the variables respective, and execute the RunInstances request
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-13
Resource Requesting/Managing With the RESTful API and SOAP, we can
manipulate the resource on the Cloud Computing through Web Service.
After all, Web Service is the final product of Cloud Computing for the end-users.
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-14
Load Balancing
Load balancing is very important in resource management
Another “As a Service” in many cloud provider AWS Elastic Load Balancing (ELB) Rackspace Cloud Load Balancing Project Atlas of OpenStack as a pluggable module
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-15
Load Balancing
Load balancing balances not only network bandwidth, but also system loading
Also monitoring the health of instances If a instance is unhealthy, the balancer should stop
sharing the load with the instance.
Session Persistence Connections should be bound to the same node in a
connection session For some local data and performance
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-16
Auto-scaling The most important feature of Cloud Computing
On-demand self-service
Work with load balancer Must get loading information to determine when to scale.
Easy with virtualization You can’t install a new physical machine automatically.
Scale up for more computing power, and scale down for saving money
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-17
Auto-scaling – Issues
Synchronizing between nodes Also an issue in load-balancing
Scales too large Needs to set the budget constraints
That is “Measured service”
Hard to scale down for database With shared-nothing architecture, it’s hard to join the
data sets from different hosts back when shutting down nodes.
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-18
Shared-nothing Architecture
Sharing no memories and disks between nodes Each node is self-sufficient Independency
No “Single Point of Failure” What will happen if an NFS server shared by 10 hosts for
storage, and the server crashes?
More easily for scaling than shared-disc/memory
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-19
SN Architecture – Database Partitioning the table
with row, rather than column
Distributes the rows to nodes Then the queries can be
distributed
Google calls it “database shard”
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-20
id data
1 data1
2 data2
3 data3
4 data4
id data
1 data1
2 data2
id data
3 data3
4 data4
Shared-nothing vs. Shared resourcesShared-Nothing: Better availability Better scalability Easier to replace the fault
device, but complicates maintenance
Need redundant resources
Shared resources: Single Point of
Contention problem Larger resource pool Centralize management,
but leads to longer downtime in maintenance
Complex synchronization and locking design
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-21
Scheduling Crosses Nodes In a normal OS, scheduler schedules processes
when to run In Cloud, when incoming a virtual machine
instance request, scheduler is responsible for scheduling which host in which zone the instance should run on
The algorithm can be very simple or very complex
It’s host OS’s responsibility to schedules when VMs running on the host
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-22
Scheduling Crosses Nodes – Example OpenStack nova-scheduler algorithms
Chance The host is chosen randomly
Availability zone The host is chosen randomly from within a specified
availability zone
Simple The host whose load is least is chosen. Need load
balancer
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-23
Scheduling Crosses Nodes – Example OpenStack nova-scheduler architecture
Using a simple table to maintain instance types (how many memories, etc.) in every host, and use this table to determine how many resources are consumed. It ignores the status updates.
Only one scheduler workers – it leads to “Single Point of Failure”
“Current workload” as a parameter to avoid hammering a heavy-loading host
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-24
Scheduling Crosses Nodes
Scheduling is a hot topic in Cloud Computing In a very large implement (500 hosts with 200
instances each, for example), it will be very critical to reduce the scheduling latency
In OpenStack the schedule algorithm is very simple, but has some architectural problems. Better model is released in newest implementation (April 5)
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-25
Network Management Network is fundamental of accessing Cloud There is no physical interface for virtual machines.
Only vNICs, then bridge to the real NICs In this model, Linux bridge control only provides
basic management. No advanced QoS, ACLs, and monitoring
Simple and flat network topology. May not suit all productions
In OpenStack, the nova-network leads to the Single Point of Failure problem again
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-26
Network Management – Virtual Switch Improved network virtualization
Not only virtualize NICs, but also switches now Virtual switch as a service layer for management Virtual switch provides virtual ports for connectivity
Virtual Switch provides the function normal switch should have Layer 2 management QoS, port statistics Enable more complex network topology
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-27
Network Management – Virtual Switch Available implementations
Open vSwitch
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-28
Network Management - Example OpenStack Quantum project
Included in Essex release Including APIs, publicly available virtual switch
implements and other plugins Support RESTful API
quantum.foo.com/tenant-a/network/net-1/port/17 We can get statistic of port 17 in net-1 of tenant-a
Through Quantum the network service can be more easily to be measured Network-as-a-Service
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-29
Summary
SOAP provides a standard method for exchanging data
REST architecture describes a more general method to manage resources
Load balancing not only can balance the resource, but also can be used for failover
Virtualization makes auto-scaling easy and possible
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-30
Summary
Shared-nothing is mainstream in Cloud, since it has higher availability and scalability
Scheduling in Cloud schedules on which host the new virtual machine should be created
As process scheduling is latency-critical in OS, VM scheduling is latency-critical in Cloud as well
Improved network virtualization will be next step in Cloud, enables more capabilities
04/19/23 Cloud Operating System - Unit 15: Cloud Resource Management 1 U15-31