multi-cloud testing
DESCRIPTION
Testing in a multi-cloud environment presents unique challenges. This presentation reviews some of these challenges and includes some examples of how they can be addressedTRANSCRIPT
Testing in a Multi-Cloud EnvironmentSeema Jethani
Who am I?Seema Jethani @seemaj
Senior Product Manager, Dell ( formerly Enstratius / enStratus )
Blog ★http://seemaj.wordpress.com ★http://cloudfieldnotes.com/author/seema/
Slideshare http://www.slideshare.net/svjethani/presentations
CLOUDS ARE LIKE BABIES
Bringing joy
But also presenting...
UNIQUE CHALLENGES
Speaking their own language
Capability Terminology
AccountAWS = Account
Terremark = Organization
RegionAWS = Region
Terremark = Environment
Data centerAWS = Availability Zone
Terremark = Compute Pool
Machine Image
AWS = Machine ImageVMware = Template vApp
Joyent = DatasetsOpenStack = Image
CloudStack, Terremark = Template
Firewall AWS, OpenStack, CloudStack = Security GroupTerremark = Firewall
VolumeAWS = Elastic Block Storage
VMware, IBM, Terremark = DiskOpenStack = Cinder or Nova Volume
Partial List
And are sometimes hard to work with
Examples of challenging cloud behavior1. Requiring paperwork / manual processing
2. Requiring calling support to create private images
3. Requiring a new account for new API versions, thus forgetting backward compatibility
4. Using same API version, but continuing to make changes
5. Taking hours to launch a server or launching server requests serially
6. Having a ridiculously low cut-off for throttling API requests e.g. dozens of API calls per second
7. Not providing a way for the VM to discover its own ID
Each is able to do different things in different ways
Partial Cloud Feature Matrix
Feature AWS Rackspace Terremark Joyent
Launch Server S S S S
Pause Server S NS S S
Make Image S S S NS
Create Snapshot S S NS NS
Create Volume S S NS NS
Create Loadbalancer S S NS NS
Create/ Reserve IP S NS S NS
Examples of the cloud nuances1.Make ImageCloudSigma requires server to be stopped before making an image
2.Attaching a volumee.g. Terremark requires the server to be stopped for for volume to be attached
3.Creating a snapshote.g. Make a snapshot from the volume fails in CloudCentral if the Volume is not attached to a server
4.Connecting via sshCloudSigma requires users to access a centos server via vnc first to enable ssh
Cloud requires us to rethink our testing
approach
We need to learn “cloud speak”Understand differences in behavior
But also importantly
Plan for failure
How is this different?
Customers expect cloud failure
but also
Expect recovery from cloud behavior to be transparent. They don’t care about root cause infrastructure issues
Test for failure , auto-recovery , MTTR
How is this different?
Customers expect cloud failure
But don’t expect it to impact their business
Test for cloud failure and its impact on customer application performance.
Enstratius testing journey
Enstratius Cloud Management Platform
The release process
build
Today
Artifacts Installer API tests
Github
Artifactory
Vagrant
AWS
Jenkins
ReportNG Report
Automated Test
Framework (TestNG)
GitHub
Enstratius API Server
Cloud A
Cloud Z
estrov
ant
API Test Framework
Cloud account access keys, test configs
HeaderConfig.xml
Cloud specific resource/method
exclusions
JSON/XML Inputs for REST Calls common
to all clouds
Setup Test bed on Cloud under test
Tear Down Test bed on Cloud under test
Update Add Delete List
Atomic Tests
Cloud Specific Inputs for REST Calls (if any)
estrov
API test workflow
<?xml&version="1.0"&encoding="UTF98"?>!<cloudService!name="AWS">!
<excluded>!<resource!name="kvdb"!/>!
<resource!name="customer">!
<method!name="add"!/>!<method!name="changeCurrency"!/>!
<method!name="revertCurrency"!/>!<method!name="setTimeZone"!/>!
<method!name="revertTimeZone"!/>!</resource>!
</excluded>!
</cloudService>
{"addLoadBalancer":![!
{!"region":{"regionId":REGION_ID},!
"budget":BILLINGCODE_ID,!
"description":"TestNG_Trial_LB",!"name":DYNAMIC,!
"label":"red",!"customer":{"customerId":CUSTOMER_ID},!
"dataCenters":[{"dataCenterId":DATACENTER_ID}],!"owningGroups":[{"groupId":GROUP_ID}],!
"listeners":!
[{"protocol":"HTTP","publicPort":80,"privatePort":80}]!}!
]!
}
start
Read the cloud access keys & test
configs
Iterate through the test resources
Exclude Resource
?
Run atomic tests
Generate report
Headerconfig.xml
AWS.xml
Read resource/method inputs
<?xml&version="1.0"&encoding="UTF98"&standalone="no"?>!<CONFIG>!
<BILLINGCODE_ID>403</BILLINGCODE_ID>!<USER_ID>50950</USER_ID>!!
<DUPLICATE_BILLING_ID>402</DUPLICATE_BILLING_ID>!
<DUPLICATE_GRP_ID>501</DUPLICATE_GRP_ID>!<ROLE_ID>55501</ROLE_ID>!
<GROUP_ID>500</GROUP_ID>!!<ACCOUNT_ID>200</ACCOUNT_ID>!
<CLOUD_ID>1</CLOUD_ID>!<CUSTOMER_ID>200</CUSTOMER_ID>!
</CONFIG>
Generate Dependency
resources
<?xml&version="1.0"&encoding="UTF98"&standalone="yes"?>!<header>!
<csp>AWS</csp>!<resource>all</resource>!
<accessKey>aws_accesskey</accessKey>!
<secretKey>aws_secretkey</secretKey>!<apiServer>enstratius_apiserver_ip</apiServer>!
<apiServerPort>enstratius_apiserver_port</apiServerPort>!<version>enstratius_apiserver_version</version>!
<testForFailure>false</testForFailure>!</header>
stop
Loadbalancer/add.json
Config_200.xml
Finished iterating
Yes
No
The extra steps1. Not all functions are supported by all clouds. An exclusion list needs to be maintained.
2. Cloud specific behavior needs to be codede.g. Terremark requires the server to be stopped for for volume to be attached
3. Sanity check - Issues found must be verified directly using cloud console / cloud API
Build
Near-term updates
Artifacts Launch
API testsGithub
Artifactory
AWS
Jenkins
Installer
Checkout Teardown
Build
Future
Artifacts Launch
API testsGithub
Artifactory
AWS
Jenkins
Installer
1. Checkout 6. Teardown
Selenium
Other tools on the radar
Visualizing performance
http://twitter.github.io/zipkin/
Vulnerability scans