failsafe iaas

34
MICROSOFT CONFIDENTIAL – INTERNA Ulrich Homann Chief Architect WW Services FailSafe IaaS Marc Mercuri Sr. Director MCS, Applied Incubation Presented 2013

Upload: marc-mercuri

Post on 08-Apr-2017

187 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: FailSafe IaaS

Ulrich HomannChief ArchitectWW Services

FailSafe IaaS

Marc MercuriSr. DirectorMCS, Applied Incubation

Presented 2013

Page 2: FailSafe IaaS

Today's Platform

Each layer “early bound” to layer belowMust provision entire stack for each layer instanceDifficult to balance isolation and utilization/efficiency

1. Purchase

OS2. InstallRole3. InstallApp4. Deploy

Context5. Configure

Requests

Page 3: FailSafe IaaS

Today's Platform

Virtualization breaks the tight coupling between hardware & softwareSoftware stack is still mostly statically bound though…

OSRoleApp

Context

OSRoleApp

Context

Virtualization

Page 4: FailSafe IaaS

“Fabric Based” Computing PlatformInfrastructure Fabric

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

OSRole

InfrastructureFabric

Base infrastructure serves multiple workloads / rolesInfrastructure is managed as one resourceProvisioned to aggregate need rather than per project

Hardware becomes fungible

Page 5: FailSafe IaaS

What are the “9”s

90% ("one nine")99% ("two nines")99.9% ("three nines")99.99% ("four nines")99.999% ("five nines")99.9999% ("six nines")

Page 6: FailSafe IaaS

The Truth About 9s

Page 7: FailSafe IaaS

SLA Constraints and Throttling

Page 8: FailSafe IaaS

SendGrid

Not every service has an SLA

Page 9: FailSafe IaaS

TODO: IaaS is a Bridge Slide

Page 10: FailSafe IaaS

Define Lifecycle Model

Workload 1

Workload 2

Workload 1

Workload 2

Page 11: FailSafe IaaS
Page 12: FailSafe IaaS

Scale

Resources

Demands

Unit of ScaleWorkloads

Page 13: FailSafe IaaS

Workload 1

Workload 2

Bottom Ramp Peak

Page 14: FailSafe IaaS

Deployment Redundancy

Page 15: FailSafe IaaS

Auto-Scaling Compute in Windows Azure

Page 16: FailSafe IaaS

Fault Domains

Fault and upgrade domains

• Failed component can’t take down service

• Isolated infrastructure• Physical hosts, racks• Network equipment

• Two by default• Role instances across 2+ fault

domains

Upgrade Domains

• VM rolling upgrades, no availability impact

• Logical grouping of role instances

• Five by default

• Role instances spread over upgrade domains

• Deployment upgraded for all or one at a time

Page 17: FailSafe IaaS

Fault DomainRack

IIS1

SQL1

Fault DomainRack

IIS2

SQL2

Web Availability Set

SQL Availability Set

Make VMs Resilient to Failures with Availability SetsGet SLA by deploying multiple instances in availability sets

Ensure availability during updates & maintenance

Continue to architecture availability into the application

Page 18: FailSafe IaaS

Custom Health Probes

LB

VM VM

Your Application

Your Application

LB

VM VM

AzureAgent

CustomerApplication

AzureAgent

CustomerApplication

Role Status Role Status

Page 19: FailSafe IaaS
Page 20: FailSafe IaaS

Understand Geo-Replication

Page 21: FailSafe IaaS

VM Disks: Built on Windows Azure Storage

Windows Azure Storageasynchronous geo-replication

WEST

DCEASTDC

> 400 miles

Page 22: FailSafe IaaS

Hybrid solutions in Windows Azure

Secure Site-to-Site Network Connectivity

Windows Azure Virtual Network

CLOUD ENTERPRISE

Data Synchronization

Multiple Options

Application-Layer Connectivity &

Messaging Service Bus

Secure Machine-to-Machine Network

ConnectivityWindows Azure Connect

Secure Point-to-Site Network Connectivity

Windows Azure Virtual Network

Page 23: FailSafe IaaS

StorSimple: Extend your storage to Azure

24

PrimaryVolume

Snapshots

Backup, Restore & DR with StorSimple: Automated, Optimized, Reliable

Cloud Snapshots

• Backup copy of data volume created in cloud• Changes to local volume automatically transferred• Cloud snapshots mountable for restoreBenefits• Backup now as easy as snapshots• Fast restores from off-site backups• Integrated, easy to test disaster

recovery• Eliminates tape

Primary Volume

Virtual Tape/Replication

Physical TapeSnapshot Offsite Tape

Storage

Backup, Restore & DR Today: Inefficient, Complex, Laborious, and Risky

Page 24: FailSafe IaaS

…Enables Seamless Scalability and Rapid Recovery

25

CloudSnapshots

Enterprise Data Center 1

Enterprise Data Center 2

Connect Many Servers to Cloud Storage and Scale

Data Sets with StorSimple Solution

Rapidly Recover to Any Data Center, Location-

Independent, via Mounting the Cloud

Production Data Production Data

Page 25: FailSafe IaaS

Backup datacenter data to Windows using System Center Data Protection ManagerBackup and recover files/folders from Windows Server 2012

Windows Azure Backup

System Center Data Protection Manager

BenefitsReliable offsite data protectionSimple, familiar, integratedEfficient backup and recoveryEasy set up

Windows Server 2012Windows Server 2012 EssentialsWindows Server 2008 R2 (SP1)System Center 2012 DPM SP1

Your On-Premises Datacenter

Page 26: FailSafe IaaS

SQL Server 2012 on IaaS: High Availability

High availability within regions using SQL Availability Groups

Page 27: FailSafe IaaS

SQL Server 2012 on IaaS: High Availability High availability within regions using databasemirroring

Two approaches – • Use Domain Controller• User Certificates

Domain Controller Approach

Certificate-Based Approach

Page 28: FailSafe IaaS

SQL Server 2012 on IaaS: High Availability

High availability across regions using database mirroring and log shipping

Domain Controller Approach

Certificate-Based Approach

Page 29: FailSafe IaaS

SQL Server 2012 on IaaS: Disaster Recovery

High availability and Disaster Recovery with Availability Groups across on-prem and cloud

Page 30: FailSafe IaaS

SQL Server Management Studio

Reliable off-site data backup

for SQL imagesEasily restore databases using VMs

Benefits

Microsoft SQL Server backup and restore to the cloud

Direct URL backup to Azure Storage

Restore in Azure Virtual Machine

Backup and restore database to the cloud

Page 31: FailSafe IaaS

Failure Points - Virtualized DCs • Background• common virtualization operations

such as backing up/restoring VMs/VHDs can rollback the state of a virtual DC

• with Active Directory, this can introduce USN bubbles leading to permanently divergent state causing:• lingering objects• inconsistent passwords• inconsistent attribute values• schema mismatches if the Schema

FSMO is rolled back• the potential also exists for security

principals to be created with duplicate SIDs

Page 32: FailSafe IaaS

How Domain Controllers are ImpactedTi

mel

ine

of e

vent

s

TIME: T2

TIME: T3

TIME: T4

CreateSnapsh

ot

T1 SnapshotApplied!

USN: 100 ID: A

RID Pool: 500 - 1000

USN: 100 ID: A

RID Pool: 500 - 1000

USN: 250ID: A

RID Pool: 650 - 1000

+150 more users created

DC1(A)@USN = 200

DC2 receives updates: USNs >200

DC1(A)@USN = 250

USN: 200ID: A

RID Pool: 600- 1000

+100 users added

DC2 receives updates: USNs >100

DC1

DC2

TIME: T1

USN rollback NOT detected: only 50 users converge across the two DCsAll others are either on one or the other DC100 security principals (users in this example) with RIDs 500-599 have conflicting SIDs

Page 33: FailSafe IaaS

PowerShell for Automation and Advanced ManagementEverything has an API – Automate, automate automate

Automation Query, manage and configure – at scale:

• Virtual machines• Storage across multiple

subscriptions and storage accounts• Tiered deployment workflows

Virtual Machines Configure storage and networking Domain join to AD DS on-premises Bring your own machine images or

disks Use remote PowerShell

Virtual Network Configure virtual network Manage configuration and

gateway Connect to on-premises networks

Storage Upload and download VHDs from storage accounts to on-premises Copy VHDs between storage accounts and subscriptions

Page 34: FailSafe IaaS

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.