nicholas dritsas. represents the customer-facing resources from the server product groups. azure cat...

56
Largest SQL Server Projects in the world – On-Premises and Azure Nicholas Dritsas DBI332

Upload: charla-hampton

Post on 22-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Largest SQL Server Projects in the world – On-Premises and AzureNicholas Dritsas

DBI332

Windows Azure CAT (Customer Advisory Team)

• Achieving Customer Success− Very large, complex projects− Mission Critical projects

• Making a Better Product− Drives feedback and product requirements back into SQL Server

development teams from deep and strategic customer and ISV engagements

• Sharing with the Community− http://sqlcat.com− http://blogs.msdn.com/b/windowsazure

Represents the customer-facing resources from the Server Product Groups. Azure CAT is comprised of product and solution experts that regularly engage in the largest, most complex, and most unique customer deployments worldwide.

Session Objectives and Takeaways

• Objectives− Learn about the largest SQL Server and Azure SQL DB projects− See the techniques they use to scale− Fit for purpose− We divide the projects into on-premise and cloud and present a brief

architecture and numbers.

• Takeaways− SQL Server and Azure SQL DB will scale to handle very large projects− Both are enterprise ready− Very flexible architecture choices− DBAs more critical than ever

Top statistics – On Premises Category MetricLargest single database 350 TB

Largest table 1.5 trillion rows

Biggest total data 1 application 88 PB

Highest database transactions per second 1 server (from Perfmon)

130,000

Fastest I/O subsystem in production (SQLIO 64k buffer serial read test)

18 GB/sec

Fastest “real time” cube Millisecond latency

data load for 1TB 30 minutes*

Largest MOLAP cube 24 TB

Top statistics – Azure SQL DBCategory MetricLargest sharded database 20 TBLargest number of databases 1 app

14,000

Most concurrent users 1 app

3 M

Largest Deployment? *.cs = 14,832Web.config=149*.csproj=889*.sln = 210

Scaling techniques and features• Compression• Table Partitioning• Partitioned Views• Service Broker• File Group and File management• Caching• SODA – Services Oriented Database

Architecture• DDR – Data Dependent Routing

• Scale up or Scale out

On Premises Case Studies

On PremisesSQL Server Case Studies

DATA WAREHOUSE / BI• Building an 80TB data

warehouse• Very fast disk subsystem• Scale out AS and RS

Building an 80TB reporting system• A large electronics company wanted to

build a system to support 80TB of data for their reporting needs.

• The data is coming from sensors in their manufacturing line.

• They hold 36 months of data, one per filegroup, to manage compression and HA.

H/W Configurations

Storage

SAN

VMAX SE * 3 ea• Cache 128GB• 8Gbps FC * 16 Port• 300GB 15k * 88ea Raid-5 : U17TB for Data Area• 300GB 15k * 6ea : Hot Spare• 600GB 10k * 44ea Raid-5 : U17TB for BCV• 600GB 10k * 4ea : Hot Spare• Same Specs. For 3 VMAX SE• EMC Replication Manager, TimeFinder/Clone,

PowerPath

HP DL980 (Production)• MS SQL Server 2008 R2• Windows Server 2008 R2• 2 CPU * 10 Core = 20 Core• 512GB memory• 8Gbps FC * 20port

Production Server

Standby Server

24

20 20

PowerPath

HP DL980 (Standby)• MS SQL Server 2008 R2• Windows Server 2008 R2• 2 CPU * 10 Core = 20 Core• 512GB memory• 8Gbps FC * 20port

SAN Switch• DS5300B * 2ea• 8Gbps FC * 96port

Total

U17TB

BCV

U17TB

BCV

U17TB

BCV

VMAX SE Building Block

• VMAX-SE• 88 450GB 15K rpm Disks

−8 Hot Spares−80 “Data disks”

• Raid 5 3+1 (3RAID5)−20 RAID sets−Each “set” is a single 3+1 group

• 2,800 MB/sec target throughput

Hilton Hotels• Room forecasting system• Full suite of SQL products (SQL,

AS, IS, RS)• Scale out AS and RS

• If you need more capacity, just add another server

• Load Balanced Analysis Services reader machines

• 40 to 50 concurrent users per RS server

• Complex queries• Large data sets returned to many clients

Hilton Architecture

On Premises - SQL Server Case Studies

OLTP• Scale out one database• Scale up one database• No downtime allowed – ever• 16,000 + instances

Microsoft CTP Data Tier• The Commerce Transaction

Platform supports Billing and subscriptions (eCommerce) for Microsoft products such as adCenter, Xbox Live, Zune, Windows Live Hotmail Plus, and Azure.

• The Commerce Transaction Platform supports payments using 13 payment methods spanning 42 currencies across 65 localized markets.

• 5 DBAs

DB Infrastructure (PROD)• 2 Datacenters • 5 Webstore Clusters• 15 Replicas (Financial Reporting; Fraud etc)

• 220 SQL 2012 servers (no VM) • 736 databases (excluding Mirrors and Secondary's)• 121 TB of datafiles (excluding Mirrors and

Secondary's)• 420 TB of storage – DAS & SAN (EMC/HDS)• 12 TB monthly growth• 82 DB Mirror Pairs – in DC most asynch, some with

auto-failover• 70 Log Shipping pairs – cross DC• 400 Replication subscription streams, 6 distributors

Monitoring with SCOM

Large Health Care Provider

• Chose to scale up• 17 TB OLTP database• 10,000 concurrent user

connections• Replication for reporting

database• SAN hardware replication

used for Disaster Recovery

Primary Data Centre (Active)

Ap

plic

atio

n T

ow

er

1

Load Balancer

OLTPDB Server SQL2008

How do we scale the solution ?

• “Performance Engineering” is at the heart of our methodology

Private WAN

SQLReplication

ReportingDB Server SQL2008

CPU CPU

CPU CPU

• Scale DB by adding CPU’s MS support for large SMP hardware platforms is fundamental

• Segregation of OLTP and Reporting workloads this allows specific tuning of the workloads

• Scale Application layer by adding additional servers into the VIP

• World’s biggest publicly listed online gaming platform

• 15 million page views and up to 980,000 unique users a day

• Environment• 5 DBA’s & 1 Database Architect• 200+ SQL Server Instances• 150+ TB of data, • 4,000+ Databases• 2+ PB storage • 10+ TB RAM• 450,000+ SQL Statements per

second on a single server• 500+ Billion database

transactions per day• No downtime allowed

SQL Server 2012 – AlwaysOn Availability Groups

Hardware for main SQL Server• Fujitsu RX-600 S5• 64 Cores• 1TB RAM• Multiple 1 Gbit NIC interfaces

• Separate VLANs for Client Access, Cluster Intercom, Backup and Mirroring traffic

• External SAS Diskshelves for Data Files• Attached with 4x 8GB Fibre Channel

• FusionIO PCI-E Solid State Devices for Transaction Log

• Scale UP and Scale OUT

• Scale UP main financial transactions

• Scale OUT other application functions

• High Availability

• 3 data centers

• Synchronous Database Mirroring adds 1 ms per transaction• Backup 2 TB per hour over the network

• http://sqlcat.com/whitepapers/archive/2009/08/13/a-technical-case-study-fast-and-reliable-backup-and-restore-of-a-vldb-over-the-network.aspx

• Case study• http://www.microsoft.com/casestudies/Microsoft-SQL-Server-2012/b

win.party/Company-Cuts-Reporting-Time-by-up-to-99-Percent-to-3-Seconds-and-Boosts-Scalability/710000000087

How do these projects scale?

• Scale UP• Table Partitioning• NUMA • Worker Threads

• Scale Out• Table Partitioning• Partitioned Views• SODA• Data Dependent Routing

Scaling UP

• Windows Server 2012 limit is 640 Cores

• New concept: Kernel Groups• A bit like NUMA, but an extra layer in the

hierarchy

• SQL Server 2012 the limit is 640 cores

• 4 TB RAM, both Windows and SQL Server

And 256 Cores Looks Like This...

Scaling Out SQL Server• How do you:• Manage an 80 TB database• Back up a petabyte• Build an index on a trillion row table

• Answer−Break it into manageable size pieces

Scalability Feature: Table Partitioning

Scalability Feature: Local Partitioned View

Scalability Feature: Cross Database Partitioned View

PDW Appliance

Database Servers

Control Nodes

Active / Passive

Landing Zone

Backup Node

Storage Nodes

Spare Database Server

Du

al Fib

er

Ch

an

nel

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

SQL

Management Servers

Compute Rack Control Rack

Compute Nodes

SQL

SQL

SQL

Du

al In

fin

iban

d

Scalability Concept: SODA – Services Oriented Database Architecture

• Separate your data by business function− Example: HR, Payroll, Accounting, etc

• Or by user function− Example: Login, chat, email, pictures, etc

• Each function goes in a different database• A common database for shared tables• May replicate common tables to all

databases (to keep all joins local)

Azure Case Studies

Common Application Scalability Themes• Scale Out, Scale Out, Scale Out

• Front end and middle tier• Caching Layer• Messaging Layer

• When to Scale Out• At the start of the application• Difficult to scale out later, especially data layer

• How to auto-scale• Some built into Azure, extensible through:

• Code you write to monitor usage• Code you write to spin up new instances• Sample code available on Codeplex and GitHub

• Performance • Resource Reservation

Common Data Scalability Themes

• Scale Out, Scale Out, Scale Out• Database Size• Multi-Tenancy

• When to shard• At the start of the application• When your container nears the limit

• How to shard• Something with a near-even distribution• On a tenant key • Something that keeps the majority of queries on a single

node

Samsung ElectronicsWorldwide TV Management

36

• Smart TVs requires frequent updates with new apps and software changes for better support and compatibility

• Due to the high success of the Smart TV line (20m TVs live and counting), Samsung needs a more scalable and elastic system.

• Achieved 10x performance levels compared to On Premises system, from 500 web requests / sec to 5,000. Main optimizations:

• Dropped using ibatis, a .net library to access SQL. Now, use stored procedures and ado.net

• Used IIS cache for the lookup data. We have a web role that refreshes the cache every 5 minutes

• Moved heavy sequential log inserts to table storage• Moved from 20L to 10XL web roles, and we raised connection limit to 500 from

100 per web role. • Having XL, we have better bandwidth.

• Switched from wcf to mvc.• Host their SQL Azure DB in XL Resource Reservation to guarantee cores, IOPS

and threads for it.

Web Role – Firmware download System  

Administrator

Web Role – Firmware Upload System Website

  

Devices (Smart TV x N)

Device Validation Reporting Update Management

Administration

Blob Storage

Smart TV firmware

 

 SQL Azure

Device logs & update status

Worker Role

Task Automation – Firmware  encryption and batch update

 ASP .NET  HTML ASP .NET

User ID/Password

Single Tenant / No Federation

Single Tenant Single Tenant

The Solution/Architecture

CDN NODE…

CDN NODE 2 - TOKYO

CDN NODE 1 - LONDON

CDN CACHING SERVERS

Windows Azure CDN

MYOBBusiness Accounting

39

Company Profile

• Large Australian based ISV developing Accounting software for small businesses.

• User count in the thousands, coming from desktop era

MYOB Solution/Architecture• AccountRight Live is a rich client application that

communicates with a SQL database for each individual customer at the back end.  • The application is developed in C#/.NET using LINQ to SQL and Entity Framework• Databases on premises and Azure are kept in sync via Sync Framework

• A business logic layer sits in between that is running in Windows Azure cloud services via a number of small size web roles deployment.

• Each SQL database is exported on a daily basis and imported in a secondary datacentre using I/E Services to provide a DR environment. 

The Solution/Architecture

MYOB - Lessons Learned

• Cloud Platforms• Enable massive scalability• High Availability at lower costs• Expose rich cloud based APIs• Efficiencies enable Smart Clients deployment

• Sync Framework• A mature technology for bridging online/offline• Many physical and logical configurations achievable

• SQL Azure Resource Reservation in Premium SKU• Used it to provide the right level of resources to their 2 key databases

where all users connect upon login.

44

Workload Review: Cash ConvertersCompany Information

• Cash Converters is a worldwide pawn broking andpersonal finance franchise company

• 1,000+ stores in 19 countries; major presence inAustralia (home country), the UK, and Spain

Project Description• Store Point-Of-Sale (POS) built on Azure• Very low transaction volumes

• ~20 POS txns per minute per store• Requirement: No downtime during store hours

• Check-out transactions must go through• Up to 15 seconds is an acceptable response time

Evolved from on-premises in-store infrastructure

Implemented as ClickOnce application, backed by application and data services running in Azure PaaS - web and worker roles, storage, and SQL Azure DBs

Performs functions such as allowing customers to sell and purchase goods, or receive/transfer stock

45

Victoria (VIC)

Azure Subscription

Architecture: Cash Converters (Proposed)

Traffic Manager

. . . . . . Store N

On-premises

Windows ServiceAggregation

s

Monitoring

Store N-5

QLD Deployment (HK)

QLD DB (HK)

QLD Deployment (SGP)

QLD.trafficmgr.cloudapp.net

VIC Deployment (HK)

VIC DB (HK)

VIC Deployment (SGP)

Queensland (QLD)

Store 1 Store 2

VIC.trafficmgr.cloudapp.net

QLD DB (SGP)

GeoDR

. . . . . . . . . .

Failover

1

2 GeoDRVIC DB (SGP)3

4

46

Current/Next Steps…• Preparing application for multi-tenancy (in progress)

• Test the entire stack as a multi-tenanted application• Tooling for moving store data from one DB to another

• Intent is to migrate single-tenanted store data to multi-tenanted DBs• Multi-tenanted DBs to be initially put under RR and

GeoDR’ed• Gap here is the time to move a DB under RR - manual action

during preview. This is also required after a failover to the original target (now source) and re-seeding to a new target

Pottermore.comOnline Social Networking & Gaming

47

What is Pottermore?

Read, Explore and Re-discover the Story

Reflect and

Share your

experience

Play and participate in the Story

Buy and give the

Story

12

34

Pottermore Background• Harry Potter Online Experience• Why Cloud?

• Original Beta Launch July 2011 on-premises solution• 4.2M Page views < 2 minutes• 1M Registrations < 10 hours• Could not scale easily

• Target User Base• Unknown popularity, estimated ~20M users

• Since port to Azure - Open to all• 1 billion page views in first 2 weeks• ~15K Registrations/day• >5M signups• Peak ~84K concurrent users, now 25K• Silent Launch: on April 14, 2012 and for 2nd book in July, 2012

• Pottermore = Very Happy CTO “Overwhelmed by support from Microsoft”

Web *(Web Role)

App *(Web Role)

Distributed Cache(Worker Role) Cache Builder

(Worker Role)

Users Federated by UserID• Activities• Comments• Games (Wins)• Others….

Shared Feeds • Great Hall• Common Room • Others…

Cache Builder• OnStart: Content• OnTime – Shared Feeds

• 30 to 120 seconds

Stateless application serversserving to web site…

Email Sender(Worker Role)

Job Scheduler(Worker Role)

Email Proxy

RMF

Database (Sharded SQL)

(*) Web and App Server hosted together – different App Pools

All User Data• Profile• Friends• Inventory• Games• Others…

Local Cache

External SystemOn-Premise

Read/Write

Write Only

Read/Write

Content• Taxonomy• Resources

<user_id>Session State

(Cookie)

Pottermore Architecture

Pottermore - Lessons Learned• Azure Platform

• Enables scalability of Azure services through online provisioning

• High Availability through application partitioning• Compose services for scale-out• Caching to enable fast response & resiliency

• Telemetry• Leverage the platform Windows Azure Diagnostics• Service Health: built-in logging and diagnostics roll-up

Scale-out SQL Databases• Sharding is key• Partition data for different uses: User Profile, Activities

and Email Tasks

Azure Projects – Lessons Learned

52AZ-302-A| Largest Azure Projects

Lessons Learned• Shared environment

• Shared resources – query tuning• Engine throttling - Transient Fault Handling

• Database Scale-up is limited by• Commodity machines• 1GB to 150GB SKUs

• Database Scale-out is unlimited by• Elastic scale through simple provisioning• 150 databases per server (master + 149 user

databases)• Sharding app pattern is key (SQL Azure

Federations)• Take 1 X 50GB database and shard it into 50 X1GB

databases, 50 X HW resources, for same price

Developer Network

Resources for Developers

http://msdn.microsoft.com/en-au/

Learning

Virtual Academy

http://www.microsoftvirtualacademy.com/

TechNet

Resources

Sessions on Demand

http://channel9.msdn.com/Events/TechEd/Australia/2013

Resources for IT Professionals

http://technet.microsoft.com/en-au/

Track Resources • Download the CTP for SQL Server 2014 and accelerate your queries

using In-Memory OLTP - http://technet.microsoft.com/en-us/evalcenter/dn205290.aspx

• Get into the cloud with an Azure account - use SQL database in Windows Azure or take your workload into Azure VM - www.windowsazure.com

• Get big with big data – HDInsight on Azure and grab the latest Power BI featureshttp://www.windowsazure.com/en-us/documentation/services/hdinsight/?fb=en-us

Power BI - www.powerbi.com

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.