benefiting from a shared test system · 2017-07-19 · paypal in a box – stage2 2010 560 services...

21
1 Benefiting From a Shared Test System Lakshminarayanan Vasudevan

Upload: others

Post on 17-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

1

Benefiting From a Shared Test System Lakshminarayanan Vasudevan

Page 2: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Benefiting From a Shared Test System

2

Page 3: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

PayPal Production Footprint

©2015 PayPal Inc. Confidential and proprietary. 3

• Operate our own datacenters in 3 locations

• Embarking on Public Cloud strategy

• Cloud environment is approximately 150,000 vm’s

• Production environment is 60/40 split between

OpenStack & VMWare

• 20K payments processed per minute

• 1.9 million web hits per minute

• Frameworks: Java, Node, & C++

• 2800+ services

Page 4: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

The Problem

©2015 PayPal Inc. Confidential and proprietary. 4

• Complex and excessive dependencies

• Rapidly growing code base

• 3 Payment stacks handling PayPal transactions

• Slow release cycles

• Inordinate amount of time required for test prep

How can a developer effectively

operate in this scenario?

Page 5: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Silo Test Environments

©2015 PayPal Inc. Confidential and proprietary. 5

PayPal in a Box – Stage2

2010

560 Services

15 MLOC

2014

1400 Services

50 MLOC

2002

30 Services

1 MLOC

2017

2800 Services

100 MLOC

• In 2015 there were 4100+ stage2’s

• Hardware cost for 2014 was 20 million

• 2017 planned expenditure was 30 million

Hardware was NOT the largest

cost!

Page 6: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Challenges with Silo Test Environments

©2015 PayPal Inc. Confidential and proprietary.

Code

6

Deploy on

Stage2

Learn, Debug &

Troubleshoot Failing

Services

Up Rev Dependent Services +

DB Schema

Test

Keep your Dependent

Services UP

Secure Stage2

Configure Stage2

30% - 50% Time wasted every sprint

maintaining stages: • Deploying all the components

• Managing environment

• Identifying transitive dependencies

• Test topology is not same prod

• Unhappy engineers

• Unproductive engineers

• Poor quality

• Longer TTM

• Integration testing was a nightmare

Page 7: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

The Solution – Managed Stage

Our answer to the stage2 problem is Managed Stage:

• A Production Like Environment

• Cluster of machines running ALL services

• Ability to scale each service based on traffic

• Zero code drift, code refresh in minutes

• Easy to test against

• Centrally Managed

©2015 PayPal Inc. Confidential and proprietary. 7

PayPal’s Shared and Integrated Test Environment

Page 8: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Managed Stage Architecture

©2015 PayPal Inc. Confidential and proprietary. 8

CC

Standby Mesos

Master

Standby Mesos

Master

Active Mesos

Master

Standby Aurora Standby Aurora

Scheduler

Active Aurora

Scheduler

DB

Mesos slaves

Router Pool

Pool

Front Pool

Mid

Pool

Back

Other

pools

Zookeeper

1

Zookeeper

3

Zookeeper

2

Page 9: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Managed Stage Environments

©2015 PayPal Inc. Confidential and proprietary. 9

MSMaster (Live)

N

R

S

H

C

N

R

S

H

C

N

R

S

H

C

MS Release (N +1)

N

R

S

H

C

N

R

S

H

C

N

R

S

H

C

MS (LnP)

N

R

S

H

C

N

R

S

H

C

N

R

S

H

C

Page 10: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

©2015 PayPal Inc. Confidential and proprietary. 10

PDLC Using Managed Stage

Code

Deploy Your Service On User

Stage

Test

o Happy engineers

o More time to

• Build

• Ship

• Think

• Play

Self-Service

User Stage

Page 11: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

©2015 PayPal Inc. Confidential and proprietary. 11

User Stage A1, C1

DB

Bidirectional

Routing via

PPFE &

haproxy

Ex. Testing A1, C1

Flow Example: A -> B -> C -> D

A’

C’ B

D

C

A

Managed Stage

PP

F

E

User Stage

Services

N

R

S

H

C

N

R

S

H

C

N

R

S

H

C

Page 12: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Developer Transformation

• Paradigm shift from silo testing to shared environment:

• Modified test frameworks and test cases

• New patterns for test execution and triage

• Education of PD teams to leverage centralized logging and monitoring

• Need to move away from anti-patterns

o Configurations in code

o Hard coded dependencies

o Custom test configurations; assumes silo test environment

o Flawed deployment patterns

• Move away from stage2 testing

©2015 PayPal Inc. Confidential and proprietary. 12

Page 13: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Core Operating Principles

©2015 PayPal Inc. Confidential and proprietary. 13

• Irrational optimism - whatever it takes to make PD teams successful

• Drive transformation - silo environment to an integrated and shared environment

• Engineering solutions to fix the problem for good - DRY - Automate everything, No- SSH policy

• Small incremental changes - Contain risk, quick restoration

• Restore first – rollback, wire off

• Operational Excellence

Page 14: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Five parts to the puzzle

©2015 PayPal Inc. Confidential and proprietary. 14

Monitoring & Alerting

Self Healing

Empower Customer

(Self Service)

Continuous Deployment

Standard Operating

Procedures (SOP)

Page 15: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

January 2016 Availability

©2015 PayPal Inc. Confidential and proprietary 15

Page 16: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

January 2017 Availability

©2015 PayPal Inc. Confidential and proprietary. 16

Page 17: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

What Did We Gain

©2015 PayPal Inc. Confidential and proprietary. 17

• Eliminated stage2 hardware cost (Stage2 count: 744 as compared to 4100)

• Improved developer productivity by 30%

• Improved application stability index

• Gained visibility into test case execution gaps

• Test case quality

• Created a path for improving engineering hygiene and product quality

Page 18: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Learnings

©2015 PayPal Inc. Confidential and proprietary. 18

• Managed Stage availability issues impacts ALL

PayPal development teams

• Foundation changes require significant cultural

shifts

• Cultural inertia was/is a persistent challenge

• Illusion of control

• Success requires tremendous tenacity and

absolute resolve

• Test Environment is a direct reflection of

Engineering hygiene

• Standardized automated operations is a MUST

o Monitoring

o Alerting

o Self healing

• Significant investment needed for education

Page 19: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Developer Productivity is a Continuous Journey

©2016 PayPal Inc. Confidential and proprietary. 19

• Altus – internally developed PaaS platform

• Docker

• ECD

• Parallel test execution

• MMI – Mother May I

• Production Auto-remediation

• Quality Guardrails

Page 20: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

Q & A

©2015 PayPal Inc. Confidential and proprietary. 20

Page 21: Benefiting From a Shared Test System · 2017-07-19 · PayPal in a Box – Stage2 2010 560 Services 15 MLOC 2014 1400 Services 50 MLOC 2002 30 Services 1 MLOC ... • Integration

www.modsummit.com

www.developersummit.com