microservices: organizing large teams for rapid delivery
TRANSCRIPT
Microservices: Organizing a Large Team for Rapid DeliveryBy Jason Goth, Micah Blalock, Patricia Anderson
@jgothtx, @micah_blalock, @patricia_sooner
About Us
Jason [email protected]
@jgothtx
Micah [email protected]
@micah_blalock
Patricia [email protected]
@patricia_sooner
Microservices are great hard
3
Managing teams developing microservices is not so great much harder
4
Objectives
• Share lessons learned on a few real world microservices projects:• Several customer projects• One internal project
• Topics:• Organizational structure for rapid delivery• Design/Development practices to support these organizations• Communication across multiple teams
5
Case Study … Custom Analytics Platform
6
We are here in the story
2014 2015 2016
AugOct
The Problem
• Suite of applications:• Analytics for hospitals • Multiple products/users• Products share core services
(e.g. how much does X cost? how many of these did I use?)
• Multiple product owners• Billions of rows of legacy data• Huge roadmap of features and
functionality (that they’d already promised…)
7
Lots of Legacy Data
ProductOwner
ProductOwner
Customers
Customers
App
App
App
SharedStuff Customers
System
They Promised This
8
We are still here in the
story
2014 2015 2016
AugOct
Product 1 Product 2 Product 3 Product 4 And So On
We Decided To Do Something Like This
• Reuse of services ? • Add applications easily ?• Scale out forever ? • Cool factor? • Multiple teams can deliver products
and features super fast ? Umm, we’ll come back to that...
9
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Let’s Use Spring!
10
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . . SpringBoot
SpringFramework
SpringSecurity
SpringData
SpringBatch
SpringAMQP
Spring Cloud
Service Registry
Configuration Server
It Worked Great!
• Built the first application in the suite and the first few services in a few months
• Spring-* provided a lot of the plumbing• We had one team (8-10 people) for the
application and the services• Big success
11
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . . DevTeam
This Is Easy … Let’s Do Another One
12
Phase 2 StartsStart
Big Success
2014 2015 2016
AugOct AprMar
It Worked Ok …
• Second team (8-10 people) brought on board to build the next application in the suite
• Did it all in 3 months this time• Another big(-ish) success
• Teams still did their own thing• More friction, but manageable
13
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Let’s Do 2 More Apps … in Only 3 Months
14
“Phase 3” Starts
Start
Big Success
2014 2015 2016
AugOct Mar Jun July
Big(-ish) Success
We Just Need To Add Some People, Right?
• Continue trying to scale out as before:• Add new teams for new features
and services• Letting teams do their own thing
15
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Two Months In …
16
StartBig
Success
2014 2015 2016
AugOct Mar Jun
Big(-ish) Success
We are here
Sep
It All Came to a Screeching Halt
17
Why?
18
It Really Looked Like This
• Dependency hell:• Each feature changed many
services• Changes cascaded everywhere• Teams were stepping on each
other• Versioning helped; but versioning is its
own special hell
19
Clients
Gateway
App 1
Svc 1
App 2 App 3 App 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc 2 Svc 3 Svc 4
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Team Health Suffering
• Unequal workloads • Training issues:
• Unfamiliar systems • Not always the right skillset
• We had sooooo many meetings:• Many to coordinate with each
other• Up to 50% of people’s time
20
Some Things Outside Our Control
• “Legacy” processes added friction:• “Architecture Review Board”• “Change Control Board”
• Shared environments • QA, Load Testing, etc.• Customer doesn’t use any cloud
services L
21
Isn’t this supposed to be faster?
22
It can be faster, if you align the solution and the organization
23
Main Pointof This Talk
Alignment
• Conway’s Law is the starting point• When each team “owns” their own services
• Easy to change• Coordination costs low
• Sharing exposes stress points, coordination costs increase:
• Adding features to existing services• Cascading changes to service contracts
• Coordination costs + partitioning of work = efficiency of change.
24
SvcSvc Svc
Svc
Svc
Svc
Stress point
Partitioning … Time vs Workers
25
PartitionedTasks
Non-partitioned Tasks
Partitioned TasksHigh Coordination
Time
Workers Workers Workers
Time
Time
Source: The Mythical Man-Month, Frederick P. Brooks, Jr., 1995
Better Partitioning
• Refactor to create surface area• Similar to the open/closed
principle• Package code by features not layers
• Add packages for new features• Grow code horizontally not
vertically
26
// This is your clue to refactor
if( purchaseType.equals(“Lab”)) {savings = getLabSavings()
} else if (purchaseType.equals(“Rx”)) {savings = getRxSavings()
} else ...
Handling Cascading Changes
• Remove semantic coupling• Shared concepts,data types or domain
entities• Behavioral dependencies
• The more effects and side-effects a service has, the tighter the coupling with its consumers
See: http://www.michaelnygard.com/blog/2015/04/the-perils-of-semantic-coupling/
27
Oracle Redis Elastic SearchOracle
Gateway
Svc 2
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Svc 1
App 1
Example ... Semantic Coupling
28
Invoices
Customer
FinanceApp
Spend
FinanceApp
Eliminate Shared Dependencies
• Warning architecture purists … you will not like this…• Shared dependencies increase coordination costs:
• Some redundancies are ok• GARY (Go Ahead, Repeat Yourself)
29
Example … GARY
30
// Does every service need to include a shared component (JAR) to validate plan types?
import java.utils.Arrays;
private static String [] PLAN_TYPES= {“HMO”, “PPO”};
public static boolean isValidPlanType(String planType) {return Arrays.stream(PLAN_TYPES).anyMatch(planType::equals)
}
What if different services accept different types of plans? Are all services using Java 8?
Example … GARY
31
// Service 1if(planType.equals(“HMO”) ||
planType.equals(“PPO”) {return true;
}
// Service 2if(planType.equals(“HMO”) {
return true;}
• Why not this?
Great discussion here: http://blog.cognitect.com/blog/2016/6/16/the-new-normal-team-scale-autonomy
Your design will evolve constantly … refactor ruthlessly
32
Ok, now let’s address the team problems
33
Reorganizing Teams
• Refactored design allows us to organize around customer needs (applications/features):
• Single Product Owner• Few overlapping changes• Much lower coupling
• Split into smaller teams (~4 people)
• Focused on the “full stack”… everyone you need
34
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
. . .
Addressing Our Training Problems
• Put some standards in place:• Just enough of them• Familiarity when switching between components
• No onerous review/governance process:o That’s an organizational dependencyo Trust people to follow standards
• Train team on all components and standards:• Sample code/generators/snippets• “Real” documentation, not shelf-ware
35
Addressing Our Workload Problems
• Warning Scrum Masters ... you will not like this…
• We shift our resources all the time• Our standards and consistency
make this easy to do• Do you want to predict velocity
or do you want to havevelocity?
• Over time it averages out
36
Clients
Gateway
App 1
Svc 2Svc 1
App 2 App 3 App 4
Svc 3 Svc 4
Service Registry
Configuration Server
Oracle Redis Elastic SearchOracle
Svc N-2Svc N-3 Svc N-1 Svc N
Addressing our Meeting Problems
• Elect one person per team as “Team Lead”:• Could be anyone• Coordinate any cross-team questions • Facilitate inter-team questions
• External Team Communication:• Designate someone as the sacrificial
offering to the Legacy Process Overlords
• You may need more than one person• Let everyone else work!
37
Legacy ProcessOverlord
So Where Are We In The Story?
38
Start
Big Success
2014 2015 2016
AugOct Mar Jun
Big Success
Sep Feb
Things Stopped
We are hereLots of refactoring, documentation,
reorg, etc.
We Started Gaining Momentum
• Teams working (mostly) independently• Meeting frequency greatly reduced • Not as much cross-team coordination• Legacy Process Overlords appeased
39
And Now We Are Here
40
Start
Big Success
2014 2015 2016
AugOct Mar Jun
Big(-ish) Success
Sep Feb
Things Stopped
Back on Track
Tons Accomplished
Huge Success
Jun
Lots of refactoring, documentation,
reorg, etc.
4 new applications in the suite in just over 4 months80+ enhancements to existing applications
Have a Need for Speed?
41
So, have a need for speed?
Summary
• Align solution and the organization• Improve partitioning • Reduce coupling and dependencies• GARY
• Create standalone teams with singular focus• Have “just enough” standards/process• Be flexible; its ok to move people around• Have people responsible for communications• Others? We’d love to hear from you
42
Learn More. Stay Connected.
@springcentralspring.io/blog
@pivotalpivotal.io/blog
@pivotalcfhttp://engineering.pivotal.io