WTF is a Microservice - Rafael Schloming, Datawire

Download WTF is a Microservice - Rafael Schloming, Datawire

Post on 18-Feb-2017

98 views

Category:

Technology

5 download

TRANSCRIPT

WTF is a microservice?Rafael SchlomingCo-Founder & Chief Architectdatawire.ioHistoryDatawire Founded in 2014 Focused on microservicesMe Lots of distributed systems experience Starting from zero with microservices2datawire.ioWhat is a microservice?Wikipedia: ...no industry consensus ...implementation approach for SoA ...processes that communicate with each other to fulfill a goal ...Naturally enforces a modular structure...Everything else: Volumes of essays good, bad, and ugly...3datawire.ioThree aspects of MicroservicesTechnologyProcessPeople4datawire.ioFrom Three Sources5Experts BootstrappingMigratingdatawire.ioStarting PointTechnical: An application composed of a network of small services Building your application from microservices forces you to create clear boundaries, better abstractions, ...Process: ???People: ???6datawire.ioThe Expert SourceRead just about every firsthand story out thereWent to conferencesTalked to everyone we couldStarted the practitioner summitAnd armed with a little bit of knowledge, we started filling in our picture7datawire.ioPeople PictureDeveloper Happiness/Tooling/Platform Team Builds the infrastructureService teams Builds the features8datawire.ioTechnical PictureControl Plane Service Discovery Logging + Metrics Configuration Smart EndpointsTraffic Layer HTTP RPC Messaging9Reference Architecturedatawire.ioFirst PictureTechnical: A network of small services Connected via a control plane and traffic layerProcess: ???People: Platform team and service teams10datawire.ioThe Bootstrap PerspectiveFive engineers building an out of the box control plane...Ingest interesting application level events: start, stop, heartbeat events log messagesStore them in an appropriate piece of infrastructure: Service registry Log storeTransform and Present: Realtime view of: routing table, service health Historic view of: request traces, ...11datawire.ioUbiquitous Data Processing Pipeline12Ingest Source of Truth Transform PresentTemplate for many data driven businessesdatawire.ioV1: Started with DiscoveryRequirements: highly available low throughput low latency low operational complexity able to survive a complete restart capable of handling spikesInitial Choices: vert.x + hazelcast websockets smart clients auth0 + python shimTotal Services: 213datawire.ioV2: Added Tracing (PoC)Requirements: high throughput highish latency ok cannot impact applicationInitial choices: vert.x, hazelcast (only retained transient buffer of last 1000 log messages) websockets smart circular buffer minimized impact on applicationTotal Services: 314datawire.ioV3: Added Persistence for TracingRequirements: keep extended history provide full text search filtering, sorting, etcInitial Choices: elasticsearch for storage/search query serviceTotal Services: 415datawire.ioFirst hint of pain...Rerouting data pathways: touched multiple services coupled changesPoor local dev experience: manually fire up and wire the whole fabricSlow deployment pipeline: bunched up changesAll this resulted in a big scary cutover16datawire.ioV4: Adding Persistence for DiscoveryRequirements: track errors associated with particular service nodes store routing strategiesInitial Choices: postgres (RDS) for persistenceYet another big cutover enough is enough!Lets fix our tooling once and for all...17datawire.ioDeployment RequirementsStuff we had tried: Deliver everything as a docker image Still too much wiring to bootstrap the system Use kubernetes for everything Nice dev experience with minikube, but we use amazon servicesNeed to meet both dev & operational requirements Fast dev cycle Good visibility Fast rollback Ability to leverage commodity services18datawire.ioDeployment Redesign Complete system definition in git Contains all the information necessary to bootstrap the system from scratch in all of its operating environments System definition is well factored with respect to its environments Abstract definition: my service needs postgres and redis Development: service -> docker image, postgres -> docker image, redis -> docker image Use minikube to run the whole system Test: Production: service -> docker image, postgres -> RDS, redis -> elasticache Kubernetes cluster for stateless services Tooling caters to the needs of each environment Development: fast feedback cycle Test: repeatable environments Production: quick and safe updates/rollbacks Tooling helps maintain environment parity19datawire.ioDevOps?DevOps is presented as a solution to an organizational problem, but we all sat in the same roomWe were thinking about operational factors from day one: throughput, latency, availability, building a service, not a serverThis forced us to follow an incremental process: tooling for this process was inadequate when we thought about the process it helped us figure out the tooling20datawire.ioProcess: Architecture vs Development (SoA vs SoD)Systems (their shape in particular) are traditionally architectedArchitecture lots of up front thinking slow feedback cycleDevelopment frequent small changes quick feedback cycle measure the impact at every stepMicroservices are about enabling a developmental methodology for systems21datawire.ioMethodology for Developing SystemsPrinciples small frequent changes rapid feedback and good visibilityApplied to codebases: Tooling for rapid feedback: compilers, incremental builds, test suites Tooling for good visibility: printf, logging, debuggers, profilersApplied to systems: Key characteristics go beyond just logic and correctness Performance within specified tolerance of the running system is a critical featureTests dont cut it anymore...22datawire.ioUpdate the Dev CycleTests assess impact on correctness... Build -> Test -> DeployWe need a way to assess impact on the system Build -> Test -> Assess Impact -> DeployHow do you measure system level impact? Measure impact against defined Service Level Objectives (SLOs): throughput, latency, and availability (error rate)23datawire.ioBack to the Experts... Canary Testing Circuit Breakers Dark Launching Tracing Metrics DeploymentAll ways to enable the dev cycle for running systems: make small frequent changes measure the impact on the running system provide good visibility24datawire.ioSecond PictureTechnical: A network of small services Scaffolding to safely enable small frequent changesProcess: Service oriented Development Small frequent changes with good visibility and feedbackPeople: Platform team and service teams25datawire.ioThe Migration PerspectiveVariety of stages... Monolith: django, rails, ... Monolith++: mothership + several little ducklings SoA-ish: small flock of services (maybe 5-10) InbetweenersSome moving really slowly... Months to create just one microserviceSome moving much faster Whats the difference?26datawire.ioMigration is about peopleStarting point: team vs tech Picking a tech stack for the entire eng org to adopt is slow lots of organizational friction Replatforming/refactoring an entire existing monolith is slow lots of organizational and orchestrational friction Creating a relatively autonomous team to tackle a particular problem in the form of a serviceGrowing pains: stability vs progress some orgs hit a sticking point, some didnt27datawire.ioThe People Picture: Dividing up the WorkThe work has two aspects: build the features (dev) keeping the system running (ops)You cant usefully divide up the work along these lines: new features are the biggest source of instability (bugs) separate roles creates misaligned incentives (devops) yet a big part of the work is keeping things runningMicroservices is about how to go about dividing up work: break the big app into smaller ones divide operational responsibility in a way that aligns incentives28datawire.ioThird PictureTechnical: A network of small services Scaffolding to quickly and safely enable small frequent changesProcess: Service oriented Development Small frequent changes with good visibility and feedbackPeople: Dividing up the work Service teams deliver features to users Platform team supports service teams29datawire.ioThe Hard Way301. Start with Tech2. Reverse Engineer The Process + People3. Make lots of mistakes along the way4. Learn from themdatawire.ioThe Easy Way311. Understand the principles of People and Process 2. Use this as a framework toa. pick tech that fitsb. learn from other people's mistakesdatawire.ioMicroservices Cheat Sheet (What, Why & How)People Process TechnologyMicroservices are a way to divide the work of building a cloud applicationMicroservices are built from a process of frequent small changes with rapid feedback and good visibilityMicroservices are an application that is made up of a network of small servicesThis work falls into two categories: Keep the system running (ops) Build new features (dev).Dividing work along these categories creates conflicting incentives between progress and stability. New features from dev eventually become the biggest source of instability for ops.Unifying these roles (devops) allows you to minimize the tradeoff between progress and stability, but you now need to divide up the work by dividing up the app. This results in a network of services.This is the application of the traditional dev cycle to systems rather than codebases, and for it to work, key system properties must become a first class features for developers.This requires dev tooling to support quickly and safely assessing system impact.This requires fast deployment tooling and good visibility into key system level properties: Throughput Latency Availability (error-rate)Depending on your system, this may require tooling for: Fancy request routing (for canary testing, dark launching)Give your dev teams operational responsibility!Define service level objectives & agreements for each service: SLOs: throughput, latency, availability SLAs: what happens when these arent metCommoditize common operational overhead.Extend the dev cycle to include a stage to assess the impact on key system properties (SLOs)Build -> Test -> Deploy Build -> Test -> Assess Impact -> DeployStart with a fast deployment pipeline that incorporates basic system level metrics and monitoring for each service.32datawire.io 33Questions?datawire.ioMicroservices Cheat Sheet (What, Why & How)PeopleMicroservices are a way to divide the work of building a cloud applicationTwo aspects of work: keep it running (ops), build new features (dev)Dividing by aspect creates conflicting incentives between progress and stability.Unifying roles (devops) to minimize tradeoff... divide work by dividing the appGive your dev teams operational responsibility!Define service level objectives & agreements for each service: SLOs: throughput, latency, availability SLAs: what happens when these arent metCommoditize common operational overhead.34datawire.ioMicroservices Cheat Sheet (What, Why & How)ProcessMicroservices are built from a process of frequent small changes with rapid feedback and good visibilityThis is the application of the traditional dev cycle to systems rather than codebases, and for it to work, key system properties must become a first class features for developers.Extend the dev cycle to include a stage to assess the impact on key system properties (SLOs)Build -> Test -> Deploy Build -> Test -> Assess Impact -> Deploy35datawire.ioMicroservices Cheat Sheet (What, Why & How)TechnologyMicroservices are an application that is made up of a network of small servicesThis requires dev tooling to support quickly and safely assessing system impact.This requires fast deployment tooling and good visibility into key system level properties: Throughput Latency Availability (error-rate)Depending on your system, this may require tooling for: Fancy request routing (for canary testing, dark launching)Start with a fast deployment pipeline that incorporates basic system level metrics and monitoring for each service.36datawire.ioDevOps: you cant split the work (along these lines)37DevOpsUser UserDevOpsdatawire.ioFeatures are the largest source of bugs38DevDevDevDevOpsOpsUserUserdatawire.ioMicroservices: Divide the work by dividing the app39DevUserUserInfraDevDevDevOpsdatawire.ioDividing up Work40DevDevDevDevDevDevDevInfraUserUserUserUserOpsdatawire.io 41