[Dec 1 meetup] upgrading microservices

Download [Dec 1 meetup] upgrading microservices

Post on 22-Jan-2018

383 views

Category:

Technology

2 download

Embed Size (px)

TRANSCRIPT

  1. 1. Upgrading Microservices (Continuously...) 1 Presented by Rean Griffith
  2. 2. Summary Upgrades should be: Boring Uneventful Predictable Frequent! Reversible (think compensating actions and redo vs. db transactions) Good news: many layers of patterns to use (bad news: many anti-patterns) Some patterns are practice, others are structural (make it easy to do the right thing) Working definition for upgrade includes Changing how/where your microservices are deployed (new base image, new kernel, new physical/virtual machine, new configurations/ports etc.) Changing what is deployed (new features, new feature variants) 2
  3. 3. Agenda Bio Microservice upgrade considerations What customers want Upgrade challenges Options customers have Upgrade options for microservices running in Containers Pros and Cons of each approach Upgrade options for microservices running in Unikernels Pros and Cons of each approach Can we learn from docker upgrade experience and apply it to unikernel microservices? Summary 3
  4. 4. Bio Operating Systems + Distributed Systems + ML Person Operating systems, resource management, cluster scheduling, machine learning 5 years in VMware CTO Office (Network-aware DRS, Network resource management, autoscaling systems, data-mining VM telemetry + anomaly detection) 2 years Post Doc in RAD Lab at UC Berkeley (Machine Learning + Systems, OpenFlow, Datacenter transport) Ph.D. in Computer Science - Columbia University B.Sc. in Computer Science and Management - University of the West Indies (Barbados) 4
  5. 5. Microservices Upgrade Considerations Important to consider upgrade logistics early Successful upgrades a mix of design/architecture and process Using technology X wont compensate for poor design, fuzzy boundaries, or poor state management. Wont automatically get rolling upgrade, etc. without forethought! Upgrading a single application feature might require changes to multiple microservices Upgrade differences from monolith Online (perhaps temporarily degraded) vs. offline expectations Higher frequency of upgrades anticipated More outbound dependencies (e.g., dns, storage, other/external services etc.) If using containers then factor in Docker registry dependencies, security and latency 5
  6. 6. Typical Upgrade Experience? Kubernetes Operations (Kops) anecdote 6
  7. 7. What Customers Want Request for docker restart with updated image I want to apply upgrade to production in zero downtime Faster post-upgrade bring-up of new services/instances (batch dns updates) Smooth cluster creation process Better interaction with networking 7
  8. 8. Continuous Upgrades Do Upgrade/ Update Undo! (oops) Validate 8
  9. 9. Continuous Upgrades: (Some) Challenges Undo! (oops) 9 Config, Component or dependency mismatch Time outs, Failed Upgrade - Unable to Rollback Post- upgrade version is buggy or new CVE (vuln) found Post- upgrade performance is worse
  10. 10. Agenda Microservice upgrade considerations What customers want Upgrade challenges Options customers have Upgrade options for microservices running in Containers Pros and Cons of each approach Upgrade options for microservices running in Unikernels Pros and Cons of each approach Can we learn from docker upgrade experience and apply it to unikernel microservices? Summary 10
  11. 11. Container Upgrade - Option 1: Manual Demo Pros Simple Cons (pain points) Prone to manual mistakes Could result in server failing to come up after upgrade Could result in dropped client connections and/or data loss in the case of a stateful application Not scalable 11
  12. 12. Container Upgrade - Option 2: Watchtower Monitor running Docker containers Pulls new images when changes detected and restarts container using new image Image restarted using ...the same options that were used when it was deployed initially Demo Pros New image push triggers the workflow Detect links between containers and start/stop them ...in a way that won't break any of the links Cons (pain points) Assumes new start options = old/initial start options No validation of image and its config options post upgrade 12
  13. 13. Container Upgrade - Option 3: Git hook (resin.io) Linux Containers for IoT (Yocto Linux + Resin Container Engine) Pros Automates dev push to image build + deploy Cons (pain points) Root causing upgrade interruptions IoT device only Device control via Yocto Linux + RCE Git submodule incompatible 13
  14. 14. Continuous Upgrades: Missing Upgrade Workflows Do Upgrade/ Update Undo! (oops) Validate 14 Manual upgrade, Watchtower, Resin.io Resin.io
  15. 15. Ex: Patterns to Integrate to Capture Cont. Upgrade Immutable Server Deployed instance carved in stone, config changes => new deployed instance Blue/Green Deployments Separate infrastructure for different versions/deployments Canary Release Introduce new functionality incrementally (different from A/B test) Monitoring (upgrade validation) Response Diffing (upgrade validation) Validating old and new service versions via (automated) response comparisons (e.g., using Diffy) 15
  16. 16. Response Diffing with Diffy (Twitter) Primary, Secondary run last known good code Candidate runs new code Compare #Primary-Secondary differences with #Primary-Candidate differences Noise example: candidate, primary and secondary all disagree 16
  17. 17. Revisit Watchtower Container Upgrade with Patterns Immutable Server Each git push builds new image Blue/Green Deployments New images deployed on new set of instances Canary Release Introduce new functionality incrementally into the newly active deployment Response Diffing (upgrade validation) Validating old and new service versions via (automated) response comparisons in newly active deployment (e.g., using Diffy) 17 Active deployment + Canaries + Previous version Inactive deployment w/Previous version
  18. 18. Revisit Resin.io Container Upgrade with Patterns Immutable Server Each git push builds new image Blue/Green Deployments May not be applicable unless new deployment = new set of drones (edge devices) Canary Release Likely more applicable: Introduce new functionality incrementally into the newly active deployment. Should preserve previous container image so theres a rollback story! Monitoring (upgrade validation) 18 Active deployment + Canaries + Previous version
  19. 19. Agenda Microservice upgrade considerations What customers want Upgrade challenges Options customers have Upgrade options for microservices running in Containers Pros and Cons of each approach Upgrade options for microservices running in Unikernels Pros and Cons of each approach Can we learn from docker upgrade experience and apply it to unikernel microservices? Summary 19
  20. 20. Upgrading Microservices in Unikernels Unikernel (working definition) Single purpose (single-process) virtual appliance (multi-threading available) Statically linked image of your Application and a hypervisor (no general OS or extra library code) No extraneous services, no (full-fledged) shell, no fork() facility to start a second process Small form-factor deployments Well-suited for storage-constrained edge deployments Some best practices baked-in A unikernel is a purpose-built targeted virtual appliance (immutable server) Config baked into image (role also set in stone) Blue/Green-friendly (each instance is a new VM) 20
  21. 21. Unikernel Upgrades with Patterns Immutable Server Must build a new statically-linked virtual appliance on each dev change Blue/Green Deployments New virtual appliance images launched as new VMs Canary Release Slightly modified VM images launched in newly active deployment Response Diffing (upgrade validation) Validating old and new service versions via (automated) response comparisons in newly active deployment (e.g., using Diffy) 21 Active deployment + Canaries + Previous version Inactive deployment w/Previous version
  22. 22. Unikernel Upgrade Story (Nodejs + OSv) Nodejs 4.1.1 + App + OSv Upgrade to nodejs 4.6.1 (rebuild + run => worked) Upgrade to nodejs 6.9.1 (rebuild + run => worked most of the time) Upgrade to nodejs 7.0.0 (rebuild + run => worked most of the time - same issue) Short term fix: rollback to 4.6.1 image. Long term fix: fix OSv pthread_mutex_trylock wrapper 22
  23. 23. Summary We want upgrades to be routine (boring) and frequent Working towards continuous upgrades requires a combination of design/architecture and process Many tools capture upgrade steps but not the higher-level desirable workflows Combining patterns/lessons from deploying containers can help capture these workflows These patterns can be applied to container and unikernel deployments 23
  24. 24. Acknowledgements Special thanks to: You (the audience) for your time and attention Cisco (our meetup hosts) Jean-Paul Calderone Erika Ghose, DJ Madhuri Yechuri Contact info: rean@caa.columbia.edu 24