javaone 2016 "java, microservices, cloud and containers"
TRANSCRIPT
01/05/2023 @danielbryantuk | @spoole167 1
Java, Microservices, Cloud and Containers: Migrating without the Tiers (or Tears)
Daniel Bryant @danielbryantukSteve Poole @spoole167
01/05/2023 @danielbryantuk | @spoole167 2
The pitch• Moving to the cloud requires a fundamental change in mindset• Technology• Skills (architectural, operational, QA)• Organisational design
• DevOps, container technology and microservices are complementary
• Migrating in non-trivial
• Learn from some of our successes (and mistakes)…
01/05/2023 @danielbryantuk | @spoole167 3
Who are we?Steve Poole
IBM Developer
@spoole167
Daniel Bryant
Chief Scientist, OpenCredo
CTO SpectoLabs
@danielbryantuk
Making Java Real Since Version 0.9
Open Source Advocate
DevOps Practitioner (whatever that means!)
Driving Change
“Biz-dev-QA-ops”
Leading change in organisations
Experience of Docker, k8s, Go, Java
InfoQ, DZone, Voxxed contributor
01/05/2023 @danielbryantuk | @spoole167 4
Introduction
What ‘Cloud’ promisesa virtual, dynamic environment which
maximizes use, is infinitely scalable, always available and needs minimal upfront
investment or commitment
Take your code – host it on someone else's machine pay only for the resource you use for the time you use it AND be able to do that very quickly and repeatedly in parallel
http
s://w
ww.fl
ickr.c
om/p
hoto
s/sk
ohlm
ann/
The ability to have ‘cloud burst’ capacity is changing the way software is being designed, developed and supported
We’re moving to a more industrial scale:
Why buy one computer for a year when you can hire 365 computers for a day..
It’s a new development world
https://www.flickr.com/photos/vuhung/
“Compute on demand” – it’s what we always wanted
Cloud computing: compute == money
Money changes everything
With a measureable and direct relationship between $£€¥ and CPU/RAM, disk etc the financial success or failure of a project is even easier to see
And that means…
Even more focus on value for money.
American Society of Civil Engineers
Someone will be looking at your leaky app
Loosing unnecessary baggage - (you have loads)
Java applications have to get lighter.
Java 9 modularity will help but you have to consider footprint across the board.
Choose your dependencies wisely
Your choice of OS & distribution is important.
The aim is ‘carry on only’
Your application isn’t going on a long trip
http
s://w
ww.fl
ickr.c
om/p
hoto
s/ar
myd
re20
08/
Startup timesHow long do you want to wait?
How long do you have to wait?
Do you need to preemptively start instances ‘just in case’ due to start up time? To bad – that costs
If the unit of deployment and scaling is an instance of a service it needs to start FAST
http
s://w
ww.fl
ickr.c
om/p
hoto
s/912
9511
7@N0
8/
https://www.flickr.com/photos/isherwoodchris/
• Q: How much RAM does
your application use?
• A: Too much
Runtime costs Most cloud providers will charge you for your RAM usage over time: $GB/hr. (Sometimes the charge is $0)
Increasing –Xmx directly effects cost. Something businesses can understand
Net effect : you’ll be tuning your application to fit into specific RAM sizes. Smaller than you use today.
You need to measure where the storage goes. You’ll be picking some components based on memory usage
Note that increasing the amount of memory for 1 service increases the bill by the number of concurrent instances
https://www.flickr.com/photos/erix/
SimplyJava applications are going to be running in a remote, constrained and metered environment
There will be precise limits on how much disk, CPU, RAM, Bandwidth an application can use and for how long
Whether your application is large or small, granular or monolithic. Someone will be paying for each unit used
That person will want to get the most out of that investment
http
s://w
ww.fl
ickr.c
om/p
hoto
s/rvo
egtli
/
Where you code runs day-to-day and moment-to-moment will be driven by economics, legal requirements and how much risk your business wants to take.
Your code has to scale better, be more efficient, resilient, secure and work in constrained environments
You will have to design, code, deliver, support and debug code in new ways
It’s going to be scary
How scary?
design, coding, deployment , startup, execution, scalingdebugging, security, resilience …
Almost everything about your application is effected
http
s://w
ww.fl
ickr.c
om/p
hoto
s/m
jtmai
l/
Resilient applications
Design for short term failure: something fails all the time. Expect data and service outages regularly
Fail and recover: don’t diagnose problems in running systems. Kill it and move on
Every IO operation you perform may fail – do as few as possible
Every IO operation may stall – costing you GB/hrs and resources– timeout everything quickly
Every piece of data you receive may be badly formed – check everything
Retry, compensation, backout strategies– these are your new friends
“Everything in the cloud fails all the time” : Werner Vogels
DebuggingRemote support for your family? Fancy having to do that for your own apps?
You have to assume:
You will never be able to log into a remote server.
You will never be able to attach a remote debugger to a failing app Ever.
All problems must be resolved by local reproduction or logs and dumps (discuss)
http
s://w
ww.fl
ickr.c
om/p
hoto
s/ca
rbon
nyc/
DebuggingIt gets more challenging.
Failures during deployment or initial startup can be difficult or impossible to diagnose.
If your service instance didn’t start there is is little chance of logs being kept!
Learn to love logs, dumps and traces.
Remote log stores and tools are going to be your best friend BTW: they’ll cost too
http
s://w
ww.fl
ickr.c
om/p
hoto
s/hin
kelst
one/
SecurityWhen you deploy to public cloud your system will be attacked in minutes. Certainly in < 1hrYour systems will always be under threat
https://www.flickr.com/photos/ahmadhammoudphotography/
It’s all changeHow you design, code, deploy, debug, support etc will be effected by the metrics and limits imposed on you.
Financial metrics and limits always change behavior. It also creates opportunity
Java applications have to get leaner and meaner
You have to learn new techniques and tools
http
s://w
ww.fl
ickr.c
om/p
hoto
s/be
igep
hoto
s/
01/05/2023 @danielbryantuk | @spoole167 22
Case studies
01/05/2023 @danielbryantuk | @spoole167 23
“Just make it do what the old one does (but better)”
• Case studies• ‘Teflon shouldered’ product owner• Rebuilding a service three times
• Problem• Performing migration without a
clear definition of ‘done’• Accepting feature creep
01/05/2023 @danielbryantuk | @spoole167 24
“Just make it do what the old one does (but better)”
• Attempt to retrofit BDD/regression tests around application• Serenity BDD, Cucumber, Jbehave
• Work incrementally with QA team• Manually test everything• Create tests for new functionality
• Compare input/output• Traffic: Twitter’s Diffy• Datastores: Reconsiliator pattern
01/05/2023 @danielbryantuk | @spoole167 25
Twitter’s Diffy and mysqldbcompare
blog.twitter.com/2015/diffy-testing-services-without-writing-tests dev.mysql.com/doc/mysql-utilities/1.5/en/mysqldbcompare.html
01/05/2023 @danielbryantuk | @spoole167 26
www.infoq.com/news/2015/04/raffi-krikorian-rearchitecting
My ‘re-architecting’ bible…
01/05/2023 @danielbryantuk | @spoole167 27
“Bounding the context”• Case studies• Large business software provider
thought they knew their domain• Small CRM company had let
domain model entropy
• Problem• Development team lost sight of the
application big picture• Lack of architectural awareness
and ‘broken windows’
01/05/2023 @danielbryantuk | @spoole167 28
Context mapping (static) & event storming (dynamic)
www.infoq.com/articles/ddd-contextmapping
ziobrando.blogspot.co.uk/2013/11/introducing-event-storming.html
01/05/2023 @danielbryantuk | @spoole167 29
“Bounding the context”• Create ‘seams’ within codebase• Natural domain boundaries• Single responsibility principle• Look for points of ‘friction’
• Extreme ownership• Seize (identify)• Clear (refactor logic / data)• Hold (metrics and rachets)• Build (move code to service)
01/05/2023 @danielbryantuk | @spoole167 30
“How small is micro?”• Case studies
• UK retailer looking to migrate to cloud and microservices
• Keen to minimise risk
• Problem• Previous attempts of gradual
migration had failed• Integration issues - services either too
big or too small• Spent a long time building a
‘microservice platform’
01/05/2023 @danielbryantuk | @spoole167 31
“How small is micro?”• Understand microservice principles and Self-Contained Systems (SCS)
• Utilise the strangler pattern
• ‘Service Virtualisation’ is valuable for testing
• Don’t underestimate the value of PaaS
01/05/2023 @danielbryantuk | @spoole167 32
zeroturnaround.com/rebellabs/microservices-for-the-enterprise/
01/05/2023 @danielbryantuk | @spoole167 33
Self-contained systems (SCS)
http://scs-architecture.org/
UI / Biz / Repo
MonolithDomains
Modules, components, frameworks, libraries
01/05/2023 @danielbryantuk | @spoole167 34
Self-contained systems (SCS)
SCS
Microservices
01/05/2023 @danielbryantuk | @spoole167 35
Strangling your software (not your manager!)
paulhammant.com/2013/07/14/legacy-application-strangulation-case-studies/
www.nginx.com/blog/refactoring-a-monolith-into-microservices/
01/05/2023 @danielbryantuk | @spoole167 36
Service Virtualisation (for Dev and Test)
• Existing tooling• Hoverfly• Wiremock• VCR/Betamax• Mountebank• mirage
01/05/2023 @danielbryantuk | @spoole167 37
Hoverfly• Lightweight Service virtualisation • Open source (Apache 2.0)• Go-based / single binary • Written by @Spectolabs
• Flexible API simulation• HTTP / HTTPS• More Protocols to follow?
01/05/2023 @danielbryantuk | @spoole167 38
• Middleware• Remove PII• Rate limit• Add headers
• Middleware• Fault injection• Chaos monkey
01/05/2023 @danielbryantuk | @spoole167 39
The value of PaaS…
01/05/2023 @danielbryantuk | @spoole167 40
The value of PaaS…
01/05/2023 @danielbryantuk | @spoole167 41
“Cloud native or ‘lift and shift’”• Case studies• Price comparison website
performance dipped upon a migration to the cloud
• Problems• Not coding for distributed or
ephemeral nature of cloud• No reliable creation of cloud
environment• Not testing in the cloud
01/05/2023 @danielbryantuk | @spoole167 42
“Cloud native or ‘lift and shift’”• Push apps through to production as early as possible (CI/CD)• Build POCs appropriately• Include building infrastructure in the pipeline
• Include NFR testing in the build pipeline
• Dev, QA and Ops must cultivate ‘mechanical sympathy’ • Everything in the cloud is networked
• Configure local development environments as appropriate
01/05/2023 @danielbryantuk | @spoole167 43
NFR testing in the (cloud) pipeline
01/05/2023 @danielbryantuk | @spoole167 44
NFRs testing in the (container) pipeline
01/05/2023 @danielbryantuk | @spoole167 45
NFR testing resources• Performance
• JMeter• Gatling
• Fault-tolerance• Hoverfly• Wiremock/Saboteur
• Security• bdd-security (OWASP ZAP)• OWASP Dependency-Check• Docker Bench for Security
01/05/2023 @danielbryantuk | @spoole167 46
Security is vital (but often ignored)
www.youtube.com/watch?v=c9uvV4ChIXw
www.infoq.com/news/2016/08/secure-docker-microservices
01/05/2023 @danielbryantuk | @spoole167 47
Shameless plugs…
www.youtube.com/watch?v=A1982GdXXSA
01/05/2023 @danielbryantuk | @spoole167 48
“Containerise all the things”• Problem• JVM respecting resource limits
• OOM: Unable to create thread
• Random application stalling
• Case studies• www.notonthehighstreet.com
01/05/2023 @danielbryantuk | @spoole167 49
“Containerise all the things”• Set container memory appropriately • docker - - memory=”Xg”• JVM requirements = Heap size (Xmx) + Metaspace + JVM overhead• Account for native thread requirements e.g. thread stack size (Xss)• Watch out for ulimits
• Entropy • Host entropy can soon be exhausted by crypto operations• –Djava.security.egd=file:/dev/urandom• Be aware of security ramifications
01/05/2023 @danielbryantuk | @spoole167 50
Containerising our knowledge
01/05/2023 @danielbryantuk | @spoole167 51
Key lessons learned
01/05/2023 @danielbryantuk | @spoole167 52
Lessons learned from the trenches• Specify goals and targets of migration (and retrospect)• Undertake just enough up front design (contexts, APIs, integration)• Understand distributed systems (12 factors etc)• Start pushing to production ASAP• There is nothing wrong with PaaS• Programmable infrastructure is a key enabler• Don’t forget the NFRs• Containers and microservices are complementary to cloud
01/05/2023 @danielbryantuk | @spoole167 53
Recommended reading
01/05/2023 @danielbryantuk | @spoole167 54
Thanks for listening
• Any questions?
• Daniel Bryant (@danielbryantuk )• Steve Pool (@spoole167)