how to build more reliable, robust and scalable distributed systems

14
How to build more reliable, robust and scalable distributed systems Lars-Erik Kindblad Senior Consultant Twitter: @kindblad E-mail: [email protected]

Upload: lars-erik-kindblad

Post on 19-Dec-2014

1.207 views

Category:

Technology


5 download

DESCRIPTION

Slides from my lightning talk at Teknologihuset in Oslo on how to build more reliable, robust and scalable distributed systems.

TRANSCRIPT

Page 1: How to build more reliable, robust and scalable distributed systems

How to build more reliable, robust and scalable distributed

systems

Lars-Erik KindbladSenior Consultant

Twitter: @kindbladE-mail: [email protected]

Page 2: How to build more reliable, robust and scalable distributed systems

The Book Shop

Place order should:1. Save the order in a database

2. Charge the customer’s credit card

3. Send a confirmation e-mail to the customer

Page 3: How to build more reliable, robust and scalable distributed systems

The typical way – request/response

Also called synchronous remote procedure call

OrderController

Place order

Browser

1. Add order

2. Charge credit card

3. Send mail

PaymentService

NotificationService

OrderService

Page 4: How to build more reliable, robust and scalable distributed systems

Problem #1 – the order is lost

Order is lostUser will receive an error

User might leave – lost sale User might retry

OrderController

Place order

Browser

Error page

1. Add order

2. Charge credit card

3. Send mail

PaymentService

NotificationService

OrderServiceError

Failures:Network

WebserviceDatabase++

Exception

Page 5: How to build more reliable, robust and scalable distributed systems

Problem #2 – no transactional management

Order is storedPayment is processedE-mail is not sentUser receives an error and might retry:

Order might be stored twice in the database Credit card might be charged multiple times

OrderController

Place order

Browser

Error page

1. Add order

2. Charge credit card

3. Send mail

ExceptionPaymentService

NotificationService

OrderService

Error

Page 6: How to build more reliable, robust and scalable distributed systems

Solution – one-way messaging

Asynchronous

OrderController

Place order

Browser

Add PlaceOrdermessage

Queue

Queue

PlaceOrder message 4

PlaceOrder message 3

PlaceOrder message 2

PlaceOrder message 1

First-In

First-Out

The message:

Page 7: How to build more reliable, robust and scalable distributed systems

The message must also be processed

Queue

Worker

1. Connect 2. Receive message PlaceOrder

3. PlaceOrderMessage Handler

PlaceOrder 4. Add order

5. Charge credit card

6. Send mail

PaymentService

NotificationService

OrderService

Page 8: How to build more reliable, robust and scalable distributed systems

Benefits

Very fast on the frontend – put the message on the queueThe order is never lostAutomatically retries during errors

Queue

Worker

1. Connect 2. Receive messagePlaceOrder

3. PlaceOrderMessage Handler

PlaceOrderError

4. Rollback. Put the message back on the queueand retry

Page 9: How to build more reliable, robust and scalable distributed systems

We still have poor transactional management

Order is createdCredit card is chargedE-mail sending failsThe message is put back on the queue and will be retried

Order is duplicated and credit card will be charged twice

Handle PlaceOrder1. Add order

2. Charge credit card

3. Send mail

ExceptionError

PaymentService

NotificationService

OrderService

Page 10: How to build more reliable, robust and scalable distributed systems

Solution

Split into many messages - one message per transactional boundary

OrderController

Place order

Browser

1. Add PlaceOrder message

Queue

PlaceOrder message

2. Handle PlaceOrder Add PayOrder message

OrderService

PayOrder message

Add SendMail3. Handle PayOrder

PaymentService

4. Handle SendMailSendMail message

NotificationService

If this fails the message is put back on the queue and will

be retried

All the services must be idempotent to be

100% reliable

Page 11: How to build more reliable, robust and scalable distributed systems

Scaling

Request/Response Needs to process unpredictable many requests

Messaging Needs to store unpredictable number of messages The workers only process a predictable number of messages, even during peaks

Scale up Concurrently process multiple messages

Scale out Use multiple workers

Page 12: How to build more reliable, robust and scalable distributed systems

Messaging challenges

1. ID generation The order ID is first available in the worker

2. Eventual consistency The PlaceOrder message might not have been picked up yet

Page 13: How to build more reliable, robust and scalable distributed systems

NServiceBus

Lightweight messaging framework for .NETGreat choice for implementing one-way messaging + publish/subscribeOpen Source but not freeAvailable at http://particular.net/

Page 14: How to build more reliable, robust and scalable distributed systems

The information contained in this presentation is proprietary.© 2012 Capgemini. All rights reserved.

www.capgemini.com

About Capgemini

With more than 120,000 people in 40 countries, Capgemini is one of the world's foremost providers of consulting, technology and outsourcing services. The Group reported 2011 global revenues of EUR 9.7 billion.Together with its clients, Capgemini creates and delivers business and technology solutions that fit their needs and drive the results they want. A deeply multicultural organization, Capgemini has developed its own way of working, the Collaborative Business ExperienceTM, and draws on Rightshore ®, its worldwide delivery model.

Rightshore® is a trademark belonging to Capgemini