gogolook sqs lesson learnt

20
Gogolook SQS lesson learnt

Upload: kakashi-liu

Post on 19-Jul-2015

161 views

Category:

Technology


4 download

TRANSCRIPT

Page 1: Gogolook SQS lesson learnt

Gogolook SQS lesson learnt

Page 2: Gogolook SQS lesson learnt

Hello!I am Kakashi

You can find me at:Twitter: @kakashiliu

Page 3: Gogolook SQS lesson learnt

The story is ...

Page 4: Gogolook SQS lesson learnt
Page 5: Gogolook SQS lesson learnt

Previous Status

◎ Worker: c3.large * 30~50, spot instance◎ CPU utilization: less than 20%◎ Messages: over than 1 billion / per day

Page 6: Gogolook SQS lesson learnt

SQS 基本介紹

Page 7: Gogolook SQS lesson learnt

SQS 基本設定 2

◎ First 1 million SQS requests per month are free◎ $0.476 / per 1 million SQS requests ◎ A single request can have from 1 to 10 messages,

up to maximum total payload of 256 kb◎ requests including:

CreateQueue, ListQueues, DeleteQueue, SendMessage, SendMessageBatch, ReceiveMessage, ChangeMessageVisibility, ChangeMessageVisibilityBatch, DeleteMessage, DeleteMessageBatch, SetQueueAttributes, GetQueueAttributes, GetQueueUrl, AddPermission, and RemovePermission.

Page 8: Gogolook SQS lesson learnt

SQS新政1 : 流程改善

◎ 每個worker每隔1min, 用crontab叫他們復活, 沒事幹就死掉 => cpu utilization low, cpu loading high

◎ Solution:利用 while True 加上 SQS long polling (20s) , 沒事做的時候sleep 60s

◎ 結果: Loading降低, CPU utilization 沒提升很多

Page 9: Gogolook SQS lesson learnt

SQS新政2 : 斷開鎖鏈

◎ 發現有network bound: 3rd library loggly 沒有使用keep-alive的方式連接

◎ Solution:1) python lib 替換 urllib2 為 requests.session 2) 並且使用batch sent log

◎ 結果: 清queue速度快一倍 worker cpu utilization 上升數倍

Page 10: Gogolook SQS lesson learnt

SQS新政3 : 減少request量

◎ 利用sent message batch的方式, Batch sent 100 messages or 塞滿256kb在發送

◎ Solution:patch掉原本送queue的function, 並且利用 atexit 去避免worker被關閉時, 有些message還在 local queue裡面

◎ 結果: sent, receive , delete 量減少99%

Page 11: Gogolook SQS lesson learnt

SQS新政4 : 調整單機工人數目

◎ Message 多的開多一點 thread去處理

◎ 其中還可以利用 gevent 去避開一些 network bound的問題

Page 12: Gogolook SQS lesson learnt

Result

◎ Spot instance: 30~50 -> 2◎ CPU utilization: 20% -> 80%◎ Message Counts: 1 billion -> 10million

◎ 處理時間

Page 13: Gogolook SQS lesson learnt

Result

Cost Reduction around 70%

Page 14: Gogolook SQS lesson learnt

Bonus: SQS alternative

Page 15: Gogolook SQS lesson learnt

Fluentd (Before)

Page 16: Gogolook SQS lesson learnt

Fluentd (After)

Page 17: Gogolook SQS lesson learnt

Traditional Flow

AP server SQS Worker Mongodb

Big Query

Page 18: Gogolook SQS lesson learnt

Fluentd Flow

AP server

FluentdForwarder

FluentdStation

Mongodb

Big Query

Page 19: Gogolook SQS lesson learnt

Fluentd Flow

AP server

FluentdForwarder

FluentdStationELB

FluentdStation

FluentdStation

FluentdStation

Page 20: Gogolook SQS lesson learnt

Thanks!Any questions?

You can find me at:@[email protected]