performance metrics for a social network

38
Performance metrics for a social network

Upload: thierry-schellenbach

Post on 28-May-2015

25.170 views

Category:

Technology


4 download

DESCRIPTION

Performance metrics for a social network. Presentation on Fashiolista's usage of Newrelic, Statsd/Graphite and PgFouine to say on top of load times. See the blogost at http://www.mellowmorning.com

TRANSCRIPT

Page 1: Performance metrics for a social network

Performance metrics for a social network

Page 2: Performance metrics for a social network

About Me

• Thierry Schellenbach• Founder/ CTO Fashiolista• Author of Django Facebook• Github/tschellenbach

• Blog: mellowmorning.com• @tschellenbach

Page 3: Performance metrics for a social network

Global Fashion Discovery

Page 4: Performance metrics for a social network
Page 5: Performance metrics for a social network
Page 6: Performance metrics for a social network
Page 7: Performance metrics for a social network

5.000.000+8.000.000+

Page 8: Performance metrics for a social network

Growth

2nd largest fashion community

• 1mln members• 17mln loves/month• 94mln non-bot pageviews

Page 9: Performance metrics for a social network

Powered By

• Django/Python• PostgreSQL• Solr• Redis• Celery• AWS/ Ubuntu• Nginx/ Gunicorn/ Supervisor

Page 10: Performance metrics for a social network

Sexy Metrics driven optimization

Hard Because

• All content is personalized

• Activity is clustered around a few users (>100k followers)

• Individual users are insanely active (7 hours in a day is normal)

• Social network, can’t easily shard data

Page 11: Performance metrics for a social network

Speed is a Feature

Page 12: Performance metrics for a social network

Metrics across the board

• Development– Spot things early on, wrong usage of ORM etc

• System Health– Is my DB healthy, my Redis cluster etc

• Page level– Why is my page slow– What is the average speed of the components (DB,

Redis, Solr etc)

Page 13: Performance metrics for a social network

Tools we use

Development

• Debug toolbar– Cache calls– Graphite

Timings– Queries and

their explains– Duplicate query

detection

System Health

• Cloudwatch• Munin• Nagios• DB slow log• Redis slow log• Integration Tests• PgFouine

• New Relic• Graphite

Page Level

Page 14: Performance metrics for a social network

Development

Duplicates

Cache Calls

StatsD

Page 15: Performance metrics for a social network

Today’s Presentation

New Relic

• Dashboard, High level insights

Graphite

• Understand what keeps your DB busy

PgFouine

• Stash all data, query it any way you want

• Tool, not a dashboard

Page 16: Performance metrics for a social network

New Relic

• Frontend -> App -> Components (DB, Solr, etc.)

• Breaks page performance down into it’s components

• Tracks deploys and compares before and after

Page 17: Performance metrics for a social network

Are you Supported?

• Ruby• Java• .NET• PHP• Python

• Pip install newrelic• Edit the .ini• Add the WSGI middleware• Wait for Magic

Page 18: Performance metrics for a social network

End user load times

• Drill down all the way to Database calls• The purple line is our app, the rest frontend

App

Frontend (97%)

Page 19: Performance metrics for a social network

Global page loads

Page 20: Performance metrics for a social network

Page Level

• Average frontend performance per page• Click to view App level breakdown

To App LevelPage. Not URL.

Page 21: Performance metrics for a social network

Drill down/ App overview

MemcachedDB Query

History

Page 22: Performance metrics for a social network

Database

• See which tables are under most load

• See which pages cause the load

• Development over time

Page 23: Performance metrics for a social network

Deploys

Page 24: Performance metrics for a social network

Deploys part TwoResponse Time Pre & Post

Memory Utilization

Page 25: Performance metrics for a social network

Background Task

Number of Task calls (sample)

Page 26: Performance metrics for a social network

Graphite Insights

• NewRelic has the overview, Graphite the detail

• Open Source!

• Throw data at it via UDP

• Popularized by Etsy(see mellowmorning.com for link)

Page 27: Performance metrics for a social network

It’s Complicated

Page 28: Performance metrics for a social network

Tracks Everything

Page 29: Performance metrics for a social network

Setup

• Track using StatsD– Support for (PHP, Python, Ruby, Node, Java)

• Hierarchy (python example)• get.<app>.<view>.<component>

with request.timings('get.user.profile_page.sql'): print ‘database query here’

Page 30: Performance metrics for a social network

Data tool/ Not a dashboard

• Wildcards

– get.<app>.<view>.*.upper_90

– get.<app>.*.redis.zadd.upper_90

– limit(sortByMaxima(get.<app>.<view>.*.upper_90),4)

Page 31: Performance metrics for a social network

/style/<user>/ performance

Memcached Slowdown

ZADD

Set Many

Page 32: Performance metrics for a social network

Including Functional parts of Pages

• More like this part is tracked

• Solr & Redis Cache

Page 33: Performance metrics for a social network

What we Track

• Loadtime per bit of functionality• Database calls per DB• 90th percentile load times• Task broker roundtrip times• Facebook API calls

Page 34: Performance metrics for a social network

PgFouine

• Run on samples of all queries (say 5m)• Not just slow queries• Repeating a simple query many times is also

wrong, PgFouine finds it

• See Instagram’s fabric snippet• https://gist.github.com/2307647

Page 35: Performance metrics for a social network

PgFouine ContinuedQueries that took up the most time (N)

• Spots issues with many small queries

Compare multiple reports

Normalized

Page 36: Performance metrics for a social network

PgFouine Tips

• My colleague wrote a fast C++ version• github.com/WoLpH/pg_query_analyser

Also look at:• Pg Stat Statement• Pg Badger

Page 37: Performance metrics for a social network

Concluding

New Relic

• Dashboard, High level insights

Graphite

• Understand what keeps your DB busy

PgFouine

• Stash all data, query it any way you want

• Tool, not a dashboard

Page 38: Performance metrics for a social network

Q&A

We’re Searching for Django Developers & Linux system administrators!

Fashiolista.com/jobs

Open source projects:

Github.com/tschellenbachTry Django Facebook!