performance metrics for a social network

Post on 28-May-2015

25.170 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Performance metrics for a social network. Presentation on Fashiolista's usage of Newrelic, Statsd/Graphite and PgFouine to say on top of load times. See the blogost at http://www.mellowmorning.com

TRANSCRIPT

Performance metrics for a social network

About Me

• Thierry Schellenbach• Founder/ CTO Fashiolista• Author of Django Facebook• Github/tschellenbach

• Blog: mellowmorning.com• @tschellenbach

Global Fashion Discovery

5.000.000+8.000.000+

Growth

2nd largest fashion community

• 1mln members• 17mln loves/month• 94mln non-bot pageviews

Powered By

• Django/Python• PostgreSQL• Solr• Redis• Celery• AWS/ Ubuntu• Nginx/ Gunicorn/ Supervisor

Sexy Metrics driven optimization

Hard Because

• All content is personalized

• Activity is clustered around a few users (>100k followers)

• Individual users are insanely active (7 hours in a day is normal)

• Social network, can’t easily shard data

Speed is a Feature

Metrics across the board

• Development– Spot things early on, wrong usage of ORM etc

• System Health– Is my DB healthy, my Redis cluster etc

• Page level– Why is my page slow– What is the average speed of the components (DB,

Redis, Solr etc)

Tools we use

Development

• Debug toolbar– Cache calls– Graphite

Timings– Queries and

their explains– Duplicate query

detection

System Health

• Cloudwatch• Munin• Nagios• DB slow log• Redis slow log• Integration Tests• PgFouine

• New Relic• Graphite

Page Level

Development

Duplicates

Cache Calls

StatsD

Today’s Presentation

New Relic

• Dashboard, High level insights

Graphite

• Understand what keeps your DB busy

PgFouine

• Stash all data, query it any way you want

• Tool, not a dashboard

New Relic

• Frontend -> App -> Components (DB, Solr, etc.)

• Breaks page performance down into it’s components

• Tracks deploys and compares before and after

Are you Supported?

• Ruby• Java• .NET• PHP• Python

• Pip install newrelic• Edit the .ini• Add the WSGI middleware• Wait for Magic

End user load times

• Drill down all the way to Database calls• The purple line is our app, the rest frontend

App

Frontend (97%)

Global page loads

Page Level

• Average frontend performance per page• Click to view App level breakdown

To App LevelPage. Not URL.

Drill down/ App overview

MemcachedDB Query

History

Database

• See which tables are under most load

• See which pages cause the load

• Development over time

Deploys

Deploys part TwoResponse Time Pre & Post

Memory Utilization

Background Task

Number of Task calls (sample)

Graphite Insights

• NewRelic has the overview, Graphite the detail

• Open Source!

• Throw data at it via UDP

• Popularized by Etsy(see mellowmorning.com for link)

It’s Complicated

Tracks Everything

Setup

• Track using StatsD– Support for (PHP, Python, Ruby, Node, Java)

• Hierarchy (python example)• get.<app>.<view>.<component>

with request.timings('get.user.profile_page.sql'): print ‘database query here’

Data tool/ Not a dashboard

• Wildcards

– get.<app>.<view>.*.upper_90

– get.<app>.*.redis.zadd.upper_90

– limit(sortByMaxima(get.<app>.<view>.*.upper_90),4)

/style/<user>/ performance

Memcached Slowdown

ZADD

Set Many

Including Functional parts of Pages

• More like this part is tracked

• Solr & Redis Cache

What we Track

• Loadtime per bit of functionality• Database calls per DB• 90th percentile load times• Task broker roundtrip times• Facebook API calls

PgFouine

• Run on samples of all queries (say 5m)• Not just slow queries• Repeating a simple query many times is also

wrong, PgFouine finds it

• See Instagram’s fabric snippet• https://gist.github.com/2307647

PgFouine ContinuedQueries that took up the most time (N)

• Spots issues with many small queries

Compare multiple reports

Normalized

PgFouine Tips

• My colleague wrote a fast C++ version• github.com/WoLpH/pg_query_analyser

Also look at:• Pg Stat Statement• Pg Badger

Concluding

New Relic

• Dashboard, High level insights

Graphite

• Understand what keeps your DB busy

PgFouine

• Stash all data, query it any way you want

• Tool, not a dashboard

Q&A

We’re Searching for Django Developers & Linux system administrators!

Fashiolista.com/jobs

Open source projects:

Github.com/tschellenbachTry Django Facebook!

top related