challenges when building high profile editorial sites

!

BUILDING HIGH PROFILE EDITORIAL SITES

YANN MALET2014.DJANGOCON.EU

MAY 2014

ABOUT THIS TALK

● It comes after

− Data Herding: How to Shepherd Your Flock Through Valleys of Darkness (2010)

− Breaking down the process of building a custom CMS (2010)

− Stop Tilting at Windmills - Spotting Bottlenecks (2011)

AGENDA

● Foreword

● Multi layer cache to protect your database

● Image management on responsive site

● Devops

HIGH PERFORMANCE

Django is web scale...

… AS ANYTHING ELSE ...

AGENDA

● Foreword



● Devops

VARNISH CACHE

VARNISH

● Varnish Cache is a web application accelerator

− aka caching HTTP reverse proxy

− 10 – 1000 times faster

!

● This is hard stuff don't try to reinvent this wheel

VARNISH: TIPS AND TRICKS

● Strip cookies

● Saint Mode

● Custom error better than guru meditation

STRIP COOKIES

● Increasing hit rate is all about reducing

− Vary: on parameters

● Accept-Language

● Cookie

STRIP COOKIES

sub vcl_recv { # unless sessionid/csrftoken is in the request, # don't pass ANY cookies (referral_source, utm, etc) if (req.request == "GET" && (req.url ~ "^/static" || (req.http.cookie !~ "sessionid" && req.http.cookie !~ "csrftoken"))) { remove req.http.Cookie; } ... } sub vcl_fetch { # pass through for anything with a session/csrftoken set if (beresp.http.set-cookie ~ "sessionid" || beresp.http.set-cookie ~ "csrftoken") { return (pass); } else { return (deliver); } ... }

VARNISH: SAINT MODE

● Varnish Saint Mode lets you serve stale content from cache, even when your backend servers are unavailable.

− http://lincolnloop.com/blog/varnish-saint-mode/

http://lincolnloop.com/blog/varnish-saint-mode/

VARNISH: SAINT MODE 1/2

# /etc/varnish/default.vcl backend default { .host = "127.0.0.1"; .port = "8000"; .saintmode_threshold = 0; .probe = { .url = "/"; .interval = 1s; .timeout = 1s; .window = 5; .threshold = 3;} } sub vcl_recv { if (req.backend.healthy) { set req.grace = 1h; set req.ttl = 5s; } else { # Accept serving stale object (extend TTL by 6h) set req.grace = 6h; } }

VARNISH: SAINT MODE 2/2

!sub vcl_fetch { # keep all objects for 6h beyond their TTL set beresp.grace = 6h; ! # If we fetch a 500, serve stale content instead if (beresp.status == 500 || beresp.status == 502 || beresp.status == 503) { set beresp.saintmode = 30s; return(restart); } }

VARNISH: SAINT MODE

.url: Format the default request with this URL.

.timeout: How fast the probe must finish, you must specify a time unit with the number, such as “0.1 s”, “1230 ms” or even “1 h”.

.interval: How long time to wait between polls, you must specify a time unit here also. Notice that this is not a ‘rate’ but an ‘interval’. The lowest poll rate is (.timeout + .interval).

.window: How many of the latest polls to consider when determining if the backend is healthy.

.threshold: How many of the .window last polls must be good for the backend to be declared healthy.

VARNISH: CUSTOM ERROR PAGE

sub vcl_error { ... # Otherwise, return the custom error page set obj.http.Content-Type = "text/html; charset=utf-8"; synthetic std.fileread("/var/www/example_com/varnish_error.html"); return(deliver); }

● Use a nicely formatted error page instead of the

default white meditation guru

CACHING STRATEGY IN YOUR APP

INEVITABLE QUOTE

!„THERE ARE ONLY TWO HARD THINGS IN

COMPUTER SCIENCE: CACHE INVALIDATION AND NAMING

THINGS, AND OFF-BY-ONE ERRORS.“ !

– PHIL KARLTON

CACHING STRATEGY

● Russian doll caching

● Randomized your cache invalidation for the HTML cache

● Cache buster URL for your HTML cache

● Cache database queries

● More resilient cache backend

RUSSIAN DOLL CACHING

● Nested cache with increasing TTL as you walk down

{% cache MIDDLE_TTL "article_list" request.GET.page last_article.id last_article.last_modified %} {% include "includes/article/list_header.html" %} <div class="article-list"> {% for article in article_list %} {% cache LONG_TTL "article_list_teaser_" article.id article.last_modified %} {% include "includes/article/article_teaser.html" %} {% endcache %} {% endfor %} </div> {% endcache %}

RUSSIAN DOLL CACHING

It get faster as traffic increases

try: expire_time = int(expire_time) expire_time = randint(expire_time * 0.8, expire_time * 1.2) except (ValueError, TypeError): raise TemplateSyntaxError( '"cache" tag got a non-integer timeout value: %r' % expire_time)

RANDOMIZED CACHE TTL

● Do not invalidate all the `X_TTL` at the same time

− Modify cache templatetag: TTL +/- 20%

● Fork the {% cache … %} templatetag

CENTRAL TTL DEFINITION

● Context processor to set TTL

− SHORT_TTL

− MIDDLE_TTL

− LONG_TTL

− FOR_EVER_TTL (* not really)

RESILIENT CACHE BACKEND

● Surviving node outages is not included

− Wrap the Django cache backend in try / except

− You might also want to report it in New Relic

● Fork Django cache backend

CACHE BUSTER URL

● http://example.com/*/?PURGE_CACHE_HTML

● This URL

− traverses your stack

− purges the HTML cache fragment

− generates fresh one

!

● Fork the {% cache … %} templatetag

# johnny/cache.py def enable(): """Enable johnny-cache, for use in scripts, management commands, async workers, or other code outside the Django request flow."""

get_backend().patch()

CACHING DB QUERIES

● Johnny cache

− It is a middleware so there is surprising side effects

− If you change the DB outside request / response

MULTIPLE CACHE BACKENDS

!CACHES = { 'default': { 'BACKEND': 'project.apps.core.backends.cache.PyLibMCCache', 'OPTIONS': cache_opts, 'VERSION': 1}, 'html': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'TEMPLATETAG_CACHE': True, 'VERSION': 1}, 'session': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'VERSION': 1, 'OPTIONS': cache_opts,}, 'johnny': { 'BACKEND': 'myproject.apps.core.backends.cache.JohnnyPyLibMCCache', 'JOHNNY_CACHE': True, 'VERSION': 1} }

CACHED_DB SESSION

SESSION_ENGINE = "Django.contrib.sessions.backends.cached_db" SESSION_CACHE_ALIAS = "session"

AGENDA

● Foreword



● Devops

RESPONSIVE DESIGN IMPACTS

● 3x more image sizes

− Desktop

− Tablet

− Mobile

IMAGE MANAGEMENT

● Django-filer

● Easy-thumbnails

● Cloudfiles (cloud containers)

!

● Assumption of fast & reliable disk should be forgotten

− The software stack is not helping, a lot of work is left to you

● Forked − Dajngo-filer (fork)

− Easy-thumbnails (Fork - very close to to be able to drop it)

− Django-cumulus (81 Forks)

− Monkey patch pyrax

− ...

Heein!!!

DJANGO-CUMULUS

● The truth is much worst

− Log everything from the swiftclient

● Target 0 calls to the API and DB on a hot page

− The main repo is getting better ...

'loggers': { ... 'Django.db': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, }, 'swiftclient': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, },

DJANGO-CUMULUS

● Django storage backend for Cloudfiles from Rakspace

− Be straight to the point when talking to slow API

diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -201,6 +202,19 @@ class SwiftclientStorage(Storage): ... + def save(self, name, content): + """ + Don't check for an available name before saving, just overwrite. + """ + # Get the proper name for the file, as it will actually be saved. + if name is None: + name = content.name + name = self._save(name, content) + # Store filenames with forward slashes, even on Windows + return force_text(name.replace('\\', '/'))

DJANGO-CUMULUS

Trust unreliable API at scale

diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -150,8 +150,11 @@ class SwiftclientStorage(Storage): def _get_object(self, name): """ Helper function to retrieve the requested Object. """ - if self.exists(name): + try: return self.container.get_object(name) + except pyrax.exceptions.NoSuchObject as err: + pass @@ -218,7 +221,7 @@ class SwiftclientStorage(Storage): def exists(self, name): """ exists in the storage system, or False if the name is available for a new file. """ - return name in self.container.get_object_names() + return bool(self._get_object(name))

PATCH PYRAX

● Assume for the best

− Reduce the auth attempts

− Reduce the connection timeout

def patch_things(): # Automatically generate thumbnails for all aliases models.signals.post_save.connect(queue_thumbnail_generation) # Force the retries for pyrax to 1, to stop the request doubling pyrax.cf_wrapper.client.AUTH_ATTEMPTS = 1 pyrax.cf_wrapper.client.CONNECTION_TIMEOUT = 2

GENERATE THE THUMBS

● Generate the thumbs as soon as possible

− post save signals that offload to a task

− easy-thumbnails

def queue_thumbnail_generation(sender, instance, **kwargs): """ Iterate over the sender's fields, and if there is a FileField instance (or a subclass like MultiStorageFileField) send the instance to a task to generate All the thumbnails defined in settings.THUMBNAIL_ALIASES. """ …

PICTUREFILL.JS

… A Responsive Images approach that you can use today that mimics the proposed picture element using spans...

− Old API demonstrated 1.2.1

<span data-picture data-alt="A giant stone facein Angkor Thom, Cambodia"> <span data-src="small.jpg"></span> <span data-src="medium.jpg" data-media="(min-width: 400px)"></span> <span data-src="large.jpg" data-media="(min-width: 800px)"></span> <span data-src="extralarge.jpg" data-media="(min-width: 1000px)"></span>  <noscript> <img src="small.jpg" alt="A giant stone face in Angkor Thom, Cambodia"> </noscript> </span>

PUTTING IT ALL TOGETHER 1/2

 {% extends "base.html" %} {% load image_tags cache_tags pagination_tags %} {% block content %} {% cache MIDDLE_TTL "article_list_" category author tag request.GET.page all_pages %} <div class="article-list archive-list "> {% for article in object_list %} {% cache LONG_TTL "article_teaser_" article.id article.modified %} {% include "newsroom/includes/article_teaser.html" with columntype="categorylist" %} {% endcache %} {% endfor %} </div> {% endcache %}

● Iterate through article_list

● Nested cache

PUTTING IT ALL TOGETHER 2/2

 {% load image_tags %} <section class="blogArticleSection"> {% if article.image %} <a href="{{ article.get_absolute_url }}" class="thumbnail"> <span data-picture data-alt="{{ article.image.default_alt_text }}"> <span data-src="{{ article.image|thumbnail_url:"large" }}"></span> <span data-src="{{ article.image|thumbnail_url:"medium" }}" data-media="(min-width: 480px)"></span> <span data-src="{{ article.image|thumbnail_url:"small" }}" data-media="(min-width: 600px)"></span> <noscript> <img src="{{ article.image|thumbnail_url:"small" }}" alt="{{ article.image.default_alt_text }}"> </noscript> </span> </a> {% endif %} ...

Use Picturefill to render your images

AGENDA

● Foreword



● Devops

DEVOPS

● Configuration management

● Single command deployment for all environments

● Settings parity

CONFIGURATION MANAGEMENT

● Pick one that fits your brain & skillset

− Puppet

− Chef

− Ansible

− Salt

● At Lincoln Loop we are using Salt

− One master per project

− Minion installed on all the cloud servers

SALT

● Provision & deploy a server role ● +X app servers to absorb a traffic spike

● Replace an unsupported OS

● Update a package

● Run a one-liner command − Restart a service on all instances

● Varnish, memcached, ...

− Check the version

SINGLE COMMAND DEPLOYMENT

● One-liner or you will get it wrong

● Consistency for each role is critical

− Avoid endless debugging of pseudo random issue

SETTING PARITY

● Is the Utopia you want to tend to but …

− There are some differences

● Avoid logic in settings.py

● Fetch data from external sources: .env

SETTINGS.PY READS FROM .ENV

import os import ConfigParser from superproject.settings.base import * TEMPLATE_LOADERS = ( ('Django.template.loaders.cached.Loader', TEMPLATE_LOADERS),) config = ConfigParser.ConfigParser() config.read(os.path.abspath(VAR_ROOT + "/../.env")) DATABASES = { 'default': { 'ENGINE': 'Django.db.backends.mysql', 'NAME': config.get("mysql", "mysql_name"), 'USER': config.get("mysql", "mysql_user"), 'PASSWORD': config.get("mysql", "mysql_password"), 'HOST': config.get("mysql", "mysql_host"), 'PORT': config.get("mysql", "mysql_port"), } }

CONCLUSION

● Multi-layer Cache to protect your database − Varnish − Russian doll cache for the HTML fragments

● Smart key naming and invalidation condition ● Cache buster URL

● Image management

− Harder on high traffic responsive site

− Software stack not mature

● Devops

− Configuration management is a must

− Try to have settings parity between your environment

HIGH PERFORMANCE DJANGO

Kickstarter http://lloop.us/hpd

BACKUP SLIDES

A WORD ABOUT LEGACY MIGRATION

● This is often the hardest part to estimates

− Huge volume of data

− Often inconsistent

− Unknown implicit business logic

!

● At scale if something can go wrong it will

● It always take longer

REUSING PUBLISHED APPLICATIONS

● Careful review before adding an external requirements

− Read the code

● Best practice

● Security audit

− Can operate at your targeted scale

− In line with the rest of your project

● It is not a binary choice you can

− extract a very small part

− Write your own version based on what you learned

challenges when building high profile editorial sites

Engineering