challenges when building high profile editorial sites
DESCRIPTION
This talk will be a walk through the challenges encountered when building a high profile editorial sites. My goal is to present some of the common pitfalls we have encountered at Lincoln Loop and to explain how we solved: * Legacy migration always take longer * devops * Multiple environment * Easy deployment * Responsive design impacts the backend * Journey of an image * Picturefill.js * Danger of reusing published django applications * Caching strategy * Html fragment * Varnish Audience Decision maker that are going to rebuild their magazine Developer bidding for this kind of projects for the first timeTRANSCRIPT
!
BUILDING HIGH PROFILE EDITORIAL SITES
YANN MALET2014.DJANGOCON.EU
MAY 2014
ABOUT THIS TALK
● It comes after
− Data Herding: How to Shepherd Your Flock Through Valleys of Darkness (2010)
− Breaking down the process of building a custom CMS (2010)
− Stop Tilting at Windmills - Spotting Bottlenecks (2011)
AGENDA
● Foreword
● Multi layer cache to protect your database
● Image management on responsive site
● Devops
HIGH PERFORMANCE
Django is web scale...
… AS ANYTHING ELSE ...
AGENDA
● Foreword
● Multi layer cache to protect your database
● Image management on responsive site
● Devops
VARNISH CACHE
VARNISH
● Varnish Cache is a web application accelerator
− aka caching HTTP reverse proxy
− 10 – 1000 times faster
!
● This is hard stuff don't try to reinvent this wheel
VARNISH: TIPS AND TRICKS
● Strip cookies
● Saint Mode
● Custom error better than guru meditation
STRIP COOKIES
● Increasing hit rate is all about reducing
− Vary: on parameters
● Accept-Language
● Cookie
STRIP COOKIES
sub vcl_recv { # unless sessionid/csrftoken is in the request, # don't pass ANY cookies (referral_source, utm, etc) if (req.request == "GET" && (req.url ~ "^/static" || (req.http.cookie !~ "sessionid" && req.http.cookie !~ "csrftoken"))) { remove req.http.Cookie; } ... } sub vcl_fetch { # pass through for anything with a session/csrftoken set if (beresp.http.set-cookie ~ "sessionid" || beresp.http.set-cookie ~ "csrftoken") { return (pass); } else { return (deliver); } ... }
VARNISH: SAINT MODE
● Varnish Saint Mode lets you serve stale content from cache, even when your backend servers are unavailable.
− http://lincolnloop.com/blog/varnish-saint-mode/
VARNISH: SAINT MODE 1/2
# /etc/varnish/default.vcl backend default { .host = "127.0.0.1"; .port = "8000"; .saintmode_threshold = 0; .probe = { .url = "/"; .interval = 1s; .timeout = 1s; .window = 5; .threshold = 3;} } sub vcl_recv { if (req.backend.healthy) { set req.grace = 1h; set req.ttl = 5s; } else { # Accept serving stale object (extend TTL by 6h) set req.grace = 6h; } }
VARNISH: SAINT MODE 2/2
!sub vcl_fetch { # keep all objects for 6h beyond their TTL set beresp.grace = 6h; ! # If we fetch a 500, serve stale content instead if (beresp.status == 500 || beresp.status == 502 || beresp.status == 503) { set beresp.saintmode = 30s; return(restart); } }
VARNISH: SAINT MODE
.url: Format the default request with this URL.
.timeout: How fast the probe must finish, you must specify a time unit with the number, such as “0.1 s”, “1230 ms” or even “1 h”.
.interval: How long time to wait between polls, you must specify a time unit here also. Notice that this is not a ‘rate’ but an ‘interval’. The lowest poll rate is (.timeout + .interval).
.window: How many of the latest polls to consider when determining if the backend is healthy.
.threshold: How many of the .window last polls must be good for the backend to be declared healthy.
VARNISH: CUSTOM ERROR PAGE
sub vcl_error { ... # Otherwise, return the custom error page set obj.http.Content-Type = "text/html; charset=utf-8"; synthetic std.fileread("/var/www/example_com/varnish_error.html"); return(deliver); }
● Use a nicely formatted error page instead of the
default white meditation guru
CACHING STRATEGY IN YOUR APP
INEVITABLE QUOTE
!„THERE ARE ONLY TWO HARD THINGS IN
COMPUTER SCIENCE: CACHE INVALIDATION AND NAMING
THINGS, AND OFF-BY-ONE ERRORS.“ !
– PHIL KARLTON
CACHING STRATEGY
● Russian doll caching
● Randomized your cache invalidation for the HTML cache
● Cache buster URL for your HTML cache
● Cache database queries
● More resilient cache backend
RUSSIAN DOLL CACHING
● Nested cache with increasing TTL as you walk down
{% cache MIDDLE_TTL "article_list" request.GET.page last_article.id last_article.last_modified %} {% include "includes/article/list_header.html" %} <div class="article-list"> {% for article in article_list %} {% cache LONG_TTL "article_list_teaser_" article.id article.last_modified %} {% include "includes/article/article_teaser.html" %} {% endcache %} {% endfor %} </div> {% endcache %}
RUSSIAN DOLL CACHING
It get faster as traffic increases
try: expire_time = int(expire_time) expire_time = randint(expire_time * 0.8, expire_time * 1.2) except (ValueError, TypeError): raise TemplateSyntaxError( '"cache" tag got a non-integer timeout value: %r' % expire_time)
RANDOMIZED CACHE TTL
● Do not invalidate all the `X_TTL` at the same time
− Modify cache templatetag: TTL +/- 20%
● Fork the {% cache … %} templatetag
CENTRAL TTL DEFINITION
● Context processor to set TTL
− SHORT_TTL
− MIDDLE_TTL
− LONG_TTL
− FOR_EVER_TTL (* not really)
RESILIENT CACHE BACKEND
● Surviving node outages is not included
− Wrap the Django cache backend in try / except
− You might also want to report it in New Relic
● Fork Django cache backend
CACHE BUSTER URL
● http://example.com/*/?PURGE_CACHE_HTML
● This URL
− traverses your stack
− purges the HTML cache fragment
− generates fresh one
!
● Fork the {% cache … %} templatetag
# johnny/cache.py def enable(): """Enable johnny-cache, for use in scripts, management commands, async workers, or other code outside the Django request flow."""
get_backend().patch()
CACHING DB QUERIES
● Johnny cache
− It is a middleware so there is surprising side effects
− If you change the DB outside request / response
MULTIPLE CACHE BACKENDS
!CACHES = { 'default': { 'BACKEND': 'project.apps.core.backends.cache.PyLibMCCache', 'OPTIONS': cache_opts, 'VERSION': 1}, 'html': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'TEMPLATETAG_CACHE': True, 'VERSION': 1}, 'session': { 'BACKEND': 'myproject.apps.core.backends.cache.PyLibMCCache', 'VERSION': 1, 'OPTIONS': cache_opts,}, 'johnny': { 'BACKEND': 'myproject.apps.core.backends.cache.JohnnyPyLibMCCache', 'JOHNNY_CACHE': True, 'VERSION': 1} }
CACHED_DB SESSION
SESSION_ENGINE = "Django.contrib.sessions.backends.cached_db" SESSION_CACHE_ALIAS = "session"
AGENDA
● Foreword
● Multi layer cache to protect your database
● Image management on responsive site
● Devops
RESPONSIVE DESIGN IMPACTS
● 3x more image sizes
− Desktop
− Tablet
− Mobile
IMAGE MANAGEMENT
● Django-filer
● Easy-thumbnails
● Cloudfiles (cloud containers)
!
● Assumption of fast & reliable disk should be forgotten
− The software stack is not helping, a lot of work is left to you
● Forked − Dajngo-filer (fork)
− Easy-thumbnails (Fork - very close to to be able to drop it)
− Django-cumulus (81 Forks)
− Monkey patch pyrax
− ...
Heein!!!
DJANGO-CUMULUS
● The truth is much worst
− Log everything from the swiftclient
● Target 0 calls to the API and DB on a hot page
− The main repo is getting better ...
'loggers': { ... 'Django.db': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, }, 'swiftclient': { 'handlers': ['console'], 'level': 'DEBUG', 'propagate': True, },
DJANGO-CUMULUS
● Django storage backend for Cloudfiles from Rakspace
− Be straight to the point when talking to slow API
diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -201,6 +202,19 @@ class SwiftclientStorage(Storage): ... + def save(self, name, content): + """ + Don't check for an available name before saving, just overwrite. + """ + # Get the proper name for the file, as it will actually be saved. + if name is None: + name = content.name + name = self._save(name, content) + # Store filenames with forward slashes, even on Windows + return force_text(name.replace('\\', '/'))
DJANGO-CUMULUS
Trust unreliable API at scale
diff --git a/cumulus/storage.py b/cumulus/storage.py @@ -150,8 +150,11 @@ class SwiftclientStorage(Storage): def _get_object(self, name): """ Helper function to retrieve the requested Object. """ - if self.exists(name): + try: return self.container.get_object(name) + except pyrax.exceptions.NoSuchObject as err: + pass @@ -218,7 +221,7 @@ class SwiftclientStorage(Storage): def exists(self, name): """ exists in the storage system, or False if the name is available for a new file. """ - return name in self.container.get_object_names() + return bool(self._get_object(name))
PATCH PYRAX
● Assume for the best
− Reduce the auth attempts
− Reduce the connection timeout
def patch_things(): # Automatically generate thumbnails for all aliases models.signals.post_save.connect(queue_thumbnail_generation) # Force the retries for pyrax to 1, to stop the request doubling pyrax.cf_wrapper.client.AUTH_ATTEMPTS = 1 pyrax.cf_wrapper.client.CONNECTION_TIMEOUT = 2
GENERATE THE THUMBS
● Generate the thumbs as soon as possible
− post save signals that offload to a task
− easy-thumbnails
def queue_thumbnail_generation(sender, instance, **kwargs): """ Iterate over the sender's fields, and if there is a FileField instance (or a subclass like MultiStorageFileField) send the instance to a task to generate All the thumbnails defined in settings.THUMBNAIL_ALIASES. """ …
PICTUREFILL.JS
… A Responsive Images approach that you can use today that mimics the proposed picture element using spans...
− Old API demonstrated 1.2.1
<span data-picture data-alt="A giant stone facein Angkor Thom, Cambodia"> <span data-src="small.jpg"></span> <span data-src="medium.jpg" data-media="(min-width: 400px)"></span> <span data-src="large.jpg" data-media="(min-width: 800px)"></span> <span data-src="extralarge.jpg" data-media="(min-width: 1000px)"></span> <!-- Fallback content for non-JS browsers. Same img src as the initial, unqualified source element. --> <noscript> <img src="small.jpg" alt="A giant stone face in Angkor Thom, Cambodia"> </noscript> </span>
PUTTING IT ALL TOGETHER 1/2
<!-- article_list.html --> {% extends "base.html" %} {% load image_tags cache_tags pagination_tags %} {% block content %} {% cache MIDDLE_TTL "article_list_" category author tag request.GET.page all_pages %} <div class="article-list archive-list "> {% for article in object_list %} {% cache LONG_TTL "article_teaser_" article.id article.modified %} {% include "newsroom/includes/article_teaser.html" with columntype="categorylist" %} {% endcache %} {% endfor %} </div> {% endcache %}
● Iterate through article_list
● Nested cache
PUTTING IT ALL TOGETHER 2/2
<!-- article_teaser.html --> {% load image_tags %} <section class="blogArticleSection"> {% if article.image %} <a href="{{ article.get_absolute_url }}" class="thumbnail"> <span data-picture data-alt="{{ article.image.default_alt_text }}"> <span data-src="{{ article.image|thumbnail_url:"large" }}"></span> <span data-src="{{ article.image|thumbnail_url:"medium" }}" data-media="(min-width: 480px)"></span> <span data-src="{{ article.image|thumbnail_url:"small" }}" data-media="(min-width: 600px)"></span> <noscript> <img src="{{ article.image|thumbnail_url:"small" }}" alt="{{ article.image.default_alt_text }}"> </noscript> </span> </a> {% endif %} ...
Use Picturefill to render your images
AGENDA
● Foreword
● Multi layer cache to protect your database
● Image management on responsive site
● Devops
DEVOPS
● Configuration management
● Single command deployment for all environments
● Settings parity
CONFIGURATION MANAGEMENT
● Pick one that fits your brain & skillset
− Puppet
− Chef
− Ansible
− Salt
● At Lincoln Loop we are using Salt
− One master per project
− Minion installed on all the cloud servers
SALT
● Provision & deploy a server role ● +X app servers to absorb a traffic spike
● Replace an unsupported OS
● Update a package
● Run a one-liner command − Restart a service on all instances
● Varnish, memcached, ...
− Check the version
SINGLE COMMAND DEPLOYMENT
● One-liner or you will get it wrong
● Consistency for each role is critical
− Avoid endless debugging of pseudo random issue
SETTING PARITY
● Is the Utopia you want to tend to but …
− There are some differences
● Avoid logic in settings.py
● Fetch data from external sources: .env
SETTINGS.PY READS FROM .ENV
import os import ConfigParser from superproject.settings.base import * TEMPLATE_LOADERS = ( ('Django.template.loaders.cached.Loader', TEMPLATE_LOADERS),) config = ConfigParser.ConfigParser() config.read(os.path.abspath(VAR_ROOT + "/../.env")) DATABASES = { 'default': { 'ENGINE': 'Django.db.backends.mysql', 'NAME': config.get("mysql", "mysql_name"), 'USER': config.get("mysql", "mysql_user"), 'PASSWORD': config.get("mysql", "mysql_password"), 'HOST': config.get("mysql", "mysql_host"), 'PORT': config.get("mysql", "mysql_port"), } }
CONCLUSION
● Multi-layer Cache to protect your database − Varnish − Russian doll cache for the HTML fragments
● Smart key naming and invalidation condition ● Cache buster URL
● Image management
− Harder on high traffic responsive site
− Software stack not mature
● Devops
− Configuration management is a must
− Try to have settings parity between your environment
HIGH PERFORMANCE DJANGO
Kickstarter http://lloop.us/hpd
BACKUP SLIDES
A WORD ABOUT LEGACY MIGRATION
● This is often the hardest part to estimates
− Huge volume of data
− Often inconsistent
− Unknown implicit business logic
!
● At scale if something can go wrong it will
● It always take longer
REUSING PUBLISHED APPLICATIONS
● Careful review before adding an external requirements
− Read the code
● Best practice
● Security audit
− Can operate at your targeted scale
− In line with the rest of your project
● It is not a binary choice you can
− extract a very small part
− Write your own version based on what you learned