varnish cache plus. random notes for wise web developers

Varnish Cache PlusRandom notes for wise web developers

Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014

Agenda

1. Introduction

2. Varnish 101

3. Invalidations

4. HTTP headers

5. Content composition

6. VAC

7. VCS

8. Device detection

9. Varnish Plus 4.x

10. Q&A

1. Introduction

Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed

‣ This is not the official Varnish Cache training

‣ This is not a Varnish Cache internals course

‣ This is not a Varnish module development course

‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus

๏ OSS Varnish Cache vs. Varnish Cache Plus

‣ 3.x vs. 4.x

Varnish Cache 3.x

๏ The Varnish Book

‣ https://www.varnish-software.com/static/book/

๏ The Varnish Reference Manual

‣ https://www.varnish-cache.org/docs/.../index.html

๏ Default VCL

‣ https://www.varnish-cache.org/trac/.../default.vcl

What everybody should know

https://www.varnish-software.com/static/book/

https://www.varnish-cache.org/docs/3.0/reference/index.html

https://www.varnish-cache.org/trac/browser/bin/varnishd/default.vcl?rev=3.0

Varnish Cache Plus 3.x

๏ Support, advise & training

๏ Varnish Enhanced Cache Invalidation

‣ Hash Two, Hash Ninja…

๏ Varnish Administration Console (VAC)

๏ Varnish Custom Statistics (VCS)

๏ Device detection

Components I


๏ Varnish Tuner

๏ Enhanced HTTP streaming

๏ Packaged binary VMODs

๏ Varnish Paywall

๏ … and more to come shortly!

Components II


๏ 64 bits

๏ Distributions

‣ RedHat Enterprise Linux 5 & 6

‣ Ubuntu Linux 12.04 LTS (precise)

‣ Ubuntu Linux 14.04 LTS (trusty)

‣ Debian Linux 7 (wheezy)

Supported platforms

2. Varnish 101

Caching policy

๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens

‣ Correct HTTP caching headers

‣ Vary HTTP header used wisely

‣ HTTP cookies used conservatively

๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*' Vary HTTP header

VCL

๏ Varnish Configuration Language

‣ Domain specific state engine

‣ No loops, variables, functions…

‣ Command line configuration & Tunable parameters

๏ Translated to C code

๏ Loaded as a dynamically generated shared library

‣ Zero downtime & Blazingly fast

Overview

VCL

๏ Normalize client-input

๏ Pick a backend / director

๏ Re-write / extend client-input

๏ Decide caching policy based on client-input

๏ Access control

๏ Security barriers

vcl_recv I

VCLvcl_recv II

sub vcl_recv { # Backend selection & URL normalization. if (req.http.host ~ "^blogs\.") { set req.backend = blogs; set req.http.host = regsub(req.http.host,"^blogs\.", ""); set req.url = regsub(req.url, "^", "/blogs"); } else { set req.backend = default; } # Poor man's device detection. if (req.http.User-‐Agent ~ "(iPad|iPhone|Android)") { set req.http.X-‐Device = "mobile"; } else { set req.http.X-‐Device = "desktop"; }}

VCL

๏ Sanitize / extend backend response

๏ Override cache duration

‣ beresp.ttl

- s-‐maxage & maxage in Cache-‐Control HTTP header

- Expires HTTP header

- Default TTL

‣ Beware with TTL of hitpass objects

vcl_fetch I

VCLvcl_fetch II

sub vcl_fetch { # Override caching TTL. if (beresp.http.Cache-‐Control !~ "s-‐maxage") { set beresp.ttl = 0; if (bereq.url ~ "\.jpg(\?|$)") { set beresp.ttl = 30s; } } # Never cache a Set-‐Cookie header. if (beresp.ttl > 0s) { unset beresp.http.Set-‐Cookie; } # Create ban-‐lurker friendly objects. set beresp.http.X-‐Url = bereq.url;}

VCLRequest flow I

VCLRequest flow II

Process architecture

VMODs

๏ Shared libraries extending the VCL core

‣ std VMOD

- std.toupper(), std.log(), std.fileread()…

‣ ABI (Application Binary Interface) mismatches

๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…

๏ https://www.varnish-cache.org/vmods

https://www.varnish-cache.org/vmods

Backends

๏ Multiple backends

‣ Selected at request time based on any request property

๏ Probes

‣ Per-backend periodic health checks

- Interval, timeout, expected response…

๏ Directors

‣ Load balanced backend groups

Error handling

๏ Some backend may be sick for a particular object

‣ Other objects from the same backend can still be accessed

- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend

๏ Do not request again the object to that backend for a period of time

‣ Grace mode is used when all possible backends for the requested object have been blacklisted

๏ Complement backend probes

Saint mode

Error handling

๏ A graced object is an object that has expired, but is still kept in cache

‣ beresp.ttl vs. beresp.grace

๏ Graced objects are used to

‣ Serve outdated content if the backend is down

- Probes or saint mode is required for this

‣ Serve sightly staled content while fresh versions are fetched

Grace mode

Beyond caching policy

๏ Why restricting VCL / VMODs to implement the caching policy?

๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer

‣ 1000x times faster than typical Java / PHP apps

- Strong restrictions

‣ Accounting, paywalling, A/B testing…

varnishtest

๏ Powerful Varnish-specific testing tool

‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances

‣ http://www.clock.co.uk/...varnishtest

๏ Essential when implementing complex VCL logic

๏ Easily integrable in any CI infrastructure

http://www.clock.co.uk/blog/getting-started-with-varnishtest

FAQ๏ When SSL support will be implemented?

‣ "[...] huge waste of time and effort to even think about it."

๏ When SPDY support will be implemented?

‣ "[...] Varnish is not speedy, Varnish is fast! [...]"

๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?

‣ Use Varnish Tuner + Fine tune based on necessity

‣ Pay attention to workspaces & syslog messages

https://www.varnish-cache.org/docs/trunk/phk/ssl.html

https://www.varnish-cache.org/docs/trunk/phk/spdy.html

3. Invalidations

Overview

๏ Updated objects may be available before TTL expiration

‣ Purges

‣ Forced misses

‣ Bans

‣ Hash Two / Hash Ninja / …

Purges

๏ VCL

๏ Eagerly discards an object along with all its variants

Overview

acl internal { "localhost"; "192.168.55.0"/24;}

sub vcl_recv { if (req.request == "PURGE") { if (client.ip !~ internal) { error 405 "Not allowed."; } return (lookup); }}

sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged."; }}

sub vcl_miss { if (req.request == "PURGE") { purge; error 200 "Purged."; }}

Purges

๏ What if the new object cannot be fetched after the invalidation?

‣ Soft-purges VMOD

‣ Forces misses

๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?

‣ Bans

‣ Hash Two

Downsides I

Purges

๏ How to invalidate hitpass objects?

‣ Not possible in Varnish Cache Plus 3.x

- Redesigned in Varnish Cache Plus 4.x

- https://www.varnish-cache.org/trac/.../1033

‣ return(pass); during vcl_recv is preferred when possible

Downsides II

https://www.varnish-cache.org/trac/ticket/1033

Forced misses

๏ VCL

๏ Forces a cache miss for the request

‣ Useful for cache priming scripts

Overview

sub vcl_recv { if (req.http.X-‐Priming-‐Script) { ... set req.hash_always_miss = true; } ...}

Forced misses

๏ Object will always be (re)fetched from the backend

๏ New object is put into cache and used from that point onward

‣ Old object is not evicted until it’s safe to do so

‣ Controls who takes the penalty of waiting for an updated object

๏ Old objects are not freed up until expiration

‣ This is considered a flaw and a fix is expected

Behavior

Bans

๏ VCL or CLI

๏ Lazily discards multiple objects matching an expression

‣ Logical operators + Object attributes + Regular expressions

‣ Only works on objects already in the cache

๏ Ban lurker

‣ Frees up memory + Keeps the ban list at a manageable size

‣ obj.* based expressions

Overview

BansExample

sub vcl_recv { if (req.request == "BAN") { ... if (!req.http.X-‐Ban-‐Url-‐Regexp) { error 400 "Empty URL regexp."; } ban("obj.http.X-‐Url ~ " + req.http.X-‐Ban-‐Url-‐Regexp); }}

sub vcl_fetch { set beresp.http.X-‐Url = req.url;}

sub vcl_deliver { unset resp.http.X-‐Url;}

Hash Two

๏ VCL + VMOD

๏ Workarounds bans scalability

Overview

HTTP/1.x 200 OKTransfer-‐Encoding: chunked...X-‐Tags: C10 P42 P236 P857...

ban obj.http.X-‐Tags ~ "(\s|^)P42(\s|$)"

Hash TwoExample

import hashtwo;

sub vcl_recv { if (req.request == "PURGE") { ... if (hashtwo.purge(req.http.X-‐Tag) != 0) { error 200 "Purged."; } else { error 404 "Not found."; } }}

sub vcl_fetch { set beresp.http.X-‐HashTwo = beresp.http.X-‐Tags; }

4. HTTP headers

Cache related headers

๏ Expires

๏ Cache-Control

๏ Last-Modified

๏ If-Modified-Since

๏ If-None-Match

๏ Etag

๏ Pragma

๏ Vary

๏ Age

Cache-Control

๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)

Overview

‣ public | private

‣ no-‐store

‣ no-‐cache

‣ max-‐age

‣ s-‐maxage

‣ must-‐revalidate

‣ no-‐transform

‣ …

Cache-Control

๏ Ignored in incoming client HTTP requests

๏ Only s-‐maxage & max-‐age used in backend HTTP responses to calculate default TTL

‣ Always overrides Expires header

‣ Beware of Age header in client responses

- Objects not cached client side

- https://www.varnish-cache.org/...Caching

beresp.ttl

https://www.varnish-cache.org/trac/wiki/VCLExampleLongerCaching

Vary

๏ Indicates the response returned by the backend server may vary depending on headers received in the request

๏ Object variants & Hit ratio

‣ Vary: Accept-‐Encoding

- Normalization of Accept-‐Encoding header is not required

‣ Vary: User-‐Agent

5. Content composition

Overview๏ Break objects into smaller fragments

‣ Separate cache policy for each fragment

‣ Increase hit ratio

๏ Tools

‣ Edge Side Includes (ESI)

‣ AJAX

- Beware of RTT & Cross domain policy

Edge Side Includes

๏ Subset of ESI Language Specification 1.0

‣ <esi:include src="<URL> " />

‣ <esi:remove>...</esi:remove>

‣ <!-‐-‐esi ...—>

๏ set beresp.do_esi = true;

‣ Separate Varnish requests

๏ Testing ESI in dev environment

http://www.w3.org/TR/esi-lang

6. VAC

Overview

๏ Central control of Varnish Cache Plus servers

‣ Web UI + RESTful API

- Super Fast Purger

๏ Cache group management

‣ Real time statistics, VCL editor, ban submission…

๏ Varnish Agent 2

Super Fast Purger

๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers

‣ Leverages speed & flexibility of VCL

‣ Keep-alive workaround

๏ Part of the VAC RESTful API

‣ Trivially integrable in existing applications

Change management

๏ Easily integrable using the VAC RESTful API

‣ git, Mercurial… hooks

‣ Jenkins, Travis, GitLab… CI scripts

๏ Manual VCL bundle generation

๏ Orchestrated / programmed deployments, rollbacks, etc.

7. VCS

Overview

๏ Real-time aggregated statistics

‣ Multiple vstatdprobe daemons

‣ One vstatd daemon

‣ JSON + Time series API

๏ VSM log based

‣ Efficient circular in-memory data structure

‣ std.log("vcs-‐key:" + <key suffix>);

Some ideas

๏ Trending articles or sale products

๏ Cache hits and cache misses

๏ URLs with long load times

๏ URLs with the most 5xx response codes

๏ Where traffic is coming from

๏ …

Example

sub vcl_deliver { std.log("vcs-‐key:" + req.http.host); std.log("vcs-‐key:" + req.http.host + req.url); std.log("vcs-‐key:TOTAL"); if (obj.hits == 0) { std.log("vcs-‐key:MISS"); } }

API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,

#3xx…) for key named “example.com" during the last time windows

‣ GET /key/example.com

๏ Keys that produced the most 5xx responses during the last time window

‣ GET /all/top_5xx

๏ Top 5 requested keys during the last time window

‣ GET /all/top/5?verbose=1

API II

๏ Top 10 most requested keys ending with ‘.gif' during the last time window

‣ GET /match/(.*)%5C.gif$/top

๏ Top 50 slowest backend requests aggregating the last 20 time windows

‣ GET /all/top_ttfb/50?b=20

8. Device detection

Overview๏ VMOD

๏ DeviceAtlas

‣ https://deviceatlas.com

‣ Database locally deployed & Daily updated

๏ OSS alternatives

‣ https://github.com/serbanghita/Mobile-Detect

‣ …

Example

import deviceatlas;

sub vcl_recv { if (deviceatlas.lookup(req.http.User-‐Agent, "isMobilePhone") == "1") { set req.http.X-‐Device = "mobile"; } elsif (deviceatlas.lookup(req.http.User-‐Agent, "isTablet") == "1") { set req.http.X-‐Device = "tablet"; } else { set req.http.X-‐Device = "desktop"; }}

Some ideas

๏ Redirections based on device properties

๏ Backend selection based on device properties

๏ Normalization of the UA header

‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs

๏ …

9. Varnish Plus 4.x

Highlights๏ Client / backend thread split

‣ Background content refreshing

๏ Redesigned purges

‣ return(purge); during vcl_recv

๏ Directors implemented as VMODs

‣ Consistent hashing director

๏ Distinction between error & synthetic responses

10. Q&A

varnish cache plus. random notes for wise web developers

Technology

oss varnish cache

default varnish cache

varnish cache plusrandom

varnish cache internals

varnish book https

cachecontrol http header

binary vmods varnish

varnish reference manual