varnish cache plus. random notes for wise web developers
DESCRIPTION
Collection of random notes for web developers willing to make the most of Varnish Cache Plus.TRANSCRIPT
Varnish Cache PlusRandom notes for wise web developers
Carlos Abalde, Roberto Moreda {cabalde, moreda}@allenta.com October 2014
Agenda
1. Introduction
2. Varnish 101
3. Invalidations
4. HTTP headers
5. Content composition
6. VAC
7. VCS
8. Device detection
9. Varnish Plus 4.x
10. Q&A
Disclaimer๏ General understanding of ‘The Varnish Book’ is assumed
‣ This is not the official Varnish Cache training
‣ This is not a Varnish Cache internals course
‣ This is not a Varnish module development course
‣ This is a collection of random notes for web developers willing to make the most of Varnish Cache Plus
๏ OSS Varnish Cache vs. Varnish Cache Plus
‣ 3.x vs. 4.x
Varnish Cache 3.x
๏ The Varnish Book
‣ https://www.varnish-software.com/static/book/
๏ The Varnish Reference Manual
‣ https://www.varnish-cache.org/docs/.../index.html
๏ Default VCL
‣ https://www.varnish-cache.org/trac/.../default.vcl
What everybody should know
Varnish Cache Plus 3.x
๏ Support, advise & training
๏ Varnish Enhanced Cache Invalidation
‣ Hash Two, Hash Ninja…
๏ Varnish Administration Console (VAC)
๏ Varnish Custom Statistics (VCS)
๏ Device detection
Components I
Varnish Cache Plus 3.x
๏ Varnish Tuner
๏ Enhanced HTTP streaming
๏ Packaged binary VMODs
๏ Varnish Paywall
๏ … and more to come shortly!
Components II
Varnish Cache Plus 3.x
๏ 64 bits
๏ Distributions
‣ RedHat Enterprise Linux 5 & 6
‣ Ubuntu Linux 12.04 LTS (precise)
‣ Ubuntu Linux 14.04 LTS (trusty)
‣ Debian Linux 7 (wheezy)
Supported platforms
Caching policy
๏ Varnish Cache Plus would require zero configuration in a perfect world with perfect HTTP citizens
‣ Correct HTTP caching headers
‣ Vary HTTP header used wisely
‣ HTTP cookies used conservatively
๏ By default Varnish Cache Plus will not cache anything marked as private, carrying a cookie or including a '*' Vary HTTP header
VCL
๏ Varnish Configuration Language
‣ Domain specific state engine
‣ No loops, variables, functions…
‣ Command line configuration & Tunable parameters
๏ Translated to C code
๏ Loaded as a dynamically generated shared library
‣ Zero downtime & Blazingly fast
Overview
VCL
๏ Normalize client-input
๏ Pick a backend / director
๏ Re-write / extend client-input
๏ Decide caching policy based on client-input
๏ Access control
๏ Security barriers
vcl_recv I
VCLvcl_recv II
sub vcl_recv { # Backend selection & URL normalization. if (req.http.host ~ "^blogs\.") { set req.backend = blogs; set req.http.host = regsub(req.http.host,"^blogs\.", ""); set req.url = regsub(req.url, "^", "/blogs"); } else { set req.backend = default; } # Poor man's device detection. if (req.http.User-‐Agent ~ "(iPad|iPhone|Android)") { set req.http.X-‐Device = "mobile"; } else { set req.http.X-‐Device = "desktop"; }}
VCL
๏ Sanitize / extend backend response
๏ Override cache duration
‣ beresp.ttl
- s-‐maxage & maxage in Cache-‐Control HTTP header
- Expires HTTP header
- Default TTL
‣ Beware with TTL of hitpass objects
vcl_fetch I
VCLvcl_fetch II
sub vcl_fetch { # Override caching TTL. if (beresp.http.Cache-‐Control !~ "s-‐maxage") { set beresp.ttl = 0; if (bereq.url ~ "\.jpg(\?|$)") { set beresp.ttl = 30s; } } # Never cache a Set-‐Cookie header. if (beresp.ttl > 0s) { unset beresp.http.Set-‐Cookie; } # Create ban-‐lurker friendly objects. set beresp.http.X-‐Url = bereq.url;}
VMODs
๏ Shared libraries extending the VCL core
‣ std VMOD
- std.toupper(), std.log(), std.fileread()…
‣ ABI (Application Binary Interface) mismatches
๏ cookie, header, var, curl, digest, geoip, boltsort, memcached, redis, dns…
๏ https://www.varnish-cache.org/vmods
Backends
๏ Multiple backends
‣ Selected at request time based on any request property
๏ Probes
‣ Per-backend periodic health checks
- Interval, timeout, expected response…
๏ Directors
‣ Load balanced backend groups
Error handling
๏ Some backend may be sick for a particular object
‣ Other objects from the same backend can still be accessed
- Unless more than a set amount of objects are added to the saint mode blacklist for a specific backend
๏ Do not request again the object to that backend for a period of time
‣ Grace mode is used when all possible backends for the requested object have been blacklisted
๏ Complement backend probes
Saint mode
Error handling
๏ A graced object is an object that has expired, but is still kept in cache
‣ beresp.ttl vs. beresp.grace
๏ Graced objects are used to
‣ Serve outdated content if the backend is down
- Probes or saint mode is required for this
‣ Serve sightly staled content while fresh versions are fetched
Grace mode
Beyond caching policy
๏ Why restricting VCL / VMODs to implement the caching policy?
๏ Any logic modeled in VCL / VMODs is compiled, embedded & executed in the caching edger layer
‣ 1000x times faster than typical Java / PHP apps
- Strong restrictions
‣ Accounting, paywalling, A/B testing…
varnishtest
๏ Powerful Varnish-specific testing tool
‣ Mocked clients & backends executing / processing HTTP requests against real Varnish Cache Plus instances
‣ http://www.clock.co.uk/...varnishtest
๏ Essential when implementing complex VCL logic
๏ Easily integrable in any CI infrastructure
FAQ๏ When SSL support will be implemented?
‣ "[...] huge waste of time and effort to even think about it."
๏ When SPDY support will be implemented?
‣ "[...] Varnish is not speedy, Varnish is fast! [...]"
๏ What is the recommended value for this bizarre kernel / varnishd parameter I found in some random blog?
‣ Use Varnish Tuner + Fine tune based on necessity
‣ Pay attention to workspaces & syslog messages
Overview
๏ Updated objects may be available before TTL expiration
‣ Purges
‣ Forced misses
‣ Bans
‣ Hash Two / Hash Ninja / …
Purges
๏ VCL
๏ Eagerly discards an object along with all its variants
Overview
acl internal { "localhost"; "192.168.55.0"/24;}
sub vcl_recv { if (req.request == "PURGE") { if (client.ip !~ internal) { error 405 "Not allowed."; } return (lookup); }}
sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged."; }}
sub vcl_miss { if (req.request == "PURGE") { purge; error 200 "Purged."; }}
Purges
๏ What if the new object cannot be fetched after the invalidation?
‣ Soft-purges VMOD
‣ Forces misses
๏ What if multiple objects need to be invalidated? What if objects need to be invalidated too frequently?
‣ Bans
‣ Hash Two
Downsides I
Purges
๏ How to invalidate hitpass objects?
‣ Not possible in Varnish Cache Plus 3.x
- Redesigned in Varnish Cache Plus 4.x
- https://www.varnish-cache.org/trac/.../1033
‣ return(pass); during vcl_recv is preferred when possible
Downsides II
Forced misses
๏ VCL
๏ Forces a cache miss for the request
‣ Useful for cache priming scripts
Overview
sub vcl_recv { if (req.http.X-‐Priming-‐Script) { ... set req.hash_always_miss = true; } ...}
Forced misses
๏ Object will always be (re)fetched from the backend
๏ New object is put into cache and used from that point onward
‣ Old object is not evicted until it’s safe to do so
‣ Controls who takes the penalty of waiting for an updated object
๏ Old objects are not freed up until expiration
‣ This is considered a flaw and a fix is expected
Behavior
Bans
๏ VCL or CLI
๏ Lazily discards multiple objects matching an expression
‣ Logical operators + Object attributes + Regular expressions
‣ Only works on objects already in the cache
๏ Ban lurker
‣ Frees up memory + Keeps the ban list at a manageable size
‣ obj.* based expressions
Overview
BansExample
sub vcl_recv { if (req.request == "BAN") { ... if (!req.http.X-‐Ban-‐Url-‐Regexp) { error 400 "Empty URL regexp."; } ban("obj.http.X-‐Url ~ " + req.http.X-‐Ban-‐Url-‐Regexp); }}
sub vcl_fetch { set beresp.http.X-‐Url = req.url;}
sub vcl_deliver { unset resp.http.X-‐Url;}
Hash Two
๏ VCL + VMOD
๏ Workarounds bans scalability
Overview
HTTP/1.x 200 OKTransfer-‐Encoding: chunked...X-‐Tags: C10 P42 P236 P857...
ban obj.http.X-‐Tags ~ "(\s|^)P42(\s|$)"
Hash TwoExample
import hashtwo;
sub vcl_recv { if (req.request == "PURGE") { ... if (hashtwo.purge(req.http.X-‐Tag) != 0) { error 200 "Purged."; } else { error 404 "Not found."; } }}
sub vcl_fetch { set beresp.http.X-‐HashTwo = beresp.http.X-‐Tags; }
Cache related headers
๏ Expires
๏ Cache-Control
๏ Last-Modified
๏ If-Modified-Since
๏ If-None-Match
๏ Etag
๏ Pragma
๏ Vary
๏ Age
Cache-Control
๏ Specifies directives that must be applied by all caching mechanisms (from Varnish Cache Plus to browser cache)
Overview
‣ public | private
‣ no-‐store
‣ no-‐cache
‣ max-‐age
‣ s-‐maxage
‣ must-‐revalidate
‣ no-‐transform
‣ …
Cache-Control
๏ Ignored in incoming client HTTP requests
๏ Only s-‐maxage & max-‐age used in backend HTTP responses to calculate default TTL
‣ Always overrides Expires header
‣ Beware of Age header in client responses
- Objects not cached client side
- https://www.varnish-cache.org/...Caching
beresp.ttl
Vary
๏ Indicates the response returned by the backend server may vary depending on headers received in the request
๏ Object variants & Hit ratio
‣ Vary: Accept-‐Encoding
- Normalization of Accept-‐Encoding header is not required
‣ Vary: User-‐Agent
Overview๏ Break objects into smaller fragments
‣ Separate cache policy for each fragment
‣ Increase hit ratio
๏ Tools
‣ Edge Side Includes (ESI)
‣ AJAX
- Beware of RTT & Cross domain policy
Edge Side Includes
๏ Subset of ESI Language Specification 1.0
‣ <esi:include src="<URL> " />
‣ <esi:remove>...</esi:remove>
‣ <!-‐-‐esi ...—>
๏ set beresp.do_esi = true;
‣ Separate Varnish requests
๏ Testing ESI in dev environment
Overview
๏ Central control of Varnish Cache Plus servers
‣ Web UI + RESTful API
- Super Fast Purger
๏ Cache group management
‣ Real time statistics, VCL editor, ban submission…
๏ Varnish Agent 2
Super Fast Purger
๏ High performance intermediary distributing invalidation requests to groups of Varnish Cache Plus servers
‣ Leverages speed & flexibility of VCL
‣ Keep-alive workaround
๏ Part of the VAC RESTful API
‣ Trivially integrable in existing applications
Change management
๏ Easily integrable using the VAC RESTful API
‣ git, Mercurial… hooks
‣ Jenkins, Travis, GitLab… CI scripts
๏ Manual VCL bundle generation
๏ Orchestrated / programmed deployments, rollbacks, etc.
Overview
๏ Real-time aggregated statistics
‣ Multiple vstatdprobe daemons
‣ One vstatd daemon
‣ JSON + Time series API
๏ VSM log based
‣ Efficient circular in-memory data structure
‣ std.log("vcs-‐key:" + <key suffix>);
Some ideas
๏ Trending articles or sale products
๏ Cache hits and cache misses
๏ URLs with long load times
๏ URLs with the most 5xx response codes
๏ Where traffic is coming from
๏ …
Example
sub vcl_deliver { std.log("vcs-‐key:" + req.http.host); std.log("vcs-‐key:" + req.http.host + req.url); std.log("vcs-‐key:TOTAL"); if (obj.hits == 0) { std.log("vcs-‐key:MISS"); } }
API I๏ Stats (#requests, #misses, avg ttfb, acc body bytes, #2xx,
#3xx…) for key named “example.com" during the last time windows
‣ GET /key/example.com
๏ Keys that produced the most 5xx responses during the last time window
‣ GET /all/top_5xx
๏ Top 5 requested keys during the last time window
‣ GET /all/top/5?verbose=1
API II
๏ Top 10 most requested keys ending with ‘.gif' during the last time window
‣ GET /match/(.*)%5C.gif$/top
๏ Top 50 slowest backend requests aggregating the last 20 time windows
‣ GET /all/top_ttfb/50?b=20
Overview๏ VMOD
๏ DeviceAtlas
‣ https://deviceatlas.com
‣ Database locally deployed & Daily updated
๏ OSS alternatives
‣ https://github.com/serbanghita/Mobile-Detect
‣ …
Example
import deviceatlas;
sub vcl_recv { if (deviceatlas.lookup(req.http.User-‐Agent, "isMobilePhone") == "1") { set req.http.X-‐Device = "mobile"; } elsif (deviceatlas.lookup(req.http.User-‐Agent, "isTablet") == "1") { set req.http.X-‐Device = "tablet"; } else { set req.http.X-‐Device = "desktop"; }}
Some ideas
๏ Redirections based on device properties
๏ Backend selection based on device properties
๏ Normalization of the UA header
‣ Caching different versions (i.e. Vary header) of the same object based on normalized UAs
๏ …
Highlights๏ Client / backend thread split
‣ Background content refreshing
๏ Redesigned purges
‣ return(purge); during vcl_recv
๏ Directors implemented as VMODs
‣ Consistent hashing director
๏ Distinction between error & synthetic responses