Download - NY Times: so news doesn't break your server
![Page 1: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/1.jpg)
@NYTDevs | developers.nytimes.com
![Page 2: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/2.jpg)
@NYTDevs | developers.nytimes.com
Varnish: Linchpin of the NYTimes.com Re-architecture
Adam E. FalkSoftware Architect, Web Products
![Page 3: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/3.jpg)
@NYTDevs | developers.nytimes.com
Who I Am
A software architect focusing on server configuration and resiliency, with sidelines in DevOps, release engineering, and testing.
Started as a LAMP developer but has always been a generalist interested in all aspects of the data center.
![Page 4: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/4.jpg)
@NYTDevs | developers.nytimes.com
Who We Are
Photo credit: Tony Cenicola/The New York Times
![Page 5: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/5.jpg)
@NYTDevs | developers.nytimes.com
Scope of this Presentation
Everything that follows pertains to the use of Varnish to accelerate serving content on the <www.nytimes.com> hostname, only.
There are several other Varnish clusters at NYTimes.com.
![Page 6: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/6.jpg)
@NYTDevs | developers.nytimes.com
NYTimes.com: Size
15+ million page URLs (1851–present)● Not all HTML; working on that
200+ new page URLs created each day
Millions more image URLs
![Page 7: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/7.jpg)
@NYTDevs | developers.nytimes.com
NYTimes.com: Traffic<www.nytimes.com> normal daily peak is ~75,000 requests/second – just this hostname.
● primarily APIs● HTML traffic is ~4,000 req/sec
Traffic spikes up to 4x during abreaking news event
R.I.P. Leonard Nimoy
![Page 8: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/8.jpg)
@NYTDevs | developers.nytimes.com
2013 Redesign of NYTimes.com
![Page 9: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/9.jpg)
@NYTDevs | developers.nytimes.com
Mission Statement
“Leverage the latest technology in order to improve the user experience, enhance our journalism, and provide a more effective environment for our advertisers.”
Project document
![Page 10: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/10.jpg)
@NYTDevs | developers.nytimes.com
Improve the User Experience
Technical goals:1. 25% improvement in browser load time,
minimum.2. ...
Sounds like a job for page caching!
![Page 11: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/11.jpg)
@NYTDevs | developers.nytimes.com
50% or better improvement in● Time to first byte● Time to paint● Time to page ready
Achievement Unlocked
![Page 12: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/12.jpg)
@NYTDevs | developers.nytimes.com
Brave New World
![Page 13: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/13.jpg)
@NYTDevs | developers.nytimes.com
Exception to the Rule
A complete code rewrite (almost). Why?● < insert usual suspects here >● Deeply embedded server-side personalization
(includes ads)
Output was simply uncacheable.
![Page 14: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/14.jpg)
@NYTDevs | developers.nytimes.com
Never Let a Crisis Go To Waste
☒ (Test|Behavior) Driven Development☒ Web performance was core from Day 0☒ Async wherever, whenever☒ New APIs☒ CSS: LESS (then), SASS (now)
![Page 15: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/15.jpg)
@NYTDevs | developers.nytimes.com
Can We Cache Pages Now?
Yes, Virginia.
</summary>
![Page 16: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/16.jpg)
@NYTDevs | developers.nytimes.com
Spotlights for You
VCL file modular organization
Cache refresh instead of purge
Varnish cluster today
![Page 17: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/17.jpg)
@NYTDevs | developers.nytimes.com
Changing Horses in Midstream
Site functionality that must not break:● redirects (mobile, registration, et. al.)● user tracking● web crawler detection
![Page 18: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/18.jpg)
@NYTDevs | developers.nytimes.com
Best Practice (singular)
![Page 19: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/19.jpg)
@NYTDevs | developers.nytimes.com
Easy Yet Powerful
![Page 20: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/20.jpg)
@NYTDevs | developers.nytimes.com
Easy Yet Powerful
![Page 21: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/21.jpg)
@NYTDevs | developers.nytimes.com
Easy Yet Powerful
![Page 22: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/22.jpg)
@NYTDevs | developers.nytimes.com
Easy Yet Powerful
![Page 23: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/23.jpg)
@NYTDevs | developers.nytimes.com
Greatest Thing Since Sliced Bread☒ Single responsibility principle☒ Code readability (and understanding!)☒ Time spent troubleshooting☒ Coding standards
![Page 24: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/24.jpg)
@NYTDevs | developers.nytimes.com
Intermission
There are only two hard things in Computer Science:
1.Cache invalidation2.Naming things3.Off-by-one errors
http://martinfowler.com/bliki/TwoHardThings.html
![Page 25: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/25.jpg)
@NYTDevs | developers.nytimes.com
Cache Invalidation
Purge is not good enough (in Varnish 3).
PURGE causes cache misses on the highest-traffic content.
Needed cache re(set|build|prime).
![Page 26: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/26.jpg)
@NYTDevs | developers.nytimes.com
NYT Homepage
● Must always be in Varnish cache.● Every article linked to on the
homepage should already be in Varnish cache.
No cache misses = long TTL.
![Page 27: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/27.jpg)
@NYTDevs | developers.nytimes.com
But...
Some content changes frequently.Latest version served in real-time after every publish action.
Short TTL = more cache misses.PURGE = more cache misses.
![Page 28: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/28.jpg)
@NYTDevs | developers.nytimes.com
Cache Rules Everything Around Me
CREAM: an API to re(set|build|prime) a single cache entry.
Publish event calls API synchronously.
![Page 29: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/29.jpg)
@NYTDevs | developers.nytimes.com
req.hash_always_miss = true
CREAM requests the just-updated article to every Varnish server, in parallel.
![Page 30: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/30.jpg)
@NYTDevs | developers.nytimes.com
Where We Are Today: Software
~2,300 lines of VCL code● Minimum of inline C
10 VMODs● std, utils, crashhandler, wurfl, boltsort,
queryfilter● 4 custom
![Page 31: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/31.jpg)
@NYTDevs | developers.nytimes.com
Where We Are Today: Traffic
Of the ~4,000 page requests/second to <www.nytimes.com>:
● ~1,500 now served by Varnish● ~91% cache hit rate (down from
~96%)
![Page 32: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/32.jpg)
@NYTDevs | developers.nytimes.com
Where We Are Today: Performance
Load test: ~3,000 requests/second/server with current configuration
We could handle a 4x spike with 2 servers
We run 8 servers per data center
![Page 33: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/33.jpg)
@NYTDevs | developers.nytimes.com
8 Servers? Why?!Because:
● Biggest spike ever was 10x (2012 Election Night)
● 2 hypervisors => even number of server instances
● Takes too long for us to dynamically provision● We can afford to stay over-provisioned
Yes, this causes extra backend network traffic.
Scaled out for resilience, scaling up for performance.
![Page 34: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/34.jpg)
@NYTDevs | developers.nytimes.com
Next Steps for Us
1. Install Varnish Cache Plus 42. Utilize the Varnish Plus tools for
monitoring.3. Replace CREAM with VHA
![Page 35: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/35.jpg)
@NYTDevs | developers.nytimes.com
Thank You
Adam E. [email protected]
@xenogragadamfalk.com xenograg.com
![Page 36: NY Times: so news doesn't break your server](https://reader030.vdocuments.mx/reader030/viewer/2022032620/55ce69c2bb61ebcd688b4595/html5/thumbnails/36.jpg)
We’re hiringnytimes.com/careers
@NYTDevs | #timesopen | developers.nytimes.com