making performant sites
DESCRIPTION
How a developer and a hoster should work together to get a good uptime, a performant site and be prepared to scaleTRANSCRIPT
Performant sitesYou and your hoster want your site to perform well
me
• Bernard Grymonpon - Wonko
• Partner in Openminds & Metatale
• Sysadmin - Web-engineer
• Openminds offers high-quality, high-performance internetsolutions
Todays talk
What a good site needs
• Performance
• Availability
• Scaling (if needed)
• Bandwidth
Problems: 1.0 to 2.0
• Serving HTML was easy, but...
• A lot of hits (blame Web 2.0)
• Each hit: processing PHP/Rails/Django
• Each hit: reading and writing like crazy
• Server response speed is driving the User Experience
More problems!
Ajax requests
User content
RSS polls
Including contenteverywhere
Fast sites
•Why
• Basic system administration
• Some cases
• Working together on Uptime, Performance and scaling
• Larger scaling & some examples
One common goalYour hoster wants what you want...
Getting the content where it should be, ASAP!
Different reasons
Your reasons
• You paid for it
• Your site is important
• You promised it to a client
• Sleep confident at night
Your hosters reasons
• Offer a stable service, for everyone
• Clear the way for other requests
• Time to invest in other projects
• Sleep at night
• Profit
Fast sites
• Why
• Basic system administration
• Some cases
• Working together on Uptime, Performance and scaling
• Larger scaling & some examples
Hosting your siteGetting to the “performance” part...
We use servers
• Processors (php, rails, django, OS)
• Storage (files, database, logging)
• Memory (needed for speed)
• Casing, some fans, circuitry... (because putting it all in a cardboard box doesn’t work)
My site is slowPut a faster processor in the server, ASAP
My site is slowPut a faster processor in the server, ASAP
Guess again...
My site is slowPut a faster processor in the server, ASAP
Guess again...
Stay focused, technical stuff coming up...
CPU can be a problemprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 6 0 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 6 0 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 7 0 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 6 0 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 6 0 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 6 0 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 6 0 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 6 0 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0
CPU can be a problemprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 6 0 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 6 0 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 7 0 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 6 0 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 6 0 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 6 0 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 6 0 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 6 0 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0
runningworking like
crazy
CPU can be a problemprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 6 0 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 6 0 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 7 0 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 6 0 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 6 0 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 6 0 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 6 0 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 6 0 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0
heaps offree memory
runningworking like
crazy
CPU can be a problemprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 7 0 21092 165200 37316 1585388 0 0 1 0 0 1 50 5 45 0 6 0 21092 163664 37316 1585388 0 0 0 0 142 279 73 26 1 0 6 0 21092 168840 37316 1585392 0 0 0 0 148 330 75 25 0 0 7 0 21092 165264 37316 1585392 0 0 0 316 235 245 75 25 0 0 6 0 21092 160828 37316 1585392 0 0 0 0 153 277 73 27 0 0 6 0 21092 168688 37316 1585396 0 0 0 0 149 383 78 22 0 0 6 0 21092 165040 37316 1585396 0 0 0 0 141 179 76 24 0 0 6 0 21092 169188 37316 1585396 0 0 0 0 143 264 77 23 0 0 6 0 21092 168376 37316 1585396 0 0 0 360 264 221 75 25 0 0
heaps offree memory
no I/Orunningworking like
crazy
I/O is slowing down
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
buffering in actioncpu is not
that stressed
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
running buffering in actioncpu is not
that stressed
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
running I/Obuffering in actioncpu is not
that stressed
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
running I/Obuffering in actioncpu is not
that stressed
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
running I/Obuffering in actioncpu is not
that stressed
I/O is slowing downprocs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----r b swpd free buff cache si so bi bo in cs us sy id wa
0 0 2034316 204240 376 518220 0 0 448 0 532 3208 0 0 99 0 0 0 2034316 203792 376 518668 0 0 448 72 576 3214 0 1 99 0 0 0 2034316 203172 376 519156 0 0 488 0 617 3328 0 1 99 0 0 2 2034300 193644 376 520076 0 0 860 0 608 3452 29 3 67 0 2 0 2034300 185996 376 521448 0 0 1032 4 624 3955 33 5 62 0 1 0 2034300 185944 376 521476 0 0 0 24 296 3600 14 1 84 0 1 0 2034300 185924 376 521496 0 0 0 72 559 3648 10 3 87 0 0 0 2034300 185928 376 521496 0 0 0 0 233 3221 3 1 96 0 1 1 2034300 177192 376 521560 0 0 64 8 253 3658 25 4 71 0 0 0 2034300 177664 376 521572 0 0 12 12 249 3415 12 2 86 0
running I/Obuffering in actioncpu is not
that stressed
I/O is always slowkeep this in mind
I/O is always slowkeep this in mind
• Filesystem, reading and writing files
I/O is always slowkeep this in mind
• Filesystem, reading and writing files
• Database, reading and writing data
I/O is always slowkeep this in mind
• Filesystem, reading and writing files
• Database, reading and writing data
• Services, logging and “doing their thing”
Fast sites
• Why
• Basic system administration
• Some cases
• Working together on Uptime, Performance and scaling
• Larger scaling & some examples
PerformanceGetting there at last...
Main problems we see
• Bad database design, puts load on I/O
• Bad code, puts load on processor
• A lot of hits at once, puts load on both
• Insecure sites, contact forms, ...
A first exampleDatabases need love and care
Databases
• Main page takes +30 seconds to load
• Developer used non-realistic small datasets
• The index page was the “heaviest” page
Solution: Indexes
• Site on a single box, getting slow, using a database? CHECK THE INDEXES!
• Solved the problem (0.5 seconds load time)
• Easy concept, like an index in a book
• Only where needed...
Where?SELECT text FROM articles WHERE category = 5
SELECT text FROM articles JOIN authors ON authers.id = articles.author_id
WHERE articles.category = 5
mysql> explain select * from articles where status = 2;+----+-------------+----------+------+---------------+------+---------+------+-------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------+------+---------------+------+---------+------+-------+-------------+| 1 | SIMPLE | articles | ALL | NULL | NULL | NULL | NULL | 50000 | Using where | +----+-------------+----------+------+---------------+------+---------+------+-------+-------------+1 row in set (0.00 sec)
mysql> create index status_idx on articles(status);Query OK, 50000 rows affected (0.31 sec)Records: 50000 Duplicates: 0 Warnings: 0
mysql> explain select * from articles where status = 2;+----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |+----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+| 1 | SIMPLE | articles | ref | status_idx | status_idx | 2 | const | 4757 | Using where | +----+-------------+----------+------+---------------+------------+---------+-------+------+-------------+1 row in set (0.00 sec)
• MySQL - PostgreSQL - Oracle - ...
• MySQL: storage engine choice
• Replication
• ...
More to databases
A second case: The mystery of the spike in
the graphs
The flow
The flow
*beep* *beep*
The flow
The flow
wtf!
The flow
The flow
Fixed!
The problem
• Client has no knowledge of the impact this has on the equipment
• Massive hits on the servers (web and db)
• Needed fast response from us
What could be done?
• Mails could be sent out in batches
• Application could be tuned
• Server could be “prepared”
• Monitoring would be guaranteed
• We share the same goal, contact us
Case 3: Blame the media
Advertisments on TV are killing servers
Good case
Good case
• Server was tuned before the hit
• LLMP stack (lighttpd instead of apache)
• Nothing happened...
Another case: Bad code kills
Another case: Bad code kills
“Dear support,
Another case: Bad code kills
“Dear support, I measured it, the query takes 0.001 seconds.
Another case: Bad code kills
“Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server.
Another case: Bad code kills
“Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server. I see no point in changing my scripts”...
Another case: Bad code kills
“Dear support, I measured it, the query takes 0.001 seconds. Executing such a fast query 500.000 times in a row can’t be that hard on your server. I see no point in changing my scripts”...
*sigh*
The query:SELECT child FROM menu WHERE parent = 21
The query:SELECT child FROM menu WHERE parent = 21
Result: 21
The query:SELECT child FROM menu WHERE parent = 21
Result: 21
A perfect loop
Fast sites
• Why
• Basic system administration
• Some cases
• Working together on Uptime, Performance and scaling
• Larger scaling & some examples
What should we doA common goal, remember?
Uptime, your part
• Write secure code
• Don’t do include($_GET[‘p’]);
• Develop local
Uptime, your hoster
• Monitoring! Alerting! Spare parts! Spare servers!
• Redundancy, things should failover easily
• Backups, and tested restore procedures
• Invest in new technology
• Invest in people
Performance & you
• Write good and efficient code
• Remember, I/O is slow
• Spend time at DB optimization
• Test your application before launch, with normal datasets
Performance & hosting
• Monitoring & tuning systems
• An ongoing task!
• Activate caching where possible
• Filesystem level (memory)
• Database tuning ...
Prepare to scale
• If you start with a new project, split read/write operations
• Be prepared to partition your application
• Normalize like crazy, denormalize when needed
• Implement, or consider caching
Scaling and hosting
• Have spare CPU/IO power available
• Invest time in testing setups
• Test setups, both on I/O and CPU-performance
• Be creative: shared storage, replication, storing sessions, special setups...
A summary for you
• Make good applications
• Test your applications
• Optimize your application in bad conditions
• Talk to your hoster
• Find bottlenecks before they show up
• Normalize first, denormalize only when needed
Check your hoster
• Knowledge
• Monitoring
• Redundancy
• Support
• Report back to the users
Fast sites
• Why
• Basic system administration
• Some cases
• Working together on Uptime, Performance and scaling
• Larger scaling & some examples
A quick word on perspective
Don’t go nuts
Don’t exagerate
• Wikipedia, 200+ servers
• Twitter, 8 servers
• LifeBook, 100 servers
• Digg, approx 100 servers
• Google, 450000 servers, 5 sites (??)
Scaling in the real worldA quick overview
Common scaling
Web
SQL
Common scaling
Web
SQL
Web
SQL SQL
SQL SQL
Common scaling
Web
SQL
Web
SQL SQL
SQL SQL
Web
SQL SQL
SQL SQL
Web
WebWeb
Common scaling
Web
SQL
Web
SQL SQL
SQL SQL
Web
SQL SQL
SQL SQL
Web
WebWeb
Web
SQL SQL
SQL SQL
Web
WebWebX
X
Y
Y’
Z
Common solutions
• Partitioning the problem
• No single bottleneck (no “master server”)
• Cache like crazy (memcached)
• Redundancy
• Balancing
A big project?
• Search & hire people with knowledge
• Hire/buy the needed equipment
• Test-drive the application before going live
• Don’t be ashamed to ask for help
LifeJournal
• 90+ servers
• +50M hits per day, +1k per sec in peak
• typical road
• partitioning
• made memcached
• Made in rails, running on 8 boxes
• Their observations:
• indexes
• denormalisation
• caching, caching, caching, caching
• your application should be partitionable
Flickr
• Massive storage (remember I/O is slow?)
• Wrote own FS, partitioned
• Made to scale, separate reading/writing
• Cache invalidation is hard
Wikipedia
• Serving is the bottleneck
• Reverse squids to the rescue
• Memcached to the rescue
• Partitioned (per language)
• Cache invalidation through multicast!
• They develop the MediaWiki software to their needs...
Conclusions
Conclusions
• Talk to your hoster, he should be a compagnion, not a far enemy
Conclusions
• Talk to your hoster, he should be a compagnion, not a far enemy
• Performance and scaling demands effort and knowledge
Conclusions
• Talk to your hoster, he should be a compagnion, not a far enemy
• Performance and scaling demands effort and knowledge
• A good site is a combination of many factors (application, code, servers, OS settings, tuning...)
Q & ADiscussion
Bernard Grymonpon - www.openminds.be