FOWA Scaling The Lamp Stack Workshop

Download FOWA Scaling The Lamp Stack Workshop

Post on 13-May-2015




7 download


Slides from the workshop "Scaling the LAMP Stack" at the Future of Web Apps on October 5, 2007


1.Scaling the LAMP Stack Future of Web Apps October 5, 20072. Introductions 3. Specific Problems, Challenges and Issues 4. About this workshop This is a broad topic Theory and application Real-world focus Interactive (please!) 5. About web apps and scaling Some different ways of looking at the problem 6. Things to think about Multi-server: locking and concurrency Running many: keep in mind whats expensive, sloppy or risky Code quality The law of truly large numbers 7. Elements of Scaling Split up different tasks Use more hardware (intelligently) Partition Replicate Cache Optimize (code and hardware) Identify and fix weaknesses Manage 8. Tools and Components Apache + PHP MySQL File System (local) Networked File System Load Balancers memcached 9. Contemplating Scaling Understand what your app does (and how much) Identify the bottlenecks Solve near-term problems Design well, but dont over-design 10. Web apps do lots of things Different operations have different scaling issues. 11. What does your app do? List the high level elements of what your application does.Separate out different functions that will have different scaling issues. 12. Common things that web apps do Manage connections/protocols Deliver static content Manage sessions Manage user data Render dynamic pages Access external APIs Process media 13. Update the list of things your app does Add anything you missed Note which items you do in quantity 14. Easy vs. Difficult Scaling What happens when you add hardware? Does it work? Does more hardware = more performance? 15. Things that break when you scale State that isnt properly shared (especially sessions) Updates/refreshes (caching and replication issues) 16. Things that dont improve when you add more servers Unpartitioned databases Anything that locks/blocks Inefficient code, especially big queries 17. Scaling Each Element (do easy separations first) 18. Managing Connections/Protocols No problem putting on multiple servers Apache is good Not too far away out of the box Moderately tunable Linux tuning TCP stack (tune to handle unusual networking needs) 19. Key Apache Configuration Issues MaxClients(and ServerLimit, ThreadLimit and ThreadsPerChild) Avoid using PHP (or other) handler unnecessarily Use the worker MPM Maybe MaxRequestsPerChild 20. Delivering Static Content Dont process it unnecessarily Either cache or use no Apache handlers Caching can let you treat semi-static content as static Multiple servers complicates updates, but is otherwise easy 21. General Discussion: Multi-server, state and sessions Rethinking state for multi-server environments What is state? Short-term state (sessions) Long-term state (application data) Managing state is usually the hardest part of scaling 22. What happens with state Written(created/destroyed/changed) Read Stored 23. Requirements for managing state Depend on what it is and how it is used Perfect coherence Performance of different operations 24. Ways of scaling state Replication: make more copies Partitioning: split up the work Caching Should make different choices for different state/data elements 25. About Load Balancers What load balancers do Spread load Detect server failures Stickiness/persistence Acceleration (especially SSL) Fancy features (including good stickiness) are expensive 26. Why sticky sessions are not usually good in practice Servers fail Corner cases exist 27. Managing Sessions 28. Where session data can be stored Browser cookies Web server temporary files (not scalable) App server state Database Cache 29. PHP session management Default (files) method isnotmulti-server friendly, and thus not scalable (unless sticky) Can implement a different back-end easily 30. Designing a session back-end Requirements Data storage options Cookies only (re-auth, let the browser take care of the logout but less secure) Full-featured involves a combination of cookies and database and cache (discussion of session details) 31. Managing small user data Databases are more efficient, flexible and sharable than small files Frequently-read data should be cached 32. Managing large user data NFS has flaws but is almost inevitable Locking is usually not important, but can be Performance degradation can be sudden 33. About NFS NFS is usually transparent to your app NFS is easy to implement gives you multiple-write access NFS locking is not to be trusted The Linux NFS client is slow for writes and can do bad things under stress 34. User data and locking Names based on hashes often mean no locking is needed Databases do locking better than file systems do Locking requires housekeeping 35. Disk Storage Hardware Disk performance can degrade suddenly If the ratio of access to storage is low, then even slow disk is usually fine Think about seek times and spindles 36. Rendering dynamic pages Depends heavily on application specifics(query, search, process, etc.) Watch out for: Onerous queries (create and watch slow query log) Locking of resources and/or incoherence if state changes Heavy CPU and memory usage Cache both elements and complete pages 37. Processing media CPU intensive May be memory intensive Might be spiky Might need its own server pool 38. Hardware Start simple Observe performance and respond accordingly Get lots of memory 39. Hardware-driven behaviors Sudden degradation because demand exceeds supply (usually relieved unhappily) Get behind due to a spike, and recover Not enough resources for normal optimization 40. Specific hardware issues Not enough memory Severe: paging/swapping Mild: poor automatic caching; slowness due to fragmentation Disk seek(very common) CPU(but might really be memory) Disk throughput(rare for web apps) 41. Hardware decisions SCSI/SAS vs. SATA Resource ratios Combining vs. splitting functions Big vs. little boxes 42. Techniques Caching Partitioning Replication Data management middleware Queuing 43. Caching Turn expensive operations in to cheap ones Reduce: Database reads Object and page calculation/rendering operations Cache objects and subobjects Add memory 44. Apache Caching Can be done with zero application modifications Complete pages/HTTP requests only Must use Apache 2.2 Cache is not shared between servers 45. memcached Extremely useful Distributed caching system Requires new thinking and new coding Straightforward API 46. memcached URLs Home: Intro: PHP documentation: 47. Partitioning Mostly for data management Split load on to separate servers/pools Partition algorithm/mechanism must be lightweight Partition algorithm must anticipate the future 48. File Storage Partitioning Index/database gives the most flexibility Hash-based is simplest 49. Database partitioning You will need to do this, but perhaps later than you think Index vs. hash-based 50. Replication Used where data is read far more than written Consider caching first Also used for failure recovery 51. Types of Replication Replication: sync vs. async Synchronous is not usually scalable Asynchronous only works with certain kinds of data and use cases, because of coherence issues 52. Database Replication Simple but finicky Asynchronous (but not by much) Allows big queries and backups to be moved to separate servers 53. File System Replication Slow and very asynchronous Mostly for disaster recovery 54. Data Management Middleware Mostly for databases Can handle partitioning and replication, and do it well Big investment in coding to the API Sometimes easier to add functionality to app 55. Queuing Save work for later Useful for less urgent operations, especially messaging Can be used to wait for a pause, or to separate hardware 56. Dealing with lots of hardware (operations) Automation Process 57. Imaging/Provisioning Be consistent Use your distros automation (Kickstart, AutoYaST, etc.) Use boring, meaningful hostnames Make re-imaging easy 58. Deployment Systems Content and code replication Coherence/atomic updates Managing pieces and processes Simple scripts are fine Create audit trail Include back-out Think 3AM Do it! 59. Monitoring systems A pain, but a lifesaver Start with built-in basics Add custom checks, especially end-to-end and communication between pieces Eliminate false alarms (ongoing) Nagios, usually 60. Coping with hardware failure Have extra servers/capacity Load balancers handle stateless layers Replication prepares you to handle data layers manually Use middleware or app-level multiple writes to get true data layer redundancy 61. Change management Part automation, part process Use version control on everything Stage changes with realistic data Know how to back out Consult the right people (internal and/or external) 62. Efficiency Access the smallest amount (DB, FS, etc.) Dont do complex stuff when simple will suffice 63. Using the database efficiently Keep it simple Know what queries you do Index every query key Cache to reduce demand Check slow query log Replicate if you need big queries 64. The messy real world 65. Security and abuse Mostly same issues, just magnified You will be a target Spam (coming and going) Abuse of file storage 66. Corner Cases Murphys law enforcement Watch out for how different user activities relate Lock data, not functions Housekeeping 67. Performance and tuning Observation and responsiveness is more important than pre-optimizing Redesign as needed Collect the data to be able to analyze (both resource utilization and end-user performance) 68. Miscellaneous Warnings 69. Files and directories Most default file system configurations get really slow with lots of files in one directory Numerical limits on files and subdirectories Some programs dont like files over 2GB 70. AJAX Sequential round trips Make preloading invisible UI that waits for too many things 71. Other topics Multiple sites CDNs 72. Scaling the LAMP Stack Future of Web Apps October, 2007 Daniel Lieberman [email_address]