my sql and search at craigslist
DESCRIPTION
TRANSCRIPT
MySQL and Search at Craigslist
Jeremy [email protected]
http://craigslist.org/
[email protected]://jeremy.zawodny.com/blog/
Who Am I?
● Creator and co-author of High Performance MySQL
● Creator of mytop● Perl Hacker● MySQL Geek● Craigslist Engineer (as of July, 2008)
– MySQL, Data, Search, Perl
● Ex-Yahoo (Perl, MySQL, Search, Web Services)
What is Craigslist?
What is Craigslist?
● Local Classifieds– Jobs, Housing, Autos, Goods, Services
● ~500 cities world-wide● Free
– Except for jobs in ~18 cities and brokered apartments in NYC
– Over 20B pageviews/month
– 50M monthly users
– 50+ countries, multiple languages
– 40+M ads/month, 10+M images
What is Craigslist?
● Forums– 100M posts
– 100s of forums
Technical and other Challenges
● High ad churn rate– Post half-life can be short
● Growth● High traffic volume● Back-end tools and data analysis needs● Growth● Need to archive postings... forever!
– 100s of millions, searchable
● Internationalization and UTF-8
Technical and other Challenges
● Small Team– Fires take priority
– Infrastructure gets creaky
– Organic code and schema growth over years
● Growth● Lack of abstractions
– Too much embedded SQL in code
● Documentation vs. Institutional Knowledge– “Why do we have things configured like this?”
Goals
● Use Open Source● Keep infrastructure small and simple
– Lower power is good!
– Efficiency all around
– Do more with less
● Keep site easy and appraochable– Don't overload with features
– People are easily confuse
Craigslist Internals OverviewLoad Balancer
Read Proxy Array Write Proxy ArrayPerl + memcached
Web Read Array Apache 1.3 + mod_perl
Object Cache
Read DB Cluster
Perl + memcached
MySQL 5.0.xxNot Included: - user db, image db - async tasks, email - accounting, internal tools - and more!
Search Cluster Sphinx
...
Vertical Partitioning: Roles
Users ClassifiedsUsers Classifieds Forums
Stats Archive
Write Read Long Trash
Vertical Partitioning
● Different roles have different access patterns– Sub-roles based on query type
● Easier to manage and scale● Logical, self-contained data● Servers may not need to be as
big/fast/expensive● Difficult to do retroactively● Various named db “handles” in code
Horizontal Partitioning: Hydra
cluster_01 cluster_02 cluster_03 cluster_N...
client
Horizontal Partitioning: Hydra
● Need to retrofit a lot of code● Need non-blocking Perl MySQL client● Wrapped
http://code.google.com/p/perl-mysql-async/● Eventually can size DB boxes based on
price/power and adjust mapping function(s)– Choose hardware first
– Make the db “fit”
● Archiving lets us age a cluster instead of migrating it's data to a new one.
Search Evolution
● Problem: Users want to find stuff.● Solution: Use MySQL Full Text.● ...time passes...● Problem: MySQL Full Text Doesn't Scale!● Solution: Use Sphinx.● ...time passes...● Problem: Sphinx doesn't scale!● Solution: Patch Sphinx.
MySQL Full-Text Problems
● Hitting invisible limits– CPU not pegged, Memory available
– Disk I/O not unreasonable
– Locking / Mutex contention? Probably.
● MyISAM has occasional crashing / corruption● 5 clusters of 5 machines
– Partitioning based on city and category
– All “hand balanced” and high-maintenance
● ~30M queries/day– Close to limits
Sphinx: My First CL Project
● Sphinx is designed for text search● Fast and lean C++ code● Forking model scales well on multi-core● Control over indexing, weighting, etc.● Also spent some time looking at Apache Solr
Search Implementation Details
● Partitioning based on cities (each has a numeric id)
● Attributes vs. Keywords● Persistent Connections
– Custom client and server modifications
● Minimal stopword List● Partition into 2 clusters (1 master, 4 slaves)
Sphinx Incremental Indexing
● Re-index every N minutes● Use main + delta strategy
– Adopted as: index + today + delta
– One set per city (~500 * 3)
● Slaves handle live queries, update via rsync● Need lots of FDs● Use all 4 cores to index● Every night, perform “daily merge”● Generate config files via Perl
Sphinx Incremental Indexing
Sphinx Issues
● Merge bugs [fixed]● File descriptor corruption [fixed]● Persistent connections [fixed]
– Overhead of fork() was substantial in our testing
– 200 queries/sec vs. 1,000 queries/sec per box
● Missing attribute updates [unreported]● Bogus docids in responses● We need to upgrade to latest Sphinx soon● Andrew and team have been excellent!
Search Project Results
● From 25 MySQL Boxes to 10 Sphinx● Lots more headroom!● New Features
– Nearby Search
● No seizing or locking issues● 1,000+ qps during peak w/room to grow● 50M queries per day w/steady growth● Cluster partitioning built but not needed (yet?)● Better separation of code
Sphinx Wishlist
● Efficient delete handling (kill lists)● Non-fatal “missing” indexes● Index dump tool● Live document add/change/delete● Built-in replication● Stats and counters● Text attributes● Protocol checksum
Data Archiving, Replication, Indexes
● Problem: We want to keep everything.● Solution: Archive to an archive cluster.● Problem: Archiving is too painful. Index
updates are expensive! Slaves affected.● Solution: Archive with home-grown eventually
consistent replication.
Data Archiving: OOB Replication
● Eventual Consistency● Master process
– SET SQL_LOG_BIN=0
– Select expired IDs
– Export records from live master
– Import records into archive master
– Delete expired from live master
– Add IDs to list
Data Archiving: OOB Replication
● Slave process– One per MySQL slave
– Throttled to minimize impact
– State kept on slave● Clone friendly
– Simple logic● Select expired IDs added since my sequence number● Delete expired records● Update local “last seen” sequence number
Long Term Data Archiving
● Schema coupling is bad– ALTER TABLE takes forever
– Lots of NULLs flying around
● CouchDB or similar long-term?– Schema-free feels like a good fit
● Tested some home grown solutions already● Separate storage and indexing?
– Indexing with Sphinx?
Drizzle, XtraDB, Future Stuff
● CouchDB looks very interesting. Maybe for archive?
● XtraDB / InnoDB plugin– Better concurrency
– Better tuning of InnoDB internals
● libdrizzle + Perl– DBI/DBD may not fit an async model well
– Can talk to both MySQL and Drizzle!
● Oracle buying Sun?!?!
We're Hiring!
● Work in San Francisco● Flexible, Small Company● Excellent Benefits● Help Millions of People Every Week● We Need Perl/MySQL Hackers● Come Help us Scale and Grow
Questions?