my sql and search at craigslist

MySQL and Search at Craigslist

Jeremy [email protected]

http://craigslist.org/

[email protected]://jeremy.zawodny.com/blog/

mailto:[email protected]

http://craigslist.org/

mailto:[email protected]

http://jeremy.zawodny.com/blog/

Who Am I?

● Creator and co-author of High Performance MySQL

● Creator of mytop● Perl Hacker● MySQL Geek● Craigslist Engineer (as of July, 2008)

– MySQL, Data, Search, Perl

● Ex-Yahoo (Perl, MySQL, Search, Web Services)

What is Craigslist?

What is Craigslist?

● Local Classifieds– Jobs, Housing, Autos, Goods, Services

● ~500 cities world-wide● Free

– Except for jobs in ~18 cities and brokered apartments in NYC

– Over 20B pageviews/month

– 50M monthly users

– 50+ countries, multiple languages

– 40+M ads/month, 10+M images

What is Craigslist?

● Forums– 100M posts

– 100s of forums

Technical and other Challenges

● High ad churn rate– Post half-life can be short

● Growth● High traffic volume● Back-end tools and data analysis needs● Growth● Need to archive postings... forever!

– 100s of millions, searchable

● Internationalization and UTF-8

Technical and other Challenges

● Small Team– Fires take priority

– Infrastructure gets creaky

– Organic code and schema growth over years

● Growth● Lack of abstractions

– Too much embedded SQL in code

● Documentation vs. Institutional Knowledge– “Why do we have things configured like this?”

Goals

● Use Open Source● Keep infrastructure small and simple

– Lower power is good!

– Efficiency all around

– Do more with less

● Keep site easy and appraochable– Don't overload with features

– People are easily confuse

Craigslist Internals OverviewLoad Balancer

Read Proxy Array Write Proxy ArrayPerl + memcached

Web Read Array Apache 1.3 + mod_perl

Object Cache

Read DB Cluster

Perl + memcached

MySQL 5.0.xxNot Included: - user db, image db - async tasks, email - accounting, internal tools - and more!

Search Cluster Sphinx

...

Vertical Partitioning: Roles

Users ClassifiedsUsers Classifieds Forums

Stats Archive

Write Read Long Trash

Vertical Partitioning

● Different roles have different access patterns– Sub-roles based on query type

● Easier to manage and scale● Logical, self-contained data● Servers may not need to be as

big/fast/expensive● Difficult to do retroactively● Various named db “handles” in code

Horizontal Partitioning: Hydra

cluster_01 cluster_02 cluster_03 cluster_N...

client

Horizontal Partitioning: Hydra

● Need to retrofit a lot of code● Need non-blocking Perl MySQL client● Wrapped

http://code.google.com/p/perl-mysql-async/● Eventually can size DB boxes based on

price/power and adjust mapping function(s)– Choose hardware first

– Make the db “fit”

● Archiving lets us age a cluster instead of migrating it's data to a new one.

http://code.google.com/p/perl-mysql-async/

Search Evolution

● Problem: Users want to find stuff.● Solution: Use MySQL Full Text.● ...time passes...● Problem: MySQL Full Text Doesn't Scale!● Solution: Use Sphinx.● ...time passes...● Problem: Sphinx doesn't scale!● Solution: Patch Sphinx.

MySQL Full-Text Problems

● Hitting invisible limits– CPU not pegged, Memory available

– Disk I/O not unreasonable

– Locking / Mutex contention? Probably.

● MyISAM has occasional crashing / corruption● 5 clusters of 5 machines

– Partitioning based on city and category

– All “hand balanced” and high-maintenance

● ~30M queries/day– Close to limits

Sphinx: My First CL Project

● Sphinx is designed for text search● Fast and lean C++ code● Forking model scales well on multi-core● Control over indexing, weighting, etc.● Also spent some time looking at Apache Solr

Search Implementation Details

● Partitioning based on cities (each has a numeric id)

● Attributes vs. Keywords● Persistent Connections

– Custom client and server modifications

● Minimal stopword List● Partition into 2 clusters (1 master, 4 slaves)

Sphinx Incremental Indexing

● Re-index every N minutes● Use main + delta strategy

– Adopted as: index + today + delta

– One set per city (~500 * 3)

● Slaves handle live queries, update via rsync● Need lots of FDs● Use all 4 cores to index● Every night, perform “daily merge”● Generate config files via Perl

Sphinx Incremental Indexing

Sphinx Issues

● Merge bugs [fixed]● File descriptor corruption [fixed]● Persistent connections [fixed]

– Overhead of fork() was substantial in our testing

– 200 queries/sec vs. 1,000 queries/sec per box

● Missing attribute updates [unreported]● Bogus docids in responses● We need to upgrade to latest Sphinx soon● Andrew and team have been excellent!

Search Project Results

● From 25 MySQL Boxes to 10 Sphinx● Lots more headroom!● New Features

– Nearby Search

● No seizing or locking issues● 1,000+ qps during peak w/room to grow● 50M queries per day w/steady growth● Cluster partitioning built but not needed (yet?)● Better separation of code

Sphinx Wishlist

● Efficient delete handling (kill lists)● Non-fatal “missing” indexes● Index dump tool● Live document add/change/delete● Built-in replication● Stats and counters● Text attributes● Protocol checksum

Data Archiving, Replication, Indexes

● Problem: We want to keep everything.● Solution: Archive to an archive cluster.● Problem: Archiving is too painful. Index

updates are expensive! Slaves affected.● Solution: Archive with home-grown eventually

consistent replication.

Data Archiving: OOB Replication

● Eventual Consistency● Master process

– SET SQL_LOG_BIN=0

– Select expired IDs

– Export records from live master

– Import records into archive master

– Delete expired from live master

– Add IDs to list

Data Archiving: OOB Replication

● Slave process– One per MySQL slave

– Throttled to minimize impact

– State kept on slave● Clone friendly

– Simple logic● Select expired IDs added since my sequence number● Delete expired records● Update local “last seen” sequence number

Long Term Data Archiving

● Schema coupling is bad– ALTER TABLE takes forever

– Lots of NULLs flying around

● CouchDB or similar long-term?– Schema-free feels like a good fit

● Tested some home grown solutions already● Separate storage and indexing?

– Indexing with Sphinx?

Drizzle, XtraDB, Future Stuff

● CouchDB looks very interesting. Maybe for archive?

● XtraDB / InnoDB plugin– Better concurrency

– Better tuning of InnoDB internals

● libdrizzle + Perl– DBI/DBD may not fit an async model well

– Can talk to both MySQL and Drizzle!

● Oracle buying Sun?!?!

We're Hiring!

● Work in San Francisco● Flexible, Small Company● Excellent Benefits● Help Millions of People Every Week● We Need Perl/MySQL Hackers● Come Help us Scale and Grow

Questions?

my sql and search at craigslist

Technology

use mysql

use sphinx

mysql boxes

text search

archive cluster

project sphinx

latest sphinx

patch sphinx