search at twitter: presented by michael busch, twitter
TRANSCRIPT
Search @twitter
Michael Busch@[email protected] [email protected]
Agenda
‣ Introduction
- Search Architecture
- Lucene Extensions
- Outlook
Search @twitter
Introduction
Introduction
Twitter has more than 284 million monthly active users.
Introduction
500 million tweets are sent per day.
Introduction
More than 300 billion tweets have been sent since company founding in 2006.
Introduction
Tweets-per-second record: one-second peak of 143,199 TPS.
Introduction
More than 2 billion search queries per day.
Search @twitter
Agenda
- Introduction
‣ Search Architecture
- Lucene Extensions
- Outlook
Search Architecture
RT index
Search Architecture
RT streamAnalyzer/Partitioner
RT index(Earlybird)
Blender
RT indexArchive index
MapreduceAnalyzer
rawtweets
HDFS
searcheswrites
Searchrequests
analyzedtweets
analyzedtweets
rawtweets
Tweet archive
RT index
Search Architecture
TweetsAnalyzer/Partitioner
RT index(Earlybird)
Blender
RT indexArchive index
queue
HDFS
Searchrequests
Updates Deletes/Engagement (e.g. retweets/favs)
searcheswrites
MapreduceAnalyzer
RT index
Search Architecture
RT index(Earlybird)
Blender
RT indexArchive index
searcheswrites
Searchrequests
• Blender is our Thrift service aggregator
• Queries multiple Earlybirds, merges results
Social graph
Social graph
Social graphUser
search
Search Architecture
RT index(Earlybird)
Archive index
Usersearch
Search Architecture
RT index(Earlybird)
Archive index
• For historic reasons, these used to be entirely different codebases, but had similar features/technologies
• Over time cross-dependencies were introduced to share code
Usersearch
Lucene
Search Architecture
RT index(Earlybird)
Archive index
Usersearch
Lucene
Lucene Extensions
• New Lucene extension package
• This package is truly generic and has no dependency on an actual product/index
• It contains Twitter’s extensions for real-time search, a thin segment management layer and other features
Search @twitter
Agenda
- Introduction
- Search Architecture
‣ Lucene Extensions
- Outlook
Lucene Extensions
Lucene Extension Library
• Abstraction layer for Lucene index segments
• Real-time writer for in-memory index segments
• Schema-based Lucene document factory
• Real-time faceting
• API layer for Lucene segments
• *IndexSegmentWriter
• *IndexSegmentAtomicReader
• Two implementations
• In-memory: RealtimeIndexSegmentWriter (and reader)
• On-disk: LuceneIndexSegmentWriter (and reader)
Lucene Extension Library
• IndexSegments can be built ...
• in realtime
• on Mesos or Hadoop (Mapreduce)
• locally on serving machines
• Cluster-management code that deals with IndexSegments
• Share segments across serving machines using HDFS
• Can rebuild segments (e.g. to upgrade Lucene version, change data schema, etc.)
Lucene Extension Library
HDFS EarlybirdEarlybirdEarlybird
Mesos
Hadoop (MR)
RT pipeline
Lucene Extension Library
RealtimeIndexSegmentWriter
• Modified Lucene index implementation optimized for realtime search
• IndexWriter buffer is searchable (no need to flush to allow searching)
• In-memory
• Lock-free concurrency model for best performance
Concurrency - Definitions
• Pessimistic locking
• A thread holds an exclusive lock on a resource, while an action is performed [mutual exclusion]
• Usually used when conflicts are expected to be likely
• Optimistic locking
• Operations are tried to be performed atomically without holding a lock; conflicts can be detected; retry logic is often used in case of conflicts
• Usually used when conflicts are expected to be the exception
Concurrency - Definitions
• Non-blocking algorithm
Ensures, that threads competing for shared resources do not have their execution indefinitely postponed by mutual exclusion.
• Lock-free algorithm
A non-blocking algorithm is lock-free if there is guaranteed system-wide progress.
• Wait-free algorithm
A non-blocking algorithm is wait-free, if there is guaranteed per-thread progress.
* Source: Wikipedia
Concurrency
• Having a single writer thread simplifies our problem: no locks have to be used to protect data structures from corruption (only one thread modifies data)
• But: we have to make sure that all readers always see a consistent state of all data structures -> this is much harder than it sounds!
• In Java, it is not guaranteed that one thread will see changes that another thread makes in program execution order, unless the same memory barrier is crossed by both threads -> safe publication
• Safe publication can be achieved in different, subtle ways. Read the great book “Java concurrency in practice” by Brian Goetz for more information!
Java Memory Model
• Program order rule
Each action in a thread happens-before every action in that thread that comes later in the program order.
• Volatile variable rule
A write to a volatile field happens-before every subsequent read of that same field.
• Transitivity
If A happens-before B, and B happens-before C, then A happens-before C.
* Source: Brian Goetz: Java Concurrency in Practice
Concurrency
0RAM
int x;
Cache
Thread 1 Thread 2
time
Concurrency
0RAM
int x;
Cache 5
Thread 1 Thread 2
x = 5;
Thread A writes x=5 to cache
time
Concurrency
0RAM
int x;
Cache 5
Thread 1 Thread 2
x = 5;
while(x != 5);time
This condition will likely never become false!
Concurrency
0RAM
int x;
Cache
Thread 1 Thread 2
time
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5
Thread A writes b=1 to RAM, because b is volatile
b = 1;
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5
Read volatile b
b = 1;
int dummy = b;
while(x != 5);
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
• Program order rule: Each action in a thread happens-before every action in that thread that comes later in the program order.
happens-before
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
happens-before
• Volatile variable rule: A write to a volatile field happens-before every subsequent read of that same field.
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
happens-before
• Transitivity: If A happens-before B, and B happens-before C, then A happens-before C.
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
This condition will be false, i.e. x==5
• Note: x itself doesn’t have to be volatile. There can be many variables like x, but we need only a single volatile field.
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
Memory barrier
• Note: x itself doesn’t have to be volatile. There can be many variables like x, but we need only a single volatile field.
Demo
Concurrency
0RAM
int x;
1
Cache
Thread 1 Thread 2
time
volatile int b;
x = 5;5b = 1;
int dummy = b;
while(x != 5);
Memory barrier
• Note: x itself doesn’t have to be volatile. There can be many variables like x, but we need only a single volatile field.
Concurrency
IndexWriter IndexReader
time
write 100 docs
maxDoc = 100
in IR.open(): read maxDoc
search upto maxDoc
maxDoc is volatile
write more docs
Concurrency
IndexWriter IndexReader
time
write 100 docs
maxDoc = 100
in IR.open(): read maxDoc
search upto maxDoc
maxDoc is volatile
write more docs
happens-before
• Only maxDoc is volatile. All other fields that IW writes to and IR reads from don’t need to be!
• Not a single exclusive lock
• Writer thread can always make progress
• Optimistic locking (retry-logic) in a few places for searcher thread
• Retry logic very simple and guaranteed to always make progress
Wait-free
In-memory Real-time Index
• Highly optimized for GC - all data is stored in blocked native arrays
• v1: Optimized for tweets with a term position limit of 255
• v2: Support for 32 bit positions without performance degradation
• v2: Basic support for out-of-order posting list inserts
In-memory Real-time Index
• Highly optimized for GC - all data is stored in blocked native arrays
• v1: Optimized for tweets with a term position limit of 255
• v2: Support for 32 bit positions without performance degradation
• v2: Basic support for out-of-order posting list inserts
In-memory Real-time Index
• RT term dictionary
• Term lookups using a lock-free hashtable in O(1)
• v2: Additional probabilistic, lock-free skip list maintains ordering on terms
• Perfect skip list not an option: out-of-order inserts would require rebalancing, which is impractical with our lock-free index
• In a probabilistic skip list the tower height of a new (out-of-order) item can be determined without knowing its insert position by simply rolling a dice
In-memory Real-time Index
• Perfect skip list
In-memory Real-time Index
• Perfect skip list
Inserting a new element in the middle of this skip list requires re-balancing the towers.
In-memory Real-time Index
• Probabilistic skip list
In-memory Real-time Index
• Probabilistic skip list Tower height determined by rolling a dice BEFORE knowing the insert location; tower height
never has to change for an element, simplifying memory allocation and concurrency.
• Apps provide one ThriftSchema per index and create a ThriftDocument for each document
• SchemaDocumentFactory translates ThriftDocument -> Lucene Document using the Schema
• Default field values
• Extended field settings
• Type-system on top of DocValues
• Validation
Schema-based Document factory
Schema-based Document factory
Schema
Lucene Document
SchemaDocumentFactory
Thrift Document
• Validation
• Fill in default values
• Apply correct Lucene field settings
Schema-based Document factory
Schema
Lucene Document
SchemaDocumentFactory
Thrift Document
• Validation
• Fill in default values
• Apply correct Lucene field settings
Decouples core package from specific product/index. Similar
to Solr/ElasticSearch.
Search @twitter
Agenda
- Introduction
- Search Architecture
- Lucene Extensions
‣ Outlook
Outlook
Outlook
• Support for parallel (sliced) segments to support partial segment rebuilds and other cool posting list update patterns
• Add remaining missing Lucene features to RT index
• Index term statistics for ranking
• Term vectors
• Stored fields
Questions?Michael Busch@[email protected] [email protected]
Backup Slides
Searching for top entities within Tweets
• Task: Find the best photos in a subset of tweets
• We could use a Lucene index, where each photo is a document
• Problem: How to update existing documents when the same photos are tweeted again?
• In-place posting list updates are hard
• Lucene’s updateDocument() is a delete/add operation - expensive and not order-preserving
Searching for top entities within Tweets
• Task: Find the best photos in a subset of tweets
• Could we use our existing time-ordered tweet index?
• Facets!
Inverted index
Query Doc ids
Forward indexDoc id Document
Metadata
Facetindex
Term id Term label
Doc id Term ids
Searching for top entities within Tweets
Storing tweet metadata
FacetindexDoc id Term ids
Facetindex
Matching doc id
Term ids
5 15 9000 9002 100000 100090
48239 831241 2
Top-k heap
Id Count
Query
Searching for top entities within Tweets
Facetindex
Matching doc id
Term ids
5 15 9000 9002 100000 100090
48239 1531241 1285932 86748 3
Top-k heap
Id Count
Query
Searching for top entities within Tweets
Facetindex
Matching doc id
Term ids
5 15 9000 9002 100000 100090
48239 1531241 1285932 86748 3
Top-k heap
Id Count
Query
Weighted counts (from engagement features) used
for relevance scoring
Searching for top entities within Tweets
Facetindex
Matching doc id
Term ids
5 15 9000 9002 100000 100090
48239 1531241 1285932 86748 3
Top-k heap
Id Count
Query
All query operators can be used. E.g. find best photos in
San Francisco tweeted by people I follow
Searching for top entities within Tweets
Inverted indexTerm id Term label
Searching for top entities within Tweets
pic.twitter.com/jknui4w 45pic.twitter.com/dslkfj83 23pic.twitter.com/acm3ps 15pic.twitter.com/948jdsd 11pic.twitter.com/dsjkf15h 8pic.twitter.com/irnsoa32 5
48239 4531241 2385932 156748 11
74294 83728 5
Id Count Label Count
Inverted index
Searching for top entities within Tweets
Summary
• Indexing tweet entities (e.g. photos) as facets allows to search and rank top-entities using a tweets index
• All query operators supported
• Documents don’t need to be reindexed
• Approach reusable for different use cases, e.g.: best vines, hashtags, @mentions, etc.