data infrastructure at linkedin - stanford university

40
Data Infrastructure at LinkedIn Shirshanka Das XLDB 2011 1

Upload: others

Post on 21-Jan-2022

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Infrastructure at LinkedIn - Stanford University

Data Infrastructure at LinkedIn

Shirshanka Das

XLDB 2011

1

Page 2: Data Infrastructure at LinkedIn - Stanford University

Me

UCLA Ph.D. 2005 (Distributed protocols in content

delivery networks)

PayPal (Web frameworks and Session Stores)

Yahoo! (Serving Infrastructure, Graph Indexing, Real-time

Bidding in Display Ad Exchanges)

@ LinkedIn (Distributed Data Systems team): Distributed

data transport and storage technology (Kafka, Databus,

Espresso, ...)

2

Page 3: Data Infrastructure at LinkedIn - Stanford University

Outline

LinkedIn Products

Data Ecosystem

LinkedIn Data Infrastructure Solutions

Next Play

3

Page 4: Data Infrastructure at LinkedIn - Stanford University

LinkedIn By The Numbers

120,000,000+ users in August 2011

2 new user registrations per second

4 billion People Searches expected in 2011

2+ million companies with LinkedIn Company Pages

81+ million unique visitors monthly*

150K domains feature the LinkedIn Share Button

7.1 billion page views in Q2 2011

1M LinkedIn Groups

* Based on comScore, Q2 2011

4

Page 5: Data Infrastructure at LinkedIn - Stanford University

5

Member Profiles

Page 6: Data Infrastructure at LinkedIn - Stanford University

Signal - faceted stream search

6

Page 7: Data Infrastructure at LinkedIn - Stanford University

People You May Know

7

Page 8: Data Infrastructure at LinkedIn - Stanford University

Outline

LinkedIn Products

Data Ecosystem

LinkedIn Data Infrastructure Solutions

Next Play

8

Page 9: Data Infrastructure at LinkedIn - Stanford University

Three Paradigms : Simplifying the Data Continuum

• Member Profiles

• Company Profiles

• Connections

• Communications

Online

• Signal

• Profile Standardization

• News

• Recommendations

• Search

• Communications

Nearline

• People You May Know

• Connection Strength

• News

• Recommendations

• Next best idea

Offline

9

Activity that should

be reflected immediately

Activity that should

be reflected soon

Activity that can be

reflected later

Page 10: Data Infrastructure at LinkedIn - Stanford University

Data Infrastructure Toolbox (Online)

Capabilities

Key-value access

Rich structures (e.g.

indexes)

Change capture

capability

Search platform

Graph engine

10

Systems Analysis

{

Page 11: Data Infrastructure at LinkedIn - Stanford University

Data Infrastructure Toolbox (Nearline)

Capabilities

Change capture streams

Messaging for site

events, monitoring

Nearline processing

11

Systems Analysis

Page 12: Data Infrastructure at LinkedIn - Stanford University

Data Infrastructure Toolbox (Offline)

Capabilities

Machine learning,

ranking, relevance

Analytics on

Social gestures

12

Systems Analysis

Page 13: Data Infrastructure at LinkedIn - Stanford University

Laying out the tools

13

Page 14: Data Infrastructure at LinkedIn - Stanford University

Outline

LinkedIn Products

Data Ecosystem

LinkedIn Data Infrastructure Solutions

Next Play

14

Page 15: Data Infrastructure at LinkedIn - Stanford University

Focus on four systems in Online and Nearline

Data Transport

– Kafka

– Databus

Online Data Stores

– Voldemort

– Espresso

15

Page 16: Data Infrastructure at LinkedIn - Stanford University

Kafka: High-Volume Low-Latency Messaging System

LinkedIn Data Infrastructure Solutions

16

Page 17: Data Infrastructure at LinkedIn - Stanford University

Kafka: Architecture

17

WebTier

Topic 1

Broker Tier

Push

Event

s

Topic 2

Topic N

Zookeeper Offset

Management

Topic, Partition

Ownership

Sequential write sendfile

Kafk

a

Clie

nt Lib

Consumers

Pull

Events Iterator 1

Iterator n

Topic Offset

100 MB/sec 200 MB/sec

Billions of Events

TBs per day

Inter-colo: few seconds

Typical retention: weeks

Scale Guarantees

At least once delivery

Very high throughput

Low latency

Durability

Page 18: Data Infrastructure at LinkedIn - Stanford University

Databus : Timeline-Consistent Change Data Capture

LinkedIn Data Infrastructure Solutions

18

Page 19: Data Infrastructure at LinkedIn - Stanford University

Relay

Databus at LinkedIn

Event Win

19

DB

Bootstrap

Capture

Changes On-line

Changes

On-line

Changes

DB

Consistent

Snapshot at U

Consumer 1

Consumer n

Data

bus

Clie

nt Lib

Client

Consumer 1

Consumer n

Data

bus

Clie

nt Lib

Client

Features

Transport independent of data source: Oracle, MySQL, …

Portable change event serialization and versioning

Start consumption from arbitrary point

Guarantees

Transactional semantics

Timeline consistency with the data source

Durability (by data source)

At-least-once delivery

Availability

Low latency

Page 20: Data Infrastructure at LinkedIn - Stanford University

Voldemort: Highly-Available Distributed Data Store

LinkedIn Data Infrastructure Solutions

20

Page 21: Data Infrastructure at LinkedIn - Stanford University

Highlights

• Open source

• Pluggable components

• Tunable consistency /

availability

• Key/value model,

server side “views”

In production

• Data products

• Network updates, sharing,

page view tracking,

rate-limiting, more…

• Future: SSDs,

multi-tenancy

Voldemort: Architecture

Page 22: Data Infrastructure at LinkedIn - Stanford University

Espresso: Indexed Timeline-Consistent Distributed

Data Store

LinkedIn Data Infrastructure Solutions

22

Page 23: Data Infrastructure at LinkedIn - Stanford University

Espresso: Key Design Points

Hierarchical data model

– InMail, Forums, Groups, Companies

Native Change Data Capture Stream

– Timeline consistency

– Read after Write

Rich functionality within a hierarchy

– Local Secondary Indexes

– Transactions

– Full-text search

Modular and Pluggable

– Off-the-shelf: MySQL, Lucene, Avro

23

Page 24: Data Infrastructure at LinkedIn - Stanford University

Application View

24

Page 25: Data Infrastructure at LinkedIn - Stanford University

Partitioning

25

Page 26: Data Infrastructure at LinkedIn - Stanford University

Node 3

Node 2

Partition Layout: Master, Slave

Cluster

Manager

Partition: P.1

Node: 1

Partition: P.12

Node: 3

Database

Node: 1

M: P.1 – Active

S: P.5 – Active

Cluster Node 1

P.1 P.2

P.4

P.3

P.5 P.6

P.9 P.1

0

P.5 P.6

P.8

P.7

P.1 P.2

P.11 P.1

2

P.9 P.1

0

P.1

2

P.11

P.3 P.4

P.7 P.8 Master

Slave

3 Storage Engine nodes, 2 way replication

Page 27: Data Infrastructure at LinkedIn - Stanford University

Espresso: API

REST over HTTP

Get Messages for bob

– GET /MailboxDB/MessageMeta/bob

Get MsgId 3 for bob

– GET /MailboxDB/MessageMeta/bob/3

Get first page of Messages for bob that are unread and in the inbox

– GET /MailboxDB/MessageMeta/bob/?query=“+isUnread:true

+isInbox:true”&start=0&count=15

27

Page 28: Data Infrastructure at LinkedIn - Stanford University

Espresso: API Transactions

• Add a message to bob’s mailbox • transactionally update mailbox aggregates, insert into metadata and details.

POST /MailboxDB/*/bob HTTP/1.1

Content-Type: multipart/binary; boundary=1299799120

Accept: application/json

--1299799120

Content-Type: application/json

Content-Location: /MailboxDB/MessageStats/bob

Content-Length: 50

{“total”:”+1”, “unread”:”+1”}

--1299799120

Content-Type: application/json

Content-Location: /MailboxDB/MessageMeta/bob

Content-Length: 332

{“from”:”…”,”subject”:”…”,…}

--1299799120

Content-Type: application/json

Content-Location: /MailboxDB/MessageDetails/bob

Content-Length: 542

{“body”:”…”}

--1299799120—

28

Page 29: Data Infrastructure at LinkedIn - Stanford University

Espresso: System Components

29

Page 30: Data Infrastructure at LinkedIn - Stanford University

Espresso @ LinkedIn

First applications

– Company Profiles

– InMail

Next

– Unified Social Content Platform

– Member Profiles

– Many more…

30

Page 31: Data Infrastructure at LinkedIn - Stanford University

Espresso: Next steps

Launched first application Oct 2011

Open source 2012

Multi-Datacenter support

Log-structured storage

Time-partitioned data

31

Page 32: Data Infrastructure at LinkedIn - Stanford University

Outline

LinkedIn Products

Data Ecosystem

LinkedIn Data Infrastructure Solutions

Next Play

32

Page 33: Data Infrastructure at LinkedIn - Stanford University

The Specialization Paradox in Distributed Systems

Good: Build specialized

systems so you can do each

thing really well

Bad: Rebuild distributed

routing, failover, cluster

management, monitoring,

tooling

33

Page 34: Data Infrastructure at LinkedIn - Stanford University

Generic Cluster Manager: Helix

• Generic Distributed State Model

• Centralized Config Management

• Automatic Load Balancing

• Fault tolerance

• Health monitoring

• Cluster expansion and

rebalancing

• Open Source 2012

• Espresso, Databus and Search

34

Page 35: Data Infrastructure at LinkedIn - Stanford University

Stay tuned for

Innovation

– Nearline processing

– Espresso eco-system

– Storage / indexing

– Analytics engine

– Search

Convergence

– Building blocks for distributed data

management systems

35

Page 36: Data Infrastructure at LinkedIn - Stanford University

Thanks!

36

Page 37: Data Infrastructure at LinkedIn - Stanford University

Appendix

37

Page 38: Data Infrastructure at LinkedIn - Stanford University

Espresso: Routing

Router is a high-performance HTTP proxy

Examines URL, extracts partition key

Per-db routing strategy

– Hash Based

– Route To Any (for schema access)

– Range (future)

Routing function maps partition key to partition

Cluster Manager maintains mapping of partition to hosts:

– Single Master

– Multiple Slaves

38

Page 39: Data Infrastructure at LinkedIn - Stanford University

Espresso: Storage Node

Data Store (MySQL)

– Stores document as Avro serialized blob

– Blob indexed by (partition key {, sub-key})

– Row also contains limited metadata

Etag, Last modified time, Avro schema version

Document Schema specifies per-field index constraints

Lucene index per partition key / resource

39

Page 40: Data Infrastructure at LinkedIn - Stanford University

Espresso: Replication

MySQL replication of mastered partitions

MySQL “Slave” is MySQL instance with custom storage

engine

– custom storage engine just publishes to databus

Per-database commit sequence number

Replication is Databus

– Supports existing downstream consumers

Storage node consumes from Databus to update

secondary indexes and slave partitions

40