rakutentechconf2013] [d-3_1] leofs - open the new door

Post on 07-Nov-2014

2.472 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Rakuten Technology Conference 2013 "LeoFS - Open the New Door" Yosuke Hara (Rakuten)

TRANSCRIPT

Open the New Door

Yosuke Hara Oct 26, 2013 (rev 2.2)

The Lion of Storage Systems

1

Started OSS-project on July 4, 2012www.leofs.org

LeoFS is "Unstructured Big Data Storage for the Web"and a highly available, distributed, eventually consistentstorage system.

Organizations can use LeoFS to store lots of dataefficently, safely and inexpensively.

2

Motivation

3

1. High Costs (Initial Costs, Running Costs)2. Possibility of "SPOF"3. NOT Easily Scale

Storage Expansion is difficult during periods of increasing data

Expensive Storage Problems:

Motivation

?Get Away From Using "Expensive H/W Based Storages"

As of 2010

4

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

REST-API / AWS S3-API

The Lion of Storage Systems

HIGH Availability

HIGH Cost Performance Ratio

HIGH Scalability

LeoFS Non Stop

Velocity: Low LatencyMinimum Resources

Volume: Petabyte / ExabyteVariety: Photo, Movie, Unstructured-data

3 Vs in 3 HIGHs

5

Overview

6

metadata Object Store

Storage Engine/Router

metadata Object Store

Storage Engine/Router

LeoFS-Manager

REST over HTTP (80/443) RPC

(4369)

Request fromWeb Applications/ Browsers

w/REST-API / S3-API

metadata Object Store

Storage Engine/Router

Load Balancer

Monitor

GUI Console

(4000,4010,4020)

(10020, 10021)

RPC (4369)

No Master No SPOF

LeoFS Overview

LeoFS-Gateway

LeoFS-Storage

7

Gateway

8

LeoFS Overview - Gateway

Stateless Proxy + Object Cache

REST-API / S3-API

Use Consistent Hashingfor decision of a primary node

[ Memory Cache, Disc Cache ]

Storage C

lusterG

ateway(s)

Clients

Handle HTTP Request and ResponseBuilt in "Object Cache Mechanism"

Storage Cluster

Choosing Replica Target Node(s)

RING2 ^ 128 (MD5)

# of replicas = 3

KEY = “bucket/leofs.key”Hash = md5(Filename)

Secondary-1

Secondary-2

Primary Node

9

Storage

10

Storage (S

torage Cluster)

Gatew

ay

Automatically Replicatean Object and a Metadata to Remote Node(s)

LeoFS Overview - Storage

Use "Consistent Hashing"for Replication

in the Storage Cluster

Choosing Replica Target Node(s)

RING2 ^ 128 (MD5)

# of replicas = 3

KEY = “bucket/leofs.key”Hash = md5(Filename)

Secondary-1

Secondary-2

Primary Node

"P2P"

11

Request From Gateway

LeoFS Overview - Storage

...

LeoFS Storage

Metadata : Keeps an in-memory index of all dataObject Container : Manages "Log Structured File"

ReplicatorRepairer w/Queue

...

Storage Engine Workers

Storage E

ngine, Metadata + O

bject Container

Gatew

ay

Storage Engine consits of "Object Storage" and "Metadata Storage"Built in "Replicator", "Recoverer" w/Queue for the Eventual Consistency

12

LeoFS Storage Engine - Retrieve an object from the storage

< META DATA >IDFilenameOffsetSizeChecksum

Header

File

Footer

< META DATA >IdFilenameOffset, SizeChecksum (MD5)Version#

Storage Engine Worker

Object Container Metadata Storage

Storage Engine Worker

13

LeoFS Storage Engine - Retrieve an object from the storage

< META DATA >IDFilenameOffsetSizeChecksum

Header

File

Footer

< META DATA >IdFilenameOffset, SizeChecksum (MD5)Version#

Object Container Metadata Storage

Storage Engine Worker

Insert a metadata

Append an objectinto the object container

Storage Engine Worker

14

LeoFS Storage Engine - Remove unnecessary objects from the storage

Compact

Old Object Container/Metadata

Storage Engine Worker

New Object Container/Metadata

Storage Engine Worker

15

Offset Version Time-stamp{VNodeId, Key}

<Metadata>

Checksum

for Sync

KeySize CustomMeta Size File Size

for Retrieve an File (Object)

Footer (8B)

Checksum KeySize DataSize Offset Version Time-stamp

{VNodeId,Key} User-Meta Footer

Header (Metadata - Fixed length) Body (Variable Length)

User-MetaSize

ActualFile

<Needle>

Supe

r-bl

ock

Nee

dle-

1

Nee

dle-

2

Nee

dle-

3

<Object Container>

Nee

dle-

4

Nee

dle-

5

LeoFS Overview - Storage - Data Structure/Relationship an object

16

To Equalize Disk Usage of Every Storage NodeTo Realize High I/O efficiency and High Availability

LeoFS Overview - Storage - Large Object Support

chunk-0

chunk-1

chunk-2

chunk-3

An Original Object’s Metadata

Original Object NameOriginal Object Size# of Chunks

Storage ClusterGatewayClient(s)

[ WRITE Operation ]

Chunked Objects

Every chunked object and metadata are replicated

in the cluster

17

Manager

18

Storage Cluster

LeoFS Overview - Manager

Monitor

Operate

RING, Node State

status, suspend,resume, detach, whereis, ...

Gateway(s)

Storage C

lusterG

ateway(s)

Manager(s)

Operate LeoFS - Gateway and Storage Cluster"RING Monitor" and "NodeState Monitor"

19

New Features

20

"Insight"

21

Give Insight into the State of LeoFS 1. To control requests from Clients to LeoFS2. To check and see "Traffic info" and "State of Every Node"

for Keeping Availability

New Features - LeoInsight (v1.0)

22

Storage Cluster

ManagerGateway

The Lion of Storage Systems

TimeSeriesDB (Savannah)

Persistent calculated statistics-data

REST-API (JSON)

Operate LeoFS

Notifier

Distributed Queue (ElkDB)

Traffic-Info from Gateway Consume MSG

Retrieve

Proves of a Node from Gateway/Storage/Manager

Notify

New Features - LeoInsight (v1.0)

23

More Scalability&

More Availability

24

TokyoEurope

US

New Features - Multi Data Center Data Replication (v1.0)

HIGH-ScalabilityHIGH-Availability

Easy Operation for Admins+

NO SPOFNO Performance Degration

Singapore

25

DC-3DC-2

v1.0 - Multi Data Center Data Replication

Storage cluster

Manager cluster

Client

DC-1

Monitor and Replicate each “RING” and “System Configuration”

"Leo Storage Platform"

[replicas:1] [replicas:1]

Request tothe Target Region

Application(s)

[ 3 Regions & 5 Replicas ]Method of MDC-Replication:Async: Bulked TransferSync+Tran: Consensus Algorithm

DC-1 Configuration:- Method of Replication:- Consistency Level: - local-quorum:[N=3, W=2, R=1, D=2] - # of target DC(s):2 - # of replicas a DC:1 >> Total of Replicas: 5

[replicas:3]

26

1) 3 replicas are written in "Local Region"

DC-3DC-2

v1.0 - Multi Data Center Data Replication

Storage cluster

Manager cluster

Client

DC-1

Monitor and Replicate each “RING” and “System Configuration”

"Leo Storage Platform"

[replicas:1] [replicas:1]

Request tothe Target Region

Application(s)

[ 3 Regions & 5 Replicas ]Method of MDC-Replication:Async: Bulked TransferSync+Tran: Consensus Algorithm

DC-1 Configuration:- Method of Replication:- Consistency Level: - local-quorum:[N=3, W=2, R=1, D=2] - # of target DC(s):2 - # of replicas a DC:1 >> Total of Replicas: 5

[replicas:3]

27

2) Sync (or Async) Rplicaion to Other Region(s)

DC-3DC-2

v1.0 - Multi Data Center Data Replication

Storage cluster

Manager cluster

Client

DC-1

Monitor and Replicate each “RING” and “System Configuration”

"Leo Storage Platform"

Request tothe Target Region

Application(s)

[ 3 Regions & 5 Replicas ]

[replicas:3]

Leader Follower

DC1.node_0 - PrimaryDC1.node_1DC1.node_2DC2.node_3DC3.node_4

Local-follower

Remote-follower

[replicas:1] [replicas:1]

28

v1.0 - Multi Data Center Data Replication

Storage cluster

Manager cluster

Client

"Leo Storage Platform"

DC-3

US

DC-2

Singapore

DC-1

Tokyo

Monitor and Replicate each “RING” and “System Configuration”

[replicas:3] [replicas:1] [replicas:1]

DC-4

Europe

Request tothe Target Region

Application(s)

[ 3 Regions & 5 Replicas ]

3) Replication for Geographical Optimization

Local Region Remote-1 Remote-2Tokyo Singapore US

Singapore Tokyo Europe

Europe US Singapore

US Europe Tokyo

29

"Center"

30

Web-based administrative console for inspecting and manipulatingLeoFS Storage Clusters and LeoFS Gateway

Operate LeoFS

New Features - LeoCenter

Admin Tools

Access Log Analysis

31

Access Log Analysis (β)

32

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �

REST-API / AWS S3-API

The Lion of Storage Systems

HIGH Availability

HIGH Cost Performance Ratio

HIGH Scalability

LeoFS Non Stop

Velocity: Low LatencyMinimum Resources

Volume: Petabyte / ExabyteVariety: Photo, Movie, Unstructured-data

3 Vs in 3 HIGHs

33

Set Sail for “Cloud Storage”Website: www.leofs.orgTwitter: @LeoFastStorageFacebook: www.facebook.com/org.leofs

34

top related