![Page 1: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/1.jpg)
Reliable, Scaling and High PerformanceStorage System
Yosuke Hara - @yosukehara
A Researcher of R.I.T. and Tech Lead LeoFSwith Masahiro Sanjo, Coordinator of R.I.T.
![Page 2: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/2.jpg)
LeoFS is "Unstructured Big Data Storage for the Web"and a highly available, distributed, eventually consistentstorage system.
Organizations can use LeoFS to store lots of dataefficently, safely and inexpensively.
LeoFS was published as OSSon July of 2012
leo-project.net/leofs
![Page 3: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/3.jpg)
Overview
Brief Benchmark ReportMulti Data Center Replication
LeoFS Administration at RakutenFuture Plans LeoFS QoS
NFS Support
![Page 4: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/4.jpg)
Overview
![Page 5: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/5.jpg)
� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �
The Lion of Storage Systems
HIGH Availability
HIGH Cost Performance Ratio
HIGH Scalability
LeoFS Non Stop
Velocity: Low LatencyMinimum Resources
Volume: Petabyte / ExabyteVariety: Photo, Movie, Unstructured-data
3 Vs in 3 HIGHs
![Page 6: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/6.jpg)
Metadata Object Storage
Storage Engine/Router
Monitor
GUI Console
( Erlang RPC)
LeoFS Overview
Storage
Manager
( Erlang RPC)
Gateway
( TCP/IP,SNMP )
Request fromWeb Applications / Browsers
w/HTTP over REST-API / S3-API
Load Balancer
Keeping High AvailabilityKeeping High PerformanceEasy Administration
Metadata Object Storage
Storage Engine/Router
Metadata Object Storage
Storage Engine/Router
![Page 7: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/7.jpg)
LeoFS Gateway
![Page 8: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/8.jpg)
LeoFS Overview - Gateway
Stateless Proxy + Object Cache
REST-API / S3-API
Use Consistent Hashingfor decision of a primary node
[ Memory Cache, Disc Cache ]
Storage C
lusterG
ateway(s)
Clients
HTTP Request and ResponseBuilt in Object Cache Mechanism
Storage Cluster
Fast HTTP Server - CowboyAPI HandlerObject Cache Mechanism
![Page 9: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/9.jpg)
LeoFS Storage
![Page 10: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/10.jpg)
Storage (S
torage Cluster)
Gatew
ay
LeoFS Overview - Storage
Use "Consistent Hashing"for Data Operation
in the Storage Cluster
Choosing Replica Target Node(s)
RING2 ^ 128 (MD5)
# of replicas = 3
KEY = “bucket/leofs.key”Hash = md5(Filename)
Secondary-1
Secondary-2
Primary Node
"P2P"
WRITE: Auto ReplicationREAD : Auto Repair of an Inconsistent Object with Async
![Page 11: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/11.jpg)
Request From Gateway
LeoFS Overview - Storage
...
LeoFS Storage
ReplicatorRecoverer
...
Storage Engine
Storage E
ngine, Metadata + O
bject Storage
Gatew
ay
Storage consists of Object Storage and Metadata StorageIncludes Replicator and Recoverer for the eventual consistency
MetadataStorage Object
Storage
![Page 12: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/12.jpg)
LeoFS Overview - Storage - Data Structure
Metadata
Storage
Object S
torage
Robust andHigh PerformanceNecessary for GC
Offset Version Time-stamp Key
<Metadata>
Checksum
for Sync
KeySize CustomMeta Size File Size
for retrieving an object
Footer (8B)
Checksum KeySize DataSize Offset Version Time-stamp Key User-Meta Footer
Header (Metadata - Fixed length) Body (Variable Length)
User-MetaSize
ActualFile
<Needle>
Supe
r-bl
ock
Nee
dle-
1
Nee
dle-
2
Nee
dle-
3
<Object Container>N
eedl
e-4
Nee
dle-
5
![Page 13: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/13.jpg)
To Equalize Disk Usage in Every Storage NodeTo Realise High I/O efficiency and High Availability
LeoFS Overview - Storage - Large Object Support
chunk-0
chunk-1
chunk-2
chunk-3
An Original Object’s Metadata
Original Object NameOriginal Object Size# of Chunks
Storage ClusterGatewayClient(s)
[ WRITE Operation ]
Chunked Objects
Every chunked objectis replicated
in the storage cluster
![Page 14: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/14.jpg)
LeoFS Manager
![Page 15: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/15.jpg)
Storage Cluster
LeoFS Overview - Manager
Monitor
Operate
RING, Node State
status, suspend,resume, detach, whereis, ...
Gateway(s)
Storage C
lusterG
ateway(s)
Manager(s)
Operate LeoFS - Gateway and Storage Cluster"RING Monitor" and "NodeState Monitor"
![Page 16: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/16.jpg)
Brief BenchmarkReport
![Page 17: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/17.jpg)
LeoFS kept in a stable performance through the benchmark
Brief Benchmark Report
Bottleneck is Disk I/O
The cache mechanism contributed to reduce network traffic between Gateway and Storage
Summary of the benchmark results
![Page 18: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/18.jpg)
Brief Benchmark Report
1st Case: Group of Value Ranges Storage:5, Gateway:1, Manager:2 R:W = 9:1
2nd Case: Group of Value Ranges Storage:5, Gateway:1, Manager:2 R:W = 8:2
source: https://github.com/leo-project/notes/tree/master/leofs/benchmark/leofs/20140605/tests/1m_r9w1_240min
source: https://github.com/leo-project/notes/tree/master/leofs/benchmark/leofs/20140605/tests/1m_r8w2_120min
![Page 19: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/19.jpg)
Brief Benchmark Report
CPU Intel(R) Xeon(R) CPU X5650 @ 2.67GHz * 2 (12 cores / 24 threads)
Memory 96GBDisk HDD - 240GB RAID0
Network 10G-Ether
Server Spec - Gateway:
CPU Intel(R) Xeon(R) CPU X5650 @ 2.67GHz * 2 (12 cores / 24 threads)
Memory 96GB
DiskHDD - 240GB RAID0 (System)
DiskHDD - 2TB RAID0 (Data)
Network 10G-Ether
Server Spec - Storage x5:
![Page 20: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/20.jpg)
Network 10GbpsOS CentOS release 6.5 (Final)
Erlang OTP R16B03-1LeoFS v1.0.2
Environment:
System Consistency Level: [ N:3, W:2, R:1, D:2 ]
Duration 4.0hR:W 9:1
# of Concurrent Processes 64
# of Keys 100,000
Value Size
Benchmark Configuration:
Range (byte)Range (byte) Percentage
1024 10240 24.00%
10241 102400 30.00%
10241 819200 30.00%
819201 1572864 16.00%
Brief Benchmark Report - 1st Case (R:W=9:1)
![Page 21: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/21.jpg)
source: https://github.com/leo-project/notes/tree/master/leofs/benchmark/leofs/20140601/tests/1m_r9w1_240min
50ms
Brief Benchmark Report - 1st Case (R:W=9:1)
50ms
1,500ops
No Errors
OPS
Latency
![Page 22: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/22.jpg)
0
150,000
300,000
450,000
600,000
750,000
900,000
1,050,000
1,200,000
1,350,000
1,500,000
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
7500s
8000s
8500s
9000s
9500s
10000s
10500s
11000s
11500s
12000s
12500s
13000s
13500s
14000s
gateway rxbyt/s gateway txbyt/sstorage-1 rxbyt/s storage-1 txbyt/sstorage-2 rxbyt/s storage-2 txbyt/sstorage-3 rxbyt/s storage-3 txbyt/sstorage-4 rxbyt/s storage-4 txbyt/sstorage-5 rxbyt/s storage-5 txbyt/s
Brief Benchmark Report - 1st Case / Network Traffic
10.0Gbps
7.0Gbps
5.0Gbps
6.0Gbps
StorageG
ateway
60%
![Page 23: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/23.jpg)
00.10.30.40.60.70.91.01.11.31.41.61.71.92.0
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
7500s
8000s
8500s
9000s
9500s
10000s
10500s
11000s
11500s
12000s
12500s
13000s
13500s
14000s
Memory Usage
CPU Load 5min
Brief Benchmark Report - 1st Case / Memory and CPU
1.0
0
10
20
30
40
50
60
70
80
90
100
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
7500s
8000s
8500s
9000s
9500s
10000s
10500s
11000s
11500s
12000s
12500s
13000s
13500s
14000s
gatewaystorage-1storage-2storage-3storage-4storage-5
![Page 24: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/24.jpg)
Network 10GbpsOS CentOS release 6.5 (Final)
Erlang OTP R16B03-1LeoFS v1.0.2
Environment:
System Consistency Level: [ N:3, W:2, R:1, D:2 ]
Duration 2.0hR:W 8:2
# of Concurrent Processes 64
# of Keys 100,000
Value Size
Benchmark Configuration:
Brief Benchmark Report - 2nd Case (R:W=8:2)
Range (byte)Range (byte) Percentage
1024 10240 24.00%
10241 102400 30.00%
10241 819200 30.00%
819201 1572864 16.00%
![Page 25: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/25.jpg)
Brief Benchmark Report - 2nd Case (R:W=8:2)
60-70ms 80-90ms
1,000ops
No Errors
OPS
Latency
![Page 26: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/26.jpg)
Compare 1st case with 2nd case
![Page 27: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/27.jpg)
0
150,000
300,000
450,000
600,000
750,000
900,000
1,050,000
1,200,000
1,350,000
1,500,000
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000sgateway rxbyt/s gateway txbyt/sstorage-1 rxbyt/s storage-1 txbyt/sstorage-2 rxbyt/s storage-2 txbyt/sstorage-3 rxbyt/s storage-3 txbyt/sstorage-4 rxbyt/s storage-4 txbyt/sstorage-5 rxbyt/s storage-5 txbyt/s
0
300,000
600,000
900,000
1,200,000
1,500,000
1,800,000
2,100,000
2,400,000
2,700,000
3,000,000
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
6.0Gbps
Brief Benchmark Report7.0Gbps
6.0Gbps7.0Gbps
minus 0.7Gbps
1st Case - Network Traffic
2nd Case - Network Traffic
![Page 28: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/28.jpg)
0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
400.0
450.0
500.0
550.0
600.0
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
400.0
450.0
500.0
550.0
600.0
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
storage-1 storage-2 storage-3storage-4 storage-5
100
100
Brief Benchmark Report
2nd Case - Disk util%
200
200
1st Case - Disk util%
1.8x high
![Page 29: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/29.jpg)
00.20.40.60.81.01.21.41.61.82.02.22.42.62.83.0
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
gatewaystorage-1storage-2storage-3storage-4storage-5
Brief Benchmark Report
00.20.40.60.81.01.21.41.61.82.02.22.42.62.83.0
0s 500s
1000s
1500s
2000s
2500s
3000s
3500s
4000s
4500s
5000s
5500s
6000s
6500s
7000s
1.00
1.00
1.6x high2nd Case - CPU Load 5min
1st Case - CPU Load 5min
![Page 30: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/30.jpg)
LeoFS kept in a stable performance through the benchmark
Brief Benchmark Report
Bottleneck is Disk I/O
The cache mechanism contributed to reduce network traffic between Gateway and Storage
Conclusion:
![Page 31: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/31.jpg)
Multi Data CenterReplication
![Page 32: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/32.jpg)
TokyoEurope
US
Multi Data Center Replication
HIGH-ScalabilityHIGH-Availability
Easy Operation for Admins+
NO SPOF
NO Performance Degradation
Singapore
![Page 33: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/33.jpg)
1. Easy Operation to build multi clusters.
2. Asynchronous data replication between clusters
Stacked data is transferred to remote cluster(s)
3. Eventual consistency
Multi Data Center Replication
Designed it as simple as possible
![Page 34: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/34.jpg)
DC-3DC-2
Storage cluster
Manager cluster
Client
DC-1
Monitors and Replicates each “RING” and “System Configuration”
"Leo Storage Platform"
[# of replicas:1] [# of replicas:1][# of replicas:3]
"join cluster DC-2 and DC-3"
leo_rpcleo_rpc
Multi Data Center Replication
Executing “Join Cluster” on Manager Console
Preparing the MDC Replication
![Page 35: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/35.jpg)
DC-3DC-2
Storage cluster
Manager cluster
Client
Monitors and Replicates each “RING” and “System Configuration”
"Leo Storage Platform"
[# of replicas:1] [# of replicas:1]
Request tothe Target Region
Application(s)
DC-1
[# of replicas:3]
Temporally Stacking objects- One container's capacity is *32MB- When capacity is full,
send it to remote cluster(s)* 32MB: default capacity - able to set optional value
leo_rpcleo_rpc
Multi Data Center Replication
Stacking objects
![Page 36: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/36.jpg)
DC-3DC-2
Storage cluster
Manager cluster
Client
Monitors and Replicates each “RING” and “System Configuration”
"Leo Storage Platform"
DC-1
Stacked an object with a metadata
Compress it with LZ4
Replicated an object
Request tothe Target Region
Application(s)
leo_rpc
leo_rpcleo_rpc
Multi Data Center Replication
Transferring stacked objects
Stacked objects
![Page 37: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/37.jpg)
DC-3DC-2
Storage cluster
Manager cluster
Client
Monitor and Replicate each “RING” and “System Configuration”
"Leo Storage Platform"
Request tothe Target Region
Application(s)
DC-1
1) Receive metadata of stored objects2) Compare them at the local cluster3) Fix inconsistent objects
leo_rpcleo_rpc
leo_rpcleo_rpc
Multi Data Center Replication
Investigating stored objects
![Page 38: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/38.jpg)
NFS Support
![Page 39: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/39.jpg)
NFS SupportFuture Plans
Data-HUB: Centralize unstructured data in LeoFS
Search / AnalysisPaaS / IaaS Photo-Storage
Many Kind of Data PhotoLog / Event Data
Loading Data
Analysis Data
Stream Processing
![Page 40: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/40.jpg)
LeoFS Administrationat Rakuten
Presented by Masahiro Sanjo Rakuten Institute of Technology
![Page 41: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/41.jpg)
Storage PlatformFile Sharing ServiceOthers Portal Site
Photo Storage
Background Storage of OpenStack
LeoFS Administration at Rakuten
![Page 42: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/42.jpg)
Storage Platform
![Page 43: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/43.jpg)
Storage Platform - Scaling the Storage Platform
(Movie)
Reduce CostsHigh ReliabilityEasy to ScaleS3-API
![Page 44: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/44.jpg)
Using Various Services Total Usage: 450TB/600TB
# of Files: 600Million Daily Growth: 100GB Daily Reqs: 13Million
Storage Platform - Scaling the Storage Platform
E-Commerce
Blog
Insurance Calendar
Recruiting
Review Photoshare
Portal &Contents
Bookmark
B
Storage Platform
(Movie)
![Page 45: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/45.jpg)
Monitor
GUI Console
( Erlang RPC)
( Erlang RPC) ( TCP/IP,SNMP )
Gatew
ay x 4Storage x 14
Manager x 2
Requests fromWeb Applications / Browsers
w/HTTP over S3-API
Load Balancer / Cache Servers
Storage Platform - System LayoutTotal disk space: 600TBNumber of Files: 600MillionAccess Stats: 800Mbps (MAX) 400Mbps (AVG)
![Page 46: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/46.jpg)
Monitor
GUI Console
( Erlang RPC)
( Erlang RPC) ( TCP/IP,SNMP )
Gatew
ay x 4Storage x 14
Manager x 2
Storage Platform - Monitor
Send Mail AlertGanglia Agent
Status Collection (Ganglia)Status Check (Nagios)Port + Threshold Check
![Page 47: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/47.jpg)
Storage Platform - Spreading Globally
Covering All Services with Multi DC Replication
![Page 48: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/48.jpg)
File Sharing Service
+https://owncloud.com/
![Page 49: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/49.jpg)
+
File Sharing Service - Required Targets
Reduce CostsHandle Confidential Files
Store Large FilesScale Easily
![Page 50: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/50.jpg)
+
Share Docs and Videos with Group CompaniesOver 20 Companies, Over 10 Countries
Over 4,000 Users, Over 10,000 Teams
File Sharing Service - Usage
![Page 51: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/51.jpg)
LDAP
Monitor
GUI Console
( Erlang RPC)
( Erlang RPC) ( TCP/IP,SNMP )
Manager x 2
Authenticate Users
Manage Configurations
ManageLogin Session(KVS)
File Sharing Service - System Layout
Web GUI File Browser
![Page 52: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/52.jpg)
Cover 25 Countries/RegionsOver 20,000 Users
+
File Sharing Service - Future Plans
![Page 53: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/53.jpg)
Empowering the Services and the Users Through the Cloud Storage
![Page 54: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/54.jpg)
Future Plans
![Page 55: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/55.jpg)
SavannaDB for Statistics Data
Retrieve metrics and stats from SavannaDB's Agents
Storage Cluster
ManagerGateway
The Lion of Storage Systems
REST-API (JSON)
Operate LeoFS
Notify a message of over # of req threshold
SavannaDB's AgentInsight LeoFS
LeoInsight
Future Plans
+
![Page 56: Reliable, Scaling and High Performance Storage System...Reliable, Scaling and High Performance Storage System Yosuke Hara - @yosukehara A Researcher of R.I.T. and Tech Lead LeoFS with](https://reader030.vdocuments.mx/reader030/viewer/2022011922/603d3fd37cf0be17d235f500/html5/thumbnails/56.jpg)
Set Sail for “Cloud Storage”Website: leo-project.net
Twitter: @LeoFastStorage