![Page 1: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/1.jpg)
Distributed Data System byDistributed Data System byRandom Network CodingRandom Network Coding
ASUSA CorporationHiroshi Nishida
![Page 2: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/2.jpg)
General Distributed Storage SystemGeneral Distributed Storage System
● Traditional Distributed Storage System
– All servers have the same raw files
– General approach for most Content Delivery Networks(CDN) like Netflix, Youtube
![Page 3: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/3.jpg)
General Distributed Storage SystemGeneral Distributed Storage System
● Slightly Efficient System
– To save overall disk space, files are split into piecesand they are sent to servers.
– However, this is not always reliable
– Hadoop, etc
![Page 4: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/4.jpg)
● More Advanced System by RNC or Erasure Coding
– Saves disk space and is more reliable
– Any combination of two servers can fail
– Each server stores only 1/3 of the original file size.
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
![Page 5: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/5.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Principle of Random Network Coding
– Split a file into three pieces – X1, X2, X3
– Randomly choose A1, A2, A3, and calculateB = A1 X1 + A2 X2 + A3 X3
– Do it for B1, B2, …, B# of servers
– For instance,
{B1 = 3 X 1+10 X 2+7 X 3
B2 = 8 X 1+5 X 2+2 X 3
B3 = 1 X 1+4 X 2+23 X 3
B4 = 11 X 1+2 X 2+9 X 3
B5 = 4 X 1+32 X 2+11 X 3
![Page 6: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/6.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Distribute B1, B2, …, B5 to each server
● Note size of Bn (for all n) = size of Xk (for all k) = 1/3
because calculation is made in Galois Field(Actuall size of Bn is slightly greater than 1/3)
B1B2
B3
B5B4
B=A1 X 1+A2 X 2+A3 X 3
![Page 7: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/7.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Restoring Original File
– With any three servers:
– We solve linear equations and obtain X1, X2, X3
– Concatenate
B2B3 B5
{B2 = 8 X 1+5 X 2+2 X 3
B3 = 1 X 1+4 X 2+23 X 3
B5 = 4 X 1+32 X 2+11 X 3
![Page 8: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/8.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Has high affinity to P2P
B1B2
B3
B5B4
● Saves disk space and achieves higher reliability
![Page 9: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/9.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Eeasure Coding (RAID5, RAID6, etc...)
– Simpler and usually faster than RNC
– MS Asure, Hadoop, OpenStack, etc
{B1 = X 1
B2 = X 2
B3 = X 3
B4 = X 1⊕ X 2⊕ X 3
![Page 10: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/10.jpg)
Distributed Storage System using Random Network CodingDistributed Storage System using Random Network Coding
● Pros
– Saves disk space
– More reliable than traditional distributed system
– Easy to add servers
– Safe because data are encoded● Cons
– Encoding and decoding require CPU power● To solve linear equations, Gaussian Elimination is
necessary (O(n3))● Calculation in GF is also slow?● Who decodes data?
![Page 11: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/11.jpg)
Content Delivery Network (CDN)Content Delivery Network (CDN)
● Puts the same contents on different servers
● Getting more popular
![Page 12: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/12.jpg)
Content Delivery Network (CDN)Content Delivery Network (CDN)
● DDoS attacks are increasing all over the world
● Enterprises employ temporary CDNs to survive attacks
● A DDoS attack costs only $5/h (free for first 5 min)
![Page 13: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/13.jpg)
CDN + RNCCDN + RNC
● Saves disk space – can utilize SSD space – achieves higher bandwidth
● But how do we distribute data?
● Can we reduce overall amount of transferred data?
● Can we guarantee non-duplication of equations?
![Page 14: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/14.jpg)
CDN + RNCCDN + RNC
● How do we minimize duplication of equations?
3 X1+10 X 2+7 X 3
3 X1+10 X 2+7 X 3
34 X 1+23X 2+9X 3
8 X 1+50 X 2+100X 3
8 X 1+50 X 2+100X 3
34 X 1+23X 2+9X 3
3 X1+10 X 2+7 X 3
3 X1+10 X 2+7 X 3
![Page 15: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/15.jpg)
CDN + RNCCDN + RNC
● Who decodes data? Client or server?
● Should we create plugin for web browsers?
![Page 16: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/16.jpg)
Programs – rnccdn & rnccdndPrograms – rnccdn & rnccdnd
● Server: rnccdnd – daemon process – receives message from clients and other servers
● Client: rnccdn – sends requests + data to servers and controls them
rnccdn
rnccdnd
Client
Servers
rnccdnd
rnccdnd
rnccdndrnccdnd
![Page 17: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/17.jpg)
Programs – rnccdn & rnccdndPrograms – rnccdn & rnccdnd
● Open source, BSD license (freer than GPL)
● Target OSs: Linux, FreeBSD
● Language: C or C++
● Libraries to use: libevent (optimizes polling functions), LibreSSL (for communication)?
● Message channel: SSL/TLS
● Data channel: SSL/TLS for raw data, non-encryption for encoded data
● HTTP/HTTPS for client–server communication?
![Page 18: Distributed Data System by Random Network … Data System by Random Network Coding ... (CDN) like Netflix, ... Creating open source programs that implement CDN + RNC](https://reader031.vdocuments.mx/reader031/viewer/2022022012/5b1f54b87f8b9a1b1e8b4c25/html5/thumbnails/18.jpg)
Project GoalProject Goal
● Creating open source programs that implement CDN + RNC
● If possible, implement a new technique to distribute encoded data