beyond the mds bound in distributed cloud storage jian li, tongtong li, jian ren michigan state...

53
Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications 2014/9/18 1

Upload: wendy-armstrong

Post on 17-Dec-2015

223 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

1

Beyond the MDS Bound in Distributed Cloud Storage

Jian Li, Tongtong Li, Jian RenMichigan State University

IEEE INFOCOM 2014-Conference on Computer Communications

2014/9/18

Page 2: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

2

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 3: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

3

Introduction

• The distributed cloud storage can also increase data availability while reducing network congestion that leads to increased resiliency.

• A popular approach is to employ an (n, k) maximum distance separable (MDS) code such as an Reed-Solomon (RS) code [1], [2].

2014/9/18

[1] S. Rhea, C. Wells, P. Eaton, D. Geels, B. Zhao, H. Weatherspoon, and J. Kubiatowicz, “Maintenance-free global data storage,” [2] R. Bhagwan, K. Tati, Y.-C. Cheng, S. Savage, and G. M. Voelker, “Total recall: System support for automated availability management,”

Page 4: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

4

Introduction

• For RS code, the data is stored in n storage nodes in the network. The data collector (DC) can reconstruct the data by connecting to any k healthy nodes.

• While RS code works perfect in reconstructing the data, it lacks scalability in repairing or regenerating a failed node.

2014/9/18

Page 5: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

5

Introduction

• Regenerating code– Allow a replacement node to connect to some

individual nodes directly.– Instead of first recovering the original data.– Regenerate a substitute of the failed node.

• Minimum storage regeneration (MSR)• Minimum bandwidth regeneration (MBR)

2014/9/18

Page 6: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

6

Introduction

• However, the existing research either has no error detection capability, or has the error correction capability limited by the RS code.

• Moreover, error correction capability are unable to determine whether the error correction is successful.

2014/9/18

Page 7: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

7

Introduction

• In this paper, we construct the H-MSR code by combining the Hermitian code and regenerating code at the MSR point.– Can detect the errors in the network while

achieving the error correction capability beyond the RS code.

– Can determine whether the error correction is successful.

2014/9/18

Page 8: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

8

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 9: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

9

Related Work

• Dimakis et al. [3] introduced the conception of {n, k, d, α, β,B} regenerating code.

• The contents stored in a failed node can be regenerated by the replacement node through downloading γ help symbols from d helper nodes.– The bandwidth consumption for the failed node

regeneration could be far less than the whole file.

2014/9/18

[3] A. Dimakis, P. Godfrey, Y. Wu, M. Wainwright, and K. Ramchandran, “Network coding for distributed storage systems,”

Page 10: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

10

Related Work

• A data collector (DC) can reconstruct the original file stored in the network by downloading α symbols from each of the k storage nodes.

• In [3], the authors proved that there is a tradeoff between bandwidth γ and per node storage α.

2014/9/18

Page 11: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

11

Related Work

• In [10], [11], [12], discussed the safely stored, check the integrity of the data, analyzed the error resilience of the RS code.

• This paper’s design is based on the structural analysis of the Hermitian code and the efficient decoding algorithm proposed in [13].

2014/9/18

[10] S. Pawar, S. El Rouayheb, and K. Ramchandran, “Securing dynamic distributedstorage systems against eavesdropping and adversarial attacks,”[11] Y. Han, R. Zheng, and W. H. Mow, “Exact regenerating codes for byzantine fault tolerance in distributed storage,” [12] K. Rashmi, N. Shah, K. Ramchandran, and P. Kumar, “Regenerating codes for errors and erasures in distributed storage,”

[13] J. Ren, “On the structure of hermitian codes and decoding for burst errors,”

Page 12: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

12

Outline

• Introduction• Related Work• Preliminary and Assumptions

– Regenerating Code– Hermitian Code– Adversarial Model

• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 13: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

13

Preliminary and AssumptionsRegenerating Code

• Regenerating code introduced in [3] is a linear code over – with a set of parameters {n, k, d, α, β,B}.– A file of size B is stored in n storage nodes, each of

which stores α symbols.– A replacement node can regenerate the contents

of a failed node by downloading β symbols from each of d randomly selected storage nodes.

2014/9/18

Page 14: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

14

Preliminary and AssumptionsRegenerating Code

• Total bandwidth needed to regenerate a failed node is γ = dβ.

• The data collector (DC) can reconstruct the whole file by downloading α symbols from each of k ≤ d randomly selected storage nodes.

2014/9/18

?

Page 15: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

15

Preliminary and AssumptionsRegenerating Code

• From equation (1), a tradeoff between the regeneration bandwidth γ and the storage requirement α was derived. γ and α cannot be decreased at the same time.

2014/9/18

Page 16: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

16

Preliminary and AssumptionsHermitian Code

• A Hermitian curve H(q) over GF() in affine coordinates is defined by:

2014/9/18

Page 17: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

17

Preliminary and AssumptionsAdversarial Model

• In this paper– Assume some network nodes can be corrupted

due to hardware failure or communication errors.– Be compromised by malicious users.– These nodes may send out incorrect response to

disrupt the data regeneration and reconstruction.– Refer these symbols as bogus symbols without

making distinction between the corrupted symbols and compromised symbols.

2014/9/18

Page 18: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

18

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 19: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

19

Encoding H-MSR Code

• Analyze the H-MSR code for d = 2k − 2 = 2α.• Let , , …, be a strictly decreasing integer

sequence– LCM of , , …, is A

• Suppose the data contains B = A.

2014/9/18

Page 20: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

20

Encoding H-MSR Code

• We arrange the B symbols into two matrices S, T as below:

2014/9/18

Page 21: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

21

Encoding H-MSR Code

• is a symmetric matrix of size × with the upper-triangular entries filled by data symbols.

• Thus contains ( + 1)/2 symbols.• So it contains .

2014/9/18

Page 22: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

22

Encoding H-MSR Code

• For distributed storage, we encode each pair of matrices (S, T) using Algorithm 1.

2014/9/18

Page 23: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

23

Encoding H-MSR Code

2014/9/18

Page 24: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

24

Encoding H-MSR Code

• Theorem 1. By processing the data symbols using Algorithm 1, we can achieve the MSR point in distributed storage.

2014/9/18

Page 25: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

25

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code– Regeneration in Error-free Network– Regeneration in the Hostile Network

• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 26: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

26

Regeneration of the H-MSR CodeRegeneration in Error-free Network

• Suppose node z fails, we devise Algorithms 2 to Algorithm 4 in the network to regenerate the exact H-MSR code symbols of node z in a replacement node .

2014/9/18

Page 27: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

27

Regeneration of the H-MSR CodeRegeneration in Error-free Network

2014/9/18

Page 28: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

28

Regeneration of the H-MSR CodeRegeneration in Error-free Network

2014/9/18

Page 29: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

29

Regeneration of the H-MSR CodeRegeneration in Error-free Network

• From Algorithm 2 to Algorithm 4, we can derive the equivalent storage parameters for each symbol block of size = A ( + 1), = 2 , = + 1, = A, = A/ .

2014/9/18

Page 30: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

30

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

• In a hostile network, Algorithm 4 may be unable to regenerate the failed node due to the possible bogus symbols received from the responses.

• It also cannot verify the correctness of the result.

2014/9/18

Page 31: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

31

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

• There are two modes for the helper nodes to regenerate the contents of a failed storage node in the hostile network.– Detection mode– Recovery mode

2014/9/18

Page 32: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

32

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

2014/9/18

Page 33: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

33

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

2014/9/18

Page 34: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

34

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

• Recovery Mode: Once the replacement node detects errors using Algorithm 5, it will send integer j = q − 1 to all the other −1 nodes in the network requesting help symbols.

• Helper node i will send help symbols using Algorithm 3. – can regenerate symbols using Algorithm 6.

2014/9/18

Page 35: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

35

Regeneration of the H-MSR CodeRegeneration in the Hostile Network

2014/9/18

Page 36: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

36

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code– Reconstruction in Error-free Network– Reconstruction in Hostile Network

• Performance Analysis• Conclusion2014/9/18

Page 37: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

37

Reconstruction of the H-MSR CodeReconstruction in Error-free Network

2014/9/18

Page 38: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

38

Reconstruction of the H-MSR CodeReconstruction in Error-free Network

2014/9/18

[4] K. Rashmi, N. Shah, and P. Kumar, “Optimal exact-regenerating codes for distributed storage at the msr and mbr points via a productmatrix construction,”

Page 39: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

39

Reconstruction of the H-MSR CodeReconstruction in Hostile Network

• Algorithm 9 cannot verify whether the result is correct or not.

• H-MSR– Detection mode– Recovery mode

2014/9/18

Page 40: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

40

Reconstruction of the H-MSR CodeReconstruction in Hostile Network

2014/9/18

Page 41: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

41

Reconstruction of the H-MSR CodeReconstruction in Hostile Network

2014/9/18

Page 42: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

42

Reconstruction of the H-MSR CodeReconstruction in Hostile Network

2014/9/18

Page 43: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

43

Reconstruction of the H-MSR CodeReconstruction in Hostile Network

2014/9/18

Page 44: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

44

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis

– Scalable Error Correction– Error Correction Capability– Complexity Discussion

• Conclusion2014/9/18

Page 45: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

45

Performance AnalysisScalable Error Correction

• The RS-MSR code in [12] can correct up to τ errors by downloading symbols from d + 2τ nodes.

• When the number of errors is larger than τ , the decoding process will fail without being detected.

2014/9/18

Page 46: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

46

Performance AnalysisScalable Error Correction(regeneration)

• The H-MSR code can detect the erroneous decodings using Algorithm 5.

• If no error is detected, the regeneration of HMSR only needs to download symbols from one more node than the regeneration in the error-free network, while the extra cost for the RS-MSR code is 2τ .

• The H-MSR code can correct the errors using Algorithm 6.

2014/9/18

Page 47: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

47

Performance AnalysisScalable Error Correction(reconstruction)

• The RS-MSR code can correct up to τ errors with support from 2τ additional helper nodes.

• The H-MSR code only requires symbols from one additional node using Algorithm 10 to do error detection.

• The errors can then be corrected using Algorithm 11.

2014/9/18

Page 48: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

48

Performance AnalysisError Correction Capability

• Theorem 4. For the data regeneration, the number of errors that the H-MSR code and the RS-MSR code can correct satisfy > when q ≥ 3.

• Similarly, we can conclude that the number of errors that can be corrected by the H-MSR code is larger than the RSMSR code under the same code rate.

2014/9/18

Page 49: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

49

Performance AnalysisComplexity Discussion(regeneration)

• H-MSR code will slightly increase the complexity of the helper nodes.– The extra operation is a matrix multiplication.– The complexity is O() = O() = O().

2014/9/18

Page 50: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

50

Performance AnalysisComplexity Discussion(regeneration)

• Similar to [13], for a replacement node, from Algorithm 4 and Algorithm 5, we can derive that the complexity to regenerate symbols for RS-MSR is O().

• While the complexity for H-MSR is only O().

2014/9/18

Page 51: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

51

Performance AnalysisComplexity Discussion(reconstruction)

• The computational complexity for DC to reconstruct the data is O() for the H-MSR code and O() for the RS-MSR code.

2014/9/18

Page 52: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

52

Outline

• Introduction• Related Work• Preliminary and Assumptions• Encoding H-MSR Code• Regeneration of the H-MSR Code• Reconstruction of the H-MSR Code• Performance Analysis• Conclusion2014/9/18

Page 53: Beyond the MDS Bound in Distributed Cloud Storage Jian Li, Tongtong Li, Jian Ren Michigan State University IEEE INFOCOM 2014-Conference on Computer Communications

53

Conclusion

• H-MSR code can significantly improve the performance of the regenerating code under malicious attacks.

• Lower complexity than (RS-MSR) code in both regeneration and reconstruction.

2014/9/18