ludovicbarman · por2 –proofsofretrievabilityandredundancy ludovicbarman epfl august27,2015...
TRANSCRIPT
PoR2 – Proofs of Retrievability and Redundancy
Ludovic Barman
EPFL
August 27, 2015
Ludovic Barman (EPFL) PoR2 August 27, 2015 1 / 54
"Google loses data as lightling strikes", BBC.com, 19.08.2015
Ludovic Barman (EPFL) PoR2 August 27, 2015 2 / 54
It is not a critical event, but some data were lost
Which (naturally...) led the journalists to write advice such as :
Still, no one should ever put their faith in the cloud — always backupyour data locally - Geek.com, 21.08.2015
Indeed, but what if we could ?
Nothing is perfectly safe - but it would be nice to know the risks !
Ludovic Barman (EPFL) PoR2 August 27, 2015 3 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 4 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 5 / 54
IntroductionMotivation
1 Cloud services are more widespread than ever2 It offers numerous benefits, in particular for the mobile user3 It simplifies data storage, synchronization and backup4 One downside : Data availability is less obvious for users5 Current cloud providers accept little liability for data loss
Ludovic Barman (EPFL) PoR2 August 27, 2015 6 / 54
IntroductionMotivation
1 Cloud providers often replicate the data to reduce risks of loss2 This replication is often done obliviously to the users3 Some cloud providers allows users to choose the degree of replication4 But users cannot check the veracity of these claims
Hence, users cannot assess nor control the risks they incurs, and need totrust the cloud providers.
Ludovic Barman (EPFL) PoR2 August 27, 2015 7 / 54
IntroductionMotivation
1 Protocols (PDP/POR) exists to check the retrievability of a file2 Recently, PDP have been extended for checking multiples replicas3 In all those protocols, the users needs to replicate locally, and send the
replicas
Numerical example: A user wants to store a 1GB file, and wants a degree ofreplication of 3 to reduce risks. He needs to sends around 3GB of data.
5 Worst, the cloud providers cannot check the replicas6 Users can cheat by claiming encrypted files are replicas
Ludovic Barman (EPFL) PoR2 August 27, 2015 8 / 54
IntroductionGoals
1 We design proofs that enable users to check1 the retrievability of their stored files
2 the correctness of the replicas of the data
2 In addition, the cloud provider creates the replicas, which1 significantly lessen the burden on users
2 prevents the users from cheating 1, hence allowing new business modelsfor the cloud providers
1by claiming some encrypted data are replicasLudovic Barman (EPFL) PoR2 August 27, 2015 9 / 54
IntroductionTeam
Dr. Ghassan Karame Prof. Dr. Frederik Armknecht
Dr. Jens-Matthias Bohli Ludovic Barman
Ludovic Barman (EPFL) PoR2 August 27, 2015 10 / 54
IntroductionContribution
1 We propose a new formal model PoR2 which extends the POR/PDPmodel and addresses new threats not covered so far
2 We propose new novel solution for data replication and retrievability,Mirror, and show that it is secure in the PoR2 model
3 We implement and evaluate the performance of Mirror in a realisticcloud setting.
Ludovic Barman (EPFL) PoR2 August 27, 2015 11 / 54
IntroductionContribution
My main contributions :1 We explore the solution space for constructing Mirror, and propose
three novel solutions2 We analyze the performance and security of our proposals, and extract
the relevant lessons3 By leveraging our observations, we design a fourth solution, Mirror,
which establishes a strong balance between security and performance4 We implement a prototype of all four devised solutions, and compare
the performance in a realistic cloud setting
Ludovic Barman (EPFL) PoR2 August 27, 2015 12 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 13 / 54
Related workDefinitions
1 Two (similar) kind of protocols allows to check the integrity of aremote file :
Definitions of PDP and PoRPDP. Provable Data Possession. Aims to validate the integrity ofoutsourced data efficiently and securely.PoR. Proof of Retrievability. Aims to provide evidence that a verifier canretrieve and reconstruct the entire outsourced data.
2 Related work mainly try to improve performance, to give the ability toupdate an uploaded file (dynamic schemes), or to process file withmultiple replicas
3 We focus on efficient multi-replica schemes
Ludovic Barman (EPFL) PoR2 August 27, 2015 14 / 54
Related WorkCommon architecture
Usually, (multi-replica) PDP and POR schemes share this commonarchitecture :
User U Server S
Store F , Γi−−−−−−−−−−−−−−−−−−−−−−−−−−B
Replicate <OR>−−−−−−−−−−−−−−−−−−−−−−−−−−B Replicate
Challenge Q−−−−−−−−−−−−−−−−−−−−−−−−−−B AnswerChallenge
UserVerify C−−−−−−−−−−−−−−−−−−−−−−−−−−
Ludovic Barman (EPFL) PoR2 August 27, 2015 15 / 54
Related workDefinitions (2)
Formal difference between PDP and PoR : PoR provide extractability.
ExtractabilityGiven a sufficient number of runs of the Verify where the user accepts,there exists an algorithm Extractor, run by the user, that takes as inputthe secret parameters of the user, and has non-black box access to themachine of the server, that recovers the file with high probability.
Personal analysis : It’s a desirable trait (which often hinders performance)
Ludovic Barman (EPFL) PoR2 August 27, 2015 16 / 54
Related WorkCommon architecture (2)
Usually, (multi-replica) PDP and POR schemes share these common traits :
1 The file F is usually considered to be encrypted by the user beforehand(for privacy purposes).
2 We consider that the user applies erasure-coding, i.e. if a givenproportion of the file is recoverable, then the whole file is.
Ludovic Barman (EPFL) PoR2 August 27, 2015 17 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 18 / 54
ModelIntroduction
We quickly summarize the most important aspects from the PoR2 model:
1 The user U and the server S agree on a SLA on storing a file F . Theserver needs to:
1 Store the file F in its entirety
2 Store R replicas F (1)...F (R) in their entirety
2 The protocol is the same as the one presented in common architecture.3 Security goals :
1 Extractability
2 Storage allocation – enough storage is allocated by S
3 Correct replication – replicas do not encode extra information
Ludovic Barman (EPFL) PoR2 August 27, 2015 19 / 54
ModelAdversary
We consider the rational attacker model as done in related work
Such adversary only deviates from the protocol if he gains somethingby doing so
For instance :
the cloud provider tries to cut corners and save costs, by storing lessthan the agreed storage
the user tries to save costs by uploading other information instead ofthe replicas
Ludovic Barman (EPFL) PoR2 August 27, 2015 20 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 21 / 54
Proposed solutionsOverview
Those are the common trait of our proposals :1 We want the replication to happen in the cloud2 We do not want the server to replicate on-the-fly3 Hence, the replication is designed as a puzzle4 This puzzle need to :
1 require a significant time to compute for the server
2 require significantly less time to compute for the user
3 have solutions at least as large as the required storage for a replica
4 be efficiently combined in the scheme
Our schemes are inspired from the private version of SW-POR.
Ludovic Barman (EPFL) PoR2 August 27, 2015 22 / 54
∑Additive Scheme
Ludovic Barman (EPFL) PoR2 August 27, 2015 23 / 54
Proposed solutionsAdditive scheme - Protocol (1)
Client
Store (File F , kPRF):
Using the PRF :... Pick α1 · · ·αs from Zφ(N)
... Pick secret1 · · · secretn from ZN
For i ∈ [1, n] :... σi ←secreti +
∑sj=1αj ·mi ,j
τU ← (kPRF, n)Keep τU , σi, may delete mi ,j
mi ,j, σi−−−−−−−−−−−−−−−−B Server
Ludovic Barman (EPFL) PoR2 August 27, 2015 24 / 54
Proposed solutionsAdditive scheme - Protocol (2)
Server
Replicate (r) :For all i ∈ [1, n], j ∈ [1, s ] :... m(r)
i ,j ← mi ,j + pE (i−1)s+jrr
Client
Challenge:Choose l indices i1 · · · il in [1, n]Pick vi1 · · · vil at random in Zp
Q = (ik , vik ) ∀ k ∈ [1, l ] Q−−−−−−−−−−−−−−−−B
Ludovic Barman (EPFL) PoR2 August 27, 2015 25 / 54
Proposed solutionsAdditive scheme - Protocol (3)Server
σ ←∑
(i ,vi )∈Q σi ·vi∀j ∈ [1, s ],... µj ←
∑(i ,vi )∈Q vi ·m(r)
i ,jσ, µj
−−−−−−−−−−−−−−−−B
Client
LHS =∑s
j=1 αj · µj +∑
(i ,vi )∈Q vi · secretiRHS = σ +
∑sj=1 αj ·Pj , where
Pj :=∑
(i ,vi )∈Q vi · pEr (i−1)·s+j mod φ(N)r mod (N)
Accepts if LHS = RHS
Ludovic Barman (EPFL) PoR2 August 27, 2015 26 / 54
Proposed solutionsBenchmarking
1 Prototypes made in Scala (run on JVM), run in realistic cloud setting
Parameter ValueMachine Type 2 x Intel Xeon E5-2640Number of cores per machine 24 coresRAM per machine 32 GBNetwork speed 100 MbpsMean packet latency 10 msVariance in packet latency 4 ms
Table: Implementation Settings
1 Each data point is averaged over 10 independant samples2 95% confidence intervals are shown where applicable
Ludovic Barman (EPFL) PoR2 August 27, 2015 27 / 54
Proposed solutionsAdditive Scheme - Performance
0
10
20
30
40
50
60
70
1 2 8 16 32
Late
ncy
[s]
File size [MB]
Store
0
50
100
150
200
250
300
1 2 8 16 32
Late
ncy
[min
]
File size [MB]
Replicate
0
50
100
150
200
1 2 8 16 32
Late
ncy
[ms]
File size [MB]
ChallengeAnswerChallenge
0
10
20
30
40
50
60
70
80
1 2 8 16 32
Late
ncy
[s]
File size [MB]
UserVerify
Ludovic Barman (EPFL) PoR2 August 27, 2015 28 / 54
Proposed solutionsAdditive scheme - Security Analysis (1)
Puzzle hardness1 Puzzle of the form pE mod N, E N, which is inherently a sequential
process that is hard to parallelize
2 Hence, multi-threaded server have no significant advantage overmono-threaded ones
3 Assuming a server with bounded number of threads, we can detect for lbig enough (puzzles starts to queue up)
Extractability
1 Each successful run, the user learns s equations µj ←∑
(i ,vi )∈Q vi ·m(r)i ,j
2 With enough equations, we apply standard Gaussian elimination
3 With x successful run, x ≥ l · r , we can recover r replicas
Ludovic Barman (EPFL) PoR2 August 27, 2015 29 / 54
Proposed solutionsAdditive scheme - Security Analysis (2)
Storage allocation1 Suppose a malicious server that stores correctly a percentage ρ of a file
2 Number of puzzles to be recomputed per challenge l · s · (1− ρ)
3 Let the puzzle time be ∆Puzzle, the extra time needed to answer is
l · s · (1− ρ) · ∆Puzzle
4 This needs to be tuned given the expected response time of an honestserver
Correct replication1 Correct by design
Ludovic Barman (EPFL) PoR2 August 27, 2015 30 / 54
∏Multiplicative Scheme
Ludovic Barman (EPFL) PoR2 August 27, 2015 31 / 54
Proposed solutionsMultiplicative scheme
We replace1 every addition with a multiplication2 every multiplication with a exponentiation
... which preserves the homomorphic properties of the scheme.
As a result :1 everything is slightly slower2 but we can use a speed-up in UserVerify
Ludovic Barman (EPFL) PoR2 August 27, 2015 32 / 54
Proposed solutionsMultiplicative scheme - Security Analysis
Puzzle hardnessExtractability
1 Each successful run, the user learns s equations µj ←∏
(i ,vi )∈Q m(r)i ,j
vi
2 This time, the solutions are not unique (2 or 4 possibilities per m(r)i ,j )
3 It will be fixed in Mirror by forcing each sector to include arecognizable pattern
Storage allocation
Correct replication
Ludovic Barman (EPFL) PoR2 August 27, 2015 33 / 54
Proposed solutionsMultiplicative scheme - Performance
0
50
100
150
200
1 2 8 16 32
Late
ncy
[s]
File size [MB]
Store
0
50
100
150
200
250
1 2 8 16 32
Late
ncy
[min
]
File size [MB]
Replicate
0
1
2
3
4
5
6
7
1 2 8 16 32
Late
ncy
[s]
File size [MB]
ChallengeAnswerChallenge
0
5
10
15
20
1 2 8 16 32
Late
ncy
[s]
File size [MB]
UserVerify
Ludovic Barman (EPFL) PoR2 August 27, 2015 34 / 54
⋃Scheme with precomputation
Ludovic Barman (EPFL) PoR2 August 27, 2015 35 / 54
Proposed solutionsScheme with precomputation
We now try to play with the replication time.We rearrange the previous n · s puzzles in :
1 1 precomputation (hopefully tunable as we want)
2 n · s fast computations based on the precomputation
1 We fill a lookup-table with powers of pE x , 0 < x ≤ Λ, withΛ = n · s · λ
2 In the multiplicative scheme, we replace the puzzle
pE (i−1)·s+j
with ∏ik∈ Ω(i ,j)
pE ik
where Ω is a PRF that yield w values in [1, Λ]Ludovic Barman (EPFL) PoR2 August 27, 2015 36 / 54
Proposed solutionsScheme with precomputation - Performance
0
50
100
150
200
1 2 8 16 32
Late
ncy
[s]
File size [MB]
Store
0
2
4
6
8
10
12
14
16
1 2 8 16 32
Late
ncy
[min
]
File size [MB]
Total ReplicationPrecomputation
0
1000
2000
3000
4000
5000
1 2 8 16 32
Late
ncy
[ms]
File size [MB]
ChallengeAnswerChallenge
0
5
10
15
20
25
1 2 8 16 32
Late
ncy
[s]
File size [MB]
UserVerify
Ludovic Barman (EPFL) PoR2 August 27, 2015 37 / 54
Proposed solutionsScheme with precomputation - Security
1 Unfortunately, the malicious cloud provider has an efficient way ofcheating
2 He computes a partial lookup table pEk·x , k ∈N, 0 < x ≤ Λ
3 He can freely choose k as a tradeoff between storage and time toanswer a challenge
4 With k sufficiently high, the lookup table is smaller than a replica5 To recompute a missing value pE x requires at most k exponentiations
power E — hard to detect
Ludovic Barman (EPFL) PoR2 August 27, 2015 38 / 54
Mirror
Ludovic Barman (EPFL) PoR2 August 27, 2015 39 / 54
MirrorIntroduction
1 We use the same construction as before (multiplicative scheme)2 We change the way of producing the puzzles3 We use a combination of LFSR and Time-lock puzzles
Ludovic Barman (EPFL) PoR2 August 27, 2015 40 / 54
MirrorLFSR
Stands for Linear Feedback Shift Register
Produces a random-looking sequence from an initial state S and afeedback function F
F
s0 s1 s2 s3 s4 s5 s6 s7
s8
Under certain conditions, a LFSR can produce the same outputsequence than a shorter one
We give the user the short LFSR, and the server the long one
We increase the asymmetry by giving the elements mapped to x → gx ,and making the server compute exponentiations
Ludovic Barman (EPFL) PoR2 August 27, 2015 41 / 54
MirrorLFSR
ClientLFSR of length λ = 3
ServerLFSR of length λ2 = 5
F : s4 = a1 · s1 + a2 · s2 + a3 · s3F ′ : s6 = a′1 · s1 + a′2 · s2 + a′3 · s3 +a′4 · s4 + a′5 · s5
The client generates an element g , and sends ti ← g si instead of theinitial state si to the server.
F : t4 = ga1·s1+a2·s2+a3·s3 F ′ : t6 = t1a′1 · t2a′2 · t3a′3 · t4a′4 · t5a′5
Both (L)FSR produce the same sequence, but one takes more resources
The asymmetry is further increase when computing the product ofoutput element
Ludovic Barman (EPFL) PoR2 August 27, 2015 42 / 54
Proposed solutionsLFSR
The output sequence of the server LFSR is be the puzzle / blindingfactor
We use two LFSR pairs
The two server LFSR output sequence are called gi and hi
The new sector is computed as follow
m(r)i ,j ← mi ,j · g(i−1)s+j · hns−(i−1)s−j
One sequence is going forward, the other backwards
Ludovic Barman (EPFL) PoR2 August 27, 2015 43 / 54
MirrorProtocol (1)
Client
Store (File F , kPRF):
Using the PRF :... Pick ε1 · · · εs from Zφ(N)
For i ∈ [1, n] :... σi ←
∏sj=1 mi ,j
εj
τU ← (kPRF, n)Keep τU , σi, may delete mi ,j
mi ,j, σi−−−−−−−−−−−−−−−−B Server
Ludovic Barman (EPFL) PoR2 August 27, 2015 44 / 54
MirrorProtocol (2)Server
Replicate (r) :For all i ∈ [1, n], j ∈ [1, s ] : m(r)
i ,j ← mi ,j · g (r)(i−1)s+j
For all i ∈ [n, 1], j ∈ [s, 1] : m(r)i ,j ← m(r)
i ,j · h(r)(i−1)s+j
Client
Challenge:Choose l indices i1 · · · il in [1, n]Pick vi1 · · · vil at random in Zp
Q = (ik , vik ) ∀ k ∈ [1, l ] Q−−−−−−−−−−−−−−−−B
Ludovic Barman (EPFL) PoR2 August 27, 2015 45 / 54
MirrorProtocol (3)Server
σ ←∏
(i ,vi )∈Q
(σi ·
∏sj=1
∏r∈[1,R ] m
(r)i ,j
)vi
∀j ∈ [1, s ],... µj ←
∏(i ,vi )∈Q m(r)
i ,jvi
σ, µj−−−−−−−−−−−−−−−−B
Client
LHS← σ ·∏
(i ,vi )∈Q
(∏sj=1
∏r∈k g
(r)(i−1)s+j · h
(r)(i−1)s+j
)−vi
RHS←∏
j=1 µjεj+‖k‖
Accepts if LHS = RHS
Ludovic Barman (EPFL) PoR2 August 27, 2015 46 / 54
Proposed solutionsMirror - Security Analysis
Puzzle hardness1 The demonstration is outside the scope of this thesis
2 There is a formal proof which gives a lower bound on the puzzlecomputation time
3 This lower bound depends on the biggest LSFR coefficient, WLOG thefirst one α1
4 Even with this pessimistic lower bound, this time is sufficiently high tobe detectable by the user
Extractability - as in the multiplicative scheme
Storage allocation - as in the additive scheme
Correct replication - correct by design
Ludovic Barman (EPFL) PoR2 August 27, 2015 47 / 54
Proposed solutionsMirror - Performance (1)
0
20
40
60
80
100
120
140
160
180
1 2 8 16 32
Late
ncy
[s]
File size [MB]
Store
0
20
40
60
80
100
120
140
160
180
1 2 8 16 32
Late
ncy
[min
]
File size [MB]
Replicate
0
1
2
3
4
5
6
7
1 2 8 16 32
Late
ncy
[s]
File size [MB]
ChallengeAnswerChallenge
UserVerify
Ludovic Barman (EPFL) PoR2 August 27, 2015 48 / 54
Proposed solutionsMirror - Performance (2)
The performance are now reasonable, especially in terms ofAnswerChallenge and UserVerify
We will compare the scheme against the scheme from Curtmola et al,MR-PDP
Ludovic Barman (EPFL) PoR2 August 27, 2015 49 / 54
Proposed solutionsMirror - Performance (3)
0
1000
2000
3000
4000
5000
6000
7000
1 2 8 16 32 64 128 1024
Late
ncy
[s]
File size [MB]
MR-PDP (Store)Mirror (Store)
500
1000
1500
2000
2500
3000
3500
4000
1 2 8 32 64
Late
ncy
[s]
r
Mirror (Replicate)
0
50
100
150
200
250
300
350
400
1 2 8 32 64
Late
ncy
[s]
r
MR-PDP (Replicate)
0
1
2
3
4
5
6
1 2 8 32 64
Late
ncy
[s]
r
MR-PDP (UserVerify)MR-PDP (AnswerChallenge)
Mirror (UserVerify)Mirror (AnswerChallenge)
Ludovic Barman (EPFL) PoR2 August 27, 2015 50 / 54
Proposed solutionsMirror - Choice of parameters
0
0.5
1
1.5
2
2.5
3
0 40 80 120 160 200
Late
ncy
[s]
|α1*|
0
200
400
600
800
1000
1200
10 20 30 40 50 60 70 80 90 100
Tim
e to
rep
licat
e 64
MB
[s]
|α1*|
λ*=5λ*=15λ*=40λ*=60
The bitsize of α1 (and the size of the server LFSR) has an impact onthe replication time
The bitsize of α1 has a big impact on the response time of amisbehaving cloud provider (that only stores 90% of the file)
Ludovic Barman (EPFL) PoR2 August 27, 2015 51 / 54
Table of Contents
1 Introduction
2 Related work
3 Model
4 Proposed solutions
5 Conclusion
Ludovic Barman (EPFL) PoR2 August 27, 2015 52 / 54
Conclusion
In this work,1 we proposed four novel solutions that allow users to check the
retrievability and redundancy of their files2 we analyzed, implemented, and benchmarked them in a realistic
settings, before discussing their strengths and weaknesses3 the selected solution, Mirror, incurs tolerable overhead and secure in
the chosen model4 All those solutions make the cloud replicate the data, lessening the
burden on users, and preventing abuse from the users
Ludovic Barman (EPFL) PoR2 August 27, 2015 53 / 54
References I
Shacham H., Waters B.
Compact proofs of retrievability.
Advances in Cryptology-ASIACRYPT 2008, 2008 - Springer.
Curtmola, R., Khan, O., Burns, R., Ateniese
MR-PDP: Multiple-replica provable data possession.
Distributed Computing Systems, 2008. ICDCS’08.
Ludovic Barman (EPFL) PoR2 August 27, 2015 54 / 54