key recovery using noised secret sharing with discounts over large clouds

Key Recovery Using Noised Secret Sharing with Discounts

Over Large Clouds Sushil Jajodia1, W. Litwin2 & Th. Schwarz3

1George Mason University, Fairfax, VA {jajodia@gmu.edu}2Université Paris Dauphine, Lamsade {witold.litwin@dauphine.fr}3Thomas Schwarz, UCU, Montevideo {tschwarz@ucu.edu.uy}

What ?• New schemes for an encryption key

recovery• Lost key is always recoverable in practical

time– From a specifically encrypted backup• Using a Noised Secret Sharing Scheme

(NS)–We proposed such schemes in 2012

What ?

• The schemes guaranteed1. Recoverability in practical time of

any key backed up using an NS-scheme• E.g. in 10 min max • But only provided a large cloud

1K – 1M nodes

What ?• Such a cloud is a scarce & usually

external resource• Possibly expensive & hard to use

illegallyo Money trail, traces in many logs…

• The constraint should make backup disclosure attempt rare• But not impossible

What ?• After all, there is also no guarantee that

someone does not pass a red light right on you• Never mind, you still move around, don’t

you ?2. The schemes guaranteed also the absence of any other scheme recovering the key faster than the NS-scheme on the same cloud

What ?• We now enhance the proposed schemes:• Key owner escrowing the key backup

may retain a secret discount (code)• Discounted recovery may be orders of

magnitude less complex than the full-cost (brute force) recovery– The only possibility for 2012 schemes

• E.g. 256 times less complex for 1B code

What ?• Less complex means:– Less computations all together– Faster recovery on the same cloud• E.g. 256 times for 1B code

– Smaller cloud for the same time• E.g. 256 times for 1B code

• In every case, cheaper in cloud cost– E.g. up to 256 times for 1B code

What ?• We present schemes for discounted

recovery• These naturally allow also the full-

cost recovery• But even if this case, these are not

the same schemes as our earlier ones

Why• Currently, it’s better to encrypt the

outsourced data• Using client-side encryption– See news

• Especially Big Data– If you count on server-side encryption:

good luck– Also see news

Why• You may also enjoy songs about

privacy– By renowned CS Prof Th. Imielinski

(Rutgers U)– http://www.system-crash.com/mp3/09-System%20Crash%20_%

20Privacy.mp3 – http

://www.system-crash.com/mp3/HackersRevenge/05_SystemCrash_HackersRevenge_WeWillForgiveYourFuture.mp3

Why• Key loss danger is Achilles’ heel of

modern crypto–Makes many folks refraining of

any encryption• With occasionally known disclosure consequences

–Other loose many tears if unthinkable happens

Why• Frequent causes of key loss– Corrupted or lost memory & no

backup– Sad loss of the key owner with all

her/his secrets– Disgruntled employee departed

with the key & backups

Why• Frequent causes of key disclosure– Hacking of one among many

(simple) backup copies at owner’s site – Sad loss of the key owner with all

her/his secrets– Disgruntled employee departed

with the key & backups

Why• Frequent causes of key disclosure– Dishonest super-user / system

administrator• Mandatory escrow of any key copy in many companies

– Insider attack within an escrow service

Why• Frequent consequences of key loss– Valuable company data• Possibly years of work by many

people–Unpublished valuable manuscripts• Personally known case

– Possibly critical personal health data• Time-series of CAT Scans, ECGs, Blood

analysis, med. treatments…

Why• Frequent consequences of key

disclosure– Listen to news

How ?Backup Encryption

• Client (Key owner) X backs up encryption key K• X estimates 1-node inhibitive recovery

time D–Say 70 days

• D measures trust to Escrow–Lesser trust ? • Choose 700 days

Client Side Encryption• D determines minimal cloud size N for

future recovery in any acceptable calculus time R –Chosen by recovery requestor• E.g. 10 min

–X expects N > D / R but also N D / R • E.g. N 10K for D = 70 days– N 100K for D = 700 days

Client Side Encryption• X creates a classical shared secret for K– Sees K as a large integer, • E.g., 256b long for AES

–Basically, X creates a 2-share secret –Share s0 is a random integer

– Share s1 is s1 = s0 XOR K

• Common knowledge:– K = s0

XOR s1

Client Side Encryption• X transforms the shared secret into a noised one– X makes s0 a noised share :• Chooses a 1-way hash H– E.g. SHA 256

• Computes the hint h = H (s0)– Chooses the noise space

I = 0,1…,m,…M-1– For some large M determined as we explain

Client-Side Encryption

– Each noise m and s0 define a noise share s• In a way we show soon as well

– There are M different pseudo random noise shares• All but one are different from s0

• But it is not known which one is s0 – The only way to find for any s whether

s = s0 is to attempt the match

H (s) ?= h

Noised Secret Sharing

Client Side Encryption

• X estimates the 1-node throughput T – # of match attempts H (s) ?= h per time unit• 1 Sec by default

• X sets M to M = Int (DT).– M may be expected 240 ÷ 250 in practice

• M determines the worst-case and the average complexity of the recovery

• Respectively O(M) and O(M / 2)• As we will show

Client Side Encryption

• X forms base noise share f = p | 0 • Hides the noised share as s0

n = (f, M, h) in its noise share space [f, f + M[ • Sends the noised shared secret

S’ = (s0n, s1) to Escrow as the key backup

Client Side Encryption : Discount Code

• X may define some discount code d• Basically, d is the m – bit suffix of s0

• In practice, we expect m = 8, 16. • Represented in ASCII or Unicode d is then

simply one or two digits, e.g., ‘j’. • X retains d secretly for future recovery

request(or)

Discount Code Actiono 2m times reduced noise space I ; M’ = M / 2m

o 2m times reduced noise share space S’

Escrow-Side Recovery (Backup Decryption)

• Escrow E receives legitimate request of S recovery in time R at most

• E sends data S” = (s0n, R) to some cloud

node C with request for processing accordingly–Keeps s1 out of the cloud

• C will choose between static or scalable recovery schemes

Recovery Processing Parameters

• Node load Ln : # of noises among M assigned to node n for match attempts

• Throughput Tn : # of match attempts node n can process / sec

• Bucket (node) capacity Bn : # of match attempts node n can process / time R–Bn = R Tn

• Load factor n = Ln / Bn

Node LoadL

Overload Normal load Optimal load

Recovery Processing Parameters

• Notice the data storage oriented vocabulary• Node n respects R iff n ≤ 1–Assuming T constant during the processing

• The cloud respects R if for every n we have n ≤ 1• This is our goal –For both static and scalable schemes we

now present

Static Scheme

• Intended for a homogenous Cloud– All nodes provide the same throughput

Static Scheme Full-Cost Recovery Init Phase

• Node C that got S” from E becomes coordinator

• Calculates a (M) = M / B (C)

–Usually (M) >> 1• Defines N as a (M) –Implicitly considers the cloud as

homogenous• E.g., N = 10K or N = 100K in our ex.

Static Scheme Full-Cost Recovery Map Phase

• C asks for allocation of N-1 nodes • Associates logical address n = 1, 2…N-1

with each new node & 0 with itself• Sends out to every node n data (n, a0, P)

–a0 is its own physical address, e.g., IP–P specifies Reduce phase

Static Scheme Full-Cost Recovery Reduce Phase

• P requests node n to attempt matches for every noise share s = (f + m) such that n = m mod N• In practice, e.g., while m < M: –Node 0 loops over noise m = 0, N, 2N…• So over the noise shares f, f + N, f + 2N…

–Node 1 loops over noise m = 1, N+1, 2N+1…–…..–Node N – 1 loops over m = (your guess here)

Static Scheme : Node Load for Full Cost Recovery

0 1 2 N - 1

……..f + 2Nf + N f

……..f + 2N + 1f + N + 1 f + 1

……..f + 2N + 2f + N + 2 f + 2

……..f + 3N - 1f + 2N - 1 f + N - 1

Static Scheme : Termination• Node n that gets the successful match

sends s to C• Otherwise node n locally terminate• C asks every node to terminate– Details depend on actual cloud

• C forwards s as s0 to E

Static Scheme : Termination• E discloses the secret S and sends S to

Requestor– Bill included (we guess)

• E.g., up to 400$ on CloudLayer for –D = 70 days–R = 10 min– Both implied N = 10K with private

option

Static Scheme

• Observe that N ≥ D / R and N D / R – If the initial estimate of T by S owner holds

• Observe also that for every node n, we have(n) ≤ 1– Provided the initial estimate of T by C

holds in time

Static Scheme

• Under above assumptions maximal recovery time is indeed R–See the papers for details

• Average recovery time is R / 2 –Every noise share is indeed equally likely to

be the lucky one

Static Scheme : Discounted Recovery• m < M’ = M / 2m : m is bit length of d

M’ M’ M’

Static Scheme : Discounted Recovery

• If full-cost recovery needs 10K nodes for R = 10 min max. recovery calculus time and m = 8 then:

• The 10 min max. recovery requires only 40 – node wide cloud

• The 10K – node wide cloud suffices for R = 2.3 sec

• 1K cloud suffices for the recovery time up to 25 sec

• Etc

Static Scheme

• See papers for –Details, –Numerical examples – Proof of correctness•The scheme really partitions I•Whatever is N and s0, one and only one node finds s0

Static Scheme• Safety–No disclosure method can in practice

offer the (1-node) calculus faster than the scheme–Dictionary attack, inverted file of

hints…• Other properties

Scalable Scheme

• Heterogeneous cloud– Throughput may differ among the nodes

Scalable Scheme

• Intended for heterogenous clouds– Different node throughputs– Basically only locally known

• E.g. –Private or hybrid cloud–Public cloud without so-called private

node option

Scalable Scheme

• Each node n : n = 0,1… recursively splits its load through hash partitioning• Until its load factor n decreases to n ≤ 1• See our previous papers for details

Scalable Scheme• Init phase similar up to (M) for full cost or

(M’) for discounted recovery calculus – Basically (M) and (M’) >> 1 – Also we note it now 0

• If > 1 we say that node overflows• Node 0 sets then its level j to j = 0 and splits – Requests node 2j = 1– Sets j to j = 1 – Sends to node 1, (S”, j, a0)

Scalable Scheme• As result –There are N = 2 nodes– Both have j = 1–Node 0 and node 1 should each process M / 2

match attempts• We show precisely how on next slides

– Iff both 0 and 1 are no more than 1

• Usually it should not be the case• The splitting should continue as follows

Scalable Scheme• Recursive rule– Each node n splits until n ≤ 1

– Each split increases node level jn to jn + 1 – Each split creates new node n’ = n + 2jn – Each node n’ gets jn’ = jn initially

• Node 0 splits thus perhaps into nodes 1,2,4… • Until 0 ≤ 1

• Node 1 starts with j= 1 and splits into nodes 3,5,9…• Until 1 ≤ 1

Scalable Scheme

• Node 2 starts with j = 2 and splits into 6,10,18… • Until 2 ≤ 1

• Your general rule here • Node with smaller T splits more times

and vice versa

Scalable Scheme : Splitting

α = 0.5

0 1 2 4 8 16

α = 0.8α = 0.7

Node 0 split 5 times. Other nodes did not split.

Scalable Scheme• If cloud is homogenous, the address

space is contiguous• Otherwise, it is not– No problem– Unlike for a extensible or linear hash

data structure

Scalable Scheme : Reduce phase

• Every node n attempts matches for every noise k [0, M-1] such that n = k mod 2jn. • If node 0 splits three times, in Reduce

phase it attempts to match noised shares (f + k) with k = 0, 8, 16…• If node 1 splits three times, it attempts to

match noised shares (f + k) with k = 1, 17, 33…• Etc.

Scalable Scheme : Reduce Phase

α = 0.5

….f + 16f + 8f

….f + 33f + 17f + 1

Scalable Scheme• N ≥ D / R– If S owner initial estimate holds

• For homogeneous cloud, N is 30% greater on the average and twice as big at worst / static scheme• Cloud cost may still be cheaper– No need for private option

• Versatility may still make it preferable besides

Scalable Scheme• Max recovery time is up to R– Depends on homogeneity of the cloud

• Average recovery time is up to R /2• See again the papers for – Examples – Correctness– Safety– …–Detailed perf. analysis remains future work

Multi-share noised secret sharing• For max recovery time R arbitrarily close to the average one : R =

Rave * (k + 1) / k for M = D T / k ; M’ = M / 2m

Multi-share noised secret sharing• We now have – R = Rave * (k + 1) / k–Welcome capability for some

• Drawback: discount code becomes randomly valued d = (d0…dk-1) E.g. d = (j,…,a) at the figure

• Each di is m – bit wide• Perhaps harder to retain for k > 4 • See the paper for paths to solutions• See previous papers for scheme analysis

Related Work

• RE scheme for outsourced LH* files • CSCP scheme for outsourced LH* records

sharing• Crypto puzzles• One way (hash) functions with trapdoors• 30-year old excitement around Clipper chip• Botnets• Snowden’s Leaks (so-called)

Conclusion• NS-schemes with discounts effectively

relieve the fear of encryption key loss/disclosure – Achilles’ heel of cryptography

• Key is always recoverable from its backup• Especially easily through a discounted

recovery 60

Conclusion• The backup disclosure is hard at will• So can be also arbitrarily unlikely• But remains always a possibility• Like a driver going through a red light

right on us • Despite all precautions• But we still drive our cars, don’t we ?

Conclusion• To our best knowledge, our technique is

the only with the discussed properties at present• It appears of interest to many apps • It should especially help potential BigData

users• Deterred up to now for many • By potentially catastrophic consequences

of outsourced data loss / disclosure

Conclusion• NS sharing is a generalization of the well-

known secret sharing• An arbitrary additional cost C still protects the

secret, even if one gets all the shares – Lowered at will by the discounts

• The “classical” scheme amounts to C = 0• NS sharing appears of general interest for the

cryptography–Beyond our current use

Conclusion• Consider a company secret split into

shares among some insiders• With NS sharing, the secret is still

arbitrarily protected even if a collusion / hacking among the insiders happens – So that someone discloses all the shares

• Not with the classical algorithm

Conclusion• The scalable partitioning scheme

appears also a general purpose tool• Could be called Scalable Map-Reduce• Apparently failed dream of Amazon’s cloud

service– Under the name of Elastic Map-Reduce

Future work• Deeper formal analysis• Proof of concept implementation– Started using the popular open source

Vagrant VM for the scalable scheme• Experiments show 2.7s on the

average for a node to (re)start Vagrant on another node so it is ready to process data sent in.

Future work• One should also test the static scheme

on some easily available Map-Reduce implementation•E.g. Amazon EMR in commercial world

Future work• Nevertheless, all things considered,

the static scheme with discounts appears ready for use• All ‘bricks’ are already out there• Ready to be assembled

Thanksfor

Your Attention

Witold LITWIN & al

key recovery using noised secret sharing with discounts over large clouds

key owner

key copy

key backups1212whyfrequent

large cloud key recovery

key disclosure hacking

cloud cost

proposed schemes

backup sad loss

Documents

recoverable encryption through a noised secret over a...

key recovery using noised secret sharing with discounts over...

retiree discounts -...

military discounts central

clouds. what are clouds?

clouds finding out what clouds are all about. cirrus clouds

group class discounts

clouds clouds clouds

discounts session2

mathcash discounts

sales discounts. sales and purchase discounts discounts are...

special discounts directory

active employee discounts -...

segway banif discounts

sales discounts

clouds weight the of a word. clouds clouds clouds

discounts) - ed

bundled discounts and single-product loyalty discounts

discounts for you

clouds, rain, clouds again