correcting localized deletions using guess & check...

22
Correcting Localized Deletions Using Guess & Check Codes Salim El Rouayheb Joint work with Serge Kas Hanna and Hieu Nguyen Rutgers University 55th Annual Allerton Conference on Communication, Control, and Computing

Upload: others

Post on 26-Sep-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

CorrectingLocalized DeletionsUsingGuess&CheckCodes

SalimElRouayheb

JointworkwithSergeKas HannaandHieu Nguyen

RutgersUniversity

55thAnnualAllertonConferenceonCommunication, Control,andComputing

Page 2: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Motivation

2

• Ourmotivation:filesynchronization,E.g.Dropbox

Alice Bob

10101010…

• Recentapplication:DNA-basedstorage

• DeletionswerefirststudiedbyVarshamov-Tenengolts (‘65)andLevenshtein (‘66)

110010…

• Deletions:10101010 110010Transmitted Received

Page 3: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

LocalizedDeletions

3

• Motivation:filesynchronization,E.g.Dropbox

Localizededits

Page 4: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

PreviousWorkonDeletions

4

Ø Unrestricteddeletions

Ø Bursty deletions

• Filesynchronization:[Maetal.‘11]

• Codeconstructions: [Levenshtein ‘67],[Chengetal.14],[Schoeny etal‘17]

Existenceofcodesforlocalizedmodelw=3,4

• Informationtheoreticapproach:[Gallager ’61],[Dobrushin ‘67];lowerandupperboundsonthecapacity:[Mitzenmacher andDrinea ‘06],[Diggavi etal.‘07],[Kanoria andMontanari ‘13],[Venkataramanan etal.’13]…

• Codeconstructionsandfundamentallimits:[Varshamov andTenenglots‘65],[Levenshtein ‘66],[SchulmanandZuckerman ‘99],[Helberg andFerreira ‘02],[Cullina andKiyavash ‘14],[Gabrys etal.‘16],[Brankensieketal.’16],[Thomas etal.’17]…

• Recentfilesynchronizationalgorithms:[Yazdi andDolecek ‘14],[Venkataramanan etal.‘15],[Salaetal.’17] …

Page 5: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

ModelandContribution

5

! = 101010111000100100010110101110000 01010001100111111000010

Ø # ≤ % deletions localizedinawindowofsize%windowofsize%

# deletions(inred)

Ø Ourassumptions:1)positionsofthedeletions areindependent ofthecodeword;2)informationmessage isuniform iid

Ø Contribution:Explicitcodeswithdeterministicpolynomialtimeencodinganddecodingthatcancorrectlocalizeddeletionswhp

• Logarithmicredundancy:& − ( = ) log ( +% + 1

• Polynomialtimeencodinganddecoding• Asymptoticallyvanishingprobabilityofdecoding failure• Canbegeneralized tomultiplewindows

Guess&Check(GC)Codes

Ø Hardproblemfor% = &

Ø [Schoeny etal.’17]:existenceofcodesfor% = 3,4

Page 6: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

GCCodesExample

6

• Encodingthemessageoflengthk=16: 1110000000110001

1110000000110001GF(17)

14 0 3 1 (6,4)MDS 14 0 3 1 1 0

Ø Guess1:

• Decoding

111000000000112 0 1?

5 12 0 1

2MDSparities

Ø Guess2: 14 3 0 1

Ø Guess3: 14 0 3 1

Ø Guess4: 14 0 0 4

1110 000000001

1110000000001

1110000000001

0

Decodedusing1st parity

Checkwith2nd parity

1110000000001

• Assumethatthedeletions (inred)affectonlyonesystematicblock

16bits 13bits

111000000011 0001

GF(17)

[Kas HannaandElRouayhebISIT17’]

Deletionsoccurinoneofthesewindows

Page 7: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

GeneralizingtoAnyWindowPosition

7

Ø Guess1:

• Decoding,windowofsizelogkcanaffectatmost2consecutive blocks

11000001100013 1?

14 4 3 1

Ø Guess2: 12 7 2 1

Ø Guess3: 12 1 11 15

1100000110001

1100000110001

12

Decodedusing1st &2nd parity

Checkwith3rd parity

• Sameencodingwithoneextraparity

1110010000110001GF(17)

(7,4)MDS 14 4 3 1 5 83MDSparities

12logkbits

1110010000110001 1100000110001

• Assumethat3deletions (inred)affect systematicbits,w=logk=4bits

16bits 13bits

window

Page 8: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

RecoveringtheMDSparities

8

• Buffer:wzeros +asingleone

Ø Howtorecover theMDSparitysymbolsatthedecoder?

Ø Trivialsolution:repeat theparitybits

systematicbits

(# + 1) repetitioncode

Ø Better solution:insertabuffer betweensystematicandMDSparitybits

• Ifparitybitsgetaffected ->simplyoutputthefirst( bits.ElseapplyGuess&Checkdecoding

• Deletionscannotaffect bothsystematicandparitybitssimultaneously

111000000011000100010000MDSparities

systematicbits

11100000001100010000000000001111 0… 0

systematicbits

111000000011000100001 00010000MDSparitiesbuffer

w+1bits

Page 9: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Whendoesdecodingfail?

9

• Encodingthemessageoflengthk=16: 1100010001110010

1100010001110010GF(17)

14 4 7 2 (6,4)MDS 14 4 7 2 8 13

Ø Guess1:

• Decoding

00100011100104 7 2?

14 4 7 2

2MDSparities

Ø Guess2: 2 14 7 2

Ø Guess3: 2 3 1 2

Ø Guess4: 2 3 9 11

0010001110010

001000111 0010

0010001110010

13

Decodedusing1st parity

Checkwith2nd parity

logkbits

0010001110010

• Assumethatthedeletions (inred)affectonlyonesystematicblock

16bits 13bits

1100 010001110010

GF(17)

DecodingFailure

Page 10: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Simulations– DecodingFailure

10

( (messagelengthinbits)

Prob

abilityofFailure

4 = 106 iterations

• Simulationresults for:% = log ( , # = log ( − 1 , c = 3,4MDSparities

Page 11: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

MainResults

11

Sketch of Theorem 2 (8 > : windows):

Ø Encoding complexity is ;(( log (), Decoding complexity is ;((<=>)

Ø Probability of decoding failure: Pr A ≤ (<(B=C)DE

Ø Redundancy: )(F% + 1) log (

Theorem 1 (One window): Guess & Check (GC) codes can correct inpolynomial time up to # ≤ % = G(log () localized deletions, whereH log ( < % < H + 1 log ( for some integer H ≥ 0. Let ) > H+ 2 be aconstant integer.

Ø Encoding complexity is ;(( log (), Decoding complexity is ;((L/ log ()Ø Probability of decoding failure: Pr A ≤ (B=CDE/ log (

Ø Redundancy: ) log ( + %+ 1

Sketch of Theorem 3 [ISIT ‘17] (Unrestricted deletions):Ø Redundancy: )(# + 1) log (

Ø Encoding complexity is ;(( log (), Decoding complexity is ;((N=>/ logN ()Ø Probability of decoding failure: Pr A = ;((>NDE/ logN ()

Page 12: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

TestGCCodesOnline

12

Ø C++&PythoncodesareavailableonGitHub

GitHub repository:https://github.com/serge-k-hanna/GC

Ø TestthecodesonlineusingtheJupyter notebook

Goto:https://try.jupyter.org/

Upload thenotebook filesfromhttp://eceweb1.rutgers.edu/csi/software.html

Ø Formoredetails:http://eceweb1.rutgers.edu/csi/software.html

Page 13: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Simulations– RunningTime

13

C++:Earlyterminationwithprecomputing

Python:Allcaseswithoutprecomputing

Python:Allcaseswithprecomputing

Python:Earlyterminationwithprecomputing

Page 14: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

DecodingFailures:WhatHappened.

13

• Exampleforonedeletion:16-bitmessage0000100011110110

Ø (6,4)MDSencodingoverGF(17): 0 8 15 6 12 52parities

Ø Suppose14th bitgetsdeleted,decoding:v Guess1: 8 4 7 10

v Guess4: 0 8 15 6

Guesses1&4satisfythe2parities

• Probabilityofdecoding failureforagivenstring:combinatorialproblemthatdependsonthestringanddeletionposition

v Guess2:

v Guess3:8 4 7 10

8 4 7 10

• Decodingfailure:morethanonepossibleguess,different decodedstrings

• Proofapproach:assumemessage isuniformiid,averageoverallpossiblemessages

Page 15: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

DecodingFailure– 1Deletion15

Setofalltransmittedk-bitstrings

B B

A

SetofallGCdecoderoutputs

Setsatisfyingall) parities

AssumeWLOGthatGuess 1iscorrect,observetheoutputofdecoderatwrong GuessO ≠ 1

DecodingfailsifdecodedstringisinA

Lemma:atmost2differenttransmitted sequences canleadtothesamedecodedstring inanyGuessO ≠ 1

Setsatisfyingfirstparity

QR decodingfailureinguessO ≠ 1 = QR(decodedstringisin])

···

·

··

14

Fixed:- Guess- Deletion- Firstparity

Page 16: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

ProofofPr(F)forOneDeletion

15

B B

A

k : length of message

Yi : string decoded in Guess i

c : number of parities

A : set satisfying all c parities

B : set satisfying first parity

Unionbound

Lemma

Subspacecardinality

q : field size

Page 17: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Claim

17

Ø Claim1(onedeletion):atmost2different transmittedstringscanleadtothesamedecodedstringinanywrongguess

Setofalltransmittedk-bitstrings

B B

SetofallGCdecoderoutputs

Setsatisfyingfirstparity

·

···

Fixed:- Guess- Deletionposition- Firstparity

Ø Claim2(# deletions): aconstant numberofdifferent transmittedstringscanleadtothesamedecodedstringinanywrongguess

Page 18: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Claim- Example

17

^: = 0000000000000000

^_ = 0010000000000010

000000000000000

000000000000010

Ø Claim1(onedeletion):atmost2different transmittedstringscanleadtothesamedecodedstringinanywrongguess

0000000000000000

Ø Example

Ø 3rd bitdeleted;Guess:deletionoccurred in4th block

0 0 0 0 0parity

` 0 0 ` 0

EncodinginGF(16)

Ø ^L = 1111111111111111 and^a = 0010000000000000

Ø Twoconditions:(1)Symmetryconstraint;(2)Algebraic linearconstraint

Decodeb:

b_Received

Page 19: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Claim

18

Ø Suppose3rd bitisdeleted, guess:deletionoccurred in4th block

8b1 + 4b2 + 2b4 + b5

b1b2b4b5 b6b7b8b9 b10b11b12b13 b14b15b16

8b6 + 4b7 + 2b8 + b9 8b10 + 4b11 + 2b12 + b13 ?

Symbol1 Symbol2 Symbol3 Erasure

Ø Howmanydifferent messagescanleadtosamedecodedstring?

GF(17)

Ø Symmetryconstraint: samedecodedstring⇒ samebitvaluesatpositionsofsymbols1,2and3

Ø Bitswhichcanbedifferent: and(deleted bit)b14, b15, b16 b3

Ø Algebraicconstraint:erasure isdecodedusingfirstparity

4b14 + 2(b3 + b15) + b16 = p1

b14, b15, b16, b3 2 GF (2)

p1 2 GF (17)

Equationhasatmost2solutions

Page 20: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

ApplicationtoFileSynchronization

20

• Interactive synchronizationalgorithmby[Venkataramanan etal.’15]Ø Isolatesingledeletions,useVTcodesØ Modification:isolate# orfewer deletions,useGCcodes

• Gain:(1)lesscommunicationrounds,(2)lowercommunicationcost

Upto75%improvement

Upto15%improvement

19

N=1000iterations

Page 21: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

Summary

16

• Guess&CheckCodes forlocalizeddeletions

Ø Explicitcodeconstruction withlogarithmicredundancy

Ø Deterministicpolynomialtimeencodinganddecoding

Ø Asymptoticallyvanishingprobabilityofdecoding failure

Ø Forsingleormultiplewindows

• Openproblems• Capacityofdeletionchannelwithlocalizeddeletions?• Codes foradversarial localizeddeletions• Andofcourse for“unrestricted” deletioncapacityandcodesarestill

openproblems

Page 22: Correcting Localized Deletions Using Guess & Check Codeseceweb1.rutgers.edu/~csi/Allerton_GC.pdf · v Guess 2: v Guess 3: 8 4 7 10 8 4 7 10 • Decoding failure: more than one possible

20