data integrity © prof. aiman hanna department of computer science concordia university montreal,...

Data IntegrityData Integrity

© Prof. Aiman Hanna© Prof. Aiman HannaDepartment of Computer Science Department of Computer Science

Concordia University Concordia University Montreal, CanadaMontreal, Canada

22

DD ata Integrity ata Integrity A possible reality: A possible reality:

Data transferred Data transferred data altered data altered datadata received received

Electrical interference, power fluctuation, sunspots, storms, …Electrical interference, power fluctuation, sunspots, storms, …etc. are all factors etc. are all factors

Error detectionError detection is hence needed is hence needed

What can be done if errors are detected? What can be done if errors are detected?

Error correctionError correction may sometimes be the most suitable solution may sometimes be the most suitable solution

33

EE rror Detection rror Detection Parity Check Error Detection Parity Check Error Detection Simple/naïve Simple/naïve

Send additional bits along with data Send additional bits along with data

The value of those additional bits depend on the data The value of those additional bits depend on the data itselfitself

Theoretically, if data is alerted, the value of the Theoretically, if data is alerted, the value of the additional bits will no longer correspond to the new additional bits will no longer correspond to the new datadata

44

EE rror Detection rror Detection Parity Check Error Detection Parity Check Error Detection Parity checkingParity checking – – even parityeven parity & & odd parityodd parity

One One parity bitparity bit is added to achieve that is added to achieve that

Figure 6. 1 – Detecting Single-bit Errors Using Even Parity Checking

55

EE rror Detection rror Detection Parity Check Error Detection Parity Check Error Detection How many errors can a parity check detect? How many errors can a parity check detect?

Examples: Assume the use of even parity Examples: Assume the use of even parity Sent: 10100010 Sent: 10100010 11 Received: 1010 Received: 101011010 010 11

Sent: 10011110 Sent: 10011110 11 Received: 10 Received: 1010101110 1110 11 Sent: 10001110 Sent: 10001110 00 Received: 1 Received: 11111111110 1110 00 Sent: 10001110 Sent: 10001110 00 Received: 1 Received: 11111111110 1110 11

≡ ≡ Errors detectedErrors detected ≡ ≡ No errors detectedNo errors detectedNotice that all the above examples do have errors!Notice that all the above examples do have errors!

66

EE rror Detection rror Detection Parity Check Error Detection Parity Check Error Detection In reality, single-bit errors are rare In reality, single-bit errors are rare

An error for a duration of 1/100 second over a 10Mbps line may affect An error for a duration of 1/100 second over a 10Mbps line may affect 10,000 bits 10,000 bits

When many bits are damaged, we refer to that as When many bits are damaged, we refer to that as burst errors burst errors

In average, parity check would catch about 50% of burst errorsIn average, parity check would catch about 50% of burst errors

In other words, it is 50% accurate, which is simply not good for In other words, it is 50% accurate, which is simply not good for communications networks communications networks

This however does not mean that parity checking is useless. Why?This however does not mean that parity checking is useless. Why?

77

EE rror Detection rror Detection Checksums Checksums Divide all data bits into 32-bit groups Divide all data bits into 32-bit groups

Treat each group as an integer value Treat each group as an integer value

The values are then added together to give a The values are then added together to give a checksumchecksum

The checksum is then appended to the sent data and tested at the The checksum is then appended to the sent data and tested at the receiving end receiving end

More accurate than parity checking, however it may not detect More accurate than parity checking, however it may not detect all errors. Why? all errors. Why?

88

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC) Cyclic Redundancy Checks (CRC) Errors are detected via Errors are detected via polynomial divisionpolynomial division

In general, a polynomial interpretation is made for any bit stringIn general, a polynomial interpretation is made for any bit string

For example, the bit string For example, the bit string

bbn-1n-1bbn-2n-2bbn-3n-3….…. bb22bb11bb00

is interpreted as the polynomial is interpreted as the polynomial

bbn-1n-1xxn-1 n-1 + b+ bn-2n-2xx

n-2 n-2 + b+ bn-3n-3xxn-3 n-3 + ….. + b+ ….. + b22xx

2 2 + b+ b11xx + b+ b00

Since each bit here is either 0 or 1, we can consider Since each bit here is either 0 or 1, we can consider xxii only when b=1 only when b=1

For example, the bit string For example, the bit string 1001010111010010101110 is interpreted as is interpreted as

xx10 10 + x+ x7 7 + x+ x5 5 + x+ x3 3 + x+ x2 2 + x+ x11

99

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC) - Cyclic Redundancy Checks (CRC) - Polynomial DivisionPolynomial Division Modulo 2 addition and subtraction (Ex-Or arithmetic) are as Modulo 2 addition and subtraction (Ex-Or arithmetic) are as

follows:follows:

Addition:Addition: 0+0=0; 1+0=1; 0+1=1; 1+1=0 0+0=0; 1+0=1; 0+1=1; 1+1=0

Subtraction:Subtraction: 0-0=0; 1-0=1; 0-1=1; 1-1=0 0-0=0; 1-0=1; 0-1=1; 1-1=0

Assume Assume T(x)T(x) and and G(x)G(x) are two polynomials as follows: are two polynomials as follows:

T(x) = xT(x) = x10 10 + x+ x9 9 + x+ x7 7 + x+ x55 + x+ x44

G(x)G(x) = = xx4 4 + x+ x33 + x+ x00

What is What is T(x)/G(x)?T(x)/G(x)?

1010

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC) - Cyclic Redundancy Checks (CRC) - Polynomial DivisionPolynomial Division What is What is T(x)/G(x)?T(x)/G(x)?

Figure 6. 2 – Calculation of (xx10 10 + x+ x9 9 + x+ x7 7 + x+ x55 + x+ x44 )/(xx4 4 + x+ x33 + 1+ 1 )

1111

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC) - Cyclic Redundancy Checks (CRC) - Polynomial DivisionPolynomial Division The equivalent synthetic division of The equivalent synthetic division of T(x)/G(x) would look as followT(x)/G(x) would look as follow

Figure 6.3 – Synthetic Division of

(xx10 10 + x+ x9 9 + x+ x7 7 + x+ x55 + x+ x44 )/(xx4 4 + x+ x33 + 1+ 1 )

1212

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC)Cyclic Redundancy Checks (CRC) How does CRC work?How does CRC work?

• Given a bit string, append several 0s to it, and call be Given a bit string, append several 0s to it, and call be BB• Find the polynomial interpretation of B; that is Find the polynomial interpretation of B; that is B(x)B(x)• Agree on some polynomial value Agree on some polynomial value G(x)G(x) (called (called generator polynomialgenerator polynomial) ) • Divide Divide B(x)/G(x)B(x)/G(x) and find the remainder polynomial and find the remainder polynomial R(x)R(x) • Define Define T(x) = B(x) – R(x)T(x) = B(x) – R(x)• Find the bit string of T(x); call it Find the bit string of T(x); call it TT• Transmit Transmit TT• Let Let T’T’ is the received string is the received string • From From TT find out find out T’(x)T’(x)• DivideDivide T’(x)/G(x) T’(x)/G(x) if there is aif there is a zero remainder zero remainder thenthen T = T’ T = T’, which means no errors, which means no errors• Otherwise, errors are detectedOtherwise, errors are detected

1313

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC)Cyclic Redundancy Checks (CRC) Example: Suppose the bit string 1101011 is to be sentExample: Suppose the bit string 1101011 is to be sent

• Assume Assume G(x)G(x) = = xx4 4 + x+ x3 3 + 1+ 1

• Append four 0s at the end of the string Append four 0s at the end of the string B = 11010110000B = 11010110000

• B(x) = xB(x) = x10 10 + x+ x9 9 + x+ x7 7 + x+ x5 5 + x+ x4 4

• B(x)/G(x)B(x)/G(x) The remainder The remainder R(x) = xR(x) = x3 3 + x+ x

• T(x) = B(x) – R(x); T(x) = B(x) – R(x); this can be done by subtracting over the bit stringsthis can be done by subtracting over the bit strings

11010110000 11010110000 bit string Bbit string B

-1010 -1010 bit string Rbit string R

11010111010 11010111010 bit string T bit string T (Notice that T(x)/G(x) has a zero (Notice that T(x)/G(x) has a zero remainder) remainder)

In algebra the following analogy is true: If p and q are integers and r is the remainder of p/q, then p-r is evenly divisible by q. for example, 8/3 generates a remainder of 2. 8-2 is evenly divisible by 3.

1414

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC)Cyclic Redundancy Checks (CRC) Example (continues…):Example (continues…):

Figure 6.4 – Dividing T(x)/G(x)

1515

EE rror Detection - CRCrror Detection - CRCCyclic Redundancy Checks (CRC)Cyclic Redundancy Checks (CRC) Example (continues…):Example (continues…):

• Transmit Transmit TT

• Let Let T’T’ is the received string is the received string

• From From TT find out find out T’(x)T’(x)

• DivideDivide T’(x)/G(x) T’(x)/G(x) if there is aif there is a zero remainder zero remainder thenthen T = T’ T = T’, which means no errors, which means no errors

• Otherwise, errors are detectedOtherwise, errors are detected

• Notes: In practice the following holds true:Notes: In practice the following holds true: A non-zero remainder concludes that errors have occurredA non-zero remainder concludes that errors have occurred A zero remainder however is not that conclusive. Unless G(x) is chosen A zero remainder however is not that conclusive. Unless G(x) is chosen

carefully, it is possible that a damaged T’ may still result in a zero remainder carefully, it is possible that a damaged T’ may still result in a zero remainder when T’(x) is divided by G(x)when T’(x) is divided by G(x)

1616

EE rror Detection - CRCrror Detection - CRCCRC Implementation Using Circular Shifts CRC Implementation Using Circular Shifts Performing CRC check for each data arrival is a massive overheadPerforming CRC check for each data arrival is a massive overhead

How fast can the polynomial division be performed? How fast can the polynomial division be performed?

Figure 6.5 – Dividing T’(x)/G(x) When Error Occurs

1717

EE rror Detection - CRCrror Detection - CRCCRC Implementation Using Circular ShiftsCRC Implementation Using Circular Shifts Division using Circular ShiftsDivision using Circular Shifts

Figure 6.6 – Dividing T(x)/G(x) Using Circular Shift

1818

EE rror Correctionrror Correction When errors are detected, either resend the message or correct itWhen errors are detected, either resend the message or correct it

In many cases correction will be a much better choiceIn many cases correction will be a much better choice

1 parity bit check 1 parity bit check 2 possibilities 2 possibilities whether or not an error occurred; but not where whether or not an error occurred; but not where

2 parity bit checks 2 parity bit checks 4 possibilities of failures and successes 4 possibilities of failures and successes

3 parity bit checks 3 parity bit checks 8 possibilities of failures and successes 8 possibilities of failures and successes

nn parity bit checks parity bit checks 22nn possibilities of failures and successes possibilities of failures and successes

1919

EE rror Correctionrror Correction How many How many single-errorsingle-error possibilities are there in the a 3-bit possibilities are there in the a 3-bit

string, for example string, for example mm11 m m2 2 mm33?? 1 parity check 1 parity check whether or not an error occurred, but not whether or not an error occurred, but not

where – Not enough to correctwhere – Not enough to correct 2 parity checks 2 parity checks 4 possibilities 4 possibilities

• Example: Assume Example: Assume 1 0 11 0 1 is sent and there are P1 & P2 parity checks is sent and there are P1 & P2 parity checks • P1 is even parity for P1 is even parity for mm11 & m & m2 2 ; ; P2P2 is even parity for is even parity for mm11 & m & m33

• P1 is hence set to 1, P2P1 is hence set to 1, P2 is set to 0. Assume also that parities are delivered is set to 0. Assume also that parities are delivered successfully! successfully!

• Possible deliveries with a maximum of a single-bit error are:Possible deliveries with a maximum of a single-bit error are: 00 0 1 0 1 P1 fails, P2 fails P1 fails, P2 fails 1 1 11 1 1 P1 fails, P2 succeed P1 fails, P2 succeed 1 0 1 0 00 P1 succeed, P2 fails P1 succeed, P2 fails 1 0 11 0 1 P1 succeed, P2 succeed P1 succeed, P2 succeed No error in transmission No error in transmission

As the receiver, can you correct the error?As the receiver, can you correct the error?

2020

EE rror Correctionrror CorrectionHamming Codes Hamming Codes Hamming code requires the insertion of few parity bits Hamming code requires the insertion of few parity bits

in the bit stream before sending itin the bit stream before sending it

The parity bits check the parity in strategic locationsThe parity bits check the parity in strategic locations

If bits are altered, the receiver will examine the If bits are altered, the receiver will examine the combinations of failures to discover where the errors combinations of failures to discover where the errors areare

2121

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Single-bit Error CorrectionSingle-bit Error Correction How many How many single-errorsingle-error possibilities are there in an 8- possibilities are there in an 8-

bit string, for example bit string, for example mm11 m m2 2 mm3 3 mm4 4 mm5 5 mm6 6 mm7 7 mm88??

3 parity checks 3 parity checks wouldwould be enough to discover a single be enough to discover a single error in one of 8 positions, howevererror in one of 8 positions, however

Errors may as also occur in the parity bitsErrors may as also occur in the parity bits

4 parity checks are needed 4 parity checks are needed 2 244 > 12 > 12

2222

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Single-bit Error CorrectionSingle-bit Error Correction In general, must have n parity checks, where 2In general, must have n parity checks, where 2nn is is

larger than the bits sent larger than the bits sent

nn (parity checks) (parity checks) Bits sentBits sent Success/fail Success/fail

possibilitiespossibilities

11 99 22

22 1010 44

33 1111 88

44 1212 1616(enough to cover the 13 possible events (enough to cover the 13 possible events – no error or single-bit error in one of – no error or single-bit error in one of

12 positions)12 positions)

2323

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Single-bit Error CorrectionSingle-bit Error Correction Which positions will parity checks cover? Where are Which positions will parity checks cover? Where are

they inserted? they inserted?

Figure 6.7 – Hamming Code for Single-bit Errors

2424

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Single-bit Error CorrectionSingle-bit Error Correction

Assume 4-bit binary number, Assume 4-bit binary number, bb11 b b22 b b33 b b44, where , where bbii = 0 if the parity = 0 if the parity ppii succeed, and 1 otherwise succeed, and 1 otherwise Error Position Error Position Binary Binary

bb44, b, b33, b, b22 & b & b11

Parity PParity P11 / / PositionPosition

Parity PParity P22

/ Position/ Position

Parity PParity P33


Parity PParity P44


0 (No Error)0 (No Error) 00000000

11 00010001 11

22 00100010 22

33 00110011 33 33

44 01000100 44

55 01010101 55 55

66 01100110 66 66

77 01110111 77 77 77

88 10001000 88

99 10011001 99 99

1010 10101010 1010 1010

1111 10111011 1111 1111 1111

1212 11001100 1212 1212

2525

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Single-bit Error CorrectionSingle-bit Error CorrectionExample:Example:

Figure 6.8 – Bit Stream

Before Transmission

Figure 6.9 – Parity Checks of Frame After Transmission

2626

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Multiple-bit Error CorrectionMultiple-bit Error Correction A more general Hamming code for multiple-bit error do existA more general Hamming code for multiple-bit error do exist

The number of extra parity bits becomes much larger in this caseThe number of extra parity bits becomes much larger in this case

When there are many possible codes, a reduction is needed in When there are many possible codes, a reduction is needed in order to analyze them quickly order to analyze them quickly

Code Correlation can be used to achieve such reduction Code Correlation can be used to achieve such reduction

When you have two, or more, Hamming codes, a Hamming When you have two, or more, Hamming codes, a Hamming distance is defined as the minimum number of bit changes that distance is defined as the minimum number of bit changes that can lead one code to be detected as anothercan lead one code to be detected as another

A Radius is defined as Hamming Distance / 2A Radius is defined as Hamming Distance / 2

2727

EE rror Correctionrror CorrectionHamming Codes – Hamming Codes – Multiple-bit Error CorrectionMultiple-bit Error Correction Example: Example:

Note:Note: This topic is a major subject of higher-level/graduate courses. It is This topic is a major subject of higher-level/graduate courses. It is notnot a part of this course and you a part of this course and you will will not not be examined on it. This example is just to give an extra knowledge on the subject. be examined on it. This example is just to give an extra knowledge on the subject.

Code 1Code 1 Code 2Code 2 Code 3Code 3

11 00 11

00 11 11

00 11 00

11 00 00

The codes have a minimum Hamming Distance of 2, or a Radius of 1The codes have a minimum Hamming Distance of 2, or a Radius of 1 A code such as 0100 (which is not among the exact valid codes) can be coded as A code such as 0100 (which is not among the exact valid codes) can be coded as

either code 3 with a missing first bit, or as code 2 with a missing third biteither code 3 with a missing first bit, or as code 2 with a missing third bit A change of 2 bits may lead to two codes to be identical; or in other words decoded A change of 2 bits may lead to two codes to be identical; or in other words decoded

incorrectly (for example if code 3 is changed to incorrectly (for example if code 3 is changed to 0011110, it will be decoded incorrectly 0, it will be decoded incorrectly as code 2)as code 2)

data integrity © prof. aiman hanna department of computer science concordia university montreal,...

Documents