methods for correcting multiple errors of information storage devices used in microprocessor...

Download Methods for Correcting Multiple Errors of Information Storage Devices Used in Microprocessor Facilities of Measurement Technology (a discussion)

Post on 02-Aug-2016




2 download

Embed Size (px)


  • A method is considered for ensuring resistance to failure in operational computer memory devices byutilizing linear correcting codes with a posteriori correction of multiple errors. The proposed methodmakes it possible to extend the correcting possibilities of the code, i.e., to determine the configuration ofany error with the minimum code redundancy and the lowest hardware and software costs.

    A characteristic feature of modern monitoring and measurement devices is the use of specialized computers intend-ed for mathematically processing and analyzing the results obtained. In turn, up to 70% of the equipment in the apparatusconsidered consists of memory [1], and so the reliability of the information obtained is largely dependent on the functionalreliability of the storage device and the transfer of information. Codes which correct individual errors [15] are widely usedin order to increase the reliability of the functioning of these devices. Here it is assumed that it is individual errors which aremost likely to occur in digital devices. Making this assumption, linear codes are decoded using the method of maximum like-lihood. Erroneous code sets having an error in the same bit form a coset of errors characterized by a definite value of errorsyndrome, with the leader of the coset being the error vector.

    Decoding is considered to be correct if the error vector really is the leader of the coset, the erroneous code set beingtransformed into a code word located at the shortest Hamming distance from it. In practice, this limitation is not always jus-tified since with the increased complexity of modern computers and also under intense operating conditions, for example whenthe power supply voltages depart from their nominal values, the influence of external actions of electromagnetic or radioactiveradiation, etc., increases the probability of faulty correction on account of the appearance of errors of arbitrary multiplicity hav-ing the same error syndrome as the correction (the appearance of multiple errors which are corrected as a single error).

    Therefore, when designing computers which are resistant to failure, it becomes necessary to utilize correcting codeswhich rectify multiple errors. However, the correction of multiple errors based on linear codes leads to a sharp increase inthe redundancy of the code and to large hardware costs for encoding and decoding the information, and this not only doesnot allow increasing the reliability and confidence in the functioning of the failure-resistant computer but even lowers theseindices.

    The main idea for eliminating this contradiction consists in the a posteriori correction of errors. In order to detect theerrors which arise, a linear correcting code is used which rectifies a single error (requiring the minimum hardware costs) whilethe determination of the configuration (the error bits) of a multiple error and its correction are performed from the results ofan analysis of a responding reaction obtained by applying a single test action (requiring a minimum expenditure of time).

    Fundamental Concepts and Definitions. Let the rectification of the errors of a code set be provided on the basisof a linear correcting code which corrects a single error.

    Measurement Techniques, Vol. 45, No. 2, 2002


    Al-r A. Pavlov and A. A. Pavlov UDC 519.725(047)

    Translated from Izmeritelnaya Tekhnika, No. 2, pp. 2123, February, 2002. Original article submitted June 25, 2001.

    0543-1972/02/4502-0141$27.00 2002 Plenum Publishing Corporation 141

  • To each working input set Xis there corresponds a code set

    Y = {y1, y2, ..., yk, rk+1, rk+2, ..., rn},

    where yi and rj are respectively the values of the signals in the information and control bits.The vector R of the control bits is a function of the information bits and is determined by the information encoding

    rule of the chosen code:

    R = {rk+1, rk+2, ..., rnk} = (y1, y2, ..., yk).

    After reception of the message concerning the information bits, the vector of the control bits Rr is formed and theerror syndrome is determined again:

    Es = R Rr.

    For each working input set Xis providing a definite value of the signals in the information and control bitsYk = {y1, y2, ..., yk, rk+1, rk+2, ..., rn}, we have a corresponding a test set Tts = {Yk, R r} Yt which gives produces the oppo-site value of the signals in the information and control bits.

    Definition 1. We shall consider the inverse value of the result of summing the information and control bitsYk = {y1, y2, ..., yk, rk+1, rk+2, ..., rn} obtained for a working input set with the information and control bits Yt obtained for thetest set to be the test error vector:

    If there is no error, the test error vector will assume a value of zero.Definition 2. An error which is not manifested in the considered input working set will be called a hidden error.Example. A variant of single values in the information bits of a Hamming code (r1r2y2r3y1) corresponds to an

    error-free code set 01111. When an error is present in the const 1 in the first information bit, we have for the input set con-sidered an output code set 01111+ (the + sign marks the erroneous bit) which does not differ from the error-free code set.

    Definition 3. We shall say that an erroneous code set is correct if it does not contain hidden errors. If it does con-tain such errors, it will be said to be incorrect.

    Statement 1. Rectification of an incorrect erroneous code set by utilizing a test error vector leads to pseudocor-rection.

    Proof. When a test stimulus is applied which provides the opposite value of the information bits, any errors aredetected. In this case, the test error vector indicates the numbers of the erroneous information bits and, in particular, of thebits containing hidden errors. Since hidden errors correspond to the working input set, their correction based on a test errorvector in turn leads to an error in the correcting code set.

    Consequence 1. A posteriori correction of multiple errors is possible under conditions when hidden errors arerevealed (when corrections to the test error vector are formed).

    On the basis of the concepts and definitions given, the problem is posed of revealing the configuration of multipleerrors from the results of algebraic operations with the values of the error syndrome Es and test error vector B obtained whenthe test stimulus is applied.

    Rule for Forming Error Vector Values. The procedure for determining the error vector is based on the followingtheoretical postulates.

    The encoding of the information bits of the test error vector from the rules of the considered code gives the errorcode of the test bits:

    Ei = (Bi).

    B Y Yk= t .


  • After summing the error syndrome and the error code of the test bits, we obtain the address code of the correction to the hid-den error

    Ec = Es Ei.

    Based on the values obtained for Es, Ei, and Ec, a decision is taken concerning the correction of errors in the infor-mation bits when the number of errors in the information bits satisfies the condition d k 1. In this case, the decoding strat-egy includes the following postulates:

    correction is impossible if the bits of the test error vector corresponding to the control bits have zero values; the transfer of information bits without correction is permitted if the test error vector contains zero values in the

    information bits and unit values (errors) in the control bits; correction is forbidden (the signal device failure is formed) if all the bits of the test error vector corresponding

    to the information bits have unit values (k-fold error) or in the presence of unit values of the signals simultaneously in theinformation and control bits of the test error vector;

    when a hidden error appears, the error vector is formed by adding the correction to the test error vector.Rule for Forming a Correction when a Hidden Error Appears. Let us construct a decision table in order to deter-

    mine the correction of the test error vector (corrections to each hidden error). Then the number of corrections forms a set ofcardinality SM = 2


    For each hidden error, we have a corresponding value of the correction and the corresponding address code of thecorrection. We represent this combination in the form of the defining matrix

    where cij are the values of the bits of the correction vector, i = 0, 2k is the row number, j = 0, k is the column number, eij are

    the values of the bits of the address code of the correction.Property 1. To each address code of the corrections (the right-hand group of elements of the defining matrix) there

    corresponds a direct and inverse value of the bits of the correction vector

    Eei = (eik+1, eik+2, ..., ein) {c1, c2, ..., ck; c1, c2, ..., ck},

    where ci and ci are respectively the direct and inverse values of the error vector bit.This property follows from the definition of the dual erroneous code set: the opposite values of the erroneous code

    set correspond to the same value of the error syndrome.Let us choose from the defining matrix those rows for which the number (binary equivalent) of the values of the cor-

    rection vector corresponds to 2i, i = 1, 2, ..., k and construct the error table


    e e e

    e e e

    e e e

    e e e

    k k n

    k k n

    k k n

    k k nk k k

    e =

    + +

    + +

    + +

    + +

    0 0 00 0 10 0 10

    1 0 0

    0 1 0 2 0

    1 1 1 2 1

    2 1 2 2 2

    2 1 2 2 2

    ... ...

    ... ...

    ... ...

    ... ... ... ... ...

    ... ...



    c c


View more >