error-correction schemes for volume optical memories

9
Error-correction schemes for volume optical memories Mark A. Neifeld and Jerry D. Hayes An optically addressed Reed–Solomon parallel decoder has been designed and fabricated for one- dimensional parallel access optical memories. The 315, 94 Reed–Solomon decoder operates on 60 parallel optical inputs and has been demonstrated at a data rate of 300 megabits@s. Compared with equivalent serial decoding solutions, this decoder is shown to be more area efficient and offers reduced latency. An extension to two-dimensional error correction using both the parallel and serial strategies is presented, and comparisons are made in terms of parallelism, page rate, and information rate for the two architectures. A hybrid optoelectronic decoding architecture that uses optical finite-field matrix–vector multipliers is given and is shown to offer error correction at large block sizes and aggregate data rates exceeding 10 12 bits@s. Key words: Optical memory, parallel error correction, page access memory, Reed–Solomon codes. r 1995 Optical Society of America 1. Introduction Volume optical memories offer high storage capac- ity, highly parallel access, and potentially fast data-transfer rates. 1–6 Three-dimensional storage schemes that utilize volume optical memories 1e.g., photorefractives2 are typically accessed by the use of a two-dimensional 12D2 page format, as shown in Fig. 1. The parallelism achievable by the use of page access with volume memories can easily exceed 10 5 bits, defining a highly parallel access memory system. With page rates approaching 10 MHz, the effective data rate for volume memory systems is expected to exceed 10 12 bits@s 1bps2. It should be noted that although technologically a 10-MHz page rate is quite optimistic, significant processing is assumed to take place within the smart memory interface and there- fore this estimate is based on detectability consider- ations as opposed to conventional memory architec- ture requirements or CCD scan rates. As an example, Fig. 1 illustrates a conceptual architecture for volume memory that serves a group of serial users. The memory interface might respond to users by serving simple retrieval requests. In this case the user would provide an address to the interface, and the interface would respond by accessing the correspond- ing page. It is also expected that such an interface could provide data-independent partitioning prior to distributing the retrieved data. Because within this scheme the interface does not perform data-depen- dent functions, data reliability is not required at the interface and conventional error correction can occur at each user location. Conventional serial error- correction data rates are currently of the order of 10 8 bps. 7,8 If instead the memory interface process- ing is data driven, then error correction must occur within the interface. Examples of data-dependent processing that might be required of the parallel memory server include database sorting, data fusion, data compression, and data decompression. Be- cause these operations require high data reliability, error correction must occur before distribution and therefore may demand processing rates greater than 10 12 bps, far exceeding the capabilities of conven- tional serial error-correction hardware and thus iden- tifying a need for parallel error-correction techniques. Although the need for error correction in such sys- tems has been discussed and demonstrated else- where, these treatments have not addressed the parallel realization of such error-correction tech- niques. 9–11 Here we first discuss a one-dimensional 11D2 paral- lel error-correction solution that we designed and tested that is based on the Reed–Solomon 1RS2 code structure. This decoder is optically addressed and offers 300-Mbps error correction for 1D parallel opti- The authors are with the Department of Electrical and Computer Engineering, The Optical Sciences Center, University of Arizona, Tucson, Arizona 85721. Received 20 December 1994; revised manuscript received 7 July 1995. 0003-6935@95@358183-09$06.00@0. r 1995 Optical Society of America. 10 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8183

Upload: jerry-d

Post on 05-Oct-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Error-correction schemes for volume optical memories

Error-correction schemesfor volume optical memories

Mark A. Neifeld and Jerry D. Hayes

An optically addressed Reed–Solomon parallel decoder has been designed and fabricated for one-dimensional parallel access optical memories. The 315, 94 Reed–Solomon decoder operates on 60 paralleloptical inputs and has been demonstrated at a data rate of 300 megabits@s. Compared with equivalentserial decoding solutions, this decoder is shown to be more area efficient and offers reduced latency. Anextension to two-dimensional error correction using both the parallel and serial strategies is presented,and comparisons are made in terms of parallelism, page rate, and information rate for the twoarchitectures. A hybrid optoelectronic decoding architecture that uses optical finite-field matrix–vectormultipliers is given and is shown to offer error correction at large block sizes and aggregate data ratesexceeding 1012 [email protected] words: Optical memory, parallel error correction, page access memory, Reed–Solomon codes.

r 1995 Optical Society of America

1. Introduction

Volume optical memories offer high storage capac-ity, highly parallel access, and potentially fastdata-transfer rates.1–6 Three-dimensional storageschemes that utilize volume optical memories 1e.g.,photorefractives2 are typically accessed by the use of atwo-dimensional 12D2 page format, as shown in Fig. 1.The parallelism achievable by the use of page accesswith volume memories can easily exceed 105 bits,defining a highly parallel access memory system.With page rates approaching 10 MHz, the effectivedata rate for volume memory systems is expected toexceed 1012 bits@s 1bps2. It should be noted thatalthough technologically a 10-MHz page rate is quiteoptimistic, significant processing is assumed to takeplace within the smart memory interface and there-fore this estimate is based on detectability consider-ations as opposed to conventional memory architec-ture requirements or CCD scan rates. As an example,Fig. 1 illustrates a conceptual architecture for volumememory that serves a group of serial users. Thememory interface might respond to users by servingsimple retrieval requests. In this case the user

The authors are with the Department of Electrical and ComputerEngineering, The Optical Sciences Center, University of Arizona,Tucson, Arizona 85721.Received 20 December 1994; revised manuscript received 7 July

1995.0003-6935@95@358183-09$06.00@0.

r 1995 Optical Society of America.

would provide an address to the interface, and theinterface would respond by accessing the correspond-ing page. It is also expected that such an interfacecould provide data-independent partitioning prior todistributing the retrieved data. Because within thisscheme the interface does not perform data-depen-dent functions, data reliability is not required at theinterface and conventional error correction can occurat each user location. Conventional serial error-correction data rates are currently of the order of108 bps.7,8 If instead the memory interface process-ing is data driven, then error correction must occurwithin the interface. Examples of data-dependentprocessing that might be required of the parallelmemory server include database sorting, data fusion,data compression, and data decompression. Be-cause these operations require high data reliability,error correction must occur before distribution andtherefore may demand processing rates greater than1012 bps, far exceeding the capabilities of conven-tional serial error-correction hardware and thus iden-tifying a need for parallel error-correction techniques.Although the need for error correction in such sys-tems has been discussed and demonstrated else-where, these treatments have not addressed theparallel realization of such error-correction tech-niques.9–11Here we first discuss a one-dimensional 11D2 paral-

lel error-correction solution that we designed andtested that is based on the Reed–Solomon 1RS2 codestructure. This decoder is optically addressed andoffers 300-Mbps error correction for 1D parallel opti-

10 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8183

Page 2: Error-correction schemes for volume optical memories

cal data. In Section 2 a brief introduction to RScodes, together with a comparison between a conven-tional serial RS decoder and our 1D parallel decoder,is given. Architectural extensions to 2D page accesserror correction that use both our 1D parallel decoderand conventional serial decoders are presented alongwith comparisons between these extended architec-tures in terms of parallelism, page rate, and data rate.In Section 3 we present a hybrid optoelectronic de-coder design based on our 1D parallel decoder. Theadvantages and cost of using optical finite-field ma-trix–vector multipliers for increasing parallelism andinformation rate are also discussed.

2. Parallel Error Correction

In order to ensure data integrity in data-storagesystems, block error-correction codes are often used toprovide robustness to errors incurred during thestorage and retrieval processes. A block code treatsthe data to be encoded asmessage blocks of k symbols,where each symbol is represented by m bits. Anencoding process is then used to add redundancy toeach message block to form an n symbol code wordwhere n . k. The added redundancy defines amathematical structure among the n symbols thatfacilitates both error detection and correction. An3n, k4 RS code is a specific example of a block coderesulting in a random symbol error-correction capabil-ity of t 5 n11 2 k@n2@2 symbols, where the ratio k@n isreferred to as the code rate and is a measure of theinformation content of the encoded system. A mea-sure of the burst error-correction capability of a RScode is the number of consecutive bit errors that canbe corrected and is given as m1t 2 12 1 1, where m 5log21n 1 12. The exceptional burst and random error-correction capabilities of RS codes are major reasonsfor their use in existing optical memory systems.12In conventional memory systems, an n symbol

stored code word is typically retrieved in a symbolwiseserial fashion, as shown in Fig. 21a2. The syndrome,Euclidean algorithm, and error-correction blocks formthe core of a serial decoder that executes a predefineddecoding algorithm based on the mathematical struc-

Fig. 1. Page access volume optical memory with parallel interfacethat serves a group of serial users.

8184 APPLIED OPTICS @ Vol. 34, No. 35 @ 10 December 1995

ture of the code. A salient feature of this architec-ture is the utilization of time resources through theuse of feedback, thus minimizing space resourcerequirements. Such a utilization of time resources,however, may create a bottleneck when the data ratesachievable for page access memories are considered.One approach to overcoming this bottleneck could bethe use of an array of Bp@Bs serial decoders in parallelto achieve the required aggregate data rate, where Bpis the required system data rate and Bs is the datarate associated with a single serial decoder. Analternative to such an array of serial decoders mightbe a parallel decoding solution that uses space re-sources in support of the required Bp data rate.Figure 21b2 shows the block diagram of a parallel

decoder that receives code words in one dimensionand processes the decoding algorithm in space ratherthan time. Instead of employing feedback, the paral-lel decoder uses a fully feedforward architecture toenhance data throughput. To facilitate comparisonbetween a conventional serial decoder and this paral-lel decoder, we designed, fabricated, and tested a315, 94 parallel decoder based on the parallel process-ing architecture shown in Fig. 21b2.13 The decoderoperates on 60-bit parallel optical input words andhas been demonstrated at a clock rate of 5 MHz,yielding an effective data rate of 300 Mbps. A photo-graph of our parallel decoder is given in Fig. 3. The1D parallel optical input array can be seen runningvertically in the upper center portion of the chip.Directly to the right of the input array is a parallelimplementation of the syndrome block. The syn-drome block receives the 15 four-bit symbols from theoptical input array in parallel and calculates sixsyndrome symbols in one clock cycle. The syndromesymbols contain information about error magnitudesand locations. If we refer to the 15 received symbolsas the received vector r and the six syndrome symbolsas the syndrome vector s, then the computationwithin the syndrome block can be described by thematrix–vector multiply

s 5 Hr.

Matrix H is referred to as the parity check matrix,

Fig. 2. Candidate RS decoding strategies: 1a2 conventional timedomain 1serial2 decoder, 1b2 1D parallel decoder.

Page 3: Error-correction schemes for volume optical memories

and the symbols ofH are finite-field elements definedby our 315, 94 RS code. A fully parallel electronicimplementation of a finite-field matrix–vector multi-ply results in a highly connected, wire-intensive,crossbar architecture. These long wires dominatethe response time of the syndrome, as shown in Fig. 4.For simplicity, the figure represents the response ofonly one of the 24 syndrome output lines. Thesewaveforms were obtained by the use of a single opticalpixel input with sufficient power to ensure that thedetector integration time was significantly fasterthan the 200-ns syndrome response time predicted byMAGIC, the VLSI design tool used in this research.14The figure shows a 250-ns delay between an appliedoptical input pulse and a syndrome output transition.This delay represents a potential bottleneck for theparallel decoder for high page rate processing. Thehighly connective nature of the syndrome block sug-gests the feasibility of using optical matrix–vectormultipliers that could remove this potential bottle-neck and also reduced the VLSI area requirements ofthe decoder. Such an architecture is presented inSection 3.

Fig. 3. Photograph of 1D parallel decoder chip with the 60-element photodiode array running vertically in the upper centerportion of the chip. The syndrome is in the upper right-handportion of the figure.

Referring once again to Fig. 21b2, we see thatfollowing the syndrome block is the Euclidean algo-rithm module, which is used to decouple the errormagnitude and location information represented bythe syndrome vector into an error magnitude vector,v, and error location vector, s. For our paralleldecoder the Euclidean module consists of two pipe-lined iterative arrays, A1 and A2, as shown in Fig. 5.Such an implementation was proposed by Mandel-baum.15 Each array utilizes 2t identical rows forfully parallel decoder processing. Following eachclock cycle, the results of each row are pipelined to asubsequent row until error magnitude vector v anderror location vector s become available at the out-puts of A1 and A2, respectively. At any one timethere are 2t received vectors being processed by thisdecoder in pipeline fashion, and the total decodinglatency therefore is 2t clocks. The local connectivitywithin this architecture makes the Euclidean algo-rithm module amenable to VLSI implementation.Following the determination of v and s, the errorcorrection block in Fig. 21b2 uses another matrix–

Fig. 4. Measured time response of syndrome output with singlepixel input. The horizontal scaling is 500 ns@div.

Fig. 5. Euclidean arrays for computing the error-locator anderror-magnitude vectors.

10 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8185

Page 4: Error-correction schemes for volume optical memories

vector multiply to determine the error vector, e, whichis then subtracted from the received vector, r, toproduce an estimate of the original error-free code-word vector. This estimate is guaranteed to be iden-tical to the original code word if t or fewer errors haveoccurred.Resource scaling laws were determined from our

parallel decoder design and were used for comparisonwith a conventional serial approach. A natural com-parison between parallel and serial decoding solu-tions is the VLSI area and decoding latency requiredfor comparable parallelism, i.e., the number of inputbits to the decoder1s2. Figure 61a2 shows the VLSIarea cost versus code rate for various block sizes forboth the 1D parallel decoder and an equivalent serialdecoder array. The number of serial decoders in anarray is determined from the requirement of equiva-lent parallelism and is given by n, because each serialdecoder operates on m bits whereas the paralleldecoder operates on n m bits. Figure 61a2 indicatesa substantial reduction in VLSI area required for theparallel decoder at typical code rates 1i.e., higher than0.82. This reduction in space resources is due to thesharing of decoder hardware. For the serial decoder

Fig. 6. Comparisons between serial and parallel RS decodingstrategies. Results were derived with an equivalent parallelismand a 0.1-µm complementary metal-oxide semiconductor process:1a2 VLSI area cost versus code rate, 1b2 decoding latency versus coderate.

8186 APPLIED OPTICS @ Vol. 34, No. 35 @ 10 December 1995

array each decoder acts as an independent channel,thus eliminating the possibility of any common utiliza-tion of resources among decoders. This results inhardware redundancy that increases the requiredVLSI area as compared with the parallel case. Forexample, a parallel decoder with a typical code rate of0.8745 and a block size of 255 requires five times lessarea than an equivalent serial decoder array. Figure61b2 compares the latency in clock cycles of the twodecoding strategies. This figure demonstrates thatin all cases the parallel decoder offers reduced latency,because the parallel solution has utilized space re-sources to execute the decoding task.

3. Two-Dimensional Decoding Based onOne-Dimensional Decoders

A 2D decoding extension 1i.e., page access decoding2that uses our 1D parallel decoder can be realized bythe formation of an array of 1D decoders, as shown inFig. 71a2. Such a configuration is natural because ofthe 1D algebraic structure of block codes. Specifi-cally, the position of any symbol in a code word can be

Fig. 7. 2D decoding extension, using 1a2 1D parallel decoder array,1b2 array of conventional serial decoders.

Page 5: Error-correction schemes for volume optical memories

described as a function of a single variable. Weobserve that the data associated with a code word aremeasured along the x axis while the data processingfor each decoder occurs along the y axis, forming aplanar decoder implementation. Apage access decod-ing solution results when we extend an array of 1Ddecoders along the z axis, forming a three-dimen-sional hardware implementation. We could also forma 2D array of serial decoders for page access decoding,as shown in Fig. 71b2. We now compare these twostrategies in the presence of fixed decoding resourcessuch as VLSI area and electrical power density.Fixing the total area 1and process size2 of a VLSI

decoder implementation will limit the number ofdecoders that can be fabricated and will thereforeprovide an upper limit on the parallelism achievable.The number of decoders per array is determined whenthe area per decoder found in Fig. 61a2 is divided intothe total available VLSI area. Parallelism is thendefined as the number of decoders per array multi-plied by the number of bits per decoder. For a fixedVLSI area of 10 cm2, a 0.1-µm VLSI process, andinput and output bit-error rates 1BER’s2 of 1024 and10212, respectively, a comparison of parallelism isgiven in Fig. 81a2 for the parallel and serial decoderarrays at various block sizes. We observe a substan-tial increase in parallelism for the parallel decoderarray for all block sizes. This is consistent with theresults of Fig. 61a2, because the area per decoder isseen to be reduced in the parallel implementation.A peak in parallelism is noted at small block sizes,which implies that the area per decoder is scalingfaster than the number of inputs per decoder forincreases in block size beyond this peak. This resultsuggests that small block codes are preferred from theperspective of maximizing decoding parallelism.In addition to decoder parallelism, the maximum

decoding clock rate 1equivalent to page rate for a 2Ddata format2 is another important implementationalcharacteristic. Decoder power dissipation is as-sumed to be dominated by the constituent metal-oxide semiconductor gate capacitances and can there-fore be computed from knowledge of this capacitance,the gate count per unit area 1which is determined byblock size and code rate2, the supply voltage, and thedecoder page rate. Alternatively, we can take pagerate to be limited by the maximum tolerable decoderpower density. For a fixed power density of 1.0W@cm2, page rate versus block size for the twodecoding architectures is given in Fig. 81b2. Fromthis figure we observe that the page rates achievablefor the two strategies are approximately the same.The effective information rate 1I2 of a decoding archi-tecture is given as the product of page rate 1 f 2,parallelism 1N2, and code rate: I 5 Nfr. The infor-mation rate versus block size for the two decodingstrategies considered here is shown in Fig. 81c2 andindicates once again the superior performance of theparallel decoder array for all block sizes. Correspond-ing to the previously seen peak in parallelism, a peakin information rate is once again observed for small

block sizes. This once again suggests that smallblock sizes are optimal; however, it should be notedthat small block codes are less efficient in utilizingmemory resources, as indicated in Fig. 9. For a fixedinput–output BER performance, a decrease in blocksize results in a decrease in code rate, correspondingto a reduction in memory storage efficiency. There-fore, large block codes are desirable in maximizingmemory efficiencies and at the same time provide anincrease in the burst error-correction capability of thesystem.

Fig. 8. Comparisons between serial and parallel RS decodingstrategies for 2D page access. Results were derived with a totalVLSI area of 10 cm2 and a 0.1- µm process size: 1a2 parallelismversus block size, 1b2 page rate versus block size with a maximumpower density of 1 W@cm 2, 1c2 information rate versus block size.

10 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8187

Page 6: Error-correction schemes for volume optical memories

In order to improve the information rate achievablein our parallel decoder for large block sizes, it isnecessary to increase parallelism. This requires thatthe number of inputs per decoder scale faster than thearea per decoder at large block sizes. Figure 10shows the area scaling relationships among the func-tional modules of our parallel RS decoder for variousblock sizes. We observe that whenever the area ofthe syndrome and error-correctionmodules dominate,the area of the decoder scales faster than the numberof inputs per decoder, resulting in an overall decreasein parallelism. If the electronic syndrome and error-correction modules are replaced with an equivalentoff-chip optical solution, the area of the decoder 1i.e.,the Euclidean algorithm alone2 scales slower than thenumber of inputs per decoder, resulting in an increasein parallelism for large block codes. We have there-fore investigated moving these two highly connectedfunctional units out of the VLSI domain and into theoptical domain.

4. Two-Dimensional Optoelectronic Hybrid Decoding

A hybrid decoding solution results when the syn-drome and error-correction matrix–vector multipliesare implemented in optics, as shown in Fig. 11. This

Fig. 9. Code rate versus block size for an input BER of 1024 andan output BER of 10212.

Fig. 10. Area relationship among functional modules of a RS 1Dparallel decoder as a function of block size.

8188 APPLIED OPTICS @ Vol. 34, No. 35 @ 10 December 1995

figure shows a volume memory that is accessedthrough an optically addressed spatial light modula-tor 1SLM2, used to interface the volume memory withan array of hybrid decoders. Only one decodingchannel 1i.e., decoder2 is shown in the figure forsimplicity. Each input channel comprises a columnof 3nm4 pixels, where each pixel is anamorphicallyimaged onto a row of the 3nm4 by 32t12m 2 124Hmatrix.The syndrome vector is formed by integration alongeach column of H. The resulting vector is thenapplied to the detector array that comprises the inputto the Euclidean algorithm module. After electroniccomputation of error magnitude vector v and errorlocation vector s, the 2.5mt pixels representing v ands are optically applied to error correction matrix M.This optical output from the Euclidean processingchip requires that some modulator technology beintegrated along with the VLSI computing circuitry.Any number of high-speed modulators can be eitherdirectly integrated or flip-chip bonded with the Si-based Euclidean processing hardware, and candidatemodulator technologies include GaAs SEED devicesand PL2T silicon.16,17 Each output pixel from theEuclidean array is anamorphically imaged onto a rowof M, and the output is formed by integration onceagain along the 3n12m 2 124 columns. The output ofthis computation is an analog representation of theerror vector and is applied to a final circuit stage to besubtracted from the received vector. There are twoimportant characteristics of the H and M matricesshown in Fig. 11. First, these matrices are binaryvalued and are determined only by code characteris-tics such as the block size and code rate, and aretherefore fixed for any given code. This permits thephysical masks representing these matrices to begenerated through the use of techniques that supportsubmicrometer spatial tolerances along with highcontrast. Second, each matrix consists of finite-fieldelements. This results in the need to develop anoptical technique that facilitates multiplication overfinite fields.When computing the product of two m-bit finite-

field elements, we find it useful to represent each fieldelement as a m 2 1 order polynomial with binarycoefficients. For the case in which m 5 4, two

Fig. 11. Schematic of hybrid RS decoder that uses optical matrix–vector multiplies.

Page 7: Error-correction schemes for volume optical memories

third-order polynomials are multiplied to form asixth-order polynomial. In order to produce a resultwithin the field, the higher-order terms 1i.e., higherthan three2 are mapped back to lower-order termsthrough a relationship defined by the irreduciblepolynomial. For example, if we let a 5 311114, b 5

310114, and x4 5 x 1 1 be the irreducible polynomial,then the product of a and b is given as

11 1 x 1 x2 1 x3211 1 x2 1 x32 5 1 1 x 1 x3 1 x6,

5 11 x1 x3 1 1x2 1 x32,

5 1 1 x 1 x2.

This finite-field multiplication can be accomplishedoptically by the representation of one operand by itsm-bit binary transmittance and expansion of theremaining operand in two dimensions, as shown inFig. 12. Operand B is represented by the 2m 2 1 bitbym bit matrix, wherem 5 4. Each pixel of operandA is anamorphically imaged along a column of theexpanded operand B. Summation is applied acrosseach row of the expanded operand, producing ananalog result. The finite-field product is thereforeseen to require a simple opticalmatrix–vector product.AMOD 2 operation is then applied to the 2m2 2 analogoutput terms to form a binary representation of theintermediate product. Each of the higher-order termsare then electrically mapped back to the lower-orderterms by the use of simple XOR circuitry, producing thefinal finite-field product. The required MOD 2 opera-tion can be performed optically by the use of simplenonlinear etalon optical switches or electrically by theuse of resonant tunneling devices.18–21A comparison of information rates for the serial,

parallel, and optoelectronic hybrid decoder arrays isgiven in Fig. 131a2. The figure indicates a substan-tial increase in the information rate at large blocksizes for the hybrid decoder. This increase is due tothe reduction of VLSI decoder area that was realizedwhen the area intensive syndrome and error-correc-tion modules were moved off chip. A reduction inarea per decoder permits more decoders to be placedwithin a fixed VLSI area 1e.g., 10 cm22 and thusresults in an increase in parallelism, thus increasing

Fig. 12. Optical implementation of a four-bit, finite-field, symbol-wise multiplication.

the information rate. For a block size of 511, theinformation rate for a hybrid decoder array is 5 31012 bps, as compared with 2 3 1011 bps for theparallel decoder array and 4 3 1010 bps for the serialdecoder array. The 5 3 1012 bps information ratecorresponds to a page rate of 12 MHz and a parallel-ism of 700 3 700 pixels.The additional cost associated with operating a

hybrid optoelectronic decoder can be measured interms of optical power, as shown in Fig. 131b2. Thisfigure indicates the minimum optical power requiredin the hybrid architecture to achieve the informationrates shown in Fig. 131a2. The total optical powerrequired of the hybrid system is found by first deter-mining the minimum optical power required perdetector on both the Euclidean chip and output array.This minimum power per detector is estimated from aconsideration of the signal-to-noise ratio 1SNR2. Forexample, a SNR of 10 is required for an output BER of10212, assuming zero-mean Gaussian noise, which ischaracteristic of a Johnson noise-limited detectionscheme. The minimum power per detector is thengiven by

CPd 5 21pkTaCd SNR21@21B@R2, 112

Fig. 13. Comparison among serial, parallel, and hybrid RS decod-ing strategies for 2D page access: 1a2 information rate versusblock size, 1b2 hybrid decoder optical power requirements.

10 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8189

Page 8: Error-correction schemes for volume optical memories

where C is the received signal contrast C 5 1 2 1lowlevel@high level2, k is Boltzmann’s constant, Ta is theeffective receiver input noise temperature,22 Cd is thedetector capacitance,B is the bandwidth, and R is theresponsivity of the detector. The values forTa andCdare determined from the photodiode and supportingcircuitry, whereas the bandwidth is determined by themaximumallowable electrical power density as shownin Fig. 81b2. Once the minimum required detectorpower is found, the total optical power required forthe syndrome computation is given by

Psyn 5Pd2mnt12m 2 12

zmaskztrans

. 122

The values for n,m, and t are code specific, whereaszmask refers to the effective transparent area of the Hmask as a fraction of the total mask area and ztransrefers to the transmission efficiency of the on regionsof themask. Asimilar approach is used to determinethe required optical power for the error-correctioncomputation 1i.e., theMmask2 and is given by

Pe 5Pd2.5mnt12m 2 12

zmaskztranszpads

. 132

The zpads terms refers to the throughput of the Euclid-ean algorithm output modulators. The parametervalues for Eqs. 112–132 are given in Table 1. The totaloptical power required by the hybrid decoder given inFig. 131b2 was found by summation of Eqs. 122 and 132.The figure indicates that less than 5 W of opticalpower is required for the block sizes shown, usingthese fairly conservative system assumptions.

5. Conclusions

Here we discussed the design and experimental re-sults of a parallel RS decoder for use in 1D opticalaccess memories. Our decoder operates on 60-bitoptical input words and has been demonstrated at adata rate of 300 Mbps. It was shown that for anequivalent parallelism between our parallel decoderand an array of serial decoders, the parallel decoder isless area intensive and exhibits lower latency. Wehave extended both the 1D parallel decoder and serialdecoder architectures to 2D error correction by form-ing an appropriate array of decoders required for pageaccess memories. For fixed decoding resources suchas VLSI area and power density, the parallel decoderarray offers a higher degree of parallelism than theserial array for all block sizes. A peak in parallelismwas observed at small block sizes for the parallel

Table 1. Parameters used to Calculate Optical Power Requirements forHybrid Optoelectronic Decoding

Ta

1°K2Cd

1fF2 SNRR

1A@W2 zmask ztrans zpads

1500 10 10 0.3 0.5 0.9 0.9

8190 APPLIED OPTICS @ Vol. 34, No. 35 @ 10 December 1995

decoder array. As a way to improve parallelism forlarge block sizes, the area intensive, highly connectedfinite-field matrix–vector multiplies were moved offchip and realized in optics. The resulting optoelec-tronic hybrid decoder array offers increased informa-tion rate performance for all block sizes as comparedwith the parallel and serial decoder arrays. Finally,an example 2D hybrid decoder based on a 0.1-µmprocess and 10-cm2 VLSI chip area may provide errorcorrection at an information rate of 5 3 1012 bps,corresponding to a page rate of 12 MHz and a pagesize of 700 3 700 pixels.

References1. D. Psaltis, ‘‘Parallel optical memories,’’ Byte 17, 179–182

119922.2. L. Hesselink and M. C. Bashaw, ‘‘Optical memories imple-

mented with photorefractive media,’’ Opt. Quantum Electron.25, S611–S661 119932.

3. F. H. Mok, ‘‘Angle-multiplexed storage of 5000 holograms inlithium niobate,’’ Opt. Lett. 18, 915–917 119932.

4. K. Blotekjaer, ‘‘Limitations on holographic storage capacity inphotochromic and photorefractivemedia,’’Appl. Opt. 18, 57–67119792.

5. D. Brady and D. Psaltis, ‘‘Control of volume holograms,’’ J.Opt. Soc. Am.A 9, 1167–1170 119922.

6. C. Gu, J. Hong, I. McMichael, R. Saxena, and F. Mok,‘‘Crosstalk limited storage capacity of volume holographicmemory,’’ J. Opt. Soc. Am.A 9, 1978–1983 119922.

7. S. R. Whitaker, J. A. Canaris, and K. B. Cameron, ‘‘Reed–Solomon VLSI code for advanced television,’’ IEEE Trans.Circuits Syst. Video Technol. 1, 230–236 119912.

8. Po Tong, ‘‘A 40-MHz encoder-decoder chip generated by aReed–Solomon code compiler,’’ presented at the IEEE CustomIntegrated Circuits Conference, Boston, Mass., 13–16 May,1990.

9. J. F. Heanue, M. C. Bashaw, and L. Hesslink, ‘‘Volumeholographic storage and retrieval of digital data,’’ Science 265,749–752 119942.

10. S. A. Dombrovskii, ‘‘Effectiveness of using error-correctingcodes in holographic storage systems,’’ Optoelectron. Instru-men. Data Process. 2, 58–62 119892.

11. M. A. Neifeld and M. McDonald, ‘‘Error correction for increas-ing the usable capacity of photorefractive memories,’’ Opt.Lett. 19, 1483–1485 119942.

12. S. W. Golomb, ‘‘Optical disk error correction,’’ Byte 11152,203–210 119862.

13. M. A. Neifeld and J. D. Hayes, ‘‘Parallel error correction foroptical memories,’’ Opt. Mem. Neural Net. 3, 87–98 119942.

14. J. K. Ousterhout, G. T. Hamachi, R. N. Mayo, W. S. Scott, andG. S. Taylor, ‘‘MAGIC: AVLSI layout system,’’ in Proceedings ofthe 21st Design Automation Conference 1Institute of Electricaland Electronics Engineers, NewYork, 19842, pp. 152–159.

15. D. M. Mandelbaum, ‘‘On iterative arrays for the Euclideanalgorithm over finite field,’’ IEEE Trans. Comput. 38, 1473–1478 119892.

16. A. L. Lentine and D. A. B. Miller, ‘‘Evolution of the SEEDtechnology: bistable logic gates to optoelectronic smart pix-els,’’ IEEE J. Quantum Electron. 29, 655–669 119932.

17. S. H. Lee, S. C. Esener, C. Sakik, M. A. Title, and T. J. Drabik,‘‘Two dimensional silicon@PLZT spatial light modulators:design considerations and technology,’’ Opt. Eng. 25, 250–260119862.

18. B. Acklin, R. Dandlinker, and N. Collings, ‘‘The nonlinear

Page 9: Error-correction schemes for volume optical memories

Fabry–Perot device: its parameters and their scaling withfinesse,’’ Opt. Commun. 103, 490–498 119932.

19. J. He and M. Cada, ‘‘Optical bistability in semiconductorperiodic structures,’’ IEEE J. Quantum Electron. 27, 1182–1188 119912.

20. H. H. Tsai, Y. K. Su, H. H. Lin, R. L. Wang, and T. L. Lee, ‘‘P-Ndouble-quantum-well resonant interband tunneling diode with

1

peak-to-valley current ratio of 144 at room temperature,’’IEEE Electron Device Lett. 15, 357–359 119942.

21. C. Y. Huang, J. E. Morris, Y. K. Su, and T. H. Kuo, ‘‘Newmethod for modeling a multipeak resonant-tunneling diode,’’Electron. Lett. 30, 1012–1013 119942.

22. A. Yariv, Optical Electronics 1Saunders, Philadelphia, Pa.,19912, Chap. 11, pp. 430–432.

0 December 1995 @ Vol. 34, No. 35 @ APPLIED OPTICS 8191