Citation:
Gao, Z., Zhang, L., Cheng, Y., Guo, K., Ullah, A. & Reviriego, P. (2021). Design of FPGA-Implemented Reed–Solomon Erasure Code (RS-EC) Decoders With Fault Detection and Location on User Memory. IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 29(6), pp. 1073–1082.
xmlui.dri2xhtml.METS-1.0.item-contributor-funder:
Ministerio de Economía y Competitividad (España)
Sponsor:
This work was supported in part by the National Natural Science Foundation of China under Grant 61501321, in part by the China Postdoctoral Science Foundation and Luoyang Newvid Technology Company, Ltd., and in part by the ACHILLES Project PID2019-104207RB-I00 funded by the Spanish Ministry of Science and Innovation.
Reed–Solomon erasure codes (RS-ECs) are widely used in packet communication and storage systems to recover erasures. When the RS-EC decoder is implemented on a field-programmable gate array (FPGA) in a space platform, it will suffer single-event upsets (SEUs) Reed–Solomon erasure codes (RS-ECs) are widely used in packet communication and storage systems to recover erasures. When the RS-EC decoder is implemented on a field-programmable gate array (FPGA) in a space platform, it will suffer single-event upsets (SEUs) that can cause failures. In this article, the reliability of an RS-EC decoder implemented on an FPGA when there are errors in the user memory is first studied. Then, a fault detection and location scheme is proposed based on partial reencoding for the faults in the user memory of the RS-EC decoder. Furthermore, check bits are added in the generator matrix to improve the fault location performance. The theoretical analysis shows that the scheme could detect most faults with small missing and false detection probability. Experimental results on a case study show that more than 90% of the faults on user memory could be tolerated by the decoder, and all the other faults can be detected by the fault detection scheme when the number of erasures is smaller than the correction capability of the code. Although false alarms exist (with probability smaller than 4%), they can be used to avoid fault accumulation. Finally, the fault location scheme could accurately locate all the faults. The theoretical estimates are very close to the experiment results, which verifies the correctness of the analysis done.[+][-]