Design, Implementation and Evaluation of a Low Redundant Error Correction Code
Keywords:
Error Correction Code, Fault-Tolerant Systems, Low Redundancy, Multiple Cell Upsets, Reliability, Single Cell UpsetsAbstract
The continuous raise in the integration scale of CMOS technology has provoked an augment in the fault rate. Particularly, computer memory is affected by Single Cell Upsets (SCU) and Multiple Cell Upsets (MCU). A common method to tolerate errors in this element is the use of Error Correction Codes (ECC). The addition of an ECC introduces a series of overheads: silicon area, power consumption and delay overheads of encoding and decoding circuits, as well as several extra bits added to allow detecting and/or correcting errors. ECC can be designed with different parameters in mind: low redundancy, low delay, error coverage, etc. The idea of this paper is to study the effects produced when adding an ECC to a microprocessor with respect to overheads. Usually, ECC with different characteristics are continuously proposed. However, a great quantity of these proposals only present the ECC, not showing its behavior when using them in a microprocessor. In this work, we present the design of an ECC whose main characteristic is a low number of code bits (low redundancy). Then, we study the overhead this ECC introduces. Firstly, we show a study of silicon area, delay and power consumption of encoder and decoder circuits, and secondly, how the addition of this ECC affects to a RISC microprocessor.
Downloads
References
The International Technology Roadmap for Semiconductors 2013. Accessed February 1, 2021. [Online]. Available at: http://www.itrs2.net/2013-itrs.html
S.K. Kurinec and K. Iniewsky. Nanoscale Semiconductor Memories: Technology and Application, CRC Press, Taylor & Francis Group, 2014.
E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. Toba, “Impact of scaling on neutron-induced soft error in SRAMs from a 250 nm to a 22 nm design rule”, IEEE Trans. Electron Devices, vol. 57, no. 7, pp. 1527–1538, July 2010.
G. Tsiligiannis et. al., “Multiple Cell Upset Classification in Commercial SRAMs”, IEEE Transactions on Nuclear Science, vol. 61, no. 4, August 2014.
G.I. Zebrev, “Multiple Cell Upset Cross-Section Uncertainty in Nanoscale Memories: Microdosimetric Approach”, 15th European Conference on Radiation and its Effects on Components and Systems (RADECS 2015), September 2015.
N.G. Chechenin and M. Sajid, “Multiple cell upsets rate estimation for 65 nm SRAM bit-cell in space radiation environment”, 3rd International Conference and Exhibition on Satellite & Space Missions, May 2017.
N.N. Mahatme, B.L. Bhuva, Y.P. Fang, and A.S. Oates, “Impact of strained-Si PMOS transistors on SRAM soft error rates”, IEEE Trans. on Nuclear Science, vol. 59, no. 4, pp. 845–850, August 2012.
E. Fujiwara, Code Design for Dependable Systems: Theory and Practical Application, Ed. Wiley-Interscience, 2006.
R. W. Hamming, “Error detecting and error correcting codes,” Bell System Technical Journal, vol. 29, pp. 147–160, 1950.
C.L. Chen and M.Y. Hsiao, “Error-correcting codes for semiconductor memory applications: a state-of-the-art review”, IBM Journal of Research and Development, vol. 58, no. 2, pp. 124–134, March 1984.
J. Gracia-Moran, L.J. Saiz-Adalid, D. Gil-Tomás, and P.J. Gil-Vicente, “Improving Error Correction Codes for Multiple Cell Upsets in Space Applications”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26(10), pp. 2132-2142, October 2018.
S. Lin and D.J. Costello, Error Control Coding, 2nd edition, Pearson-Prentice Hall, 2004.
S. Pontarelli, G.C. Cardarilli, M. Re and A. Salsano, “Error correction codes for SEU and SEFI tolerant memory systems”, 24th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT 2009), pp. 425-430, 2009.
A. Sánchez-Macián, P. Reviriego, J. Tabero, A. Regadío and J.A. Maestro, “SEFI protection for Nanosat 16-bit Chip On-Board Computer Memories”, IEEE Transactions on Device and Materials Reliability, vol. 17(4), pp. 698-707, December 2017.
R. Naseer and J. Draper, “DEC ECC design to improve memory reliability in Sub-100nm technologies”, 2008 15th IEEE International Conference on Electronics, Circuits and Sys-tems, pp. 586-589, August 2008.
C. Argyrides, D.K. Pradhan and T. Kocak, “Matrix codes for reliable and cost efficient memory chips”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 19 (3), pp.420–428, March 2011.
A. Neubauer, J. Freudenberger, and V. Kühn, Coding Theory: Algorithms, Architectures and Applications. Hoboken, NJ, USA, John Wiley & Sons, 2007.
L.J. Saiz-Adalid, P.J. Gil-Vicente, J.C. Ruiz, D. Gil-Tomás, J.C: Baraza and J. Gracia-Morán, “Flexible Unequal Error Control Codes with Selectable Error Detection and Correc-tion Levels”, 32th International Conference on Computer Safety, Reliability and Security (SAFECOMP 2013) , pp. 178-189, September 2013.
L.J. Saiz-Adalid, “Fallos intermitentes: análisis de causas y efectos, nuevos modelos de fallos y técnicas de mitigación”, Tesis doctoral. Universitat Politècnica de València, 2015. Accessed February 1, 2021. [Online]. http://hdl.handle.net/10251/59452
P. Reviriego, S. Pontarelli, J.A. Maestro, M. Ottavi, “A Method to Construct Low Delay Single Error Correction Codes for Protecting Data Bits Only”, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 32, no. 3, pp. 479-483, March 2013.
S. Liu, L. Xiao and Z. Mao, “Extend orthogonal Latin square codes for 32-bit data protection in memory applications”, Microelectronics Reliability, vol. 63, pp. 278-283, 2016.
J. Gracia-Morán, L.J. Saiz-Adalid, D. Gil-Tomás, P.J. Gil-Vicente, “Un nuevo Código de Corrección de Errores matricial con baja redundancia”, III Jornadas de Computación Empotrada y Reconfigurable (JCER2018), Jornadas SARTECO, pp. 561-566, September 2018.
J. Gracia-Morán, L.J. Saiz-Adalid, J.C. Baraza-Calvo, P.J. Gil-Vicente, “Correction of Adjacent Errors with Low Redundant Matrix Error Correction Codes”, 8th Latin-American Symposium on Dependable Computing (LADC 2018), pp. 107-114, October 2018.
J. Gracia-Morán, L.J. Saiz-Adalid, D. Gil-Tomás, J.C. Baraza-Calvo, P.J. Gil-Vicente, “Mejora de un Código de Corrección de Errores para tolerar fallos adyacentes bidimensionales”, IV Jornadas de Computación Empotrada y Reconfigurable (JCER2018), Jornadas SARTECO, pp. 600-605, September 2019.
Cadence: EDA Tools and IP for System Design Enablement. Accessed February 1, 2021. [Online]. https://www.cadence.com/
J.E Stine et al., “FreePDK: An Open-Source Variation-Aware Design Kit”, IEEE International Conference on Microelectronic Systems Education (MSE'07), June 2007.
NanGate FreePDK45 Open Cell Library. Accessed February 1, 2021. [Online]. https://www.eda.ncsu.edu/wiki/FreePDK45:Contents
Plasma CPU model. Accessed February 1, 2021. [Online]. Available: https://opencores.org/projects/plasma