skip to main content
research-article
Open access

SoftSNN: low-cost fault tolerance for spiking neural network accelerators under soft errors

Published: 23 August 2022 Publication History
  • Get Citation Alerts
  • Abstract

    Specialized hardware accelerators have been designed and employed to maximize the performance efficiency of Spiking Neural Networks (SNNs). However, such accelerators are vulnerable to transient faults (i.e., soft errors), which occur due to high-energy particle strikes, and manifest as bit flips at the hardware layer. These errors can change the weight values and neuron operations in the compute engine of SNN accelerators, thereby leading to incorrect outputs and accuracy degradation. However, the impact of soft errors in the compute engine and the respective mitigation techniques have not been thoroughly studied yet for SNNs. A potential solution is employing redundant executions (re-execution) for ensuring correct outputs, but it leads to huge latency and energy overheads. Toward this, we propose SoftSNN, a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution, thereby maintaining the accuracy with low latency and energy overheads. Our SoftSNN methodology employs the following key steps: (1) analyzing the SNN characteristics under soft errors to identify faulty weights and neuron operations, which are required for recognizing faulty SNN behavior; (2) a Bound-and-Protect technique that leverages this analysis to improve the SNN fault tolerance by bounding the weight values and protecting the neurons from faulty operations; and (3) devising lightweight hardware enhancements for the neural hardware accelerator to efficiently support the proposed technique. The experimental results show that, for a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively, as compared to the re-execution technique.

    References

    [1]
    F. Akopyan et al. 2015. TrueNorth: Design and Tool Flow of a 65 mW 1 Million Neuron Programmable Neurosynaptic Chip. IEEE TCAD 34, 10 (Oct 2015).
    [2]
    R. Baumann. 2005. Soft Errors in Advanced Computer Systems. IEEE MDT 22, 3 (2005), 258--266.
    [3]
    Q. Chen et al. 2021. A 67.5 μJ/Prediction Accelerator for Spiking Neural Networks in Image Segmentation. IEEE TCSII (2021).
    [4]
    M. Davies et al. 2018. Loihi: A Neuromorphic Manycore Processor with On-Chip Learning. IEEE Micro 38, 1 (Jan 2018), 82--99.
    [5]
    S. A. El-Sayed et al. 2020. Spiking Neuron Hardware-Level Fault Modeling. In IOLTS. 1--4.
    [6]
    C. Frenkel et al. 2019. A 0.086-mm2 12.7-pJ/SOP 64k-Synapse 256-Neuron Online-Learning Digital Spiking Neuromorphic Processor in 28-nm CMOS. IEEE TBCAS 13, 1 (Feb 2019), 145--158.
    [7]
    R. Garg et al. 2009. Circuit-Level Design Approaches for Radiation-Hard Digital Electronics. IEEE TVLSI 17, 6 (2009), 781--792.
    [8]
    H. Hazan et al. 2018. BindsNET: A Machine Learning-Oriented Spiking Neural Networks Library in Python. FNINF 12 (2018), 89.
    [9]
    Q. Huang and J. Jiang. 2019. An Overview of Radiation Effects on Electronic Devices under Severe Accident Conditions in NPPs, Rad-Hardened Design Techniques and Simulation Tools. Progress in Nuclear Energy 114 (2019), 105--120.
    [10]
    R. E. Lyons and W. Vanderkulk. 1962. The Use of Triple-Modular Redundancy to Improve Computer Reliability. IBM J. Res. Dev. 6, 2 (1962), 200--209.
    [11]
    E. Painkras et al. 2013. SpiNNaker: A 1-W 18-core system-on-chip for massively-parallel neural network simulation. IEEE JSSC 48, 8 (2013), 1943--1953.
    [12]
    R. V. W. Putra et al. 2021. ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories. In ICCAD. 1--9.
    [13]
    R. V. W. Putra et al. 2021. SparkXD: A Framework for Resilient and Energy-Efficient Spiking Neural Network Inference using Approximate DRAM. In DAC. 379--384.
    [14]
    R. V. W. Putra and M. Shafique. 2020. FSpiNN: An Optimization Framework for Memory-Efficient and Energy-Efficient Spiking Neural Networks. IEEE TCAD 39, 11 (2020), 3601--3613.
    [15]
    M. Rastogi et al. 2021. On the Self-Repair Role of Astrocytes in STDP Enabled Unsupervised SNNs. FNINS 14 (2021), 1351.
    [16]
    C. D. Schuman et al. 2020. Resilience and Robustness of Spiking Neural Networks for Neuromorphic Systems. In IJCNN. 1--10.
    [17]
    T. Spyrou et al. 2021. Neuron Fault Tolerance in Spiking Neural Networks. In DATE. 743--748.
    [18]
    H. Y. Sze. 2000. Circuit and Method for Rapid Checking of Error Correction Codes using Cyclic Redundancy Check. US Patent 6,092,231.
    [19]
    R. Vadlamani et al. 2010. Multicore soft error rate stabilization using adaptive dual modular redundancy. In DATE. 27--32.
    [20]
    E.-I. Vatajelu et al. 2019. Special Session: Reliability of Hardware- Implemented Spiking Neural Networks (SNN). In VTS. 1--8.

    Cited By

    View all
    • (2024)Securing On-Chip Learning: Navigating Vulnerabilities and Potential Safeguards in Spiking Neural Network Architectures2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558041(1-5)Online publication date: 19-May-2024
    • (2024)Signature Driven Post-Manufacture Testing and Tuning of RRAM Spiking Neural Networks for Yield RecoveryProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473874(740-745)Online publication date: 22-Jan-2024
    • (2023)Exposing Reliability Degradation and Mitigation in Approximate DNNs Under Permanent FaultsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.323890731:4(555-566)Online publication date: 1-Apr-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    DAC '22: Proceedings of the 59th ACM/IEEE Design Automation Conference
    July 2022
    1462 pages
    ISBN:9781450391429
    DOI:10.1145/3489517
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 August 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Funding Sources

    • Indonesia Endowment Fund for Education (LPDP) Graduate Scholarship
    • NYUAD Center for CyberSecurity (CCS), funded by Tamkeen
    • NYUAD Center for Interacting Urban Networks (CITIES), funded by Tamkeen

    Conference

    DAC '22
    Sponsor:
    DAC '22: 59th ACM/IEEE Design Automation Conference
    July 10 - 14, 2022
    California, San Francisco

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Upcoming Conference

    DAC '25
    62nd ACM/IEEE Design Automation Conference
    June 22 - 26, 2025
    San Francisco , CA , USA

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)451
    • Downloads (Last 6 weeks)26

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Securing On-Chip Learning: Navigating Vulnerabilities and Potential Safeguards in Spiking Neural Network Architectures2024 IEEE International Symposium on Circuits and Systems (ISCAS)10.1109/ISCAS58744.2024.10558041(1-5)Online publication date: 19-May-2024
    • (2024)Signature Driven Post-Manufacture Testing and Tuning of RRAM Spiking Neural Networks for Yield RecoveryProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473874(740-745)Online publication date: 22-Jan-2024
    • (2023)Exposing Reliability Degradation and Mitigation in Approximate DNNs Under Permanent FaultsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.323890731:4(555-566)Online publication date: 1-Apr-2023
    • (2023)Compact Functional Testing for Neuromorphic Computing CircuitsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2022.322384342:7(2391-2403)Online publication date: 1-Jul-2023
    • (2023)Testability and Dependability of AI Hardware: Survey, Trends, Challenges, and PerspectivesIEEE Design & Test10.1109/MDAT.2023.324111640:2(8-58)Online publication date: Apr-2023
    • (2023)Towards Effective Training of Robust Spiking Recurrent Neural Networks Under General Input Noise via Provable Analysis2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD)10.1109/ICCAD57390.2023.10323789(1-9)Online publication date: 28-Oct-2023
    • (2023)Mantis: Enabling Energy-Efficient Autonomous Mobile Agents with Spiking Neural Networks2023 9th International Conference on Automation, Robotics and Applications (ICARA)10.1109/ICARA56516.2023.10125781(197-201)Online publication date: 10-Feb-2023
    • (2023)A Resilience Framework for Synapse Weight Errors and Firing Threshold Perturbations in RRAM Spiking Neural Networks2023 IEEE European Test Symposium (ETS)10.1109/ETS56758.2023.10174229(1-4)Online publication date: 22-May-2023
    • (2023)On-Line Testing of Neuromorphic Hardware2023 IEEE European Test Symposium (ETS)10.1109/ETS56758.2023.10174077(1-6)Online publication date: 22-May-2023
    • (2023)Testing and Reliability of Spiking Neural Networks: A Review of the State-of-the-Art2023 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)10.1109/DFT59622.2023.10313541(1-8)Online publication date: 3-Oct-2023
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media