1,720,973 research outputs found
Reliability Assessment Methodologies for ANN-based Systems
In recent decades, deep learning (DL)-based solutions have gained a great deal of interest in industry and academia due to their outstanding computational capabilities. The usage of electronic devices running applications based on Artificial Neural Networks (ANNs) is spreading in several areas, including safety-critical applications such as self-driving cars, robots, and space applications. ANNs are often regarded as inherently robust and fault-tolerant, being brain-inspired and redundant computing models. However, to use them safely in human contexts, there is a compelling need to assess their reliability. Indeed, when they are deployed on resource-constrained hardware devices, single physical faults might jeopardize the activity of multiple neurons, leading to undesirable results. Since reliability assessment is becoming a growing concern, many efforts have been made in recent decades to propose efficient approaches to assess ANN-based systems reliability. The intent of this article is to overview the main reliability assessment methodologies for ANN-based systems, focusing mainly on Fault Injection techniques used to evaluate the ANN resilience at different abstraction levels
SCI-FI: a Smart, aCcurate and unIntrusive Fault-Injector for Deep Neural Networks
In recent years, the reliability of Deep Neural Networks (DNN) has become the focus of an increasing number of research activities. In particular, researchers have focused on understanding how a DNN behaves when the underlying hardware is affected by a fault. This is a challenging task: slight changes in a network architecture can significantly impact how the network reacts to faults. There are several approaches to simulate the behaviour of a faulty network: the most accurate one is to perform low-level fault simulations. Nonetheless, this task is very time-consuming and costly to be implemented. Even though the injection time can be reduced by injecting faults at the application level, for sufficiently large networks, this time is still very high, requiring weeks to complete a single simulation. This work aims at providing a fast and accurate solution for injecting software-level faults in a DNN that is independent of its architecture and does not require any modification to its structure. For this reason, this paper introduces SCI-FI, a Smart, aCcurate and unIntrusive Fault-Injector. SCI-FI smartly reduces the fault injection time required for a complete fault simulation of the network by taking advantage of two fundamental mechanisms: Fault Dropping and Delayed Start. Experimental results from various ResNet, DenseNet and EfficientNet architectures targeting the CIFAR-10 and ImageNet datasets show that combining these techniques drastically reduces the simulation time, which can last up to 70% less
On the Detection of Always-On Hardware Trojans Supported by a Pre-Silicon Verification Methodology
Hardware-based vulnerabilities are becoming a serious threat in the Integrated Circuit (IC) industry. Current System-on-Chip (SoC) designs are comprised of many Intellectual Property (IP) blocks coming from third-party vendors. These can maliciously insert additional hardware, commonly known as Hardware Trojans, aiming at degrading performance, altering functionality or even leaking secret information. According to their activation mechanism, Hardware Trojans are classified as triggered or always-on. While the detection approaches for the first class are widely explored even during the early stages of the IC design flow, the detection of always-on type mainly relies on side channel analyses, carried out after fabrication. This work presents a methodology oriented to detect always-on Hardware Trojans during the pre-silicon design stage. The proposed approach is able to detect suspicious intrusions by exploiting a signature mechanism developed during the RTL verification phase. The activity of carefully selected signals is spied to record and keep the state of the core. Finally, the efficacy of the technique has been validated on an open-source IP core with three different always-on Trojans
A Fast Reliability Analysis of Image Segmentation Neural Networks Exploiting Statistical Fault Injections
The reliability of hardware running deep neural networks (DNNs) is becoming the object of multiple research works. Fault injections (FIs) are one of the most used solutions to determine the reliability of DNN models. However, defining how many faults to inject in the model is not a trivial task. An exhaustive FI campaign requires injecting, in modern DNNs, billions or trillions of parameters. On the other hand, random FI campaigns do not offer a practical measure of the accuracy of the result. A different approach is to perform a statistical FI: the number of faults to inject is decided based on the number of possible faults and by fixing an error margin and a confidence level on the measured output metric. While the statistical approach offers the best of both worlds, it requires a proper setup to guarantee its statistically significance. In this work, a study on the statistical fault injection procedure on an image segmentation neural network is proposed. In particular, the study compares results from a random FI campaign and an improperly-defined statistical FI campaign, and shows how they fail at highlighting some of the critical aspects of U-Net, a state-of-the-art DNN used for image segmentation. The proposed approach, by injecting only the 0.07% of all the possible faults, accurately measures both the criticality of each layer and of the parameters' bit with an error margin of 1% and a confidence level of 99%
On the resilience of representative and novel data formats in CNNs
In recent years, a wide range of data type representations have been employed for training and storing the parameters of Deep Neural Networks (DNNs). The decision to employ a particular data type over another is influenced by various requirements, including the desire to enhance training accuracy or reduce data size to minimize memory usage, energy and power consumption. However, opting for one data type over another inevitably impacts the reliability of the model. This work studies the impact of different data representations on the reliability of LeNet-5, a popular Convolutional Neural Network (CNN) used for image classification tasks.An investigation is performed to evaluate the efficacy of the Average Bit-Flip Distance (ABFD) in predicting the criticality of bit positions in the data representation. The data type under analysis are FP32, POSIT32, POSIT16 and INT8. Together with the widely adopted metrics, this work proposes a new metric, called Soft SDC-n, to measure the percentage of faults that cause a change in the order of the top-n output elements. Experimental results shows that POSIT is not as reliable as FP32, while indicating that the most reliable data type is INT8. Furthermore, the same results confirm the presence of a relationship between the ABFD and the criticality of a bit in all the data representations under analysis
Open-Set Recognition: an Inexpensive Strategy to Increase DNN Reliability
Deep Neural Networks (DNNs) are nowadays widely used in low-cost accelerators, characterized by limited computational resources. These models, and in particular DNNs for image classification, are becoming increasingly popular in safety-critical applications, where they are required to be highly reliable. Unfortunately, increasing DNNs reliability without computational overheads, which might not be affordable in low-power devices, is a non-trivial task. Our intuition is to detect network executions affected by faults as outliers with respect to the distribution of normal network's output. To this purpose, we propose to exploit Open-Set Recognition (OSR) techniques to perform Fault Detection in an extremely low-cost manner. In particuar, we analyze the Maximum Logit Score (MLS), which is an established Open-Set Recognition technique, and compare it against other well-known OSR methods, namely OpenMax, energy-based outof-distribution detection and ODIN. Our experiments, performed on a ResNet-20 classifier trained on CIFAR-10 and SVHN datasets, demonstrate that MLS guarantees satisfactory detection performance while adding a negligible computational overhead. Most remarkably, MLS is extremely convenient to conFigure and deploy, as it does not require any modification or re-training of the existing network. A discussion of the advantages and limitations of the analysed solutions concludes the paper
Reliability of Deep Neural Networks: Impact and Open Issues
Nowadays, Deep Neural Networks (DNNs) are widely used in safety-critical fields such as automotive and healthcare, where their reliability is crucial due to their direct impact on human lives. Over the years, evaluating their resilience through software-level fault injection experiments has become a common research approach. The corruption of individual bits in the model’s parameters has been one of the most studied fault models in the last decade. This work introduces a methodology to evaluate the impact of permanent faults on DNN weights in image classification and object detection tasks, highlighting key ideas, main contributions, and the research’s impact over time
A Suitability analysis of software based testing strategies for the on-line testing of artificial neural networks applications in embedded devices
Electronic devices based on artificial intelligence solutions are pervading our everyday life. Nowadays, human decision processes are supported by real-time data gathered from intelligent systems. Artificial Neural Networks (ANNs) are one of the most used deep learning predictive models due to their outstanding computational capabilities. However, assessing their reliability is still an open issue faced by both the academic and industrial worlds, especially when ANNs are deployed on safety-critical systems, such as self-driving cars in the automotive world. In these systems, a strategy for identifying hardware faults is required by industry standards (e.g., ISO26262 for automotive, and DO254 for avionics). Among the existing in-field test strategies, the periodic scheduling of on-line Software Test Library (STL) is a wide strategy adopted; STL allows to reach an acceptable fault coverage without the need for additional hardware. However, when dealing with ANN-based applications, the execution of on-line tests interleaving the ANN inferences may jeopardise the strive for performance maximization. The paper presents a comprehensive analysis of six possible scenarios concerning the execution of on-line self-test programs in embedded devices running ANN-based applications. In the proposed scenarios, the impact of the STL execution on the ANN performance is analyzed; in particular, the execution times of an inference and the Fault Detection Time (FDT) of the STL are discussed and compared. Experimental analyses are provided by relying on: an open-source RISC-V platform running two different convolutional neural networks; a STL for RISC-V cores with a maximum achievable fault coverage of 90%
Early Detection of Permanent Faults in DNNs Through the Application of Tensor-Related Metrics
Computational models based on deep learning are today integrated in many safety-critical domains. These algorithms, such as deep neural networks (DNNs), are rapidly growing in size, reaching billions or even trillions of parameters. This factor brings big challenges not only for performance goals but also for dependability aspects such as reliability. The larger the model, the more challenging the reliability assessment becomes. It is now crucial to develop new test approaches supported by acceptable computational costs for the detection of random-hardware faults such as permanent faults, which may change the predictions of DNNs. The aim of this paper is to leverage tensor-related metrics to early detect faulty behaviors during the inference of DNNs. This involves calculating metrics applied to tensors across various domains (such as image processing, audio analysis, and regression) on the Output Feature Maps (OFMs) of a layer. This analysis allows knowing in advance the effect that a permanent fault will have on the output of the DNN application. The effectiveness of the approach has been experimentally demonstrated by means of software fault injection campaigns considering faults affecting weights of Convolutional Neural Networks (CNNs), i.e., ResNet20 and MobileNetV2. The quality of the metrics is discussed in terms of the trade-off between energy consumption and the ability to differentiate between critical and non-critical faults
- …
