1,721,004 research outputs found

    Machine Learning in Resource-constrained Devices: Algorithms, Strategies, and Applications

    No full text
    The ever-increasing growth of technologies is changing people's everyday life. As a major consequence: 1) the amount of available data is growing and 2) several applications rely on battery supplied devices that are required to process data in real time. In this scenario the need for ad-hoc strategies for the development of low-power and low-latency intelligent systems capable of learning inductive rules from data using a modest mount of computational resources is becoming vital. At the same time, one needs to develop specic methodologies to manage complex patterns such as text and images. This Thesis presents different approaches and techniques for the development of fast learning models explicitly designed to be hosted on embedded systems. The proposed methods proved able to achieve state-of-the-art performances in term of the trade-off between generalization capabilities and area requirements when implemented in low-cost digital devices. In addition, advanced strategies for ecient sentiment analysis in text and images are proposed

    Combining Compressed Sensing and Neural Architecture Search for Sensor-Near Vibration Diagnostics

    Full text link
    Compressed sensing (CS) for sensor-near vibration diagnostics represents a suitable approach for the design of network-efficient structural health monitoring systems. This article presents a solution for vibration analysis based on deep neural networks (DNNs) trained on compressed data. The envisioned maintenance system consists of a network of sensing nodes orchestrated by a very constrained centralizing unit. The latter is equipped with a microcontroller unit (MCU) that predicts the health state using the aggregated information. As a major contribution, the DNN architectures are generated automatically from the data through a procedure inspired by hardware-aware (HW) neural architecture search (NAS), called as HW-NAS-CS, which is uniquely refined with additional constraints that consider both the peculiarities of CS parameters and the limitation of embedded devices. The proposed approach has been validated using two real-world SHM datasets for vibration damage identification and eventually deployed on a low-end computing platform (the STM32L5 MCU). Results demonstrate that DNNs combined with adapted CS schemes can attain classification scores always above 90% even in case of very huge compression levels (higher than 64x): these performances significantly improve the ones attained by state-of-the-art approaches in the field, with the utmost advantage of being portable on embedded devices

    An approximate randomization-based neural network with dedicated digital architecture for energy-constrained devices

    No full text
    Variable energy constraints affect the implementations of neural networks on battery-operated embedded systems. This paper describes a learning algorithm for randomization-based neural networks with hard-limit activation functions. The approach adopts a novel cost function that balances accuracy and network complexity during training. From an energyspecific perspective, the new learning strategy allows to adjust, dynamically and in real time, the number of operations during the network’s forward phase. The proposed learning scheme leads to efficient predictors supported by digital architectures. The resulting digital architecture can switch to approximate computing at run time, in compliance with the available energy budget. Experiments on 10 real-world prediction testbeds confirmed the effectiveness of the learning scheme. Additional tests on limited-resource devices supported the implementation efficiency of the overall design approac

    Compression-Accuracy Co-optimization Through Hardware-aware Neural Architecture Search for Vibration Damage Detection

    Full text link
    Internet-of-Things (IoT) is a key enabler for the transition to the Automatic Structural Health Monitoring (ASHM) of technical facilities, thanks to the seamless flow of data from a multitude of always connected devices. Current IoT-ASHM installations, however, face the double challenge to ensure high accuracy while meeting the requirement of minimal energy consumption. The paper tackles these issues from a deep-learning perspective, and describes an IoT-enabled monitoring approach based on a distributed end-to-end deep neural network (DNN). The architecture supports both data compression and damage detection. A low-end microcontroller hosts a specific local DNN; a hardware-aware neural-architecture search strategy rules network optimization, in order to satisfy the resource constraints set by low-end computing devices. The features extracted from data feed an aggregating unit, which includes a stacked global classification layer for full-scale damage detection. After proper quantization, the designed models are eventually deployed on a wireless accelerometer sensor. Finally, a cost-benefit analysis evaluates the system’s impact on the sensor energy autonomy. Experiments on a well-known dataset proved that the proposed solution could achieve state-of-the-art classification scores (all metrics above 98.4%) with a minimal transmission cost (less than 53 B on average); as compared with conventional approaches, the described strategy yielded a reduction of three orders of magnitude in energy consumption

    Enhanced HW-NAS for Affordance Segmentation on Wearable Robotics

    No full text
    The automatic design of deep neural network architectures executed in real-time on mobile processors would pave the way for new developments in wearable robotics. Processing information from cameras in real-time is essential to implement semiautonomous control pipelines. This work presents a hardware-aware neural architecture search suitable to generate architectures for affordance segmentation supported by mobile processors. The procedure uses a weight-sharing mechanism to speed up the search procedure and improves the convergence capability of the network selection procedure. In addition, the proposed search space has been design to induce multi-resolution features. These factors allow the network generation procedure to select architectures with a better trade-off between generalization performance and hardware requirements when compared to existing solutions

    Tiny Neural Networks for Session-Level Traffic Classification

    No full text
    This paper presents a system for session-level traffic classification on endpoint devices, developed using a Hardware-aware Neural Architecture Search (HW-NAS) framework. HW-NAS optimizes Convolutional Neural Network (CNN) architectures by integrating hardware constraints, ensuring efficient deployment on resource-constrained devices. Tested on the ISCX VPN-nonVPN dataset, the method achieves 97.06% accuracy while reducing parameters by over 200 times and FLOPs by nearly 4 times compared to leading models. The proposed model requires up to 15.5 times less RAM and 26.4 times fewer FLOPs than the most hardware-demanding models. This system enhances compatibility across network architectures and ensures efficient deployment on diverse hardware, making it suitable for applications like firewall policy enforcement and traffic monitoring

    Low-complexity digital architecture for solving the point location problem in explicit Model Predictive Control

    No full text
    This paper describes a digital circuit architecture which implements a recently proposed algorithm for the solution of the point location problem in the evaluation of piecewise affine functions. The circuit is suitable for FPGA implementation of explicit Model Predictive Control. The performances of the architecture are tested in a case study through hardware-in-the-loop simulation. Results show that the proposed circuit can be implemented on limited hardware resources also for quite complex, possibly discontinuous control functions, thus representing a good spare solution when other existing circuit architectures (generally faster but more resource-demanding) cannot be deployed

    Digital Architecture for the n-mode Tensor-Matrix Multiplication Based on Pipelined Computing Units

    No full text
    Compact digital circuitry supporting data processing is a key requirement of modern engineering. This pa-per addresses the design of digital architectures for a crucial operation in multi-linear algebra: the n-mode tensor-matrix product, implemented in fixed-point representation. A pipelined architecture that optimizes throughput and balances area and energy consumption is proposed. A cost-effective classifier based on this architecture was deployed on an embedded system. Ex-perimental tests conducted on a Kintex-7 FPGA demonstrate that the circuit achieves efficient digital implementations, providing real-time performance on benchmark applications with power consumption lower than 130 mW. This implementation proves to be more efficient than its non-pipelined counterpart

    Learning with similarity functions : a tensor-based framework

    No full text
    Machine learning algorithms are typically designed to deal with data represented as vectors. Several major applications, however, involve multi-way data, such as video sequences and multi-sensory arrays. In those cases, tensors endow a more consistent way to capture multi-modal relations, which may be lost by a conventional remapping of original data into a vector representation. This paper presents a tensor-oriented machine learning framework, and shows that the theory of learning with similarity functions provides an effective paradigm to support this framework. The proposed approach adopts a specific similarity function, which defines a measure of similarity between a pair of tensors. The performance of the tensor-based framework is evaluated on a set of complex, real-world, pattern-recognition problems. Experimental results confirm the effectiveness of the framework, which compares favorably with state-of-the-art machine learning methodologies that can accept tensors as inputs. Indeed, a formal analysis proves that the framework is more efficient than state-of-the-art methodologies also in terms of computational cost. The paper thus provides two main outcomes: (1) a theoretical framework that enables the use of tensor-oriented similarity notions and (2) a cognitively inspired notion of similarity that leads to computationally efficient predictors
    corecore