1,721,024 research outputs found
Temporal Variability Analysis in sEMG Hand Grasp Recognition using Temporal Convolutional Networks
Hand movement recognition via surface electromyographic (sEMG) signal is a promising approach for the advance in Human-Computer Interaction. However, this field has to deal with two main issues: (1) the long-term reliability of sEMG-based control is limited by the variability affecting the sEMG signal (especially, variability over time); (2) the classification algorithms need to be suitable for implementation on embedded devices, which have strict constraints in terms of power budget and computational resources. Current solutions present a performance over-time drop that makes them unsuitable for reliable gesture controller design. In this paper, we address temporal variability of sEMG-based grasp recognition, proposing a new approach based on Temporal Convolutional Networks, a class of deep learning algorithms particularly suited for time series analysis and temporal pattern recognition. Our approach improves by 7.6% the best results achieved in the literature on the NinaPro DB6, a reference dataset for temporal variability analysis of sEMG. Moreover, when targeting the much more challenging inter-session accuracy objective, our method achieves an accuracy drop of just 4.8% between intra- and inter-session validation. This proves the suitability of our setup for a robust, reliable long-term implementation. Furthermore, we distill the network using deep network quantization and pruning techniques, demonstrating that our approach can use down to 120 lower memory footprint than the initial network and 4 lower memory footprint than a baseline Support Vector Machine, with an inter-session accuracy degradation of only 2.5%, proving that the solution is suitable for embedded resource-constrained implementations
A Microcontroller is All You Need: Enabling Transformer Execution on Low-Power IoT Endnodes
Transformer networks have become state-of-The-Art for many tasks such as NLP and are closing the gap on other tasks like image recognition. Similarly, Transformers and Attention methods are starting to attract attention on smaller-scale tasks, which fit the typical memory envelope of MCUs. In this work, we propose a new set of execution kernels tuned for efficient execution on MCU-class RISC-V and ARM Cortex-M cores. We focus on minimizing memory movements while maximizing data reuse in the Attention layers. With our library, we obtain 3.4×, 1.8×, and 2.1× lower latency and energy on 8-bit Attention layers, compared to previous state-of-The-Art (SoA) linear and matrix multiplication kernels in the CMSIS-NN and PULP-NN libraries on the STM32H7 (Cortex M7), STM32L4 (Cortex M4), and GAP8 (RISC-V IMC-Xpulp) platforms, respectively. As a use case for our TinyTransformer library, we also demonstrate that we can fit a 263 kB Transformer on the GAP8 platform, outperforming the previous SoA convolutional architecture on the TinyRadarNN dataset, with a latency of 9.24 ms and 0.47 mJ energy consumption and an accuracy improvement of 3.5%
Work-in-progress: Dory: Lightweight memory hierarchy management for deep NN inference on iot endnodes
IoT endnodes often couple a small and fast L1 scratchpad memory with higher-capacity but lower bandwidth and speed L2 background memory. The absence of a coherent hardware cache hierarchy saves energy but comes at the cost of labor-intensive explicit memory management, complicating the deployment of algorithms with large data memory footprint, such as Deep Neural Network (DNN) inference. In this work, we present DORY, a lightweight software-cache dedicated to DNN Deployment Oriented to memoRY. DORY leverages static data tiling and DMA-based double buffering to hide the complexity of manual L1-L2 memory traffic management. DORY enables storage of activations and weights in L2 with less than 4% performance overhead with respect to direct execution in L1. We show that a 142 kB DNN achieving 79.9% on CIFAR-10 runs 3.2× faster compared to its execution directly from L2 memory while consuming 1.9× less energy
Embedding Principal Component Analysis for Data Reduction in Structural Health Monitoring on Low-Cost IoT Gateways
Principal component analysis (PCA) is a powerful data reduction method for Structural Health Monitoring. However, its computational cost and data memory footprint pose a significant challenge when PCA has to run on limited capability embedded platforms in low-cost IoT gateways. This paper presents a memory-efficient parallel implementation of the streaming History PCA algorithm. On our dataset, it achieves 10× compression factor and 59× memory reduction with less than 0.15 dB degradation in the reconstructed signal-to-noise ratio (RSNR) compared to standard PCA. Moreover, the algorithm benefits from parallelization on multiple cores, achieving a maximum speedup of 4.8× on Samsung ARTIK 710
Hyperdimensional Computing with Local Binary Patterns: One-Shot Learning of Seizure Onset and Identification of Ictogenic Brain Regions Using Short-Time iEEG Recordings
We develop a fast learning algorithm combining symbolic dynamics and brain-inspired hyperdimensional computing for both seizure onset detection and identification of ictogenic (seizure generating) brain regions from intracranial electroencephalography (iEEG). Methods: Our algorithm first transforms iEEG time series from each electrode into symbolic local binary pattern codes, from which a holographic distributed representation of the brain state of interest is constructed across all the electrodes and over time in a hyperdimensional space. The representation is used to quickly learn from few seizures, detect their onset, and identify the spatial brain regions that generated them. Results: We assess our algorithm on our dataset that contains 99 short-time iEEG recordings from 16 drug-resistant epilepsy patients being implanted with 36-100 electrodes. For the majority of the patients (ten out of 16), our algorithm quickly learns from one or two seizures and perfectly (100%) generalizes on novel seizures using k-fold cross-validation. For the remaining six patients, the algorithm requires three to six seizures for learning. Our algorithm surpasses the state-of-the-art including deep learning algorithms by achieving higher specificity (94.84% versus 94.77%) and macroaveraging accuracy (95.42% versus 94.96%), and 74× lower memory footprint, but slightly higher average latency in detection (15.9 s versus 14.7 s). Moreover, the algorithm can reliably identify (with a p-value < 0.01) the relevant electrodes covering an ictogenic brain region at two levels of granularity: cerebral hemispheres and lobes. Conclusion and significance: Our algorithm provides: 1) a unified method for both learning and classification tasks with end-to-end binary operations; 2) one-shot learning from seizure examples; 3) linear computational scalability for increasing number of electrodes; and 4) generation of transparent codes that enables post-translational support for clinical decision making. Our source code and anonymized iEEG dataset are freely available at http://ieeg-swez.ethz.ch
Model-based vs. Data-driven Approaches for Anomaly Detection in Structural Health Monitoring: A Case Study
Modern Structural Health Monitoring (SHM) systems are becoming of pervasive use in civil engineering because they can track the structural condition and detect damages of critical and civil infrastructures such as buildings, viaducts, and tunnels. Although noticeable work has been done to improve anomaly detection for ensuring public safety, algorithms that can be executed on low-cost hardware for long-term monitoring are still an open issue to the community. This paper presents a new framework that exploits compression techniques to identify anomalies in the structure, avoiding continuous streaming of raw data to the cloud. We used a real installation on a bridge in Italy to test the proposed anomaly detection algorithm. We trained three compression models, namely a Principal Component Analysis (PCA), a fully-connected autoencoder, and a convolutional autoencoder. Performance comparison is also provided through an ablation study that analyzes the impact of various parameters. Results demonstrate that the model-based approach, i.e., PCA, can reach a better accuracy whereas data-driven models, i.e., autoencoders, are limited by training set size
Laelaps: An Energy-Efficient Seizure Detection Algorithm from Long-term Human iEEG Recordings without False Alarms
We propose Laelaps, an energy-efficient and fast learning algorithm with no false alarms for epileptic seizure detection from long-term intracranial electroencephalography (iEEG) signals. Laelaps uses end-to-end binary operations by exploiting symbolic dynamics and brain-inspired hyperdimensional computing. Laelaps's results surpass those yielded by state-of-the-art (SoA) methods [1], [2], [3], including deep learning, on a new very large dataset containing 116 seizures of 18 drug-resistant epilepsy patients in 2656 hours of recordings - each patient implanted with 24 to 128 iEEG electrodes. Laelaps trains 18 patient-specific models by using only 24 seizures: 12 models are trained with one seizure per patient, the others with two seizures. The trained models detect 79 out of 92 unseen seizures without any false alarms across all the patients as a big step forward in practical seizure detection. Importantly, a simple implementation of Laelaps on the Nvidia Tegra X2 embedded device achieves 1.7×-3.9× faster execution and 1.4×-2.9× lower energy consumption compared to the best result from the SoA methods. Our source code and anonymized iEEG dataset are freely available at http://ieeg-swez.ethz.ch
Enhancing structural health monitoring with vehicle identification and tracking
Traffic load monitoring and structural health monitoring (SHM) have been gaining increasing attention over the last decade. However, most of the current installations treat the two monitoring types as separated problems, thereby using dedicated installed sensors, such as smart cameras for traffic load or accelerometers for Structural Health Monitoring (SHM). This paper presents a new framework aimed at leveraging the data collected by a SHM system for a second use, namely, monitoring vehicles passing on the structure being monitored (a viaduct). Our framework first processes the raw three-axial acceleration signals through a series of transformations and extracts its energy. Then, an anomaly detection algorithm is used to detect peaks from 90 installed sensors, and a linear regression together with a simple threshold filters out false detection by estimating the speed of the vehicles. Initial results in conditions of moderate traffic load are promising, demonstrating the detection of vehicles and realistic characterization of their speed. Moreover, a k-means clustering analysis distinguishes two groups of peaks with statistically different features such as amplitude and damping duration that could be likely associated with heavy vehicles and cars, respectively
Adversarially-Trained Tiny Autoencoders for Near-Sensor Continuous Structural Health Monitoring
Structural Health Monitoring (SHM) systems are increasingly employed in many civil structures such as buildings, tunnels and viaducts. Typical installations consist of sensors that gather information and send it to a central computing unit, which then periodically analyzes the incoming data and produces an assessment of the structure conditions. To avoid the transmission of a huge amount of raw data and reduce latency in the detection of structural anomalies, recent works focus on moving computation on the sensor nodes. This work shows that a small autoencoder, which fits the tiny 2 MB memory of a typical microcontroller used for SHM sensor nodes can achieve very competitive accuracy in detecting structural anomalies as well as vehicle passage on bridges by leveraging adversarial training based on generative adversarial networks (GANs). We improve accuracy over state-of-the-art algorithms in two use-cases on real-standing buildings: i) predicting anomalies on a bridge (+7.4%) and ii) detecting vehicles on a viaduct (2.30 x )
- …
