1,720,995 research outputs found
xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems
ISSN:2160-051
Securing Tiny Transformer-based Computer Vision Models: Evaluating Real-World Patch Attacks
Transformers have significantly impacted the field of Computer Vision (CV) and the Internet of Things (IoT), sur-passing Convolutional Neural Networks (CNN) in various tasks. However, ensuring the security of CV models for critical real-world IoT applications such as autonomous driving, surveillance, and biomedical technologies is crucial. The adversarial robustness of these models has become a key research area, especially for edge processing. This work evaluates the robustness of Swin tiny and ConvNeXt tiny, specifically focusing on real-world patch attacks in Object Detection scenarios. To ensure a fair comparison, we establish a level playing field between Transformer based and CNN architectures, examining their vulnerabilities and potential defenses. Experimental results demonstrate the susceptibility of the Swin tiny and ConvNeXt tiny models to
patch attacks, resulting in a significant decrease in average precision (AP) for the ”Person” class. When trained adversarial patches were applied, the AP drops to 12.8% and 15.2% for Swin tiny and ConvNeXt tiny models, respectively, highlighting their vulnerability to these attacks. This paper contributes to securing CV models on IoT vision devices, providing insights into the robustness of transformer-based architectures against real-world attacks, and advancing the field of adversarial robustness in embedded computer vision
WideVision: A Low-Power, Multi-Protocol Wireless Vision Platform for Distributed Surveillance
The trend in Internet of Things research points toward performing increasingly compute-intensive data analysis tasks on embedded sensor nodes, rather than server centers. Exploiting the technological advances in both energy efficiency, and Tiny Machine Learning algorithms and methods, an increasing number of recognition and classification tasks can be performed by small, low-power, wireless sensor nodes. This paper presents WideVision, a wireless, wide-area sensing platform capable of performing on-board person detection with power requirements in the mW range. The WideVision platform integrates seamlessly into the Internet of Things, by coupling a dedicated multiradio platform, including a LoRa interface, enabling medium and long-range communication, with a novel parallel RISC-V microcontroller. We evaluate the proposed platform with the GAP8 microcontroller, which includes an 8-core RISC-V cluster, and greyscale camera to perform person detection by training and deploying an advanced, quantized neural network, achieving a statistical accuracy 84.5% for a 5-person detection task with a latency of only 182 ms. Experimental results demonstrate that the WideVision sensor node platform while performing inference at a rate of one image per minute on-board, is capable of lasting 300 days on a 2400 mAh Li-ion battery, and 65 days when evaluating one image per 10 seconds while providing effective surveillance of its perimeter
J/inference end-to-end gesture recognition from dynamic vision sensor data using ternarized hybrid convolutional neural networks
Dynamic vision sensor (DVS) cameras enable energy-activity proportional visual sensing by only propagating events produced by changes in the observed scene. Furthermore, by generating these events asynchronously, they offer
s-scale latency while eliminating the redundant data transmission inherent to classical, frame-based cameras. However, the potential of DVS to improve the energy efficiency of IoT sensor nodes can only be fully realized with efficient and flexible systems that tightly integrate sensing, processing, and actuation capabilities. In this paper, we propose a complete end-to-end pipeline for DVS event data classification implemented on the Kraken parallel ultra-low power (PULP) system-on-chip and apply it to gesture recognition. A dedicated on-chip peripheral interface for DVS cameras aggregates the received events into ternary event frames. We process these video frames with a fully ternarized two-stage temporal convolutional network (TCN). The neural network can be executed either on Kraken’s PULP cluster of general-purpose RISC-V cores or on CUTIE, the on-chip ternary neural network accelerator. We perform extensive ablations on network structure, training, and data generation parameters. We achieve a validation accuracy of 97.7 % on the DVS128 11-class gesture dataset, a new record for embedded implementations. With in-silicon power and energy measurements, we demonstrate a classification energy of 7
J at a latency of 0.9 ms when running the TCN on CUTIE, a reduction of inference energy by
when compared to the state of the art in embedded gesture recognition. The processing system consumes as little as 4.7 mW in continuous inference, enabling always-on gesture recognition and closing the gap between the efficiency potential of DVS cameras and application scenarios
ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers
Transformer networks have emerged as the state-of-the-art approach for natural language processing tasks and are gaining popularity in other domains such as computer vision and audio processing. However, the efficient hardware acceleration of transformer models poses new challenges due to their high arithmetic intensities, large memory requirements, and complex dataflow dependencies. In this work, we propose ITA, a novel accelerator architecture for transformers and related models that targets efficient inference on embedded systems by exploiting 8-bit quantization and an innovative softmax implementation that operates exclusively on integer values. By computing on-the-fly in streaming mode, our softmax implementation minimizes data movement and energy consumption. ITA achieves competitive energy efficiency with respect to state-of-the-art transformer accelerators with 16.9 TOPS/W, while outperforming them in area efficiency with 5.93 TOPS/mm in 22 nm fully-depleted silicon-on-insulator technology at 0.8 V.Accepted for publication at the 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED
Reliability and validity of the German “Evidence-Based Practice Confidence (EPIC) Scale” for allied health professionals
Introduction: The Evidence-Based Practice Confidence (EPIC) Scale measures health professionals’ self-efficacy associated with evidence-based practice activities. The scale has been cross-culturally translated into German together with physical therapists. To support its use in German-speaking countries, the measurement properties of the scale need to be determined. Therefore, the primary objective of this study was to assess the measurement properties of the German EPIC scale. In a preparatory step, we aimed to evaluate the comprehensibility of the scale among German-speaking occupational therapists, speech and language therapists, and nurses.
Methods: First, semi-structured cognitive interviews were used to evaluate the comprehensibility of the EPIC scale. Second, a longitudinal online survey with repeated measures (baseline and retest survey) was conducted. The target group included physical therapists, occupational therapists, speech and language therapists, and nurses from Germany, Austria, and Switzerland. Reliability, responsiveness, and validity were evaluated using internal consistency, test-retest reliability, standard error of measurement, known-groups method, exploratory factor analysis and the minimal detectable change, respectively.
Results: Comprehensibility of the German EPIC scale was confirmed by eleven health care professionals (four occupational therapists, two speech and language therapists, five nurses). The baseline and the retest surveys were completed by 708 and 222 participants, respectively. The measure demonstrated an internal consistency of .930, with an intraclass correlation coefficient (ICC) for test-retest reliability of .936 (95% CI: .917 to .951). The standard error of measurement was 4.92, and the minimal detectable change at the 95% confidence level was 6.02. All hypotheses in the known-groups method were confirmed, and construct validity was acceptable. Factor analysis revealed two main factors affecting the results of the scale.
Conclusion: The findings provide evidence that supports the use of the German EPIC scale among health professionals. For instance, it can be used to evaluate self-efficacy during EBP training.Hintergrund: Die „Evidence-Based Practice Confidence Scale“ (EPIC-Skala) misst die Selbstwirksamkeit von Angehörigen der Gesundheitsberufe bei Aktivitäten der evidenzbasierten Praxis. Die englischsprachige Skala wurde 2019 ins Deutsche übersetzt und anschließend mit Physiotherapeut*innen interkulturell adaptiert. Jedoch kann die Skala im deutschsprachigen Raum bislang nicht verwendet werden, da noch keine Bestimmung der psychometrischen Gütekriterien durchgeführt wurde. Deshalb ist das primäre Ziel dieser Studie die Ermittlung der psychometrischen Gütekriterien der deutschen EPIC-Skala. In einem vorbereitenden Schritt sollte die Verständlichkeit der Skala bei deutschsprachigen Ergotherapeut*innen, Logopäd*innen und Gesundheits- und Krankenpfleger*innen evaluiert werden.
Methode: Zunächst wurde die Verständlichkeit der EPIC-Skala anhand von semistrukturierten kognitiven Interviews untersucht. In einem zweiten Schritt wurde eine Online-Umfrage im Längsschnittdesign durchgeführt, wobei zunächst eine Basiserhebung und anschließend eine Retest-Erhebung stattfand. Die Zielgruppe der Untersuchung bildeten Physiotherapeut*innen, Ergotherapeut*innen, Logopäd*innen und Gesundheits- und Krankenpfleger*innen aus Deutschland, Österreich und der Schweiz. Die Reliabilität, Responsivität und Validität wurden anhand der internen Konsistenz, der Test-Retest-Reliabilität, des Standardmessfehlers, der Known-Groups-Methode, der explorativen Faktorenanalyse und der minimalen nachweisbaren Veränderung beurteilt.
Ergebnisse: Die Verständlichkeit der deutschen EPIC-Skala wurde von elf Angehörigen der Gesundheitsberufe (vier Ergotherapeut*innen, zwei Logopäd*innen, fünf Gesundheits- und Krankenpfleger*innen) bestätigt. Die Baseline- und Retest-Erhebungen wurden von 708 bzw. 222 Teilnehmenden abgeschlossen. Die Berechnungen zeigten eine interne Konsistenz von .930, mit einer Intraklassen-Korrelation für die Test-Retest-Reliabilität von .936 (95% CI: .917 - .951). Der Standardmessfehler betrug 4,92 und der minimale messbare Unterschied 6,02 (95%-Konfidenzintervall). Alle Hypothesen in der Known-Groups-Methode wurden bestätigt, sodass eine akzeptable Konstruktvalidität festgestellt wurde. Die Faktoranalyse ergab zwei Faktoren für die Skala.
Schlussfolgerung: Die Ergebnisse unterstützen die Anwendung der deutschen Version der EPIC-Skala durch Angehörige der Gesundheitsfachberufe, insbesondere bei der Evaluation von Fortbildungen zur Vermittlung von Kompetenzen im evidenzbasierten Arbeiten
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs
Transformer networks are rapidly becoming State of the Art (SotA) in many fields, such as Natural Language Processing (NLP) and Computer Vision (CV). Similarly to Convolutional Neural Networks (CNNs), there is a strong push for deploying Transformer models at the extreme edge, ultimately fitting the tiny power budget and memory footprint of Micro-Controller Units (MCUs). However, the early approaches in this direction are mostly ad-hoc, platform, and model-specific. This work aims to enable and optimize the flexible, multi-platform deployment of encoder Tiny Transformers on commercial MCUs. We propose a complete framework to perform end-to-end deployment of Transformer models onto single and multi-core MCUs. Our framework provides an optimized library of kernels to maximize data reuse and avoid unnecessary data marshaling operations into the crucial attention block. A novel Multi-Head Self-Attention (MHSA) inference schedule, named Fused-Weight Self-Attention (FWSA), is introduced, fusing the linear projection weights offline to further reduce the number of operations and parameters. Furthermore, to mitigate the memory peak reached by the computation of the attention map, we present a Depth-First Tiling (DFT) scheme for MHSA tailored for cache-less MCU devices that allows splitting the computation of the attention map into successive steps, never materializing the whole matrix in memory. We evaluate our framework on three different MCU classes exploiting ARM and RISC-V Instruction Set Architecture (ISA), namely the STM32H7 (ARM Cortex M7), the STM32L4 (ARM Cortex M4), and GAP9 (RV32IMC-XpulpV2). We reach an average of 4.79× and 2.0× lower latency compared to SotA libraries CMSIS-NN (ARM) and PULP-NN (RISC-V), respectively. Moreover, we show that our MHSA depth-first tiling scheme reduces the memory peak by up to 6.19×, while the fused-weight attention can reduce the runtime by 1.53×, and number of parameters by 25 %. Leveraging the optimizations proposed in this work, we run end-to-end inference of three SotA Tiny Transformers for three applications characterized by different input dimensions and network hyperparameters. We report significant improvements across the networks: for instance, when executing a transformer block for the task of radar-based hand-gesture recognition on GAP9, we achieve a latency of 0.14ms and energy consumption of 4.92 μJ, 2.32× lower than the SotA PULP-NN library on the same platform
TCN-CUTIE: A 1,036-TOp/s/W, 2.72-μJ/Inference, 12.2-mW All-Digital Ternary Accelerator in 22-nm FDX Technology
Tiny machine learning (TinyML) applications impose μJ/inference constraints, with a maximum power consumption of tens of megawatt. It is extremely challenging to meet these requirements at a reasonable accuracy level. This work addresses the challenge with a flexible, fully digital ternary neural network (TNN) accelerator in a reduced instruction set computer-five (RISC-V)-based System-on-Chip (SoC). Besides supporting ternary convolutional neural networks, we introduce extensions to the accelerator design that enable the processing of time-dilated temporal convolutional neural networks (TCNs). The design achieves 5.5-μJ/inference, 12.2 mW, 8,000 inferences/s at 0.5 V for a dynamic vision sensor (DVS)-based TCN and an accuracy of 94.5%, and 2.72-μJ/inference, 12.2 mW, 3,200 inferences/s at 0.5 V for a nontrivial 9-layer, 96 channels-per-layer convolutional network with CIFAR-10 accuracy of 86%. The peak energy efficiency is 1,036 TOp/s/W, outperforming the state-of-the-art silicon-proven TinyML quantized accelerators by 1.67× while achieving competitive accuracy
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
- …
