Search CORE

1,720,995 research outputs found

xTern: Energy-Efficient Ternary Neural Network Inference on RISC-V-Based Edge Systems

Author: Mihali Joan
Scherer Moritz
Scherer Moritz; id_orcid
Benini Luca
Rutishauser Georg
Bonini Lucas
Publication venue
Publication date: 01/01/2024
Field of study

ISSN:2160-051

ETHzürich Repository for Publications and Research Data

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Securing Tiny Transformer-based Computer Vision Models: Evaluating Real-World Patch Attacks

Author: Cioflan Cristian
Mattei Andrea
Benini Luca
Scherer Moritz
Magno Michele
Publication venue
Publication date: 01/01/2023
Field of study

Transformers have significantly impacted the field of Computer Vision (CV) and the Internet of Things (IoT), sur-passing Convolutional Neural Networks (CNN) in various tasks. However, ensuring the security of CV models for critical real-world IoT applications such as autonomous driving, surveillance, and biomedical technologies is crucial. The adversarial robustness of these models has become a key research area, especially for edge processing. This work evaluates the robustness of Swin tiny and ConvNeXt tiny, specifically focusing on real-world patch attacks in Object Detection scenarios. To ensure a fair comparison, we establish a level playing field between Transformer based and CNN architectures, examining their vulnerabilities and potential defenses. Experimental results demonstrate the susceptibility of the Swin tiny and ConvNeXt tiny models to patch attacks, resulting in a significant decrease in average precision (AP) for the ”Person” class. When trained adversarial patches were applied, the AP drops to 12.8% and 15.2% for Swin tiny and ConvNeXt tiny models, respectively, highlighting their vulnerability to these attacks. This paper contributes to securing CV models on IoT vision devices, providing insights into the robustness of transformer-based architectures against real-world attacks, and advancing the field of adversarial robustness in embedded computer vision

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

WideVision: A Low-Power, Multi-Protocol Wireless Vision Platform for Distributed Surveillance

Author: Scherer Moritz
Sidler Fabian
Benini Luca
Rogenmoser Michael
Magno Michele
Publication venue
Publication date: 01/01/2022
Field of study

The trend in Internet of Things research points toward performing increasingly compute-intensive data analysis tasks on embedded sensor nodes, rather than server centers. Exploiting the technological advances in both energy efficiency, and Tiny Machine Learning algorithms and methods, an increasing number of recognition and classification tasks can be performed by small, low-power, wireless sensor nodes. This paper presents WideVision, a wireless, wide-area sensing platform capable of performing on-board person detection with power requirements in the mW range. The WideVision platform integrates seamlessly into the Internet of Things, by coupling a dedicated multiradio platform, including a LoRa interface, enabling medium and long-range communication, with a novel parallel RISC-V microcontroller. We evaluate the proposed platform with the GAP8 microcontroller, which includes an 8-core RISC-V cluster, and greyscale camera to perform person detection by training and deploying an advanced, quantized neural network, achieving a statistical accuracy 84.5% for a 5-person detection task with a latency of only 182 ms. Experimental results demonstrate that the WideVision sensor node platform while performing inference at a rate of one image per minute on-board, is capable of lasting 300 days on a 2400 mAh Li-ion battery, and 65 days when evaluating one image per 10 seconds while providing effective surveillance of its perimeter

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

J/inference end-to-end gesture recognition from dynamic vision sensor data using ternarized hybrid convolutional neural networks

Author: Scherer Moritz
Moritz Scherer
Benini Luca
Luca Benini
Georg Rutishauser
Fischer Tim
Tim Fischer
Rutishauser Georg
Publication venue
Publication date: 01/01/2023
Field of study

Dynamic vision sensor (DVS) cameras enable energy-activity proportional visual sensing by only propagating events produced by changes in the observed scene. Furthermore, by generating these events asynchronously, they offer s-scale latency while eliminating the redundant data transmission inherent to classical, frame-based cameras. However, the potential of DVS to improve the energy efficiency of IoT sensor nodes can only be fully realized with efficient and flexible systems that tightly integrate sensing, processing, and actuation capabilities. In this paper, we propose a complete end-to-end pipeline for DVS event data classification implemented on the Kraken parallel ultra-low power (PULP) system-on-chip and apply it to gesture recognition. A dedicated on-chip peripheral interface for DVS cameras aggregates the received events into ternary event frames. We process these video frames with a fully ternarized two-stage temporal convolutional network (TCN). The neural network can be executed either on Kraken’s PULP cluster of general-purpose RISC-V cores or on CUTIE, the on-chip ternary neural network accelerator. We perform extensive ablations on network structure, training, and data generation parameters. We achieve a validation accuracy of 97.7 % on the DVS128 11-class gesture dataset, a new record for embedded implementations. With in-silicon power and energy measurements, we demonstrate a classification energy of 7 J at a latency of 0.9 ms when running the TCN on CUTIE, a reduction of inference energy by when compared to the state of the art in embedded gesture recognition. The processing system consumes as little as 4.7 mW in continuous inference, enabling always-on gesture recognition and closing the gap between the efficiency potential of DVS cameras and application scenarios

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

ITA: An Energy-Efficient Attention and Softmax Accelerator for Quantized Transformers

Author: Scherer Moritz
Garofalo Angelo
İslamoğlu Gamze
Paulin Gianna
Jung Victor J. B.
Benini Luca
Fischer Tim
Publication venue
Publication date: 01/01/2023
Field of study

Transformer networks have emerged as the state-of-the-art approach for natural language processing tasks and are gaining popularity in other domains such as computer vision and audio processing. However, the efficient hardware acceleration of transformer models poses new challenges due to their high arithmetic intensities, large memory requirements, and complex dataflow dependencies. In this work, we propose ITA, a novel accelerator architecture for transformers and related models that targets efficient inference on embedded systems by exploiting 8-bit quantization and an innovative softmax implementation that operates exclusively on integer values. By computing on-the-fly in streaming mode, our softmax implementation minimizes data movement and energy consumption. ITA achieves competitive energy efficiency with respect to state-of-the-art transformer accelerators with 16.9 TOPS/W, while outperforming them in area efficiency with 5.93 TOPS/mm

^2

in 22 nm fully-depleted silicon-on-insulator technology at 0.8 V.Accepted for publication at the 2023 ACM/IEEE International Symposium on Low Power Electronics and Design (ISLPED

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Reliability and validity of the German “Evidence-Based Practice Confidence (EPIC) Scale” for allied health professionals

Author: Scherer Moritz
Elser Alexander
Diermayr Gudrun
Garbade Sven F.
Stadel Maria
Publication venue
Publication date: 01/01/2025
Field of study

Introduction: The Evidence-Based Practice Confidence (EPIC) Scale measures health professionals’ self-efficacy associated with evidence-based practice activities. The scale has been cross-culturally translated into German together with physical therapists. To support its use in German-speaking countries, the measurement properties of the scale need to be determined. Therefore, the primary objective of this study was to assess the measurement properties of the German EPIC scale. In a preparatory step, we aimed to evaluate the comprehensibility of the scale among German-speaking occupational therapists, speech and language therapists, and nurses. Methods: First, semi-structured cognitive interviews were used to evaluate the comprehensibility of the EPIC scale. Second, a longitudinal online survey with repeated measures (baseline and retest survey) was conducted. The target group included physical therapists, occupational therapists, speech and language therapists, and nurses from Germany, Austria, and Switzerland. Reliability, responsiveness, and validity were evaluated using internal consistency, test-retest reliability, standard error of measurement, known-groups method, exploratory factor analysis and the minimal detectable change, respectively. Results: Comprehensibility of the German EPIC scale was confirmed by eleven health care professionals (four occupational therapists, two speech and language therapists, five nurses). The baseline and the retest surveys were completed by 708 and 222 participants, respectively. The measure demonstrated an internal consistency of .930, with an intraclass correlation coefficient (ICC) for test-retest reliability of .936 (95% CI: .917 to .951). The standard error of measurement was 4.92, and the minimal detectable change at the 95% confidence level was 6.02. All hypotheses in the known-groups method were confirmed, and construct validity was acceptable. Factor analysis revealed two main factors affecting the results of the scale. Conclusion: The findings provide evidence that supports the use of the German EPIC scale among health professionals. For instance, it can be used to evaluate self-efficacy during EBP training.Hintergrund: Die „Evidence-Based Practice Confidence Scale“ (EPIC-Skala) misst die Selbstwirksamkeit von Angehörigen der Gesundheitsberufe bei Aktivitäten der evidenzbasierten Praxis. Die englischsprachige Skala wurde 2019 ins Deutsche übersetzt und anschließend mit Physiotherapeut*innen interkulturell adaptiert. Jedoch kann die Skala im deutschsprachigen Raum bislang nicht verwendet werden, da noch keine Bestimmung der psychometrischen Gütekriterien durchgeführt wurde. Deshalb ist das primäre Ziel dieser Studie die Ermittlung der psychometrischen Gütekriterien der deutschen EPIC-Skala. In einem vorbereitenden Schritt sollte die Verständlichkeit der Skala bei deutschsprachigen Ergotherapeut*innen, Logopäd*innen und Gesundheits- und Krankenpfleger*innen evaluiert werden. Methode: Zunächst wurde die Verständlichkeit der EPIC-Skala anhand von semistrukturierten kognitiven Interviews untersucht. In einem zweiten Schritt wurde eine Online-Umfrage im Längsschnittdesign durchgeführt, wobei zunächst eine Basiserhebung und anschließend eine Retest-Erhebung stattfand. Die Zielgruppe der Untersuchung bildeten Physiotherapeut*innen, Ergotherapeut*innen, Logopäd*innen und Gesundheits- und Krankenpfleger*innen aus Deutschland, Österreich und der Schweiz. Die Reliabilität, Responsivität und Validität wurden anhand der internen Konsistenz, der Test-Retest-Reliabilität, des Standardmessfehlers, der Known-Groups-Methode, der explorativen Faktorenanalyse und der minimalen nachweisbaren Veränderung beurteilt. Ergebnisse: Die Verständlichkeit der deutschen EPIC-Skala wurde von elf Angehörigen der Gesundheitsberufe (vier Ergotherapeut*innen, zwei Logopäd*innen, fünf Gesundheits- und Krankenpfleger*innen) bestätigt. Die Baseline- und Retest-Erhebungen wurden von 708 bzw. 222 Teilnehmenden abgeschlossen. Die Berechnungen zeigten eine interne Konsistenz von .930, mit einer Intraklassen-Korrelation für die Test-Retest-Reliabilität von .936 (95% CI: .917 - .951). Der Standardmessfehler betrug 4,92 und der minimale messbare Unterschied 6,02 (95%-Konfidenzintervall). Alle Hypothesen in der Known-Groups-Methode wurden bestätigt, sodass eine akzeptable Konstruktvalidität festgestellt wurde. Die Faktoranalyse ergab zwei Faktoren für die Skala. Schlussfolgerung: Die Ergebnisse unterstützen die Anwendung der deutschen Version der EPIC-Skala durch Angehörige der Gesundheitsfachberufe, insbesondere bei der Evaluation von Fortbildungen zur Vermittlung von Kompetenzen im evidenzbasierten Arbeiten

GRO.publications

GRO.publications (Univ. Göttingen)

OPUS HS Trier

Optimizing the Deployment of Tiny Transformers on Low-Power MCUs

Author: Scherer Moritz
Jung Victor J. B.
Benini Luca
Burrello Alessio
Conti Francesco
Publication venue
Publication date: 01/01/2025
Field of study

Transformer networks are rapidly becoming State of the Art (SotA) in many fields, such as Natural Language Processing (NLP) and Computer Vision (CV). Similarly to Convolutional Neural Networks (CNNs), there is a strong push for deploying Transformer models at the extreme edge, ultimately fitting the tiny power budget and memory footprint of Micro-Controller Units (MCUs). However, the early approaches in this direction are mostly ad-hoc, platform, and model-specific. This work aims to enable and optimize the flexible, multi-platform deployment of encoder Tiny Transformers on commercial MCUs. We propose a complete framework to perform end-to-end deployment of Transformer models onto single and multi-core MCUs. Our framework provides an optimized library of kernels to maximize data reuse and avoid unnecessary data marshaling operations into the crucial attention block. A novel Multi-Head Self-Attention (MHSA) inference schedule, named Fused-Weight Self-Attention (FWSA), is introduced, fusing the linear projection weights offline to further reduce the number of operations and parameters. Furthermore, to mitigate the memory peak reached by the computation of the attention map, we present a Depth-First Tiling (DFT) scheme for MHSA tailored for cache-less MCU devices that allows splitting the computation of the attention map into successive steps, never materializing the whole matrix in memory. We evaluate our framework on three different MCU classes exploiting ARM and RISC-V Instruction Set Architecture (ISA), namely the STM32H7 (ARM Cortex M7), the STM32L4 (ARM Cortex M4), and GAP9 (RV32IMC-XpulpV2). We reach an average of 4.79× and 2.0× lower latency compared to SotA libraries CMSIS-NN (ARM) and PULP-NN (RISC-V), respectively. Moreover, we show that our MHSA depth-first tiling scheme reduces the memory peak by up to 6.19×, while the fused-weight attention can reduce the runtime by 1.53×, and number of parameters by 25 %. Leveraging the optimizations proposed in this work, we run end-to-end inference of three SotA Tiny Transformers for three applications characterized by different input dimensions and network hyperparameters. We report significant improvements across the networks: for instance, when executing a transformer block for the task of radar-based hand-gesture recognition on GAP9, we achieve a latency of 0.14ms and energy consumption of 4.92 μJ, 2.32× lower than the SotA PULP-NN library on the same platform

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

TCN-CUTIE: A 1,036-TOp/s/W, 2.72-μJ/Inference, 12.2-mW All-Digital Ternary Accelerator in 22-nm FDX Technology

Author: Scherer Moritz
Mauro Alfio Di
Benini Luca
Fischer Tim
Rutishauser Georg
Publication venue
Publication date: 01/01/2023
Field of study

Tiny machine learning (TinyML) applications impose μJ/inference constraints, with a maximum power consumption of tens of megawatt. It is extremely challenging to meet these requirements at a reasonable accuracy level. This work addresses the challenge with a flexible, fully digital ternary neural network (TNN) accelerator in a reduced instruction set computer-five (RISC-V)-based System-on-Chip (SoC). Besides supporting ternary convolutional neural networks, we introduce extensions to the accelerator design that enable the processing of time-dilated temporal convolutional neural networks (TCNs). The design achieves 5.5-μJ/inference, 12.2 mW, 8,000 inferences/s at 0.5 V for a dynamic vision sensor (DVS)-based TCN and an accuracy of 94.5%, and 2.72-μJ/inference, 12.2 mW, 3,200 inferences/s at 0.5 V for a nontrivial 9-layer, 96 channels-per-layer convolutional network with CIFAR-10 accuracy of 86%. The peak energy efficiency is 1,036 TOp/s/W, outperforming the state-of-the-art silicon-proven TinyML quantized accelerators by 1.67× while achieving competitive accuracy

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Author Instructions

Author: Instructions Author
Publication venue
Publication date: 04/11/2013
Field of study

Crossref

Cartographic Perspectives (E-Journal - North American Cartographic Information Society, NACIS)

Going Beyond Counting First Authors in Author Co-citation Analysis

Author: Zhao Dangzhi
Publication venue
Publication date: 01/01/2005
Field of study

The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

E-LIS