1,721,096 research outputs found

    Unleashing Fine-Grained Parallelism on Embedded Many-Core Accelerators with Lightweight OpenMP Tasking

    Full text link
    In recent years, programmable many-core accelerators (PMCAs) have been introduced in embedded systems to satisfy stringent performance/Watt requirements. This has increased the urge for programming models capable of effectively leveraging hundreds to thousands of processors. Task-based parallelism has the potential to provide such capabilities, offering high-level abstractions to outline abundant and irregular parallelism in embedded applications. However, efficiently supporting this programming paradigm on embedded PMCAs is challenging, due to the large time and space overheads it introduces. In this paper we describe a lightweight OpenMP tasking runtime environment (RTE) design for a state-of-the-art embedded PMCA, the Kalray MPPA 256. We provide an exhaustive characterization of the costs of our RTE, considering both synthetic workload and real programs, and we compare to several other tasking RTEs. Experimental results confirm that our solution achieves near-ideal parallelization speedups for tasks as small as 5K cycles, and an average speedup of 12 × for real benchmarks, which is 60% higher than what we observe with the original Kalray OpenMP implementation

    Using gait symmetry to virtually align a triaxial accelerometer during running and walking

    No full text
    During running and walking the human centre of mass experiences a symmetric acceleration along the mediolateral direction. This reported work shows how to exploit this knowledge to correct misalignments of the axes of a trunk-mounted accelerometer with respect to the body axes. After vertical alignment, based on the gravitational component of the signal, the technique computes the virtual rotation angle of the axes lying in the horizontal plane. The chosen angle minimises the autocorrelation of the signal along the mediolateral direction

    Modeling and Evaluation of Application-Aware Dynamic Thermal Control in HPC Nodes

    Full text link
    As side effects of the end of Dennard’s scaling, power and thermal technological walls stand in front of the evolution of supercomputers towards the exaflops era. Energy and temperature walls are big challenges to face for assuring a constant grow of performance in future. New generation architectures for HPC systems implement HW and SW components to address energy and thermal issues for increasing power and efficient computing in scientific workload. In thermal-bound HPC machines, workload-aware runtimes can leverage hardware knobs to guarantee the best operating point in term of performance and power saving without violating thermal constraints. In this paper, we present an integer-linear programming formulation for job mapping and frequency selection for thermal-bound HPC nodes. We use a fast solver and workload traces extracted from a real supercomputer to test our methodology. Our runtime is integrated into the MPI library, and it is capable of assigning high-performance cores to performance-critical processes. Critical processes are identified at execution time through a mathematical formulation, which relies on the characterization of the application workload and on the global synchronization barriers. We demonstrate that by combining long and short horizon predictions with information on the critical processes retrieved from the programming model, we can drastically improve the performance of the target application w.r.t. state-of-the-art DTM solutions

    An optimized task-based runtime system for resource-constrained parallel accelerators

    No full text
    Manycore accelerators have recently proven a promising solution for increasingly powerful and energy efficient computing systems. This raises the need for parallel programming models capable of effectively leveraging hundreds to thousands of processors. Task-based parallelism has the potential to provide such capabilities, offering flexible support to fine-grained and irregular parallelism. However, efficiently supporting this programming paradigm on resource-constrained parallel accelerators is a challenging task. In this paper, we present an optimized implementation of the OpenMP tasking model for embedded parallel accelerators, discussing the key design solution that guarantee small memory (footprint) and minimize performance overheads. We validate our design by comparing to several state-of-the-art tasking implementations, using the most representative parallelization patterns. The experimental results confirm that our solution achieves near-ideal speedups for tasks as small as 5K cycles

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Deploying a Communicating Automatic Weather Station on an Alpine Glacier

    Full text link
    The cost and effort of installing and maintaining an automatic weather station (AWS) on a glacier may be mitigated by the possibility of gathering sensor data in near real-time, and of controlling and programming the station remotely. In this paper we report our experience with upgrading an existing AWS, operating over an Italian glacier, from a mere datalogger into a networked sensing station. Design choices, energy constraints and power-aware programming of the station determined by harsh environment are discussed. Deployment operations and results are described. The upgraded AWS provides low-power connectivity from a remote location and is able to serve as a base station for a wireless sensor network working in the glacier

    COUNTDOWN: A Run-Time Library for Performance-Neutral Energy Saving in MPI Applications

    Full text link
    Power and energy consumption are becoming key challenges for the supercomputers' exascale race. HPC systems' processors waist active power during communication and synchronization among the MPI processes in large-scale HPC applications. However, due to the time scale at which communication happens, transitioning into low-power states while waiting for the completion of each communication may introduce unacceptable overhead. In this article, we present COUNTDOWN, a run-time library for identifying and automatically reducing the power consumption of the CPUs during communication and synchronization. COUNTDOWN saves energy without penalizing the time-to-completion by lowering CPUs power consumption only during idle times for which power state transition overhead is negligible. This is done transparently to the user, without requiring labor-intensive and error-prone application code modifications, nor requiring recompilation of the application. We test our methodology on a production Tier-1 system. For the NAS benchmarks, COUNTDOWN saves between 6 and 50 percent energy, with a time-to-solution penalty lower than 5 percent. In a complete production - Quantum ESPRESSO - for a 3.5K cores run, COUNTDOWN saves 22.36 percent energy, with a performance penalty below 3 percent. Energy saving increases to 37 percent with a performance penalty of 6.38 percent, if the application is executed without communication tuning

    A smartphone based sonification and telemetry platform for on-water rowing training

    No full text
    On water rowing training greatly benefits from sonification. However, no real-time usable smartphone based platform exists for acquisition and sonification of data measured during rowing. We propose the use of a smartphone based system, coupled with an Accrow (BeSB) data acquisition device. The whole system is able to convey the produced sound within 100ms from the movement, thus enabling the presentation of functional real-time feedback to the rowers. The system is thought to be useful for both athletes and coaches. The sonification presented to the athletes is aimed at enhancing their perception for the movement execution with the final aim of synchronizing the crew in a uniform rhythm in order to improve the boat velocity. The sonification presented to the coaches aimed at assisting their visual observation of the boat motion in the daily training routine by listening to the sound in order to detect fluctuations that are not visible. An empirically investigated concept of acoustic feedback that is presented in real-time during on-water rowing training sessions already exists. This paper deals with the extension of the technical hardware currently used in high performance rowing training to a smartphone based platform in order to provide the sonification to more users and to everyday club training including young and older rowers (juniors and masters)

    Net Zeb case studies. Leaf House

    No full text
    In this book, accomplished international experts present advanced modeling techniques as well as in-depth case studies meant to lead designers towards the optimal use of simulation tools for the design of net-zero energy buildings (Net ZEBs). The book discusses different design processes and tools used in designing Net ZEBs, starting from the fundamental concepts, design strategies, and technologies. These processes and tools are then evaluated by referring to four diverse Net ZEBs where the authors were intimately involved from the design concept to operation. The high resolution measured performance data from these case studies are compared with the predictions made using the respective design tools. Written by academics and building designers based in North America and Europe, this book provides a broad perspective on Net ZEBs. It is a guideline for advanced building designers that draws from both the theoretical background and the vast practical experience of the authors
    corecore