1,721,234 research outputs found

    Preface

    No full text

    Preface

    Full text link

    Preface

    No full text

    Towards efficient code generation for exposed datapath architectures

    Full text link
    Coarse-grained reconfigurable architectures and other exposed datapath architectures such as transport-triggered architectures come with a high energy efficiency promise for accelerating data oriented workloads. Their main drawback results from the push of complexity from the architecture to the programmer; compiler techniques that allow starting from a higher-level programming language and generate code efficiently to such architectures robustly is still an open research area. In this article we survey the known main sources of challenges and outline a generic processor architecture template that covers the most common architecture variations along with a proposal for a common code generation framework for such challenging architectures

    Reviewing inference performance of state-of-the-art deep learning frameworks

    No full text
    Deep learning models have replaced conventional methods for machine learning tasks. Efficient inference on edge devices with limited resources is key for broader deployment. In this work, we focus on the tool selection challenge for inference deployment. We present an extensive evaluation of the inference performance of deep learning software tools using state-of-the-art CNN architectures for multiple hardware platforms. We benchmark these hardware-software pairs for a broad range of network architectures, inference batch sizes, and floating-point precision, focusing on latency and throughput. Our results reveal interesting combinations for optimal tool selection, resulting in different optima when considering minimum latency and maximum throughput

    Exploiting specification modularity to prune the optimization-space of manufacturing systems

    No full text
    In this paper we address the makespan optimization of industrial-sized manufacturing systems. We introduce a framework which species functional system requirements in a compositional way and automatically computes makespan optimal solutions respecting these requirements. We show the optimization problem to be NP-Hard. To scale towards systems of industrial complexity, we propose a novel approach based on a subclass of compositional requirements which we call constraints. We prove that these constraints always prune the worst-case optimization-space thereby increasing the odds of nding an optimal solution (with respect to the additional constraints). We demonstrate the applicability of the framework on an industrial-sized manufacturing system

    CIM-SIM: computation in Memory SIMuIator

    Full text link
    Computation-in-memory reverses the trend in von-Neumann processors by bringing the computation closer to the data, to even within the memory array, as opposed to introducing new memory hierarchies to keep (frequently used) data closer to a central processing unit (CPU). In recent years, new non-volatile memory (NVM) technologies, e.g., memristor, PCM, etc., have proven that they can function as memories and perform computations on the stored data as well. In particular, when they are combined with a modest set of (digital) peripheral modules, a wider range of operations can be supported, e.g., vector matrix multiply and Boolean logic. In this paper, we are introducing the CIM-SIM, an open source simulator written in SystemC, which is capable of simulating the functional behaviour of such architectures. The architecture includes the definition of a set of technology-agnostic nano-instructions

    Memory and parallelism analysis using a platform-independent approach

    No full text
    Emerging computing architectures such as near-memory computing (NMC) promise improved performance for applications by reducing the data movement between CPU and memory. However, detecting such applications is not a trivial task. In this ongoing work, we extend the state-of-the-art platform-independent software analysis tool with NMC related metrics such as memory entropy, spatial locality, data-level, and basic-block-level parallelism. These metrics help to identify the applications more suitable for NMC architectures

    Real-time audio processing for hearing aids using a model-based Bayesian inference framework

    No full text
    Development of hearing aid (HA) signal processing algorithms entails an iterative process between two design steps, namely algorithm development and the embedded implementation. Algorithm designers favor high-level programming languages for several reasons including higher productivity, code readability and, perhaps most importantly, availability of state-of-the-art signal processing frameworks that open new research directions. Embedded software, on the other hand, is preferably implemented using a low-level programming language to allow finer control of the hardware, an essential trait in real-time processing applications. In this paper we present a technique that allows deploying DSP algorithms written in Julia, a modern high-level programming language, on a real-time HA processing platform known as openMHA. We demonstrate this technique by using a model-based Bayesian inference framework to perform real-time audio processing

    Data dependent energy modeling for worst case energy consumption analysis

    Full text link
    Safely meeting Worst Case Energy Consumption (WCEC) criteria requires accurate energy modeling of software. We investigate the impact of instruction operand values upon energy consumption in cacheless embedded processors. Existing instruction-level energy models typically use measurements from random input data, providing estimates unsuitable for safe WCEC analysis. We examine probabilistic energy distributions of instructions and propose a model for composing instruction sequences using distributions, enabling WCEC analysis on program basic blocks. The worst case is predicted with statistical analysis. Further, we verify that the energy of embedded benchmarks can be characterised as a distribution, and compare our proposed technique with other methods of estimating energy consumption
    corecore