1,720,996 research outputs found

    Larger-Than-Memory Stateful Stream Processing with WindFlow

    No full text
    In stream processing, a vast volume of data is continuously processed by standing queries that extract insights from raw inputs. These queries often maintain an internal state, representing useful information from the stream’s history, to produce results. Notable examples of state paradigms include sliding windows, where computation is periodically repeated over the most recent data (e.g., inputs received in the last ten seconds, sliding every half second). Additionally, this state is replicated per distinct key, a user-defined attribute used to partition the physical stream into logical sub-streams. The combination of numerous keys (often millions in real-world scenarios) and the window size can make the overall state of a streaming query enormous, potentially exceeding available memory. This issue is particularly critical when the processing is done on resource-constrained, low-end devices like in the Edge computing paradigm. In this paper, we focus on designing a family of persistent operators capable of transparently maintaining their internal state in an external Key-Value Store, thereby leveraging secondary memory. We present this design within the context of the WindFlow stream processing library for multi-core architectures. The paper details our design and implementation, along with an experimental evaluation based on a set of benchmarks, to assess the performance of persistent operators compared with traditional in-memory processing

    The 4th International Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP 2021)

    No full text
    The organizers of the 4th International Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP 2021) are delighted to welcome you to the workshop proceedings as part of the ICPE 2021 conference companion

    Boosting general-purpose stream processing with reconfigurable hardware

    No full text
    Reconfigurable devices such as field-programmable gate arrays (FPGAs) offer flexible solutions to workload acceleration with high energy efficiency. Despite such a potential advantage, they often reveal hard to program by application programmers. High-level synthesis languages have been developed to provide higher-level abstractions, allowing the developers to define the FPGA behavior using an imperative programming approach based on C/C++ languages. However, such approaches still leave the developer with the responsibility to harness the low-level optimizations required to develop efficient FPGA programs. Along this line, this paper introduces FSPX, a framework helping programmers to develop FPGA-accelerated data stream processing (DSP) applications. The approach provides a high-level Python API to develop the data-flow graph of operators, which is automatically translated into an efficient Vitis source code targeting Xilinx devices. The execution of the bitstreams implementing two benchmark applications showcases the efficiency of using FPGAs for DSP workloads. In general, FSPX provides, with a reasonable time-to-solution, higher performance compared with state-of-the-art DSP frameworks

    Seamless FPGA Integration with Stream Processing Engines

    No full text
    Stream processing is a computing paradigm enabling the analysis of data streams arriving at high speed from data producers. Its goal is to extract knowledge and complex events by processing streams with high throughput and low latency. To accomplish this goal, Stream Processing Engines (SPEs) try to exploit the parallel processing capabilities provided by modern hardware (usually multi-core CPUs and distributed systems). The exploitation of hardware accelerators, and in particular of FPGAs, is promising because they can maximize parallelism and reduce energy consumption. However, programming FPGAs is a very cumbersome and challenging task requiring a lot of expertise. In this paper, we discuss the seamless integration of FSPX, a prototype system for generating FPGA-based implementations of streaming pipelines, with an existing SPE (WindFlow). Our goal is to integrate these two tools by providing high-level programming interfaces to end users and guaranteeing high performance with efficient hardware utilization

    PPOIJ: Shared-Nothing Parallel Patterns for Efficient Online Interval Joins over Data Streams

    No full text
    Joining data streams is a fundamental stateful operator in stream processing. It involves evaluating join pairs of tuples from two streams that meet specific user-defined criteria. This operator is typically time-consuming and often represents the major bottleneck in several real-world continuous queries. This paper focuses on a specific class of join operator, named online interval join, where we seek join pairs of tuples that occur within a certain time frame of each other. Our contribution is to propose different parallel patterns for implementing this join operator efficiently in the presence of watermarked data streams and skewed key distributions. The proposed patterns comply with the shared-nothing parallelization paradigm, a popular paradigm adopted by most of the existing Stream Processing Engines. Among the proposed patterns, we introduce one based on hybrid parallelism, which is particularly effective in handling various scenarios in terms of key distribution, number of keys, batching, and parallelism as demonstrated in our experimental analysis

    Towards Parallel Data Stream Processing on System-on-Chip CPU+GPU Devices

    No full text
    Data Stream Processing is a pervasive computing paradigm with a wide spectrum of applications. Traditional streaming systems exploit the processing capabilities provided by homogeneous Clusters and Clouds. Due to the transition to streaming systems suitable for IoT/Edge environments, there has been the urgent need of new streaming frameworks and tools tailored for embedded platforms, often available as System-onChips composed of a small multicore CPU and an integrated onchip GPU. Exploiting this hybrid hardware requires special care in the runtime system design. In this paper, we discuss the support provided by the WindFlow library, showing its design principles and its effectiveness on the NVIDIA Jetson Nano board

    Autonomic management experiences in structured parallel programming

    No full text
    Structured parallel programming models based on parallel design patterns are gaining more and more importance. Several state-of-the-art industrial frameworks build on the parallel design pattern concept, including Intel TBB and Microsoft PPL. In these frameworks, the explicit exposition of parallel structure of the application favours the identification of the inefficiencies, the exploitation of techniques increasing the efficiency of the implementation and ensures that most of the more critical aspects related to an efficient exploitation of the available parallelism are moved from application programmers to framework designers. The very same exposition of the graph representing the parallel activities enables framework designers to emplace efficient autonomic management of non functional concerns, such as performance tuning or power management. In this paper, we discuss how autonomic management features evolved in different structured parallel programming frameworks based on the algorithmic skeletons and parallel design patterns. We show that different levels of autonomic management are possible, ranging from simple provisioning of mechanisms suitable to support programmers in the implementation of ad hoc autonomic managers to the complete autonomic managers whose behaviour may be programmed using high level rules by the application programmers

    Distributed-Memory FastFlow Building Blocks

    No full text
    We present the new distributed-memory run-time system (RTS) of the C++-based open-source structured parallel programming library FastFlow. The new RTS enables the execution of FastFlow shared-memory applications written using its Building Blocks (BBs) on distributed systems with minimal changes to the original program. The changes required are all high-level and deal with introducing distributed groups (dgroup), i.e., logical partitions of the BBs composing the application streaming graph. A dgroup, which in turn is implemented using FastFlow’s BBs, can be deployed and executed on a remote machine and communicate with other dgroups according to the original shared-memory FastFlow streaming programming model. We present how to define the distributed groups and how we faced the problem of data serialization and communication performance tuning through transparent messages’ batching and their scheduling. Finally, we present a study of the overhead introduced by dgroups considering some benchmarks on a sixteen-node cluster

    Evaluation of Adaptive Micro-batching Techniques for GPU-Accelerated Stream Processing

    No full text
    Stream processing plays a vital role in applications that require continuous, low-latency data processing. Thanks to their extensive parallel processing capabilities and relatively low cost, GPUs are well-suited to scenarios where such applications require substantial computational resources. However, micro-batching becomes essential for efficient GPU computation within stream processing systems. However, finding appropriate batch sizes to maintain an adequate level of service is often challenging, particularly in cases where applications experience fluctuations in input rate and workload. Addressing this challenge requires adjusting the optimal batch size at runtime. This study proposes a methodology for evaluating different self-adaptive micro-batching strategies in a real-world complex streaming application used as a benchmark

    New Landscapes of the Data Stream Processing in the era of Fog Computing

    No full text
    The “New Landscapes of the Data Stream Processing in the era of Fog Computing” special issue aims to present new research works on topics related to recent advances in Data Streaming Processing (DSP) computing paradigm in the emerging environments of Fog Computing and Internet of Things (IoT). The papers included in this special issue are relevant examples of recent research achievements in the definition of new DSP applications in the Fog Computing context, of run-time systems mechanisms and techniques targeting DPS frameworks, and also of new high-level interfaces for data streaming in highly dynamic IT environments
    corecore