1,721,108 research outputs found

    DataPACT: compliance by design of data/AI operations and pipelines

    No full text
    DataPACT is a key initiative that develops novel tools and methodologies for efficient, compliant, ethical, and sustainable data/AI operations and pipelines. DataPACT contributes to their design, implementation, and management by embedding compliance, privacy, and environmental sustainability at their core design. It delivers compliance-by-design for data/AI operations and pipelines by developing innovative technical tools (Compliance Toolbox) and supportive methodologies (Compliance Framework) for compliance assessment and realization of data/AI pipelines designed, deployed, and executed through a set of management tools and techniques (Compliance-aware Data/AI Pipeline Toolbox). This paper presents an overview of DataPACT, focusing on motivation, methodology, and use cases

    On the next generations of infrastructure-as-a-services

    No full text
    Following the wide adoption by industry of the cloud computing technologies, we can talk about a second generation of cloud services and products that are currently under design phase. However, it is not yet clear how the third generation of cloud products and services of the next decade will look like, especially at the delivery level of Infrastructure-as-a-Service. In order to answer at least partially to such a challenging question, we initiated a literature overview and two surveys involving the members of a cluster of European research and innovation actions. The results are interpreted in this paper and a set of topics of interest for the third generation are identified

    Cloud Cost Modelling and Optimisation: A Taxonomy and Approaches for Storage Object Classification and Cloud Resource Placement

    Full text link
    The adoption of cloud services continues to rise due to their flexibility and potential cost savings. However, the cost structure of cloud services is complex and often leads to significant service wastage. Existing cost models are industry-specific and lack the ability to incorporate user requirements, which is why only a small percentage of cloud users actually use these models to optimise service costs. As a result, this often leads to higher cloud usage costs. The primary objective of this thesis is to explore approaches for cloud cost modelling and optimisation in order to aid understanding of the complex cost structure and cost management. This includes identifying varying cost dimensions with cost-saving potential and, based on that, developing industry independent approaches that are applicable to a wide variety of scenarios fulfilling the users’ requirements. In this respect, there are three main contributions produced in this thesis. First, a comprehensive cloud storage cost taxonomy is developed in relation to other cost elements. This taxonomy serves as a framework for understanding the various cost elements associated with cloud storage. It provides a structured approach to dissecting the cost ecosystem, making it more comprehensible and manageable. Moreover, different cost optimisation strategies are identified for storage cost, network usage cost, compute cost etc. Focusing on the first area of optimisation which is storage cost, the second contribution involves the development of two novel approaches for the classification of storage objects across different storage tiers. These approaches aim to optimise the allocation of storage objects, ensuring that each object is stored in the most cost-effective tier. This not only helps in managing storage costs but also enhances the efficiency of data retrieval processes. The proposed storage tier classification approaches are evaluated on both synthetic and semi-synthetic datasets that mimic real-world big data pipeline scenarios. The results show a considerable amount of cost savings compared to the scenario where data is not moved between tiers and is stored in the same tier for the entire duration. Third, in order to optimize network usage cost and address the trade-off between cost and performance, a graph-based approach is devised for the placement of cloud resources. This approach uses graph theory and allows for the optimal allocation of resources, taking into consideration factors such as cost, performance, and availability. For the graph-based approach, four different big data deployment scenarios are created to accommodate various situations. The deployment model is generated through the proposed approach and evaluvated. The evaluations not only demonstrate the potential for cost savings but also the ability of the approach to incorporate Quality of Service (QoS) elements. In summary, through the taxonomy and the proposed approaches, the thesis seeks to simplify the cost structure of cloud services and provides approaches that can lead to more efficient and cost-effective utilisation of cloud services. Based on the evaluations, the proposed approaches show the potential to significantly impact how organisations approach their cloud strategies, leading to notable cost savings and improved operational efficiency

    The coming age of pervasive data processing

    No full text
    Emerging Big Data analytics and machine learning applications require a significant amount of computational power. While there exists a plethora of large-scale data processing frameworks which thrive in handling the various complexities of data-intensive workloads, the ever-increasing demand of applications have made us reconsider the traditional ways of scaling (e.g., scale-out) and seek new opportunities for improving the performance. In order to prepare for an era where data collection and processing occur on a wide range of devices, from powerful HPC machines to small embedded devices, it is crucial to investigate and eliminate the potential sources of inefficiency in the current state of the art platforms. In this paper, we address the current and upcoming challenges of pervasive data processing and present directions for designing the next generation of large-scale data processing systems.Green Open Access added to TU Delft Institutional Repository ‘You share, we take care!’ – Taverne project https://www.openaccess.nl/en/you-share-we-take-care Otherwise as indicated in the copyright section: the publisher is the copyright holder of this work and the author uses the Dutch legislation to make this work public.Data-Intensive System

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    The ENTICE Project: dEcentralized repositories for traNsparent and efficienT vIrtual maChine opErations

    No full text
    The ENTICE Project: dEcentralized repositories for traNsparent and efficienT vIrtual maChine opEration

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis
    corecore