1,720,966 research outputs found

    Discrete Bayesian Optimization via Machine Learning

    Full text link
    Bayesian Optimization (BO) is a family of powerful algorithms designed to solve complex optimization problems involving expensive black-box functions. These sequential algorithms iteratively update a surrogate model of the objective function (OF), effectively balancing exploration and exploitation to identify near-optimal solutions within a limited number of iterations. Originally designed for continuous, unconstrained domains, its efficiency has inspired adaptations for discrete, constrained optimization problems. On the other hand, Machine Learning (ML) models allow accurate predictions for black-box functions, although they typically require large amounts of data for training. Leveraging the strengths of BO and ML, research tackles the challenge of identifying optimal configurations in the context of cloud computing. This paradigm has become pervasive due to its ability to provide flexible and scalable resources. Identifying the optimal hardware-software configuration is essential for minimizing costs while meeting Quality of Service constraints. This task involves solving complex optimization problems over multidimensional discrete domains and black-box objective functions and constraints, within a limited number of iterations. To address this challenge, this work introduces d-MALIBOO, a BO-based algorithm that integrates ML techniques to enhance the efficiency of finding near-optimal solutions in discrete and bounded domains. While BO builds the surrogate model of the OF, ML models determine the feasible region of the black-box constraints and guide the BO algorithm toward promising regions of the discrete domain. Furthermore, we introduce an epsilon-greedy approach to favor exploration in domains with multiple local optima. Experimental results show that our algorithm outperforms OpenTuner, a popular framework for constrained optimization, by reducing the average regret by 29%, and SVM-CBO, a BO-based algorithm that integrates SVM models to determine the feasible region, by 82%

    Integrating Bayesian Optimization and Machine Learning for the Optimal Configuration of Cloud Systems

    Full text link
    Bayesian Optimization (BO) is an efficient method for finding optimal cloud configurations for several types of applications. On the other hand, Machine Learning (ML) can provide helpful knowledge about the application at hand thanks to its predicting capabilities. This work proposes a general approach based on BO, which integrates elements from ML techniques in multiple ways, to find an optimal configuration of recurring jobs running in public and private cloud environments, possibly subject to blackbox constraints, e.g., application execution time or accuracy. We test our approach by considering several use cases, including edge computing, scientific computing, and Big Data applications. Results show that our solution outperforms other state-of-the-art black-box techniques, including classical autotuning and BO- and ML-based algorithms, reducing the number of unfeasible executions and corresponding costs up to 2–4 times

    BayesMix: Bayesian Mixture Models in C++

    Full text link
    We describe BayesMix, a C++ library for MCMC posterior simulation for general Bayesian mixture models. The goal of BayesMix is to provide a self-contained ecosystem to perform inference for mixture models to computer scientists, statisticians and practitioners. The key idea of this library is extensibility, as we wish the users to easily adapt our software to their specific Bayesian mixture models. In addition to the several models and MCMC algorithms for posterior inference included in the library, new users with little familiarity on mixture models and the related MCMC algorithms can extend our library with minimal coding effort. Our library is computationally very efficient when compared to competitor software. Examples show that the typical code runtimes are from two to 25 times faster than competitors for data dimension from one to ten. We also provide Python (bayesmixpy) and R (bayesmixr) interfaces. Our library is publicly available on GitHub at https://github.com/bayesmix-dev/bayesmix/

    Exploring the Utility of Graph Methods in HPC Thermal Modeling

    Full text link
    This work critically examines several approaches to temperature prediction for High-Performance Computing (HPC) systems, focusing on component-level and holistic models. In particular, we use publicly available data from the Tier-0 Marconi100 supercomputer and propose models ranging from a room-level Graph Neural Network (GNN) spatial model to node-level models. Our results highlight the importance of correct graph structures and suggest that while graph-based models can enhance predictions in certain scenarios, node-level models remain optimal when data is abundant. These findings contribute to understanding the effectiveness of different modeling approaches in HPC thermal prediction tasks, enabling proactive management of the modeled system

    Going Beyond Counting First Authors in Author Co-citation Analysis

    Full text link
    The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed

    Variations on the Author

    Full text link
    “Variations on the Author” discusses two of Eduardo Coutinho’s recent films (Um Dia na Vida, from 2010, and Últimas Conversas, posthumously released in 2015) and their contribution to the general question of documentary authorship. The director’s filmography is characterized by a consistent yet self-effacing form of authorial self-inscription: Coutinho often features as an interviewer that rather than express opinions propels discourses; an interviewer that is good at listening. This mode of self-inscription characterizes him as an author who is not expressive but who is nonetheless markedly present on the screen. In Um Dia na Vida, however, Coutinho is completely absent form the image, while Últimas Conversas, on the contrary, includes a confessional prologue that moves the director from the margins to the center of his films. This article examines the ways in which these works stand out in the filmography of a director who offers new insights into the notion of cinematic authorship

    Appropriate Similarity Measures for Author Cocitation Analysis

    Full text link
    We provide a number of new insights into the methodological discussion about author cocitation analysis. We first argue that the use of the Pearson correlation for measuring the similarity between authors’ cocitation profiles is not very satisfactory. We then discuss what kind of similarity measures may be used as an alternative to the Pearson correlation. We consider three similarity measures in particular. One is the well-known cosine. The other two similarity measures have not been used before in the bibliometric literature. Finally, we show by means of an example that our findings have a high practical relevance.information science;Pearson correlation;cosine;similarity measure;author cocitation analysis

    Dispelling the Myths Behind First-author Citation Counts

    Full text link
    We conducted a full-scale evaluative citation analysis study of scholars in the XML research field to explore just how different from each other author rankings resulting from different citation counting methods actually are, and to demonstrate the capability of emerging data and tools on the Web in supporting more realistic citation counting methods. Our results contest some common arguments for the continued use of first-author citation counts in the evaluation of scholars, such as high correlations between author rankings by first-author citation counts and other citation counting methods, and high costs of using more realistic citation counting methods that are not well-supported by the ISI databases. It is argued that increasingly available digital full text research papers make it possible for citation analysis studies to go beyond what the ISI databases have directly supported and to employ more sophisticated methods
    corecore