1,721,002 research outputs found

    Local rademacher complexity machine

    No full text
    In this paper we present the Local Rademacher Complexity Machine, a transposition of the Local Rademacher Complexity Theory into a learning algorithm. By exploiting a series of real world small-sample datasets, we show the advantages of our proposal with respect to the Support Vector Machines, i.e. the transposition of the milestone results of V. N. Vapnik and A. Chervonenkis into a learning algorithm

    Improving the union bound: A distribution dependent approach

    No full text
    Statistical Learning Theory deals with the problem of estimating the performance of a learning procedure. Any learning procedure implies making choices and this choices imply a risk. When the number of choices is finite, the state-of-the-art tool for evaluating the total risk of all the choice made is the Union Bound. The problem of the Union Bound is that it is very loose in practice if no a-priori information is available. In fact, the Union Bound considers all choices equally plausible while, as a matter of fact, a learning procedure targets just particular choices disregarding the others. In this work we will show that it is possible to improve the Union Bound based results using a distribution dependent weighting strategy of the true risks associated to each choice. Then we will prove that our proposal outperforms or, in the worst case, it degenerate in the Union Bound

    Towards algorithms and models that we can trust: A theoretical perspective

    No full text
    In the last decade it became increasingly apparent the inability of technical metrics such as accuracy, sustainability, and non-regressiveness to well characterize the behavior of intelligent systems. In fact, they are nowadays requested to meet also ethical requirements such as explainability, fairness, robustness, and privacy increasing our trust in their use in the wild. Of course often technical and ethical metrics are in tension between each other but the final goal is to be able to develop a new generation of more responsible and trustworthy machine learning. In this paper, we focus our attention on machine learning algorithms and associated predictive models, questioning for the first time, from a theoretical perspective, if it is possible to simultaneously guarantee their performance in terms of both technical and ethical metrics towards machine learning algorithms that we can trust. In particular, we will investigate for the first time both theory and practice of deterministic and randomized algorithms and associated predictive models showing the advantages and disadvantages of the different approaches. For this purpose we will leverage the most recent advances coming from the statistical learning theory: Complexity-Based Methods, Distribution Stability, PAC-Bayes, and Differential Privacy. Results will show that it is possible to develop consistent algorithms which generate predictive models with guarantees on multiple trustworthiness metrics

    Informed Machine Learning: Excess risk and generalization

    No full text
    Machine Learning (ML) has transformed both research and industry by offering powerful models capable of capturing complex phenomena. However, these models often require large, high-quality datasets and may struggle to generalize beyond the distributions on which they are trained. Informed Machine Learning (IML) tackles these challenges by incorporating domain knowledge at various stages of the ML pipeline, thereby reducing data requirements and enhancing generalization. Building on statistical learning theory, we present some theoretical comparison and insights about ML and IML excess risk and generalization performance. We then illustrate how these theoretical insights can be leveraged in practice through some practical examples. Our findings shed some light on the mechanisms and conditions under which IML can outperform traditional ML, offering valuable guidance for effective implementation in real-world settings

    Generalization performances of randomized classifiers and algorithms built on data dependent distributions

    No full text
    In this paper we prove that a randomized algorithm based on the data generating dependent prior and data dependent posterior Boltz- mann distributions of Catoni (2007) is Differentially Private (DP) and shows better generalization properties than the Gibbs (randomized) classi- fier associated to the same distributions. For this purpose, we will develop a tight DP-based generalization bound, which improve over the current state-of-the-art Hoefiding-type bound

    Local Rademacher Complexity Machine

    No full text
    Support Vector Machines (SVMs) are a state-of-the-art and powerful learning algorithm that can effectively solve many real world problems. SVMs are the transposition of the Vapnik–Chervonenkis (VC) theory into a learning algorithm. In this paper, we present the Local Rademacher Complexity Machine (LRCM), a transposition of the Local Rademacher Complexity (LRC) theory, the state-of-the-art evolution of the VC theory, into a learning algorithm. Analogously to what has been done for the SVMs, we will present first the theoretical ideas behind the LRC theory, we will show how these ideas can be translated into a learning algorithm, the LRCM, and then how the LRCM can be made efficient and kernelizable. By exploiting a series of real world datasets, we will show the effectiveness of the LRCM against the SVMs

    Distribution-Dependent Weighted Union Bound

    No full text
    In this paper, we deal with the classical Statistical Learning Theory’s problem of bounding, with high probability, the true risk R(h) of a hypothesis h chosen from a set H of m hypotheses. The Union Bound (UB) allows one to state that PLR^(h),δqh≤R(h)≤UR^(h),δph≥1−δ where R^(h) is the empirical errors, if it is possible to prove that P{R(h)≥L(R^(h),δ)}≥1−δ and P{R(h)≤U(R^(h),δ)}≥1−δ, when h, qh, and ph are chosen before seeing the data such that qh,ph∈[0,1] and ∑h∈H(qh+ph)=1. If no a priori information is available qh and ph are set to 12m, namely equally distributed. This approach gives poor results since, as a matter of fact, a learning procedure targets just particular hypotheses, namely hypotheses with small empirical error, disregarding the others. In this work we set the qh and ph in a distribution-dependent way increasing the probability of being chosen to function with small true risk. We will call this proposal Distribution-Dependent Weighted UB (DDWUB) and we will retrieve the sufficient conditions on the choice of qh and ph that state that DDWUB outperforms or, in the worst case, degenerates into UB. Furthermore, theoretical and numerical results will show the applicability, the validity, and the potentiality of DDWUB
    corecore