Search CORE

1,721,040 research outputs found

Iterate Averaging as Regularization for Stochastic Gradient Descent

Author: Neu G.
Rosasco L.
Publication venue
Publication date: 01/01/2018
Field of study

Archivio istituzionale della ricerca - Università di Genova

Statistical and Computational Trade-Offs in Kernel K-Means

Author: Calandriello D
Rosasco L
Publication venue
Publication date: 01/01/2018
Field of study

We investigate the efficiency of k-means in terms of both statistical and computational requirements. More precisely, we study a Nystrom approach to kernel k-means. We analyze the statistical properties of the proposed method and show that it achieves the same accuracy of exact kernel k-means with only a fraction of computations. Indeed, we prove under basic assumptions that sampling oot pn Nystrom landmarks allows to greatly reduce computational costs without incurring in any loss of accuracy. To the best of our knowledge this is the first result of this kind for unsupervised learning

Archivio istituzionale della ricerca - Università di Genova

Fast approximation of orthogonal matrices and application to PCA

Author: Rusu C.
Rosasco L.
Publication venue
Publication date: 01/01/2022
Field of study

Orthogonal projections are a standard technique of dimensionality reduction in machine learning applications. We study the problem of approximating orthogonal matrices so that their application is numerically fast and yet accurate. We find an approximation by solving an optimization problem over a set of structured matrices, that we call extended orthogonal Givens transformations, including Givens rotations as a special case. We propose an efficient greedy algorithm to solve such a problem and show that it strikes a balance between approximation accuracy and speed of computation. The approach is relevant to spectral methods and we illustrate its application to PCA

Archivio istituzionale della ricerca - Università di Genova

Learning with SGD and Random Features

Author: Rudi A
Carratino L
Rosasco L
Publication venue
Publication date: 01/01/2018
Field of study

Sketching and stochastic gradient methods are arguably the most common techniques to derive efficient large scale learning algorithms. In this paper, we investigate their application in the context of nonparametric statistical learning. More precisely, we study the estimator defined by stochastic gradient with mini batches and random features. The latter can be seen as form of nonlinear sketching and used to define approximate kernel methods. The considered estimator is not explicitly penalized/constrained and regularization is implicit. Indeed, our study highlights how different parameters, such as number of features, iterations, step-size and mini-batch size control the learning properties of the solutions. We do this by deriving optimal finite sample bounds, under standard assumptions. The obtained results are corroborated and illustrated by numerical experiments

Archivio istituzionale della ricerca - Università di Genova

Generalization properties and implicit regularization for multiple passes SGM

Author: Rosasco L
Lin J
Camoriano R
Publication venue
Publication date: 01/01/2016
Field of study

We study the generalization properties of stochastic gradient methods for learning with convex loss functions and linearly parameterized functions. We show that, in the absence of penalizations or constraints, the stability and approximation properties of the algorithm can be controlled by tuning either the step-size or the number of passes over the data. In this view, these parameters can be seen to control a form of implicit regularization. Numerical results complement the theoretical findings

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

A General Framework for Consistent Structured Prediction with Implicit Loss Embeddings

Author: Rudi A
Ciliberto C
Rosasco L
Publication venue
Publication date: 01/01/2020
Field of study

We propose and analyze a novel theoretical and algorithmic framework for structured prediction. While so far the term has referred to discrete output spaces, here we consider more general settings, such as manifolds or spaces of probability measures. We define structured prediction as a problem where the output space lacks a vectorial structure. We identify and study a large class of loss functions that implicitly defines a suitable geometry on the problem. The latter is the key to develop an algorithmic framework amenable to a sharp statistical analysis and yielding efficient computations. When dealing with output spaces with infinite cardinality, a suitable implicit formulation of the estimator is shown to be crucial

Archivio istituzionale della ricerca - Università di Genova

Beating SGD saturation with tail-averaging and minibatching

Author: Mücke N.
Neu G.
Rosasco L.
Publication venue
Publication date: 01/01/2019
Field of study

Archivio istituzionale della ricerca - Università di Genova

Reproducing kernel Hilbert spaces on manifolds: Sobolev and diffusion spaces

Author: Rosasco L.
De Vito E.
Mucke N.
Publication venue
Publication date: 01/01/2020
Field of study

We study reproducing kernel Hilbert spaces (RKHS) on a Riemannian manifold. In particular, we discuss under which condition Sobolev spaces are RKHS and characterize their reproducing kernels. Further, we introduce and discuss a class of smoother RKHS that we call diffusion spaces. We illustrate the general results with a number of detailed examples. While connections between Sobolev spaces, differential operators and RKHS are well known in the Euclidean setting, here we present a self-contained study of analogous connections for Riemannian manifolds. By collecting a number of results in unified a way, we think our study can be useful for researchers interested in the topic

Archivio istituzionale della ricerca - Università di Genova

Less is more: Nyström computational regularization

Author: Rosasco L
Rudi A
Camoriano R
Publication venue
Publication date: 01/01/2015
Field of study

We study Nyström type subsampling approaches to large scale kernel methods, and prove learning bounds in the statistical learning setting, where random sampling and high probability estimates are considered. In particular, we prove that these approaches can achieve optimal learning bounds, provided the subsampling level is suitably chosen. These results suggest a simple incremental variant of Nyström kernel ridge regression, where the subsampling level controls at the same time regularization and computations. Extensive experimental analysis shows that the considered approach achieves state of the art performances on benchmark large scale datasets

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

From inexact optimization to learning via gradient concentration

Author: Stankewitz B.
Rosasco L.
Mucke N.
Publication venue
Publication date: 01/01/2023
Field of study

Optimization in machine learning typically deals with the minimization of empirical objectives defined by training data. The ultimate goal of learning, however, is to minimize the error on future data (test error), for which the training data provides only partial information. In this view, the optimization problems that are practically feasible are based on inexact quantities that are stochastic in nature. In this paper, we show how probabilistic results, specifically gradient concentration, can be combined with results from inexact optimization to derive sharp test error guarantees. By considering unconstrained objectives, we highlight the implicit regularization properties of optimization for learning

Archivio istituzionale della ricerca - Università di Genova