Freie Universität Berlin

Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)
Not a member yet
    2251 research outputs found

    Federated Learning with Deep Neural Networks: A Privacy-Preserving Approach to Enhanced ECG Classification

    No full text
    In response to increasing data privacy regulations, this work examines the use of federated learning for deep residual networks to diagnose cardiac abnormalities from electrocardiogram (ECG) data. This approach allows medical institutions to collaborate without exchanging raw patient data. We utilize the publicly available data from the PhysioNet/Computing in Cardiology Challenge 2021, featuring diverse ECG databases, to compare the classification performance of three federated learning methods against both central training with data sharing and isolated training scenarios. We show that federated learning outperforms ECG classifiers trained in isolation. In particular, our findings demonstrate that a globally trained model fine-tuned to specific local datasets surpasses non-collaborative approaches. This shows that models trained in federation learn general features that can be tailored to specific tasks. Furthermore, federated learning almost matches the performance of central training with data sharing on out-of-distribution data from non-participating institutions. These results highlight the ability of federated learning in developing models that generalize well across diverse patient data, without the need to share data among institutions, thus addressing data privacy concerns

    Barrier-crossing transition-path times for non-Markovian systems

    No full text
    By simulation and asymptotic theory, we investigate the transition-path time of a one-dimensional finite-mass reaction coordinate crossing a double-well potential in the presence of non-Markovian friction. First, we consider single-exponential memory kernels and demonstrate that memory accelerates transition paths compared to the Markovian case, especially in the low-mass/high-friction limit. Then, we generalize to multi-exponential kernels and construct an asymptotic formula for the transition-path time that compares well with simulation data

    Deriving a GENERIC system from a Hamiltonian system

    Full text link
    We reconsider the fundamental problem of coarse-graining infinite-dimensional Hamiltonian dynamics to obtain a macroscopic system which includes dissipative mechanisms. In particular, we study the thermodynamical implications concerning Hamiltonians, energy, and entropy and the induced geometric structures such as Poisson and Onsager brackets (symplectic and dissipative brackets). We start from a general finite-dimensional Hamiltonian system that is coupled linearly to an infinite-dimensional heat bath with linear dynamics. The latter is assumed to admit a compression to a finite-dimensional dissipative semigroup (i.e., the heat bath is a dilation of the semigroup) describing the dissipative evolution of new macroscopic variables. Already in the finite-energy case (zero-temperature heat bath) we obtain the so-called GENERIC structure (General Equations for Non-Equilibrium Reversible Irreversibe Coupling), with conserved energy, nondecreasing entropy, a new Poisson structure, and an Onsager operator describing the dissipation. However, their origin is not obvious at this stage. After extending the system in a natural way to the case of positive temperature, giving a heat bath with infinite energy, the compression property leads to an exact multivariate Ornstein-Uhlenbeck process that drives the rest of the system. Thus, we are able to identify a conserved energy, an entropy, and an Onsager operator (involving the Green-Kubo formalism) which indeed provide a GENERIC structure for the macroscopic system

    Uncertainty quantification for random domains using periodic random variables

    No full text
    Partial differential equations (PDEs) with uncertain or random inputs have been considered in many studies of uncertainty quantification. In forward uncertainty quantification, one is interested in analyzing the stochastic response of the PDE subject to input uncertainty, which usually involves solving high-dimensional integrals of the PDE output over a sequence of stochastic variables. In practical computations, one typically needs to discretize the problem in several ways: approximating an infinite-dimensional input random field with a finite-dimensional random field, spatial discretization of the PDE using, e.g., finite elements, and approximating high-dimensional integrals using cubatures such as quasi-Monte Carlo methods. In this paper, we focus on the error resulting from dimension truncation of an input random field. We show how Taylor series can be used to derive theoretical dimension truncation rates for a wide class of problems and we provide a simple checklist of conditions that a parametric mathematical model needs to satisfy in order for our dimension truncation error bound to hold. Some of the novel features of our approach include that our results are applicable to non-affine parametric operator equations, dimensionally-truncated conforming finite element discretized solutions of parametric PDEs, and even compositions of PDE solutions with smooth nonlinear quantities of interest. As a specific application of our method, we derive an improved dimension truncation error bound for elliptic PDEs with lognormally parameterized diffusion coefficients. Numerical examples support our theoretical findings

    Optimizing Job/Task Granularity for Metagenomic Workflows in Heterogeneous Cluster Infrastructures

    No full text
    Data analysis workflows are popular for sequencing activities in large-scale and complex scientific processes. Scheduling approaches attempt to find an appropriate assignment of workflow tasks to the computing nodes for minimizing the makespan in heterogeneous cluster infrastructures. A common feature of these approaches is that they already know the structure of the workflow. However, for many workflows, a high degree of parallelization can be achieved by splitting the large input data of a single task into chunks and processing them independently. We call this problem task granularity, which involves finding an assignment of tasks to computing nodes and simultaneously optimizing the structure of a bag of tasks. Accordingly, this paper addresses the problem of task granularity for metagenomic workflows. To this end, we first formulated the problem as a mathematical model. We then solved the proposed model using the genetic algorithm. To overcome the challenge of not knowing the number of tasks, we adjusted the number of tasks as a factor of the number of computing nodes. The procedure of increasing the number of tasks is performed interactively and evolutionarily. Experimental results showed that a desirable makespan value can be achieved after a few steps of the increase

    Application of Dimension Truncation Error Analysis to High-Dimensional Function Approximation in Uncertainty Quantification

    No full text
    Parametric mathematical models such as parameterizations of partial differential equations with random coefficients have received a lot of attention within the field of uncertainty quantification. The model uncertainties are often represented via a series expansion in terms of the parametric variables. In practice, this series expansion needs to be truncated to a finite number of terms, introducing a dimension truncation error to the numerical simulation of a parametric mathematical model. There have been several studies of the dimension truncation error corresponding to different models of the input random field in recent years, but many of these analyses have been carried out within the context of numerical integration. In this paper, we study the dimension truncation error of the parametric model problem. Estimates of this kind arise in the assessment of the dimension truncation error for function approximation in high dimensions. In addition, we show that the dimension truncation error rate is invariant with respect to certain transformations of the parametric variables. Numerical results are presented which showcase the sharpness of the theoretical results

    Risk-neutral limit of adaptive importance sampling of random stopping times

    No full text
    We discuss importance sampling of exit problems that involve unbounded stopping times; examples are mean first passage times, transition rates or committor probabilities in molecular dynamics. The naive application of variance minimization techniques can lead to pathologies here, including proposal measures that are not absolutely continuous to the reference measure or importance sampling estimators that formally have zero variance, but that produce infinitely long trajectories. We illustrate these issues with simple examples and discuss a possible solution that is based on a risk-sensitive optimal control framework of importance sampling

    Characterising information gains and losses when collecting multiple epidemic model outputs

    No full text
    Background. Collaborative comparisons and combinations of epidemic models are used as policy-relevant evidence during epidemic outbreaks. In the process of collecting multiple model projections, such collaborations may gain or lose relevant information. Typically, modellers contribute a probabilistic summary at each time-step. We compared this to directly collecting simulated trajectories. We aimed to explore information on key epidemic quantities; ensemble uncertainty; and performance against data, investigating potential to continuously gain information from a single cross-sectional collection of model results. Methods We compared July 2022 projections from the European COVID-19 Scenario Modelling Hub. Five modelling teams projected incidence in Belgium, the Netherlands, and Spain. We compared projections by incidence, peaks, and cumulative totals. We created a probabilistic ensemble drawn from all trajectories, and compared to ensembles from a median across each model’s quantiles, or a linear opinion pool. We measured the predictive accuracy of individual trajectories against observations, using this in a weighted ensemble. We repeated this sequentially against increasing weeks of observed data. We evaluated these ensembles to reflect performance with varying observed data. Results. By collecting modelled trajectories, we showed policy-relevant epidemic characteristics. Trajectories contained a right-skewed distribution well represented by an ensemble of trajectories or a linear opinion pool, but not models’ quantile intervals. Ensembles weighted by performance typically retained the range of plausible incidence over time, and in some cases narrowed this by excluding some epidemic shapes. Conclusions. We observed several information gains from collecting modelled trajectories rather than quantile distributions, including potential for continuously updated information from a single model collection. The value of information gains and losses may vary with each collaborative effort’s aims, depending on the needs of projection users. Understanding the differing information potential of methods to collect model projections can support the accuracy, sustainability, and communication of collaborative infectious disease modelling efforts. Data availability All code and data available on Github: https://github.com/covid19-forecast-hub-europe/aggregation-info-los

    Connecting Stochastic Optimal Control and Reinforcement Learning

    No full text
    In this paper the connection between stochastic optimal control and reinforcement learning is investigated. Our main motivation is to apply importance sampling to sampling rare events which can be reformulated as an optimal control problem. By using a parameterised approach the optimal control problem becomes a stochastic optimization problem which still raises some open questions regarding how to tackle the scalability to high-dimensional problems and how to deal with the intrinsic metastability of the system. To explore new methods we link the optimal control problem to reinforcement learning since both share the same underlying framework, namely a Markov Decision Process (MDP). For the optimal control problem we show how the MDP can be formulated. In addition we discuss how the stochastic optimal control problem can be interpreted in the framework of reinforcement learning. At the end of the article we present the application of two different reinforcement learning algorithms to the optimal control problem and a comparison of the advantages and disadvantages of the two algorithms

    Optimal sampling for stochastic and natural gradient descent Robert Gruhlke, ,

    No full text
    We consider the problem of optimising the expected value of a loss functional over a nonlinear model class of functions, assuming that we have only access to realisations of the gradient of the loss. This is a classical task in statistics, machine learning and physics-informed machine learning. A straightforward solution is to replace the exact objective with a Monte Carlo estimate before employing standard first-order methods like gradient descent, which yields the classical stochastic gradient descent method. But replacing the true objective with an estimate ensues a ``generalisation error''. Rigorous bounds for this error typically require strong compactness and Lipschitz continuity assumptions while providing a very slow decay with sample size. We propose a different optimisation strategy relying on a natural gradient descent in which the true gradient is approximated in local linearisations of the model class via (quasi-)projections based on optimal sampling methods. Under classical assumptions on the loss and the nonlinear model class, we prove that this scheme converges almost surely monotonically to a stationary point of the true objective and we provide convergence rates

    783

    full texts

    2,251

    metadata records
    Updated in last 30 days.
    Repository: Freie Universität Berlin (FU), Math Department (fu_mi_publications)
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇