1,721,334 research outputs found
Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
This paper presents the first framework (up to the authors' knowledge) to address time-varying objectives in finite-horizon Deep Reinforcement Learning (DeepRL), based on a switching control solution developed on the ground of Bellman's principle of optimality. By augmenting the state space of the system with information on its visit time, the DeepRL agent is able to solve problems in which its task dynamically changes within the same episode. To address the scalability problems caused by the state space augmentation, we propose a procedure to partition the episode length to define separate sub-problems that are then solved by specialised DeepRL agents. Contrary to standard solutions, with the proposed approach the DeepRL agents correctly estimate the value function at each time-step and are hence able to solve time-varying tasks. Numerical simulations validate the approach in a classic RL environment
Stability and Wardrop Equilibria of Non-Cooperative Routing With Time-Varying Load
Non-cooperative or selfish routing problems emerge in several applications of network control theory. Considering a multi-commodity setting subject to time-varying traffic demands, this paper studies the convergence properties of a family of non-cooperative routing control laws, originally developed in the literature for constant traffic demands. By employing results from hybrid systems theory and set stability, this paper identifies the minimum time between bounded load variations to assure the convergence of the controlled system towards a set of approximated Wardrop equilibria. Numerical simulations validate the results on a test scenario
A Weighted Average Consensus Approach for Decentralized Federated Learning
Federated learning (FedL) is a machine learning (ML) technique utilized to train deep neural networks (DeepNNs) in a distributed way without the need to share data among the federated training clients. FedL was proposed for edge computing and Internet of things (IoT) tasks in which a centralized server was responsible for coordinating and governing the training process. To remove the
design limitation implied by the centralized entity, this work proposes two different solutions to decentralize existing FedL algorithms, enabling the application of FedL on networks with arbitrary communication topologies, and thus extending the domain of application of FedL to more complex scenarios and new tasks. Of the two proposed algorithms, one, called FedLCon, is developed based on results from discrete-time weighted average consensus theory and is able to reconstruct the performances of the standard centralized FedL solutions, as also shown by the reported validation tests
Precise Orbit Determination on LEO Satellite using Pseudorange and Pseudorange-Rate Measurements
Nowadays, along with the trend of developing highly autonomous satellites, there is a strong motivation to improve real-time Precise Orbit Determination (POD), in particular for Low Earth Orbit (LEO) satellites. The development of Global Navigation Satellite System (GNSS) sensors allows to obtain low-noise measurements and provide a satellite with autonomous continuous tracking onboard. Following the deactivation of Selective Availability, a representative real-time positioning accuracy of 10 m is presently achieved by means of Global Positioning System (GPS) receivers on LEO satellites. The introduction of dynamical filtering methods has opened a new way to improve this accuracy by making use of measurements such as pseudorange or carrier-phase. This paper presents a Kalman filtering approach using pseudorange and pseudorange-rate measurements instead of pseudorange and carrier-phase ones, with advantages in terms of storage and processing requirements. An error of around 0.2 m and 1e-3 m/s for position and velocity is obtained, which is in line if not better w.r.t. other approaches
Comparative evaluation of contact ultrasonography and transcystic cholangiography during laparoscopic cholecystectomy
- …
