1,721,085 research outputs found
Increasing Autonomy of Aerospace Systems via PINN-based Solutions of HJB Equation
Closed-loop optimal control is crucial for enhancing the autonomy of aerospace systems. However, its computation can be challenging, as it typically involves solving the Hamilton-Jacobi-Bellman (HJB) equation—a nonlinear partial differential equation (PDE) that poses significant numerical difficulties. This paper focuses on employing Bellman Neural Networks (BeNNs), a specialized framework within Physics-Informed Neural Networks (PINNs), to learn the solution of the HJB PDE and thereby ascertain the closed-loop optimal control. BeNNs leverage the constrained expressions from the Theory of Functional Connections and utilize shallow neural networks, trained via the Extreme Learning Machine (X-TFC) approach, to approximate the elusive solution of the HJB PDE. We achieve the solution to the nonlinear HJB by integrating the method of successive approximation with the solution of the linear Generalized HJB (GHJB) equation. The effectiveness of these frameworks is evaluated in the context of a missile pitch-plane autopilot optimal control problem. The results demonstrate that our framework can accurately compute the closed-loop optimal control within the specified domain, achieving low final errors relative to the reference states
Meta-Reinforcement Learning with Transformer Networks for Space Guidance Applications
Transformer neural networks have revolutionized machine learning, excelling in text and image processing. Their self-attention mechanism captures sequence dependencies, facilitating feature extraction and avoiding gradient problems of recurrent networks. Transformers naturally implement a meta-reinforcement learning framework when used in reinforcement learning, using self-attention weights as context-dependent parameters for task inference. This paper proposes a meta-reinforcement learning algorithm based on the gated transformerXL model for autonomous spacecraft guidance during a planetary landing, by considering the presence of unmodeled dynamics, inaccurate navigation data, and control errors. The method will be compared with standard reinforcement learning via a feed-forward network to demonstrate the potential of transformers for real-time and robust spacecraft guidance in uncertain mission scenarios
Bayesian inversion of coupled radiative and heat transfer models for asteroid regoliths and lakes
estimation, the quantities we seek to retrieve are considered as random variables. The randomness includes the uncertainty regarding
their true values.Weintend to use this approach to perform inversion of coupled radiative and heat transfer models for asteroid regoliths and lakes. The Bayesian inversion of this kind of models allows estimating optical and thermodynamic properties of the systems considered, and also allows finding any correlation among these properties; that would be quite difficult to find with the classical approaches
Comparative Analysis of Reinforcement Learning Algorithms for Robust Interplanetary Trajectory Design
This paper focuses on the application of reinforcement learning to the robust design of low-thrust interplanetary trajectories in presence of severe dynamical uncertainties modeled as Gaussian additive process noise. A closed-loop control policy is used to steer the spacecraft to a final target state despite the perturbations. The control policy is approximated by a deep neural network, trained by reinforcement learning to output the optimal control thrust given as input the current spacecraft state. The effectiveness of three different model-free reinforcement learning algorithms is assessed and compared on a three-dimensional low-thrust transfer between Earth and Mars elected as study case
Robust Waypoint Guidance of a Hexacopter on Mars using Meta-Reinforcement Learning
This paper presents a meta-reinforcement learning approach to the robust and autonomous waypoint guidance of a six-rotor unmanned aerial vehicle in Mars' atmosphere. The meta-learning is implemented by using a recurrent neural network as a control policy to map data about the hexacopter state provided by onboard sensors to the six rotor angular speeds. The network is trained by proximal policy optimization, a state-of-the-art policy gradient reinforcement learning algorithm. During the training, the network is also provided with information about the previous control output and reward, to improve the policy adaptability to different environment instances. Several mission scenarios, involving uncertainties on Mars' atmosphere's properties, the presence of random wind gusts, and Gaussian noise on the collected sensor data, are investigated to assess the robustness of the proposed approach in realistic operative conditions. The flexibility and performance of meta-reinforcement learning are also compared against standard reinforcement learning with a fully-connected neural network, to better highlight the potential of the proposed methodology in real-world autonomous guidance applications
Meta-reinforcement learning for adaptive spacecraft guidance during finite-thrust rendezvous missions
In this paper, a meta-reinforcement learning approach is investigated to design an adaptive guidance algorithm capable of carrying out multiple rendezvous space missions. Specifically, both a standard fully-connected network and a recurrent neural network are trained by proximal policy optimization on a wide distribution of finite-thrust rendezvous transfers between circular co-planar orbits. The recurrent network is also provided with the control and reward at the previous simulation step, thus allowing it to build, thanks to its history-dependent state, an internal representation of the considered task distribution. The ultimate goal is to generate a model which could adapt to unseen tasks and produce a nearly-optimal guidance law along any transfer leg of a multi-target mission. As a first step towards the solution of a complete multi-target problem, a sensitivity analysis on the single rendezvous leg is carried out in this paper, by varying the radius either of the initial or the final orbit, the transfer time, and the initial phasing between the chaser and the target. Numerical results show that the recurrent-network-based meta-reinforcement learning approach is able to better reconstruct the optimal control in almost all the analyzed scenarios, and, at the same time, to meet, with greater accuracy, the terminal rendezvous condition, even when considering problem instances that fall outside the original training domain
Physics-informed Neural Networks for Optimal Intercept Problem,
The novel Extreme Theory of Functional Connections (X-TFC) method is employed to solve the optimal intercept
problem. With X-TFC, for the first time, Theory of Functional Connections (TFC) and shallow Neural Networks (NNs)
trained via the Extreme Learning Machine (ELM) algorithm are brought together as a class of PINN methods and applied to solving a broad class of ODEs and PDEs. In particular, the unknown solutions (in strong sense) of the ODEs and PDEs are approximated via particular expressions, called constrained expression (CEs), defined within TFC. A CE is a functional that always analytically satisfies the specified constraints and has a free-function that does not affect the specified constraints. In the X-TFC method, the free-function is a single-layer NN, trained via ELM algorithm. According to the ELM algorithm, the unknown constant coefficients appear linearly and thus, a least-squares method (for linear problems) or an iterative least-square method (for nonlinear problems) is used to compute the unknowns by minimizing the residual of the differential equations. In this work, the differential equations are represented by the system arising from the indirect method formulation of optimal control problems, which exploits the Hamiltonian function and the Pontryagin Maximum/Minimum Principle (PMP) to obtain a Two-Point Boundary Value Problem. The proposed method is tested by solving the Feldbaum problem and the minimum time-energy optimal intercept problem. It is shown that the major advantage of this method is the comparable accuracy with respect to the state of the art methods for the solution of optimal control problems along with an extremely fast computational time. In particular, the low computational time makes the proposed method suitable for real-time applications
Application of ZEM/ZEV guidance for closed-loop transfer in the Earth-Moon System
The vision for advanced missions to the lunar far-side brings many challenges to the forefront of the space community. This paper focuses on two subjects pertaining to lunar far-side operations: utilization of a feedback guidance algorithm and the ability to exploit invariant manifolds for reaching desired states in a fuel efficient manner. The performance of the δr/δv form of the ZEM/ZEV (Zero-Effort-Miss/Zero-Effort-Velocity) feedback guidance algorithm is evaluated through scenarios in the Circular Restricted Three-Body Problem. All scenarios include the use of invariant manifolds to complete transfers. A grid search is implemented to find invariant manifolds and δr/δv trajectories that minimize the fuel used during the transfer. Overall, the use of the δr/δv guidance algorithm for targeting manifolds is successful when thrust is not saturated, but becomes problematic when thrust saturation exists. Conclusions were drawn by exploring halo to halo transfers, descending from, and ascending to the lunar far-side
Autonomous guidance for cislunar orbit transfers via reinforcement learning
This paper investigates the use of reinforcement learning for the optimal guidance of a spacecraft during a time-free low-thrust transfer between two libration point orbits in the cislunar environment. To this aim, a deep neural network is trained via Proximal Policy Optimization to map any spacecraft state to the optimal control action. A general-purpose reward is used to guide the network toward a fuel-optimal control law regardless of the specific orbits considered, and without the use of any ad-hoc reward shaping technique. Eventually, the learned control policies are compared with the optimal solutions provided by a direct method in two different mission scenarios, and Monte Carlo simulations are used to assess the policies robustness to navigation uncertainties
- …
