1,721,047 research outputs found
From Functional Optimization to Nonlinear Programming by the Extended Ritz Method
This chapter describes the approximate solution of infinite-dimensional optimization problems by the “Extended Ritz Method” (ERIM). The ERIM consists in substituting the admissible functions with fixed-structure parametrized (FSP) functions containing vectors of “free” parameters. The larger the dimensions, the more accurate the approximations of the optimal solutions of the original functional optimization problems. This requires solving easier nonlinear programming problems. In the area of function approximation, we review the definition of approximating sequences of sets, which enjoy the property of density in the sets of functions one wants to approximate. Then, we provide the definition of polynomially complex approximating sequences of sets, which are able to approximate functions provided with suitable regularity properties by using, for a desired arbitrary accuracy, a number of “free” parameters increasing at most polynomially when the number of function arguments grows. In the less studied area of approximate solution of infinite-dimensional optimization problems, the optimizing sequences and the polynomially complex optimizing sequences of FSP functions are defined. Results are presented that allow to conclude that, if appropriate hypotheses occur, polynomially complex approximating sequences of sets give rise to polynomially complex optimizing sequences of FSP functions, possibly mitigating the curse of dimensionality
Deterministic Optimal Control over a Finite Horizon
This chapter addresses discrete-time deterministic problems, where optimal controls have to be generated at a finite number of decision stages. No random variables influence either the dynamic system or the cost function. Then, there is no necessity of estimating the state vector. Such optimization problems are stated for their intrinsic practical importance and to describe the basic concepts of dynamic programming. As the problems are formulated under very general assumptions, their optimal solutions cannot be found in an analytical form. Therefore, we resort to an approximation consisting of the discretization of the state space into suitable grids at each decision stage. The discretization by regular grids is the simplest approach (and the one most widely used until some time ago). However, unless a small dimension of the state space is considered, this approach leads to an exponential growth of the number of samples, and thus to the curse of dimensionality. Therefore, the discretization by deterministic sequences of samples is addressed, which spread the samples in the most uniform way. Specifically, low-discrepancy sequences are considered, like quasi-Monte Carlo sequences. We also point out that the optimization problem can even be viewed as a nonlinear programming problem solvable by gradient-based descent techniques
Design of Mathematical Models by Learning From Data and FSP Functions
First, well-known concepts from Statistical Learning Theory are reviewed. In reference to the problem of modelling an unknown input/output (I/O) relationship by fixed-structure parametrized functions, the concepts of expected risk, empirical risk, and generalization error are described. The last error is then split into approximation and estimation errors. Four quantities of interest are emphasized: the accuracy, the number of arguments of the I/O relationship, the model complexity, and the number of samples generated for the estimation. The possibility of generating such samples by deterministic algorithms like quasi-Monte Carlo methods, orthogonal arrays, Latin hypercubes, etc. gives rise to the so-called Deterministic Learning Theory. This possibility is an intriguing alternative to the random generation of input data, typically obtained by using Monte Carlo techniques, since it enables one to reduce the number of samples (under the same accuracy) and to obtain upper bounds on the errors in deterministic terms rather than in probabilistic ones. Deterministic learning relies on some basic quantities such as variation and discrepancy. Special families of deterministic sequences called “low-discrepancy sequences” are useful in the computation of integrals and in dynamic programming, to mitigate the danger of incurring the curse of dimensionality deriving from the use of regular grids
Numerical Methods for Integration and Search for Minima
Two topics are addressed. The first refers to the numerical computation of integrals and expected values of functions that may depend on a large number of random variables. Of course, integration includes the computation of the expected values of functions dependent on random variables. However, the latter shows peculiar nontrivial aspects that the former does not have. In case of a large number of random variables, the use of regular grids implies the risk of incurring the curse of dimensionality. Then, suitable sampling methods are taken into account to reduce such risk. In particular, Monte Carlo and quasi-Monte Carlo sequences are addressed. The second topic refers to the solution of the nonlinear programming problems obtained from the approximation of infinite-dimensional optimization problems by the Extended Ritz Method. We mention a few well-known direct techniques and gradient-based descent algorithms. In the case of nonlinear programming problems stated in stochastic frameworks, the stochastic approximation approach deserves attention and thus it is considered in some detail. Within this context, we describe the stochastic gradient algorithm enabling one to avoid the computation of integrals, hence, the computation of expected values of functions dependent on random variables. Convergence properties of that algorithm are reported
The Basic Infinite-Dimensional or Functional Optimization Problem
In infinite-dimensional or functional optimization problems, one has to minimize (or maximize) a functional with respect to admissible solutions belonging to infinite-dimensional spaces of functions, often dependent on a large number of variables. As we consider optimization problems characterized by very general conditions, optimal solutions might not be found analytically and classical numerical methods might fail. In these cases, one must use suitable approximation methods. This chapter describes an approximation method that uses families of nonlinear approximators – including commonly used shallow and deep neural networks as special cases – to reduce infinite-dimensional problems to finite-dimensional nonlinear programming ones. It is named “Extended Ritz Method” (ERIM). This term originates from the classical Ritz method, which uses linear combinations of functions as an approximation tool. It does not seem that the Ritz method has met with important successes as regards problems whose admissible solutions depend on large number of variables. This might be ascribed to the fact that this method can be subject to one of the forms of the so-called “curse of dimensionality.” However, theoretical and numerical results show that the ERIM might mitigate this drawback. Examples of infinite-dimensional optimization problems are presented that can be approximated by nonlinear programming ones
Optimal Control Problems over an Infinite Horizon
Optimal control problems over an infinite number of decision stages are considered with emphasis on the deterministic scenario. Both the open-loop and the closed-loop formulations are given and conditions for the existence of a stationary optimal control law are provided. Unless strong assumptions are made on the dynamic system and on the random variables (if present), the design of the optimal infinite-horizon controllers is an almost impossible task. Then, the well-known “receding-horizon” (RH) approximation is considered and the optimal control problem is restated accordingly. In the second part of the chapter, we consider the fundamental issue of closed-loop stability that arises owing to the infinite number of decision stages. More specifically, we address the stability properties of the closed-loop deterministic system under the action of approximate RH control laws obtained by the “extended Ritz method” and implemented through fixed-structure parametrized functions containing vectors of “free” parameters. Conditions are established on the maximum allowable approximation errors so as to ensure the boundedness of the state trajectories
Some Families of FSP Functions and Their Properties
We report properties of fixed-structure parametrized (FSP) functions that give insights into the effectiveness of the “Extended Ritz Method” (ERIM) as a methodology for the approximate solution of infinite-dimensional optimization problems. First, we present the structure of some widespread FSP functions, including linear combinations of fixed-basis functions, one-hidden-layer (OHL) and multiple-hidden-layer (MHL) networks, and kernel smoothing models. Second, focusing on the case of OHL neural networks based on ridge and radial constructions, we report their density properties under different metrics. Third, we present rates of function approximation via ridge OHL neural networks, by reporting a fundamental theorem by Maurey, Jones, and Barron, together with its extensions, based on a norm tailored to approximation by computational units from a given set of functions. We also discuss approximation properties valid for MHL networks. Fourth, we compare the classical Ritz method and the ERIM from the point of view of the curse of dimensionality, proving advantages of the latter for a specific class of problems, where the functional to be optimized is quadratic. Finally, we provide rates of approximate optimization by the ERIM, based on the concepts of modulus of continuity and modulus of convexity of the functional to be optimized
Team Optimal Control Problems
We consider discrete-time stochastic optimal control problems over a finite number of decision stages in which several controllers share different information and aim at minimizing a common cost functional. This organization can be described within the framework of “team theory.” Unlike the classical optimal control problems, linear-quadratic-Gaussian hypotheses are sufficient neither to obtain an optimal solution in closed-loop form nor to understand whether an optimal solution exists. In order to obtain an optimal solution in closed-loop form, additional suitable assumptions must be introduced on the “information structure” of the team. The “information structure” describes the way in which each controller’s information vector is influenced by the stochastic environment and by the decisions of the other controllers. Dynamic programming cannot be applied unless the information structure takes particular forms. On the contrary, the “extended Ritz method” (ERIM) can be always applied. The ERIM consists in substituting the admissible functions with fixed-structure parametrized functions containing vectors of “free” parameters. The ERIM is tested in two case studies. The former is the well-known Witsenhausen counterexample. The latter is an optimal routing problem in a store-and-forward packet-switching network
Stochastic Optimal Control with Perfect State Information over a Finite Horizon
Discrete-time stochastic optimal control problems are considered. These problems are stated over a finite number of decision stages. The state vector is assumed to be observed through a noisy measurement channel. Because of the very general assumptions under which the problems are stated, obtaining analytically optimal solutions is practically impossible. Note that the controller has to retain the vector of all the measures and of all the controls in memory, up to the most recent decision stage. Such measures and controls constitute the “information vector” that the control function depends on. The increasing dimension of the information vector makes it practically impossible to use dynamic programming. Then, we resort to the “extended Ritz method” (ERIM). The ERIM consists in substituting the admissible functions with fixed-structure parametrized functions containing vectors of “free” parameters. Of course, if the number of decision stages is large, the application of the ERIM is also impossible. Therefore, an approximate approach is followed by truncating the information vector and retaining in the memory only a suitable “limited-memory information vector.
- …
