1,721,056 research outputs found
Automated curriculum learning for embodied agents a neuroevolutionary approach
We demonstrate how the evolutionary training of embodied agents can be extended with a curriculum learning algorithm that automatically selects the environmental conditions in which the evolving agents are evaluated. The environmental conditions are selected to adjust the level of difficulty to the ability level of the current evolving agents, and to challenge the weaknesses of the evolving agents. The method does not require domain knowledge and does not introduce additional hyperparameters. The results collected on two benchmark problems, that require to solve a task in significantly varying environmental conditions, demonstrate that the method proposed outperforms conventional learning methods and generates solutions which are robust to variations and able to cope with different environmental conditions
Qualitative differences between evolutionary strategies and reinforcement learning methods for control of autonomous agents
In this paper we analyze the qualitative differences between evolutionary strategies and reinforcement learning algorithms by focusing on two popular state-of-the-art algorithms: the OpenAI-ES evolutionary strategy and the Proximal Policy Optimization (PPO) reinforcement learning algorithm – the most similar methods of the two families. We analyze how the methods differ with respect to: (i) general efficacy, (ii) ability to cope with rewards which are sparse in time, (iii) propensity/capacity to discover minimal solutions, (iv) dependency on reward shaping, and (v) ability to cope with variations of the environmental conditions. The analysis of the performance and of the behavioral strategies displayed by the agents trained with the two methods on benchmark problems enable us to demonstrate qualitative differences which were not identified in previous studies, to identify the relative weakness of the two methods, and to propose ways to ameliorate some of those weaknesses. We show that the characteristics of the reward function has a strong impact which vary qualitatively not only for the OpenAI-ES evolutionary algorithm and the PPO reinforcement learning algorithm but also for other reinforcement learning algorithms, thus demonstrating the importance of optimizing the characteristic of the reward function to the algorithm used
Enhancing Cartesian genetic programming through preferential selection of larger solutions
We demonstrate how the efficiency of Cartesian genetic programming methods can be enhanced through the preferential selection of phenotypically larger solutions among equally good solutions. The advantage is demonstrated in two qualitatively different problems: the eight-bit parity problems and the “Paige” regression problem. In both cases, the preferential selection of larger solutions provides an advantage in term of the performance and of speed, i.e. number of evaluations required to evolve optimal or high-quality solutions. Performance can be further enhanced by self-adapting the mutation rate through the one-fifth success rule. Finally, we demonstrate that, for problems like the Paige regression in which neutrality plays a smaller role, performance can be further improved by preferentially selecting larger solutions also among candidates with similar fitness
Spatial Frames of Reference and Action: A Study with Evolved Neuro-agents
Solving spatial tasks is crucial for adaptation and is made possible by the representation of space. It is still debated which is the exact nature of this representation that can rely on egocentric and allocentric frames of reference. In this paper, a modelling approach is proposed to complement research on humans and animal models. Artificial agents, simulated mobile robots ruled by an artificial neural network, are evolved through Evolutionary strategies to solve a spatial task that consists in locating the central area between 2 landmarks in a rectangular enclosure. This is a non-trivial task that requires the agent to identify landmarks’ location, spatial relation between landmarks and landmark position relative to the environment. Different populations of agents with different spatial frames of reference are compared. Results indicate that both egocentric and allocentric frames of reference are effective, but allocentric frames gives advantages and leads to better performance
Going Beyond Counting First Authors in Author Co-citation Analysis
The present study examines one of the fundamental aspects of author co-citation analysis (ACA) - the way co-citation
counts are defined. Co-citation counting provides the data on which all subsequent statistical analyses and mappings
are based, and we compare ACA results based on two different types of co-citation counting - the traditional type that
only counts the first one among a cited work's authors on the one hand and a non-traditional type that takes into
account the first 5 authors of a cited work on the other hand. Results indicate that the picture produced through this non-traditional author co-citation counting contains more coherent author groups and is therefore considerably clearer. However, this picture represents fewer specialties in the research field being studied than that produced through the traditional first-author co-citation counting when the same number of top-ranked authors is selected and analyzed. Reasons for these effects are discussed
Moderate Environmental Variation Across Generations Promotes the Evolution of Robust Solutions
Previous evolutionary studies demonstrated how robust solutions can be obtained by evaluating agents multiple times in variable environmental conditions. Here we demonstrate how agents evolved in environments that vary across generations outperform agents evolved in environments that remain fixed. Moreover, we demonstrate that best performance is obtained when the environment varies at a moderate rate across generations, that is, when the environment does not vary every generation but every N generations. The advantage of exposing evolving agents to environments that vary across generations at a moderate rate is due, at least in part, to the fact that this condition maximizes the retention of changes that alter the behavior of the agents, which in turn facilitates the discovery of better solutions. Finally, we demonstrate that moderate environmental variations are advantageous also from an evolutionary computation perspective, that is, from the perspective of maximizing the performance that can be achieved within a limited computational budget
Efficacy of Modern Neuro-Evolutionary Strategies for Continuous Control Optimization
We analyze the efficacy of modern neuro-evolutionary strategies for continuous control optimization. Overall, the results collected on a wide variety of qualitatively different benchmark problems indicate that these methods are generally effective and scale well with respect to the number of parameters and the complexity of the problem. Moreover, they are relatively robust with respect to the setting of hyper-parameters. The comparison of the most promising methods indicates that the OpenAI-ES algorithm outperforms or equals the other algorithms on all considered problems. Moreover, we demonstrate how the reward functions optimized for reinforcement learning methods are not necessarily effective for evolutionary strategies and vice versa. This finding can lead to reconsideration of the relative efficacy of the two classes of algorithm since it implies that the comparisons performed to date are biased toward one or the other class
Robustness, evolvability and phenotypic complexity: insights from evolving digital circuits
We analyze the relation between robustness to mutations, phenotypic complexity, and evolvability in the context of artificial circuits evolved for the ability to solve a parity problem. We demonstrate that whether robustness to mutations enhances or diminishes phenotypic variability and evolvability depends on whether robustness is achieved through the development of parsimonious (phenotypically simple) solutions, that minimize the number of genes playing functional roles, or through phenotypically more complex solutions, capable of buffering the effect of mutations. We show that the characteristics of the selection process strongly influence the robustness and the performance of the evolving candidate solutions. Finally, we propose a new evolutionary method that outperforms evolutionary algorithms commonly used in this domain
Maximizing adaptive power in neuroevolution
In this paper we compare systematically the most promising neuroevolutionary methods and two new original methods on the double-pole balancing problem with respect to: the ability to discover solutions that are robust to variations of the environment, the speed with which such solutions are found, and the ability to scale-up to more complex versions of the problem. The results indicate that the two original methods introduced in this paper and the Exponential Natural Evolutionary Strategy method largely outperform the other methods with respect to all considered criteria. The results collected in different experimental conditions also reveal the importance of regulating the selective pressure and the importance of exposing evolving agents to variable environmental conditions. The data collected and the results of the comparisons are used to identify the most effective methods and the most promising research directions
- …
