Search CORE

1,721,438 research outputs found

Sequence-Form and Evolutionary Dynamics: Realization Equivalence to Agent Form and Logit Dynamics

Author: RESTELLI MARCELLO
GATTI NICOLA
Publication venue
Publication date: 01/01/2016
Field of study

Evolutionary game theory provides the principal tools to model the dynamics of multi-agent learning algorithms. While there is a long-standing literature on evolutionary game theory in strategic-form games, in the case of extensive-form games few results are known and the exponential size of the representations currently adopted makes the evolutionary analysis of such games unaffordable. In this paper, we focus on dynamics for the sequence form of extensive-form games, providing three dynamics: one realization equivalent to the normal-form logit dynamic, one realization equivalent to the agent-form replicator dynamic, and one realization equivalent to the agent-form logit dynamic. All the considered dynamics require polynomial time and space, providing an exponential compression w.r.t. the dynamics currently known and providing thus tools that can be effectively employed in practice. Moreover, we use our tools to compare the agent-form and normal-form dynamics and to provide new "hybrid" dynamics

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

Equilibrium Approximation in Extensive-Form Simulation-Based Games

Author: RESTELLI MARCELLO
GATTI NICOLA
Publication venue
Publication date: 01/01/2011
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

An architecture for adaptive coordination of heterogeneous agents

Author: BONARINI ANDREA
RESTELLI MARCELLO
Publication venue
Publication date: 01/01/2002
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

An architecture to implement adaptive cooperative strategies for heterogeneous agents

Author: BONARINI ANDREA
RESTELLI MARCELLO
Publication venue
Publication date: 01/01/2002
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Inverse Reinforcement Learning through Policy Gradient Minimization

Author: RESTELLI MARCELLO
PIROTTA MATTEO
Publication venue
Publication date: 01/01/2016
Field of study

Inverse Reinforcement Learning (IRL) deals with the problem of recovering the reward function optimized by an expert given a set of demonstrations of the expert's policy.Most IRL algorithms need to repeatedly compute the optimal policy for different reward functions.This paper proposes a new IRL approach that allows to recover the reward function without the need of solving any "direct" RL problem.The idea is to find the reward function that minimizes the gradient of a parameterized representation of the expert's policy.In particular, when the reward function can be represented as a linear combination of some basis functions, we will show that the aforementioned optimization problem can be efficiently solved.We present an empirical evaluation of the proposed approach on a multidimensional version of the Linear-Quadratic Regulator (LQR) both in the case where the parameters of the expert's policy are known and in the (more realistic) case where the parameters of the expert's policy need to be inferred from the expert's demonstrations.Finally, the algorithm is compared against the state-of-the-art on the mountain car domain, where the expert's policy is unknown

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

Policy gradient in Lipschitz Markov Decision Processes

Author: RESTELLI MARCELLO
PIROTTA MATTEO
BASCETTA LUCA
Publication venue
Publication date: 01/01/2015
Field of study

This paper is about the exploitation of Lipschitz continuity properties for Markov Decision Processes to safely speed up policy-gradient algorithms. Starting from assumptions about the Lipschitz continuity of the state-transition model, the reward function, and the policies considered in the learning process, we show that both the expected return of a policy and its gradient are Lipschitz continuous w.r.t. policy parameters. By leveraging such properties, we define policy-parameter updates that guarantee a performance improvement at each iteration. The proposed methods are empirically evaluated and compared to other related approaches using different configurations of three popular control scenarios: the linear quadratic regulator, the mass-spring-damper system and the ship-steering control

Archivio istituzionale della ricerca - Politecnico di Milano

Crossref

Evolutionary dynamics of Q-learning over the sequence form

Author: RESTELLI MARCELLO
PANOZZO FABIO
GATTI NICOLA
Publication venue
Publication date: 01/01/2014
Field of study

Multi-agent learning is a challenging open task in artificial intelligence. It is known an interesting connection between multi-agent learning algorithms and evolutionary game theory, showing that the learning dynamics of some algorithms can be modeled as replicator dynamics with a mutation term. Inspired by the recent sequence-form replicator dynamics, we develop a new version of the Q-learning algorithm working on the sequence form of an extensive-form game allowing thus an exponential reduction of the dynamics length w.r.t. those of the normal form. The dynamics of the proposed algorithm can be modeled by using the sequence-form replicator dynamics with a mutation term. We show that, although sequence-form and normal-form replicator dynamics are realization equivalent, the Q-learning algorithm applied to the two forms have non-realization equivalent dynamics. Originally from the previous works on evolutionary game theory models form multi-agent learning, we produce an experimental evaluation to show the accuracy of the model

Archivio istituzionale della ricerca - Politecnico di Milano

Association for the Advancement of Artificial Intelligence: AAAI Publications

Dead Reckoning for Mobile Robots Using Two Optical Mice

Author: BONARINI ANDREA
MATTEUCCI MATTEO
RESTELLI MARCELLO
Publication venue
Publication date: 01/01/2004
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano

Automatic Error Detection and Reduction for an Odometric Sensor based on Two Optical Mice.

Author: BONARINI ANDREA
MATTEUCCI MATTEO
RESTELLI MARCELLO
Publication venue
Publication date: 01/01/2005
Field of study

In this paper, we present a dead reckoning sensor to support reliable odometry on mobile robots. This sensor is based on a pair of optical mice rigidly connected to the robot body and its main advantages are 1) this localization system is independent from the kinematics of the robot, 2) the measurement given by the mice is not subject to slipping, since they are independent from the traction wheels, nor to crawling, since they measure displacements in any direction 3) it is a low-cost solution with a precision comparable to classical shaft encoders. Since we have redundant measures it is possible to detect non-systematic errors; in this paper, an automatic procedure to reduce non-systematic errors of the sensor is presented and validated with experimental results on a real mobile robot

Archivio istituzionale della ricerca - Politecnico di Milano

A novel model to rule behavior interaction

Author: BONARINI ANDREA
MATTEUCCI MATTEO
RESTELLI MARCELLO
Publication venue
Publication date: 01/01/2004
Field of study

Archivio istituzionale della ricerca - Politecnico di Milano