1,721,721 research outputs found
Sparse Distributed Learning Based on Diffusion Adaptation
This article proposes diffusion LMS strategies for distributed estimation over adaptive networks that are able to exploit sparsity in the underlying system model. The approach relies on convex regularization, common in compressive sensing, to enhance the detection of sparsity via a diffusive process over the network. The resulting algorithms endow networks with learning abilities and allow them to learn the sparse structure from the incoming data in real-time, and also to track variations in the sparsity of the model. We provide convergence and mean-square performance analysis of the proposed method and show under what conditions it outperforms the unregularized diffusion version. We also show how to adaptively select the regularization parameter. Simulation results illustrate the advantage of the proposed filters for sparse data recovery.AS
Sparse distributed learning based on diffusion adaptation
This article proposes diffusion LMS strategies for distributed estimation over adaptive networks that are able to exploit sparsity in the underlying system model. The approach relies on convex regularization, common in compressive sensing, to enhance the detection of sparsity via a diffusive process over the network. The resulting algorithms endow networks with learning abilities and allow them to learn the sparse structure from the incoming data in real-time, and also to track variations in the sparsity of the model. We provide convergence and mean-square performance analysis of the proposed method and show under what conditions it outperforms the unregularized diffusion version. We also show how to adaptively select the regularization parameter. Simulation results illustrate the advantage of the proposed filters for sparse data recovery
Diffusion-based adaptive distributed detection: Steady-state performance in the slow adaptation regime
This paper examines the close interplay between cooperation and adaptation for distributed detection schemes over fully decentralized networks. The combined attributes of cooperation and adaptation are necessary to enable networks of detectors to continually learn from streaming data and to continually track drifts in the state of nature when deciding in favor of one hypothesis or another. The results in this paper establish a fundamental scaling law for the steady-state probabilities of miss detection and false alarm in the slow adaptation regime, when the agents interact with each other according to distributed strategies that employ small constant step-sizes. The latter are critical to enable continuous adaptation and learning. This paper establishes three key results. First, it is shown that the output of the collaborative process at each agent has a steady-state distribution. Second, it is shown that this distribution is asymptotically Gaussian in the slow adaptation regime of small step-sizes. Third, by carrying out a detailed large deviations analysis, closed-form expressions are derived for the decaying rates of the false-alarm and miss-detection probabilities. Interesting insights are gained from these expressions. In particular, it is verified that as the step-size decreases, the error probabilities are driven to zero exponentially fast as functions of , and that the exponents governing the decay increase linearly in the number of agents. It is also verified that the scaling laws governing the errors of detection and the errors of estimation over the network behave very differently, with the former having exponential decay proportional to , while the latter scales linearly with decay proportional to . Moreover, and interestingly, it is shown that the cooperative strategy allows each agent to reach the same detection performance, in terms of detection error exponents, of a centralized stochastic-gradient solution. The results of this paper are illustrated by applying them to canonical distributed detection problems
Exact asymptotics of distributed detection over adaptive networks
In [1], an important step toward the characterization of distributed detection over adaptive networks has been made by establishing the fundamental scaling law of the error probabilities. However, empirical evidence reported in [1] revealed that a refined asymptotic analysis is necessary in order to capture the exact impact of network connectivity on the detection performance of each individual agent. Here we address this open issue by exploiting the framework of exact asymptotics
Recommended from our members
Adaptive Techniques for Mitigating Circuit Imperfections in High Performance A/D Converters
In this dissertation, we examine the effect of four sources of circuit imperfections on the performance of analog-to-digital converters (ADCs), including sampling clock jitters, spurious sidebands, timing mismatches, and gain mismatches. These imperfections distort the sampled data and degrade the signal-to-noise ratio (SNR) of the ADCs. We develop signal models for the distortions and propose effective adaptive signal processing techniques to filter the sampled data and mitigate the spurious effects. Rather than remove the distortions by perfecting the circuitry, the proposed techniques focus on processing the sampled data by using adaptive DSP algorithms. Analog circuit impairments create many distortions including I/Q imbalances, phase noise, frequency offsets, and sampling clock jitter. Timing jitters generally arise from noise in the clock generating crystal and phase-locked-loop (PLL). The jitters cause the ADCs to sample the input signals at non-uniform sampling times and introduce distortion that limits the signal fidelity and degrades the SNR. While the effects of jitter noise can be neglected at low frequencies, applications requiring enhanced performance at higher frequencies demand higher SNR from the sampling circuit. We first examine the effect of the clock jitter on the SNR of the sampled signal and subsequently propose compensation methods based on a signal injection structure for direct down-conversion architectures. We also address the effect of non-ideal PLL circuitry on the quality of the sampled data. In a non-ideal PLL circuit, leakage of the reference signal into the control line produces spurious tones. When the distorted PLL signal is used to generate the sampling clock, it injects the spurious tones into the sampled data. These distortions are harmful for wideband applications, such as spectrum sensing, since they affect the detection of vacant frequency bands. We again examine the distortion effect in some detail and propose techniques in the digital domain to clean the data and remove the PLL leakage effects. We study the performance of the proposed algorithms and compare it against the corresponding Cramer-Rao bound (CRB). We further propose an adaptive frequency-domain structure to compensate the effect of timing and gain mismatches in time-interleaved ADCs. An M-channel time-interleaved ADC uses M ADCs to sample an input signal to obtain a larger effective sampling rate. However, in practice, combining ADCs introduces mismatches among the various ADC channels. In the proposed solution, the signal is split into multiple frequency bins and adaptation across the frequency channels is combined by means of an adaptive strategy. The construction is able to assign more or less weight to the various frequency channels depending on whether their estimates are more or less reliable in comparison to other channels
Recommended from our members
Learning under Imperfections by Networked Agents
Distributed learning deals with the problem of optimizing aggregate cost functions by networked agents from streaming data. This scenario arises in many contexts including distributed estimation, machine learning, resource allocation, and in the modeling of flocking and swarming behavior by biological networks. Among several available solutions such as consensus and incremental strategies, the class of diffusion strategies has proven to be particularly attractive because these techniques are scalable, robust, fully-distributed, and endow networks with real-time adaptation and learning abilities.One key challenge in real applications is that networked agents generally face many types of asynchronous imperfections, such as random link failures, random data arrival times, noisy links, random topology changes, agents turning on and off randomly, and even drifting objectives. This dissertation provides a detailed analysis of the stability and performance of asynchronous diffusion strategies for solving distributed optimization and adaptation problems over networks in the presence of such imperfections. Conditions are developed to ensure the stability of the mean-square and mean-fourth-order moments of the network error vectors; closed-form expressions are derived to reveal how the network parameters influence the learning behavior; and the performance of the asynchronous networks is then compared against centralized solutions and synchronous networks. One notable conclusion is that the mean-square performance of asynchronous networks degrades only in the order of &mu, which is a small step-size parameter, while the convergence rate remains largely unaltered. A second notable conclusion is that even under the influence of asynchronous events, all agents in the adaptive network can still reach an O(&mu1+&gamma<\super>)$ near-agreement with some constant &gamma > 0, while approaching the desired solution within O(&mu) accuracy. These theoretical results provide a solid justification for the remarkable resilience of cooperative networks in the face of random imperfections at multiple levels: agents, links, data arrivals, and topology. The dissertation also examines a second challenging form of uncertainty arising from agents in a network pursuing different objectives or observing data arising from different unknown models. In these cases, indiscriminate cooperation will lead to undesired results. A useful adaptive clustering and learning strategy is developed in order to allow agents to learn which neighbors should be trusted and which other neighbors should be ignored. The resulting procedure enables agents to identify their grouping and to attain improved learning performance
Recommended from our members
Teamwork and Exploration in Reinforcement Learning
Reinforcement learning (RL) is a powerful machine learning paradigm that studies the interaction between a single agent with an unknown environment. A plethora of applications fit into the RL framework, however, in many cases of interest, a team of agents will need to interact with the environment and with each other to achieve a common goal. This is the object study of collaborative multi-agent RL (MARL).Several challenges arise when considering collaborative MARL. One of these challenges is decentralization. In many cases, due to design constraints, it is undesirable or inconvenient to constantly relay data between agents and a centralized location. Therefore, fully distributed solutions become preferable. The first part of this dissertation addresses the challenge of designing fully decentralized MARL algorithms. We consider two problems: policy evaluation and policy optimization. In the policy evaluation problem, the objective is to estimate the performance of a target team policy in a particular environment. This problem has been studied before for the case with streaming data, however, in most implementations the target policy is evaluated using a finite data set. For this case, existing algorithms guarantee convergence at a sub-linear rate. In this dissertation we introduce Fast Diffusion for Policy Evaluation (FDPE), an algorithm that converges at linear rate for the finite data set case. We then consider the policy optimization problem, where the objective is for all agents to learn an optimal team policy. This problem has also been studied recently, however, existing solutions are data inefficient and converge to Nash equilibria (whose performance can be catastrophically bad) as opposed to team optimal policies. For this case we introduce the Diffusion for Team Policy Optimization (DTPO) algorithm. DTPO is more data efficient than previous algorithms and does not converge to Nash equilibria. For both of these cases, we provide experimental studies that show the effectiveness of the proposed methods.Another challenge that arises in collaborative MARL, which is orthogonal to the decentralization problem, is that of scalability. The parameters that need to be estimated when full team policies are learned, grow exponentially with the number of agents. Hence, algorithms that learn joint team policies quickly become intractable. A solution to this problem is for each agent to learn an individual policy, such that the resulting joint team policy is optimal. This problem has been the object of much research lately. However, most solution methods are data inefficient and often make unrealistic assumptions that greatly limit the applicability of these approaches. To address this problem we introduce Logical Team Q-learning (LTQL), an algorithm that learns factored policies in a data efficient manner and is applicable to any cooperative MARL environment. We show that LTQL outperforms previous methods in a challenging predator-prey task.Another challenge is that of efficient exploration. This is a problem both in the single-agent and multi-agent settings, although in MARL it becomes more severe due to the larger state-action space. The challenge of deriving policies that are efficient at exploring the state space has been addressed in many recent works. However, most of these approaches rely on heuristics, and more importantly, they consider the problem of exploring the state space separately from that of learning an optimal policy (even though they are related, since the state-space is explored to collect data to learn an optimal policy). To address this challenge, we introduce the Information Seeking Learner (ISL), an algorithm that displays state of the art performance in difficult exploration benchmarks. The fundamental value of our work on exploration is that we take a fundamentally different approach from previous works. As opposed to earlier methods we consider the problem of exploring the state space and learning an optimal policy jointly. The main insight of our approach is that in RL, obtaining point estimates of the quantities of interest is not sufficient and confidence bound estimates are also necessary
Reinforcement Learning by Networked Agents
Multi-agent reinforcement learning (MARL) has emerged as a compelling framework for modeling the collaborative behavior of autonomous agents operating in interconnected systems. The potential for agents to collectively achieve goals that are infeasible for any single unit makes MARL a powerful tool across a wide range of domains. However, the multi-agent setting introduces distinct challenges that require specialized solutions. This thesis addresses two fundamental challenges in MARL: (i) effective deep exploration, and (ii) global state estimation under partial observability. To this end, we leverage the networked structure of agents and their communication capabilities to develop decentralized learning algorithms that facilitate robust and scalable collaboration under uncertain conditions.
First, we propose a novel, counting-free deep exploration algorithm for MARL that guarantees all state-action pairs are visited infinitely often. Deep exploration is essential for avoiding suboptimal learning in environments with sparse or deceptively structured rewards. Our method distributes an ensemble of value estimates across the network of agents and uses statistical variance to guide exploration. The count-free nature of the design makes it suitable for large or continuous state spaces. Theoretical guarantees are established for sufficient exploration, and the approach is validated through extensive simulations.
Second, we address the challenge of global state estimation in partially observable environments, where agents have access only to local, incomplete observations. Individually, these observations are insufficient to recover the global state; however, through local communication, agents can collaboratively estimate it. We explore two social learning-based strategies to tackle this issue: standard and adaptive social learning. Standard social learning does not impose constraints on state dynamics but introduces a two-time-scale learning structure. We provide theoretical analysis showing that MARL combined with this approach achieves -optimality with respect to the fully observable baseline.
To overcome the limitation of two-time-scale learning, we introduce an adaptive social learning method that enables single-time-scale integration of state estimation and reinforcement learning, assuming slowly evolving state dynamics. Under appropriate choices of the adaptation and learning parameters, we show that the proposed method also achieves -optimal performance. Both methods are fully decentralized, rely solely on local communication, and are supported by rigorous convergence guarantees. Empirical evaluations further confirm the effectiveness of both approaches.AS
Recommended from our members
On the Performance and Linear Convergence of Decentralized Primal-Dual Methods
This dissertation studies the performance and linear convergence properties of primal-dual methods for the solution of decentralized multi-agent optimization problems. Decentralized multi-agent optimization is a powerful paradigm that finds applications in diverse fields in learning and engineering design. In these setups, a network of agents is connected through some topology and agents are allowed to share information only locally. Their overall goal is to seek the minimizer of a global optimization problem through localized interactions. In decentralized consensus problems, the agents are coupled through a common consensus variable that they need to agree upon. While in decentralized resource allocation problems, the agents are coupled through global affine constraints. Various decentralized consensus optimization algorithms already exist in the literature. Some methods are derived from a primal-dual perspective, while other methods are derived as gradient tracking mechanisms meant to track the average of local gradients. Among the gradient tracking methods are the adapt-then-combine implementations motivated by diffusion strategies, which have been observed to perform better than other implementations. In this dissertation, we develop a novel adapt-then-combine primal-dual algorithmic framework that captures most state-of-the-art gradient based methods as special cases including all the variations of the gradient-tracking methods. We also develop a concise and novel analysis technique that establishes the linear convergence of this general framework under strongly-convex objectives. Due to our unified framework, the analysis reveals important characteristics for these methods such as their convergence rates and step-size stability ranges. Moreover, the analysis reveals how the augmented Lagrangian penalty term, which is utilized in most of these methods, affects the performance of decentralized algorithms. Another important question that we answer is whether decentralized proximal gradient methods can achieve global linear convergence for non-smooth composite optimization. For centralized algorithms, linear convergence has been established in the presence of a non-smooth composite term. In this dissertation, we close the gap between centralized and decentralized proximal gradient algorithms and show that decentralized proximal algorithms can also achieve linear convergence in the presence of a non-smooth term. Furthermore, we show that when each agent possesses a different local non-smooth term then global linear convergence cannot be established in the worst case. Most works that study decentralized optimization problems assume that all agents are involved in computing all variables. However, in many applications the coupling across agents is sparse in the sense that only a few agents are involved in computing certain variables. We show how to design decentralized algorithms in sparsely coupled consensus and resource allocation problems. More importantly, we establish analytically the importance of exploiting the sparsity structure in coupled large-scale networks
- …
