1,721,076 research outputs found

    Partial Hypernetworks for Continual Learning

    No full text
    Hypernetworks mitigate forgetting in continual learning (CL) by generating task-dependent weights and penalizing weight changes at a meta-model level. Unfortunately, generating all weights is not only computationally expensive for larger architectures, but also, it is not well understood whether generating all model weights is necessary. Inspired by latent replay methods in CL, we propose partial weight generation for the final layers of a model using hypernetworks while freezing the initial layers. With this objective, we first answer the question of how many layers can be frozen without compromising the final performance. Through several experiments, we empirically show that the number of layers that can be frozen is proportional to the distributional similarity in the CL stream. Then, to demonstrate the effectiveness of hypernetworks, we show that noisy streams can significantly impact the performance of latent replay methods, leading to increased forgetting when features from noisy experiences are replayed with old samples. In contrast, partial hypernetworks are more robust to noise by maintaining accuracy on previous experiences. Finally, we conduct experiments on the split CIFAR-100 and TinyImagenet benchmarks and compare different versions of partial hypernetworks to latent replay methods. We conclude that partial weight generation using hypernetworks is a promising solution to the problem of forgetting in neural networks. It can provide an effective balance between computation and final test accuracy in CL streams

    RaSP: Relation-aware Semantic Prior for Weakly Supervised Incremental Segmentation

    Full text link
    Class-incremental semantic image segmentation assumes multiple model updates, each enriching the model to segment new categories. This is typically carried out by providing expensive pixel-level annotations to the training algorithm for all new objects, limiting the adoption of such methods in practical applications. Approaches that solely require image-level labels offer an attractive alternative, yet, such coarse annotations lack precise information about the location and boundary of the new objects. In this paper we argue that, since classes represent not just indices but semantic entities, the conceptual relationships between them can provide valuable information that should be leveraged. We propose a weakly supervised approach that exploits such semantic relations to transfer objectness prior from the previously learned classes into the new ones, complementing the supervisory signal from image-level labels. We validate our approach on a number of continual learning tasks, and show how even a simple pairwise interaction between classes can significantly improve the segmentation mask quality of both old and new classes. We show these conclusions still hold for longer and, hence, more realistic sequences of tasks and for a challenging few-shot scenari

    Learning in POMDPs with Monte Carlo Tree Search

    No full text
    The POMDP is a powerful framework for reasoning under outcome and information uncertainty, but constructing an accurate POMDP model is difficult. Bayes-Adaptive Partially Observable Markov Decision Processes (BA-POMDPs) extend POMDPs to allow the model to be learned during execution. BA-POMDPs are a Bayesian RL approach that, in principle, allows for an optimal trade-off between exploitation and exploration. Unfortunately, BA-POMDPs are currently impractical to solve for any non-trivial domain. In this paper, we extend the Monte-Carlo Tree Search method POMCP to BA-POMDPs and show that the resulting method, which we call BA-POMCP, is able to tackle problems that previous solution methods have been unable to solve. Additionally, we introduce several techniques that exploit the BA-POMDP structure to improve the efficiency of BA-POMCP along with proof of their convergence

    Reinforcement learning in the continuous double auction and the trading agents competition

    No full text
    In this thesis we investigate if reinforcement learning (RL) techniques can be successfully used to build automated trading agents.We present two case studies in which we develop RL agents for participating in auctions. The first case study focuses on the continuous double auction (CDA), a market mechanism used in many electronic trading venues. There are currently several automated bidding strategies for participating in the CDA, geared both toward personal profit and toward increasing the efficiency of the entire market. We use model-free reinforcement learning to construct a bidding strategy for the CDA and empirically evaluates its performance against other well-known automated strategies. The second case study deals with the larger but related problem of interdependent electronic auctions. We describe an RL agent for the trading agent competition (TAC), and analyze its performance. This competition features multiple dependent auctions, and hence provides a much harder test-bed. The empirical results did not show the same success for the RL strategy as in the CDA environment. We attribute this problem to the difficulty of dealing with dependent auctions, in which the optimal strategy in one auction depends on the state of the other auctions as well

    Comparing machine learning and hand-crafted approaches for information extraction from HTML documents

    No full text
    The problem of automatically extracting information from web pages is becoming very important, due to the explosion of information available on the World Wide Web. In this thesis, we explore and compare hand-crafted information extraction tools with tools constructed using machine learning algorithms. The task we consider is the extraction of organization names and contact information, such as addresses and phone numbers, from web pages. Given the huge number of company web pages on the Internet, automating this task is of great practical interest. The system we developed consists of two components. The first component achieves the labeling or tagging of named entities (such as company names, addresses and phone numbers) in HTML documents. We compare the performance of hand-coded regular expressions and decision trees for this task. Using decision trees allows us to generate tagging rules that are significantly more accurate. The second component is used to establish relationships between named entities (i.e. company names, phone numbers and addresses), for the purpose of structuring the data into a useful record (i.e. a contact, or an organization). For this task we experimented with two approaches. The first approach uses an aggregator that implements human-generated heuristics to relate the tags and create the records sought. The second approach is based on Hidden Markov Models (HMM). As far as we know, no one has used HMM before to establish relationships between more than two tagged entities. Our empirical results suggest that HMMs compare favorable with the hand-crafted aggregator in terms of performance and ease of development

    Automated discovery of options in reinforcement learning

    No full text
    AI planning benefits greatly from the use of temporally-extended or macro-actions. Macro-actions allow for faster and more efficient planning as well as the reuse of knowledge from previous solutions. In recent years, a significant amount of research has been devoted to incorporating macro-actions in learned controllers, particularly in the context of Reinforcement Learning. One general approach is the use of options (temporally-extended actions) in Reinforcement Learning [22]. While the properties of options are well understood, it is not clear how to find new options automatically. In this thesis we propose two new algorithms for discovering options and compare them to one algorithm from the literature. We also contribute a new algorithm for learning with options which improves on the performance of two widely used learning algorithms. Extensive experiments are used to demonstrate the effectiveness of the proposed algorithms

    Leveraging node attributes for incomplete relational data

    Full text link
    Relational data are usually highly incomplete in practice, which inspires us to leverage side information to improve the performance of community detection and link prediction. This paper presents a Bayesian probabilistic approach that incorporates various kinds of node attributes encoded in binary form in relational models with Poisson likelihood. Our method works flexibly with both directed and undirected relational networks. The inference can be done by efficient Gibbs sampling which leverages sparsity of both networks and node attributes. Extensive experiments show that our models achieve the state-of-the-art link prediction results, especially with highly incomplete relational data
    corecore