Association for the Advancement of Artificial Intelligence: AAAI Publications
Not a member yet
26155 research outputs found
Sort by
S-INF: Towards Realistic Indoor Scene Synthesis via Scene Implicit Neural Field
Learning-based methods have become increasingly popular in 3D indoor scene synthesis (ISS), showing superior performance over traditional optimization-based approaches. These learning-based methods typically model distributions on simple yet explicit scene representations using generative models. However, due to the oversimplified explicit representations that overlook detailed information and the lack of guidance from multimodal relationships within the scene, most learning-based methods struggle to generate indoor scenes with realistic object arrangements and styles. In this paper, we introduce a new method, Scene Implicit Neural Field (S-INF), for indoor scene synthesis, aiming to learn meaningful representations of multimodal relationships, to enhance the realism of indoor scene synthesis. S-INF assumes that the scene layout is often related to the object-detailed information. It disentangles the multimodal relationships into scene layout relationships and detailed object relationships, fusing them later through implicit neural fields (INFs). By learning specialized scene layout relationships and projecting them into S-INF, we achieve a realistic generation of scene layout. Additionally, S-INF captures dense and detailed object relationships through differentiable rendering, ensuring stylistic consistency across objects. Through extensive experiments on the benchmark 3D-FRONT dataset, we demonstrate that our method consistently achieves state-of-the-art performance under different types of ISS
Does Your AI Agent Get You? A Personalizable Framework for Approximating Human Models from Argumentation-based Dialogue Traces
Explainable AI is increasingly employing argumentation methods to facilitate interactive explanations between AI agents and human users. While existing approaches typically rely on predetermined human user models, there remains a critical gap in dynamically learning and updating these models during interactions. In this paper, we present a framework that enables AI agents to adapt their understanding of human users through argumentation-based dialogues. Our approach, called Persona, draws on prospect theory and integrates a probability weighting function with a Bayesian belief update mechanism that refines a probability distribution over possible human models based on exchanged arguments. Through empirical evaluations with human users in an applied argumentation setting, we demonstrate that Persona effectively captures evolving human beliefs, facilitates personalized interactions, and outperforms state-of-the-art methods
Learning to Collaborate with Unknown Agents in the Absence of Reward
With the advancements of artificial intelligence (AI), emerging scenarios involving close collaboration between AI and other unknown agents are becoming increasingly common. This requires sometimes training AI agents to collaborate with unknown agents in the absence of a reward function -- which may be unavailable to the AI agents or even undefined by the unknown agents themselves -- thus posing news challenges to existing learning algorithms that often require knowing the shared reward. In this paper, we show that effective teaming with unknown agents can be achieved in the absence of a reward function, through actively modeling other unknown agents and reasoning about their latent rewards from available interaction/observation history. In particular, we propose a novel framework that leverages a kernel density Bayesian inverse learning method for active reward/goal inference and prove that multi-agent reinforcement learning guided by the inferred reward signals can converge to an optimal policy teaming with unknown agents. The result enables us to develop an adaptive policy update strategy, through the use of a family of pre-trained, goal-conditioned policies, further eliminating the need for online retraining. The proposed solution is evaluated using a wide range of diverse unknown agents of latent and even non-stationary reward. Our solution significantly increases the teaming performance between AI and unknown agents in the absence of reward
Personalized Sleep Staging Leveraging Source-free Unsupervised Domain Adaptation
Sleep staging is important for monitoring sleep quality and diagnosing sleep-related disorders. Recently, numerous deep learning-based models have been proposed for automatic sleep staging using polysomnography recordings. Most of them are trained and tested on the same labeled datasets which results in poor generalization to unseen target domains. However, they regard the subjects in the target domains as a whole and overlook the individual discrepancies, which limits the model's generalization ability to new patients (i.e., unseen subjects) and plug-and-play applicability in clinics. To address this, we propose a novel Source-Free Unsupervised Individual Domain Adaptation (SF-UIDA) framework for sleep staging, leveraging sequential cross-view contrasting and pseudo-label based fine-tuning. It is actually a two-step subject-specific adaptation scheme, which enables the source model to effectively adapt to newly appeared unlabeled individual without access to the source data. It meets the practical needs in real-world scenarios, where the personalized customization can be plug-and-play applied to new ones. Our framework is applied to three classic sleep staging models and evaluated on three public sleep datasets, achieving the state-of-the-art performance
When Should We Prefer State-to-Visual DAgger over Visual Reinforcement Learning?
Learning policies from high-dimensional visual inputs, such as pixels and point clouds, is crucial in various applications. Visual reinforcement learning is a promising approach that directly trains policies from visual observations, although it faces challenges in sample efficiency and computational costs. This study conducts an empirical comparison of State-to-Visual DAgger — a two-stage framework that initially trains a state policy before adopting online imitation to learn a visual policy — and Visual RL across a diverse set of tasks. We evaluate both methods across 16 tasks from three benchmarks, focusing on their asymptotic performance, sample efficiency, and computational costs. Surprisingly, our findings reveal that State-to-Visual DAgger does not universally outperform Visual RL but shows significant advantages in challenging tasks, offering more consistent performance. In contrast, its benefits in sample efficiency are less pronounced, although it often reduces the overall wall-clock time required for training. Based on our findings, we provide recommendations for practitioners and hope that our results contribute valuable perspectives for future research in visual policy learning
First-Order Automata
First-order linear temporal logic (FOLTL) is a flexible and expressive formalism capable of naturally describing complex behaviors and properties. Although the logic is in general highly undecidable, the idea of using it as a specification language for the verification of complex infinite-state systems is appealing. However, a missing piece, which has proved to be an invaluable tool in dealing with other temporal logics, is an automaton model capable of capturing the logic. In this paper we address this issue, by defining and studying such a model, which we call first-order automaton. We define this very general class of automata, and the corresponding notion of regular first-order language (of finite words), showing their closure under most language-theoretic operations. We show how they can capture any FOLTL formula over finite words, over any signature and theory, and provide sufficient conditions for the semi-decidability of their non-emptiness problem. Then, to show the usefulness of the formalism, we prove the decidability of monodic FOLTL, a classic result known in the literature, with a simpler and direct proof
New Compilation Languages Based on Restricted Weak Decomposability
This paper introduces two new compilation languages restricting weak decomposable negation normal form (wDNNF) circuits and integrates them into the knowledge compilation map. Positive (resp. negative) wDNNF circuits restrict wDNNF circuits so that each variable shared among the inputs of a conjunction node can only have positive (resp. negative) occurrences in that subcircuit. Unlike wDNNF circuits, pwDNNF (resp. nwDNNF) circuits satisfy the maximum (resp. minimum) cardinality query. We present a compiler for converting CNF formulae into pwDNNF and nwDNNF circuits by extending Bella - the state-of-the-art compiler for wDNNF circuits. We introduce a new caching scheme, called Cara, that exploits isomorphism. Using that scheme, we show a new compilation method based on copying subcircuits, which may significantly speed up compilations at the expense of increasing circuit sizes. Our experiments demonstrate that nwDNNF circuits are suitable for computing most probable explanations (MPEs) in two-layer Bayesian networks (BNs) with large domains
An And-Sum Circuit with Signed Edges That Is More Succinct than SDD
Knowledge compilation is a method of transforming knowledge into a compressed and tractable form for permitting more efficient operations. For Boolean functions, numerous representations have been proposed that enhance succinctness and tractability. In this paper, we introduce a new representation named structured Decomposable And-Sum Circuit (st-DASC), which employs AND and SUM nodes with signed edges, in place of the standard AND and OR nodes with unsigned edges. Notably, incorporating negative signs permits polytime logical negation. By following a knowledge compilation map, we show that st-DASCs are more succinct than Sentential Decision Diagrams (SDDs) while maintaining support for every operation on the knowledge compilation map that SDD supports. Furthermore, st-DASCs are even more succinct than structured d-DNNFs (st-d-DNNFs), which are more succinct than SDDs although they support fewer operations than SDDs. Accordingly, st-DASCs break the traditional trade-off between succinctness and tractability over SDDs and st-d-DNNFs
Even-if Explanations: Formal Foundations, Priorities and Complexity
Explainable AI has received significant attention in recent years. Machine learning models often operate as black boxes, lacking explainability and transparency while supporting decision-making processes. Local post-hoc explainability queries attempt to answer why individual inputs are classified in a certain way by a given model. While there has been important work on counterfactual explanations, less attention has been devoted to semifactual ones. In this paper, we focus on local post-hoc explainability queries within the semifactual `even-if' thinking and their computational complexity among different classes of models, and show that both linear and tree-based models are strictly more interpretable than neural networks. After this, we introduce a preference-based framework enabling users to personalize explanations based on their preferences, both in the case of semifactuals and counterfactuals, enhancing interpretability and user-centricity. Finally, we explore the complexity of several interpretability problems in the proposed preference-based framework and provide algorithms for polynomial cases
Masked Language Modeling Becomes Conditional Density Estimation for Tabular Data Synthesis
In this paper, our goal is to generate synthetic data for heterogeneous (mixed-type) tabular datasets with high machine learning utility (MLu). Since the MLu performance depends on accurately approximating the conditional distributions, we focus on devising a synthetic data generation method based on conditional distribution estimation. We introduce MaCoDE by redefining the consecutive multi-class classification task of Masked Language Modeling (MLM) as histogram-based non-parametric conditional density estimation. Our approach enables the estimation of conditional densities across arbitrary combinations of target and conditional variables. We bridge the theoretical gap between distributional learning and MLM by demonstrating that minimizing the orderless multi-class classification loss leads to minimizing the total variation distance between conditional distributions. To validate our proposed model, we evaluate its performance in synthetic data generation across 10 real-world datasets, demonstrating its ability to adjust data privacy levels easily without re-training. Additionally, since masked input tokens in MLM are analogous to missing data, we further assess its effectiveness in handling training datasets with missing values, including multiple imputations of the missing entries