Association for the Advancement of Artificial Intelligence: AAAI Publications
Not a member yet
    26155 research outputs found

    Neural Reasoning Networks: Efficient Interpretable Neural Networks with Automatic Textual Explanations

    No full text
    Recent advances in machine learning have led to a surge in adoption of neural networks for various tasks, but lack of interpretability remains an issue for many others in which an understanding of the features influencing the prediction is necessary to ensure fairness, safety, and legal compliance. In this paper we consider one class of such tasks, tabular dataset classification, and propose a novel neuro-symbolic architecture, Neural Reasoning Networks (NRN), that is scalable and generates logically sound textual explanations for its predictions. NRNs are connected layers of logical neurons that implement a form of real valued logic. A training algorithm (R-NRN) learns the weights of the network as usual using gradient descent optimization with backprop, but also learns the network structure itself using a bandit-based optimization. Both are implemented in an extension to PyTorch that takes full advantage of GPU scaling and batched training. Evaluation on a diverse set of 22 open-source datasets for tabular classification demonstrates performance (measured by ROC AUC) which improves over Multilayer Perceptron (MLP) and is statistically similar to other state-of-the-art approaches such as Random Forest, XGBoost and Gradient Boosted Trees, while offering 43% faster training and a more than 2 orders of magnitude reduction in the number of parameters required, on average. Furthermore, R-NRN explanations are shorter than the compared approaches while producing more accurate feature importance scores

    Understanding Individual Agent Importance in Multi-Agent System via Counterfactual Reasoning

    No full text
    Explaining multi-agent systems (MAS) is urgent as these systems become increasingly prevalent in various applications. Previous work has provided explanations for the actions or states of agents, yet falls short in understanding the blackboxed agent’s importance within a MAS and the overall team strategy. To bridge this gap, we propose EMAI, a novel agent-level explanation approach that evaluates the individual agent’s importance. Inspired by counterfactual reasoning, a larger change in reward caused by the randomized action of agent indicates its higher importance. We model it as a MARL problem to capture interactions across agents. Utilizing counterfactual reasoning, EMAI learns the masking agents to identify important agents. Specifically, we define the optimization function to minimize the reward difference before and after action randomization and introduce sparsity constraints to encourage the exploration of more action randomization of agents during training. The experimental results in seven multi-agent tasks demonstrate that EMAI achieves higher fidelity in explanations compared to baselines and provides more effective guidance in practical applications concerning understanding policies, launching attacks, and patching policies

    Integrating Sequence and Image Modeling in Irregular Medical Time Series Through Self-Supervised Learning

    No full text
    Medical time series are often irregular and face significant missingness, posing challenges for data analysis and clinical decision-making. Existing methods typically adopt a single modeling perspective, either treating series data as sequences or transforming them into image representations for further classification. In this paper, we propose a joint learning framework that incorporates both sequence and image representations. We also design three self-supervised learning strategies to facilitate the fusion of sequence and image representations, capturing a more generalizable joint representation. The results indicate that our approach outperforms seven other state-of-the-art models in three representative real-world clinical datasets. We further validate our approach by simulating two major types of real-world missingness through leave-sensors-out and leave-samples-out techniques. The results demonstrate that our approach is more robust and significantly surpasses other baselines in terms of classification performance

    A Pioneering Neural Network Method for Efficient and Robust Fuel Sloshing Simulation in Aircraft

    No full text
    Simulating fuel sloshing within aircraft tanks during flight is crucial for aircraft safety research. Traditional methods based on Navier-Stokes equations are computationally expensive. In this paper, we treat fluid motion as point cloud transformation and propose the first neural network method specifically designed for simulating fuel sloshing in aircraft. This model is also the first deep learning model capable of stably modeling fluid particle dynamics in such complex scenarios. Our triangle feature fusion design achieves an optimal balance among fluid dynamics modeling, momentum conservation constraints, and global stability control. Additionally, we constructed the Fueltank dataset, the first dataset for aircraft fuel surface sloshing. It comprises 320,000 frames across four typical tank types and covers a wide range of flight maneuvers, including multi-directional rotations. We conducted comprehensive experiments on both our dataset and the take-off scenario of the aircraft. Compared to existing neural network-based fluid simulation algorithms, we significantly enhanced accuracy while maintaining high computational speed. Compared to traditional SPH methods, our speed improved approximately 10 times. Furthermore, compared to traditional fluid simulation software such as Flow3D, our computation speed increased by more than 300 times

    Unveiling the Threat of Fraud Gangs to Graph Neural Networks: Multi-Target Graph Injection Attacks Against GNN-Based Fraud Detectors

    No full text
    Graph neural networks (GNNs) have emerged as an effective tool for fraud detection, identifying fraudulent users, and uncovering malicious behaviors. However, attacks against GNN-based fraud detectors and their risks have rarely been studied, thereby leaving potential threats unaddressed. Recent findings suggest that frauds are increasingly organized as gangs or groups. In this work, we design attack scenarios where fraud gangs aim to make their fraud nodes misclassified as benign by camouflaging their illicit activities in collusion. Based on these scenarios, we study adversarial attacks against GNN-based fraud detectors by simulating attacks of fraud gangs in three real-world fraud cases: spam reviews, fake news, and medical insurance frauds. We define these attacks as multi-target graph injection attacks and propose MonTi, a transformer-based Multi-target one-Time graph injection attack model. MonTi simultaneously generates attributes and edges of all attack nodes with a transformer encoder, capturing interdependencies between attributes and edges more effectively than most existing graph injection attack methods that generate these elements sequentially. Additionally, MonTi adaptively allocates the degree budget for each attack node to explore diverse injection structures involving target, candidate, and attack nodes, unlike existing methods that fix the degree budget across all attack nodes. Experiments show that MonTi outperforms the state-of-the-art graph injection attack methods on five real-world graphs

    MimiQ: Low-Bit Data-Free Quantization of Vision Transformers with Encouraging Inter-Head Attention Similarity

    No full text
    Data-free quantization (DFQ) is a technique that creates a lightweight network from its full-precision counterpart without the original training data, often through a synthetic dataset. Although several DFQ methods have been proposed for vision transformer (ViT) architectures, they fail to achieve efficacy in low-bit settings. Examining the existing methods, we observe that their synthetic data produce misaligned attention maps, while those of the real samples are highly aligned. From this observation, we find that aligning attention maps of synthetic data helps improve the overall performance of quantized ViTs. Motivated by this finding, we devise MimiQ, a novel DFQ method designed for ViTs that enhances inter-head attention similarity. First, we generate synthetic data by aligning head-wise attention outputs from each spatial query patch. Then, we align the attention maps of the quantized network to those of the full-precision teacher by applying head-wise structural attention distillation. The experimental results show that the proposed method significantly outperforms baselines, setting a new state-of-the-art for ViT-DFQ

    Creating Coherence in Federated Non-Negative Matrix Factorization

    No full text
    In many real-world applications, data is inherently decentralized, necessitating data analysis methods that prioritize privacy while delivering interpretable results. Federated Non-Negative Matrix Factorization (FedNMF) meets this requirement by factorizing latent components from distributed data that cannot be freely shared among clients. A significant challenge in FedNMF arises when clients converge on different solutions due to prolonged independent optimization, leading to drift and incoherent models. While Federated Learning (FL) typically mitigates drift through frequent synchronizations and strong regularization, it often overlooks critical properties of Non-Negative Matrix Factorization, such as permutation invariance. As a result, solutions from FedNMF clients may be misidentified by FL drift as distinct, despite being equivalent. Using an alignment-aware drift, we create coherence through proximal optimization and barycenter aggregation for FedNMF. We analyze the computational complexity of our approach, provide efficient heuristics, and ensure the convergence of our algorithms. On a diverse set of real-world and synthetic datasets, we demonstrate the effectiveness of our methods

    OneBatchPAM: A Fast and Frugal K-Medoids Algorithm

    No full text
    This paper proposes a novel k-medoids approximation algorithm to handle large-scale datasets with reasonable computational time and memory complexity. We develop a local-search algorithm that iteratively improves the medoid selection based on the estimation of the k-medoids objective. A single batch of size

    3SAT: A Simple Self-Supervised Adversarial Training Framework

    No full text
    The combination of self-supervised learning and adversarial training (AT) can significantly improve the adversarial robustness of self-supervised models. However, the robustness of self-supervised adversarial training (self-AT) still lags behind that of state-of-the-art (SOTA) supervised AT (sup-AT), even though the performance of current self-supervised learning models has already matched or even surpassed that of SOTA supervised learning models. This issue raises concerns about the secure application of self-supervised learning models. The inclusion of adversarial training turns self-AT into a challenging joint optimization problem, and recent studies have shown that the data augmentation methods necessary for constructing positive pairs in self-supervised learning negatively impact the robustness improvement in self-AT. Inspired by this, we propose 3SAT, a simple self-supervised adversarial training framework. 3SAT conducts adversarial training on original, unaugmented samples, reducing the difficulty of optimizing the adversarial training subproblem and fundamentally eliminating the negative impact of data augmentation on robustness improvement. Additionally, 3SAT introduces a dynamic training objective scheduling strategy to address the issue of model training collapse during the joint optimization process when using original samples directly. 3SAT is not only structurally simple and computationally efficient, reducing self-AT training time by half, but it also improves the SOTA self-AT robustness accuracy by 16.19\% and standard accuracy by 11.41\% under Auto-Attack on the CIFAR-10 dataset. Even more impressively, 3SAT surpasses the SOTA sup-AT method in robust accuracy by a significant margin of 11.25\%. This marks the first time that self-AT has outperformed SOTA sup-AT in robustness, indicating that self-AT is a superior method for improving model robustness

    Beyond Federated Prototype Learning: Learnable Semantic Anchors with Hyperspherical Contrast for Domain-Skewed Data

    No full text
    Federated prototype learning is in the spotlight as global prototypes are effective in enhancing the learning of local representation spaces, facilitating the ability to generalize the global model. However, when encountering domain-skewed data, conventional federated prototype learning is susceptible to two dilemmas: 1) Local prototypes obtained by averaging intra-class embedding carry domain-specific markers, the margins among aggregated global prototypes could be attenuated and detrimental to inter-class separation. 2) Local domain-skewed embedding may not exhibit a uniform distribution in Euclidean space, which is not conductive to the prototype-induced intra-class compactness. To address the two drawbacks, we go beyond conventional paradigm of federated prototype learning, and propose learnable semantic anchors with hyperspherical contrast (FedLSA) for domain-skewed data. Specifically, we eschew the pattern of yielding prototypes via averaging intra-class embedding and directly learn a set of semantic anchors aided by the global semantic-aware classifier. Meanwhile, the margins between anchors are augmented via pulling apart them, ensuring decent inter-class separation. To guarantee that local domain-skewed representations can be uniformly distributed, local data is projected into the hyperspherical space, and the intra-class compactness is achieved by optimizing the contrastive loss derived from the von Mises-Fisher distribution. Finally, extensive experimental results on three multi-domain datasets show the superiority of the proposed FedLSA compared to existing typical and state-of-the-state methods

    0

    full texts

    26,155

    metadata records
    Updated in last 30 days.
    Association for the Advancement of Artificial Intelligence: AAAI Publications
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇