Association for the Advancement of Artificial Intelligence: AAAI Publications
Not a member yet
26155 research outputs found
Sort by
Designing Specialized Two-Dimensional Graph Spectral Filters for Spatial-Temporal Graph Modeling
Spatial-temporal graph modeling is challenging due to the diverse node interactions across spatial and temporal dimensions. Recent studies typically adopt Graph Neural Networks (GNNs) to perform node-level aggregation at different time steps, acting as a series of low-pass graph spectral filters, for node interaction modeling. However, these filters, confined to the spatial dimension, are ill-suited for processing signals of nodes with inherent spatial-temporal interdependencies. Moreover, oversimplified low-pass filtering fails to fully exploit information from diverse node interactions. To address these issues, we propose a Spatial-Temporal Spectral Graph Neural Network (STSGNN), which designs specialized two-dimensional (2-D) graph spectral filters for comprehensive spatial-temporal graph modeling. First, based on the normalized Laplacian spectrum of spatial and temporal graphs, we extend the existing graph spectral theory from a univariate spatial dimension to a bivariate spatial-temporal dimension through a 2-D Discrete Graph Fourier Transform (2-D DGFT). Then, we leverage the bivariate Bernstein polynomial approximation, with learned basis coefficients, to design 2-D filters with specialized spectral properties for unified spatial-temporal signal filtering. Finally, the filtered signals, with refined spatial-temporal representations, are fed into well-designed pyramidal gated convolution modules to acquire multiple ranges of spatial-temporal dependencies. Experiments on traffic and meteorological prediction tasks demonstrate that STSGNN achieves state-of-the-art performance. Additionally, we visualize the 2-D filters learned from inputs with distinct spatial-temporal characteristics to enhance the model's interpretability
POI-Enhancer: An LLM-based Semantic Enhancement Framework for POI Representation Learning
POI representation learning plays a crucial role in handling tasks related to user mobility data. Recent studies have shown that enriching POI representations with multimodal information can significantly enhance their task performance.
Previously, the textual information incorporated into POI representations typically involved only POI categories or check-in content, leading to relatively weak textual features in existing methods.
In contrast, large language models (LLMs) trained on extensive text data have been found to possess rich textual knowledge.
However leveraging such knowledge to enhance POI representation learning presents two key challenges: first, how to extract POI-related knowledge from LLMs effectively, and second, how to integrate the extracted information to enhance POI representations.
To address these challenges, we propose POI-Enhancer, a portable framework that leverages LLMs to improve POI representations produced by classic POI learning models. We first design three specialized prompts to extract semantic information from LLMs efficiently. Then, the Dual Feature Alignment module enhances the quality of the extracted information, while the Semantic Feature Fusion module preserves its integrity. The Cross Attention Fusion module then fully adaptively integrates such high-quality information into POI representations and Multi-View Contrastive Learning further injects human-understandable semantic information into these representations. Extensive experiments on three real-world datasets demonstrate the effectiveness of our framework, showing significant improvements across all baseline representations
Representation Space Augmentation for Effective Self-Supervised Learning on Tabular Data
Tabular data, widely used across industries, remains underexplored in deep learning. Self-supervised learning (SSL) shows promise for pre-training deep neural networks (DNNs) on tabular data, but its potential is hindered by challenges in designing suitable augmentations. Unlike image and text data, where SSL leverages inherent spatial or semantic structures, tabular data lacks such explicit structure. This makes traditional input-level augmentations, like modifying or removing features, less effective due to difficulties in balancing critical information preservation with variability. To address these challenges, we propose RaTab, a novel method that shifts augmentation from input-level to representation-level using matrix factorization, specifically truncated SVD. This approach preserves essential data structures while generating diverse representations by applying dropout at various stages of the representation, thereby significantly enhancing SSL performance for tabular data
Mixed-Curvature Multi-Modal Knowledge Graph Completion
Multi-modal Knowledge Graph Completion (KGC), which aims to enrich knowledge graph embeddings by incorporating images and text as supplementary information alongside triplets, is an significant task in learning KGs. Existing multi-modal KGC methods mainly focus on modalitylevel fusion, neglecting the importance of modeling the complex structures, such as hierarchical and circular patterns. To address this, we propose a Mixed-Curvature multi-modal Knowledge Graph Completion method (MCKGC) that embeds the information into three single-curvature spaces, including hyperbolic space, hyperspherical space, and Euclidean space, and incorporates multi-modal information into a mixed space. Specifically, MCKGC consists of Modality Information Mixed-Curvature Module (MIMCM) and Progressive Fusion Module (PFM). To improve the expressive ability for different modalities, MIMCM introduces multi-modal information into three single-curvature spaces for interaction. Then, to extract useful information from different modalities and capture the complex structure from the geometric information, PFM implements a progressive fusion strategy by utilizing modality-level and space-level gates to adaptively incorporate the information from different spaces. Extensive experiments on three widely used benchmarks demonstrate the effectiveness of our method
Behavior Importance-Aware Graph Neural Architecture Search for Cross-Domain Recommendation
Cross-domain recommendation (CDR) mitigates data sparsity and cold-start issues in recommendation systems. While recent CDR approaches using graph neural networks (GNNs) capture complex user-item interactions, they rely on manually designed architectures that are often suboptimal and labor-intensive. Additionally, extracting valuable behavioral information from source domains to improve target domain recommendations remains challenging. To address these challenges, we propose Behavior importance-aware Graph Neural Architecture Search (BiGNAS), a framework that jointly optimizes GNN architecture and data importance for CDR. BiGNAS introduces two key components: a Cross-Domain Customized Supernetwork and a Graph-Based Behavior Importance Perceptron. The supernetwork, as a one-shot, retrain-free module, automatically searches the optimal GNN architecture for each domain without the need for retraining. The perceptron uses auxiliary learning to dynamically assess the importance of source domain behaviors, thereby improving target domain recommendations. Extensive experiments on benchmark CDR datasets and a large-scale industry advertising dataset demonstrate that BiGNAS consistently outperforms state-of-the-art baselines. To the best of our knowledge, this is the first work to jointly optimize GNN architecture and behavior data importance for cross-domain recommendation
MSR: A Multifaceted Self-Retrieval Framework for Microscopic Cascade Prediction
The microscopic cascade prediction task has wide applications in downstream areas like ''rumor detection''.
Its goal is to forecast the diffusion routines of information cascade within networks.
Existing works typically formulate it as a classification task, which fails to well align with the Social Homophily assumption, as it just use the features of ''infected'' users while neglecting those of ''uninfected'' users in representation learning.
Moreover, these methods focus primarily on social relationships, thereby dismissing other vital dimensions like users' historical behavior and the underlying preferences behind it.
To address these challenges, we introduce the MSR (Multifaceted Self-Retrieval) framework.
During encoding, in addition to the existing social graph, we construct a preference graph to represent ''behavioral preferences'' and further propose a modified multi-channel GRAU for multi-view analysis of cascade phenomenon.
For decoding, our approach diverges from classification-based methods by reformulating the task as an information retrieval problem that predicts the target user with similarity measures.
Empirical evaluations on public datasets demonstrate that this framework significantly outperforms baselines on Hits@κ and MAP@κ, affirming its enhanced ability
Bridging the User-side Knowledge Gap in Knowledge-aware Recommendations with Large Language Models
In recent years, knowledge graphs have been integrated into recommender systems as item-side auxiliary information, enhancing recommendation accuracy. However, constructing and integrating structural user-side knowledge remains a significant challenge due to the improper granularity and inherent scarcity of user-side features. Recent advancements in Large Language Models (LLMs) offer the potential to bridge this gap by leveraging their human behavior understanding and extensive real-world knowledge. Nevertheless, integrating LLM-generated information into recommender systems presents challenges, including the risk of noisy information and the need for additional knowledge transfer. In this paper, we propose an LLM-based user-side knowledge inference method alongside a carefully designed recommendation framework to address these challenges. Our approach employs LLMs to infer user interests based on historical behaviors, integrating this user-side information with item-side and collaborative data to construct a hybrid structure: the Collaborative Interest Knowledge Graph (CIKG). Furthermore, we propose a CIKG-based recommendation framework that includes a user interest reconstruction module and a cross-domain contrastive learning module to mitigate potential noise and facilitate knowledge transfer. We conduct extensive experiments on three real-world datasets to validate the effectiveness of our method. Our approach achieves state-of-the-art performance compared to competitive baselines, particularly for users with sparse interactions
Time Series Supplier Allocation via Deep Black-Litterman Model
As a typical problem of Spatiotemporal Resource Management, Time Series Supplier Allocation (TSSA) poses a complex NP-hard challenge, aimed at refining future order dispatching strategies to satisfy the trade-off between demands and maximum supply. The Black-Litterman (BL) model, which comes from financial portfolio management, offers a new perspective for the TSSA by balancing expected returns against insufficient supply risks. However, the BL model is not only constrained by manually constructed perspective matrices and spatio-temporal market dynamics but also restricted by the absence of supervisory signals and unreliable supplier data. To solve these limitations, we introduce the pioneering Deep Black-Litterman Model for TSSA, which innovatively adapts the BL model from financial domain to supply chain context. Specifically, DBLM leverages Spatio-Temporal Graph Neural Networks (STGNNs) to capture spatio-temporal dependencies for automatically generating future perspective matrices. Moreover, a novel Spearman rank correlation is designed as our DBLM supervise signal to navigate complex risks and interactions of the supplier. Finally, DBLM further uses a masking mechanism to counteract the bias of unreliable data, thus improving precision and reliability. Extensive experiments on two datasets demonstrate significant improvements of DBLM on TSSA
HLMEA: Unsupervised Entity Alignment Based on Hybrid Language Models
Entity alignment (EA) is crucial for integrating knowledge graphs (KGs) constructed from diverse sources. Conventional unsupervised EA approaches attempt to eliminate human intervention but often suffer from accuracy limitations. With the rise of large language models (LLMs), leveraging their capabilities for EA presents a promising direction. However, it introduces new challenges: formulating the LLM-based EA problem and extracting the background knowledge in LLMs to realize EA without human intervention. This paper proposes HLMEA, a novel hybrid language model-based unsupervised EA method. HLMEA formulates the EA task into a filtering and single-choice problem and synergistically integrates small language models (SLMs) and LLMs. Specifically, SLMs filter candidate entities based on textual representations generated from KG triples. Then, LLMs refine this selection to identify the most semantically aligned entities. An iterative self-training mechanism allows SLMs to distill knowledge from LLM outputs, enhancing the EA ability of hybrid language models in subsequent rounds cooperatively. We also conducted extensive experiments on benchmark datasets to evaluate HLMEA's performance. The results demonstrate that HLMEA significantly outperforms unsupervised and even supervised EA baselines, proving its potential for scalable and effective EA across large KGs. The code and data are available at \url{https://github.com/xnjin-ai/HLMEA}
Unified Graph Neural Networks Pre-training for Multi-domain Graphs
Graph Neural Networks (GNNs) have proven effective and typically benefit from pre-training on accessible graphs to enhance performance on tasks with limited labeled data. However, existing GNNs are constrained by the ``one-domain-one-model'' limitation, which restricts their effectiveness across diverse graph domains. In this paper, we tackle this problem by developing a method called Multi-Domain Pre-training for a Unified GNN Model (MDP-GNN). This method is based on the philosophical notion that everything is interconnected, suggesting that a latent meta-domain exists to encompass the diverse graph domains and their interconnections. MDP-GNN seeks to identify and utilize this meta-domain to train a unified GNN model through three core strategies. Firstly, it integrates node feature semantics from different domains to create unified representations. Secondly, it employs a bi-level learning strategy to build a domain-synthesized network that identifies latent connections to facilitate cross-domain knowledge transfer. Thirdly, it uses Wasserstein distance to map diverse domains into the common meta-domain for graph distribution alignment. We validate the effectiveness of MDP-GNN through theoretical analysis and extensive experiments on four real-world graph datasets, showing its superiority in enhancing GNN performance across diverse domains