arXiv.org e-Print Archive

Not a member yet

623509 research outputs found

Sort by

Markov chains for the analysis of states of one-dimensional spin systems

Author: Yasinskaya D. N.
Panov Y. D.
Publication venue
Publication date: 18/11/2024
Field of study

We analyze frustrated states of the one-dimensional dilute Ising chain with charged interacting impurities of two types with mapping of the system to some Markov chain. We perform classification and reveal two types of Markov chains: periodic with period 2 and aperiodic. Frustrated phases with various types of chains have different properties. In phases with periodic Markov chains, long-range order is observed in the sublattice while another sublattice remains disordered. This results in a conjunction of the non-zero residual entropy and the infinite correlation length. In frustrated phases with aperiodic chains, there is no long-range order, and the correlation length remains finite. It is shown that under the magnetic field the most significant change in the spin chain structure corresponds to the change of the Markov chain type.19 pages, 4 figures, for associated pdf file, see https://journals.ioffe.ru/articles/viewPDF/5897

Intelligent Pooling: Proactive Resource Provisioning in Large-scale Cloud Service

Author: Ravikumar Deepak
Yeo Alex
Zhu Yiwen
Lakra Aditya
Nagulapalli Harsha
Ravindran Santhosh Kumar
Suh Steve
Dutta Niharika
Fogarty Andrew
Park Yoonjae
Khushalani Sumeet
Tarafdar Arijit
Parekh Kunal
Krishnan Subru
Publication venue
Publication date: 18/11/2024
Field of study

The proliferation of big data and analytic workloads has driven the need for cloud compute and cluster-based job processing. With Apache Spark, users can process terabytes of data at ease with hundreds of parallel executors. At Microsoft, we aim at providing a fast and succinct interface for users to run Spark applications, such as through creating simple notebook sessions by abstracting the underlying complexity of the cloud. Providing low latency access to Spark clusters and sessions is a challenging problem due to the large overheads of cluster creation and session startup. In this paper, we introduce Intelligent Pooling, a system for proactively provisioning compute resources to combat the aforementioned overheads. To reduce the COGS (cost-of-goods-sold), our system (1) predicts usage patterns using an innovative hybrid Machine Learning (ML) model with low latency and high accuracy; and (2) optimizes the pool size dynamically to meet customer demand while reducing extraneous COGS. The proposed system auto-tunes its hyper-parameters to balance between performance and operational cost with minimal to no engineering input. Evaluated using large-scale production data, Intelligent Pooling achieves up to 43% reduction in cluster idle time compared to static pooling when targeting 99% pool hit rate. Currently deployed in production, Intelligent Pooling is on track to save tens of million dollars in COGS per year as compared to traditional pre-provisioned pools

Enhancing Decision Transformer with Diffusion-Based Trajectory Branch Generation

Author: Liu Zhihong
Qian Long
Liu Zeyang
Wan Lipeng
Chen Xingyu
Lan Xuguang
Publication venue
Publication date: 18/11/2024
Field of study

Decision Transformer (DT) can learn effective policy from offline datasets by converting the offline reinforcement learning (RL) into a supervised sequence modeling task, where the trajectory elements are generated auto-regressively conditioned on the return-to-go (RTG).However, the sequence modeling learning approach tends to learn policies that converge on the sub-optimal trajectories within the dataset, for lack of bridging data to move to better trajectories, even if the condition is set to the highest RTG.To address this issue, we introduce Diffusion-Based Trajectory Branch Generation (BG), which expands the trajectories of the dataset with branches generated by a diffusion model.The trajectory branch is generated based on the segment of the trajectory within the dataset, and leads to trajectories with higher returns.We concatenate the generated branch with the trajectory segment as an expansion of the trajectory.After expanding, DT has more opportunities to learn policies to move to better trajectories, preventing it from converging to the sub-optimal trajectories.Empirically, after processing with BG, DT outperforms state-of-the-art sequence modeling methods on D4RL benchmark, demonstrating the effectiveness of adding branches to the dataset without further modifications

Visual-Semantic Graph Matching Net for Zero-Shot Learning

Author: Duan Bowen
Chen Shiming
Guo Yufei
Xie Guo-Sen
Ding Weiping
Wang Yisong
Publication venue
Publication date: 18/11/2024
Field of study

Zero-shot learning (ZSL) aims to leverage additional semantic information to recognize unseen classes. To transfer knowledge from seen to unseen classes, most ZSL methods often learn a shared embedding space by simply aligning visual embeddings with semantic prototypes. However, methods trained under this paradigm often struggle to learn robust embedding space because they align the two modalities in an isolated manner among classes, which ignore the crucial class relationship during the alignment process. To address the aforementioned challenges, this paper proposes a Visual-Semantic Graph Matching Net, termed as VSGMN, which leverages semantic relationships among classes to aid in visual-semantic embedding. VSGMN employs a Graph Build Network (GBN) and a Graph Matching Network (GMN) to achieve two-stage visual-semantic alignment. Specifically, GBN first utilizes an embedding-based approach to build visual and semantic graphs in the semantic space and align the embedding with its prototype for first-stage alignment. Additionally, to supplement unseen class relations in these graphs, GBN also build the unseen class nodes based on semantic relationships. In the second stage, GMN continuously integrates neighbor and cross-graph information into the constructed graph nodes, and aligns the node relationships between the two graphs under the class relationship constraint. Extensive experiments on three benchmark datasets demonstrate that VSGMN achieves superior performance in both conventional and generalized ZSL scenarios. The implementation of our VSGMN and experimental results are available at github: https://github.com/dbwfd/VSGMN15 pages, 6 figure

Superpixel-informed Implicit Neural Representation for Multi-Dimensional Data

Author: Li Jiayi
Zhao Xile
Wang Jianli
Wang Chao
Wang Min
Publication venue
Publication date: 18/11/2024
Field of study

Recently, implicit neural representations (INRs) have attracted increasing attention for multi-dimensional data recovery. However, INRs simply map coordinates via a multi-layer perception (MLP) to corresponding values, ignoring the inherent semantic information of the data. To leverage semantic priors from the data, we propose a novel Superpixel-informed INR (S-INR). Specifically, we suggest utilizing generalized superpixel instead of pixel as an alternative basic unit of INR for multi-dimensional data (e.g., images and weather data). The coordinates of generalized superpixels are first fed into exclusive attention-based MLPs, and then the intermediate results interact with a shared dictionary matrix. The elaborately designed modules in S-INR allow us to ingenuously exploit the semantic information within and across generalized superpixels. Extensive experiments on various applications validate the effectiveness and efficacy of our S-INR compared to state-of-the-art INR methods.Accepted at ECCV 2024, 18 pages, 7 figure

Continual Task Learning through Adaptive Policy Self-Composition

Author: Hu Shengchao
Zhou Yuhang
Fan Ziqing
Hu Jifeng
Shen Li
Zhang Ya
Tao Dacheng
Publication venue
Publication date: 18/11/2024
Field of study

Training a generalizable agent to continually learn a sequence of tasks from offline trajectories is a natural requirement for long-lived agents, yet remains a significant challenge for current offline reinforcement learning (RL) algorithms. Specifically, an agent must be able to rapidly adapt to new tasks using newly collected trajectories (plasticity), while retaining knowledge from previously learned tasks (stability). However, systematic analyses of this setting are scarce, and it remains unclear whether conventional continual learning (CL) methods are effective in continual offline RL (CORL) scenarios. In this study, we develop the Offline Continual World benchmark and demonstrate that traditional CL methods struggle with catastrophic forgetting, primarily due to the unique distribution shifts inherent to CORL scenarios. To address this challenge, we introduce CompoFormer, a structure-based continual transformer model that adaptively composes previous policies via a meta-policy network. Upon encountering a new task, CompoFormer leverages semantic correlations to selectively integrate relevant prior policies alongside newly trained parameters, thereby enhancing knowledge sharing and accelerating the learning process. Our experiments reveal that CompoFormer outperforms conventional CL methods, particularly in longer task sequences, showcasing a promising balance between plasticity and stability.21 pages, 8 figure

The Dark Side of Trust: Authority Citation-Driven Jailbreak Attacks on Large Language Models

Author: Yang Xikang
Tang Xuehai
Han Jizhong
Hu Songlin
Publication venue
Publication date: 18/11/2024
Field of study

The widespread deployment of large language models (LLMs) across various domains has showcased their immense potential while exposing significant safety vulnerabilities. A major concern is ensuring that LLM-generated content aligns with human values. Existing jailbreak techniques reveal how this alignment can be compromised through specific prompts or adversarial suffixes. In this study, we introduce a new threat: LLMs\u27 bias toward authority. While this inherent bias can improve the quality of outputs generated by LLMs, it also introduces a potential vulnerability, increasing the risk of producing harmful content. Notably, the biases in LLMs is the varying levels of trust given to different types of authoritative information in harmful queries. For example, malware development often favors trust GitHub. To better reveal the risks with LLM, we propose DarkCite, an adaptive authority citation matcher and generator designed for a black-box setting. DarkCite matches optimal citation types to specific risk types and generates authoritative citations relevant to harmful instructions, enabling more effective jailbreak attacks on aligned LLMs.Our experiments show that DarkCite achieves a higher attack success rate (e.g., LLama-2 at 76% versus 68%) than previous methods. To counter this risk, we propose an authenticity and harm verification defense strategy, raising the average defense pass rate (DPR) from 11% to 74%. More importantly, the ability to link citations to the content they encompass has become a foundational function in LLMs, amplifying the influence of LLMs\u27 bias toward authority

Multidimensional specific relative entropy between continuous martingales

Author: Backhoff Julio
Bellotto Edoardo Kimani
Publication venue
Publication date: 18/11/2024
Field of study

In continuous time, the laws of martingales tend to be singular to each other. Notably, N. Gantert introduced the concept of specific relative entropy between real-valued continuous martingales, defined as a scaling limit of finite-dimensional relative entropies, and showed that this quantity is non-trivial despite the aforementioned mutual singularity of martingale laws. Our main mathematical contribution is to extend this object, originally restricted to one-dimensional martingales, to multiple dimensions. Among other results, we establish that Gantert\u27s inequality, bounding the specific relative entropy with respect to Wiener measure from below by an explicit functional of the quadratic variation, essentially carries over to higher dimensions. We also prove that this lower bound is tight, in the sense that it is the convex lower semicontinuous envelope of the specific relative entropy. This is a novel result even in dimension one. Finally we establish closed-form expressions for the specific relative entropy in simple multidimensional examples

Causal Effect of Group Diversity on Redundancy and Coverage in Peer-Reviewing

Author: Goyal Navita
Stelmakh Ivan
Shah Nihar
Daumé III Hal
Publication venue
Publication date: 18/11/2024
Field of study

A large host of scientific journals and conferences solicit peer reviews from multiple reviewers for the same submission, aiming to gather a broader range of perspectives and mitigate individual biases. In this work, we reflect on the role of diversity in the slate of reviewers assigned to evaluate a submitted paper as a factor in diversifying perspectives and improving the utility of the peer-review process. We propose two measures for assessing review utility: review coverage -- reviews should cover most contents of the paper -- and review redundancy -- reviews should add information not already present in other reviews. We hypothesize that reviews from diverse reviewers will exhibit high coverage and low redundancy. We conduct a causal study of different measures of reviewer diversity on review coverage and redundancy using observational data from a peer-reviewed conference with approximately 5,000 submitted papers. Our study reveals disparate effects of different diversity measures on review coverage and redundancy. Our study finds that assigning a group of reviewers that are topically diverse, have different seniority levels, or have distinct publication networks leads to broader coverage of the paper or review criteria, but we find no evidence of an increase in coverage for reviewer slates with reviewers from diverse organizations or geographical locations. Reviewers from different organizations, seniority levels, topics, or publications networks (all except geographical diversity) lead to a decrease in redundancy in reviews. Furthermore, publication network-based diversity alone also helps bring in varying perspectives (that is, low redundancy), even within specific review criteria. Our study adopts a group decision-making perspective for reviewer assignments in peer review and suggests dimensions of diversity that can help guide the reviewer assignment process

Unveiling the Inflexibility of Adaptive Embedding in Traffic Forecasting

Author: Wang Hongjun
Chen Jiyuan
Zhang Lingyu
Jiang Renhe
Song Xuan
Publication venue
Publication date: 18/11/2024
Field of study

Spatiotemporal Graph Neural Networks (ST-GNNs) and Transformers have shown significant promise in traffic forecasting by effectively modeling temporal and spatial correlations. However, rapid urbanization in recent years has led to dynamic shifts in traffic patterns and travel demand, posing major challenges for accurate long-term traffic prediction. The generalization capability of ST-GNNs in extended temporal scenarios and cross-city applications remains largely unexplored. In this study, we evaluate state-of-the-art models on an extended traffic benchmark and observe substantial performance degradation in existing ST-GNNs over time, which we attribute to their limited inductive capabilities. Our analysis reveals that this degradation stems from an inability to adapt to evolving spatial relationships within urban environments. To address this limitation, we reconsider the design of adaptive embeddings and propose a Principal Component Analysis (PCA) embedding approach that enables models to adapt to new scenarios without retraining. We incorporate PCA embeddings into existing ST-GNN and Transformer architectures, achieving marked improvements in performance. Notably, PCA embeddings allow for flexibility in graph structures between training and testing, enabling models trained on one city to perform zero-shot predictions on other cities. This adaptability demonstrates the potential of PCA embeddings in enhancing the robustness and generalization of spatiotemporal models

375,182

full texts

623,509

metadata records

Updated in last 30 days.

arXiv.org e-Print Archive is based in United States

Access Repository Dashboard

Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇