Hong Kong University of Science and Technology
Hong Kong University of Science and Technology Institutional RepositoryNot a member yet
162821 research outputs found
Sort by
Transformer-Based Scalable Multi-Agent Reinforcement Learning for Joint Resource Optimization in Cloud–Edge–End Video Streaming Systems
Cloud-edge-end (CEE) collaboration has demonstrated significant potential in video streaming analysis. However, dynamic wireless environments, lack of incentive mechanisms, and constrained resources (e.g., transmission power, bandwidth, and computing resources) remain the primary bottlenecks for achieving efficient CEE-based video processing. To address these challenges, this paper focuses on a multi-user CEE scenario with dynamic wireless channels and investigates the joint optimization of adaptive incentives, cooperative offloading, and resource allocation. We propose a novel framework, called JROC, to motivate edge devices (EDs) to participate in collaborative computation within CEE, thereby enhancing system utility. Specifically, JROC encompasses a smart contract-based adaptive incentive mechanism and an Adaptive Transformer-based multi-agent reinforcement Learning Algorithm (ATLA). The incentive mechanism leverages blockchain to ensure trustworthy and automated incentive distribution, while ATLA captures long-term dependencies and global state features among agents to guide video tasks in dynamic environments through adaptive compression, cooperative offloading, and resource allocation. Moreover, we discuss key steps for deploying the proposed algorithm in a real CEE prototype, including lightweight actor inference at the terminal side and training at the edge. Experimental results based on a real-world operator dataset show that, compared to existing methods, JROC achieves higher long-term system utility while maintaining favorable scalability, thereby validating its effectiveness in resource-constrained and under-incentivized CEE video streaming scenarios.</p
STAR++: Region-aware Conditional Semantics via Interpretable Side Information for Zero-Shot Skeleton Action Recognition
Zero-shot skeleton action recognition endeavors to classify novel action categories by transferring previously learned seen skeleton-semantic priors to unseen categories. However, current methods struggle to distinguish highly similar action categories, primarily due to the coarse-grained cross-modal alignment and non-discriminative representation space. To address these issues, we propose STAR++, a novel framework that aligns skeleton and semantics in a fine-grained and conditional manner. The key idea is to first establish region-level correspondences between body parts and semantic cues, and then utilize these local alignments to inform a global alignment process. This design is inspired by human visual cognition, which first attends to crucial local details before perceiving the broader scene. Concretely, we refine both skeleton and semantic representations with a dual-prompt attention mechanism driven by the structural decomposition of the human body and side information generated by a large language model (LLM). This encourages skeleton representations to be more compact within each class and semantic embeddings to be more separable across classes, which helps resolve ambiguity between highly similar actions and provides better interpretability of how unseen actions are perceived. Furthermore, we construct a region-aware holistic fusion module that aggregates these fine-grained features into a unified representation, yielding more discriminative holistic representations. Finally, the global alignment is conditioned on region-aware semantics feedback derived from fine-grained alignment, forming a conditional process that achieves more effective cross-modal alignment. Extensive experiments on four mainstream benchmarks demonstrate that our method achieves state-of-the-art performance in the zero-shot learning (ZSL) and generalized zero-shot learning (GZSL) settings.</p
A Dual-Source Seven-Level Switched-Capacitor Inverter With Common-Ground Structure and Scalability
This article presents a dual-source, seven-level switched-capacitor inverter that comprises nine transistors, one diode, two capacitors, and two dc voltage sources. With this structure, the common-ground structure effectively addresses the leakage current issue commonly encountered in photovoltaic applications. Meanwhile, the scalability of the inverter allows it to be adapted to different numbers of input sources and at the same time improves the quality of the output waveform, allowing it to be used in multiinput applications. By employing a phase opposition disposition PWM algorithm, both capacitor ripple and low harmonic components of the output voltage are effectively suppressed. Further, a theoretical analysis of capacitors’ charging process, system’s leakage current and power losses are provided. When compared to existing solutions, the proposed structure stands out due to its high device utilization, scalability and common-ground design. Finally, simulation and experimental results demonstrate that the proposed inverter has self-balancing capacitor voltages without any use of voltage sensor or control strategy, and good performance under both steady and dynamic load conditions.</p
Correction to: Dissecting the contributions of organic nitrogen aerosols to global atmospheric nitrogen deposition and implications for ecosystems
In the article by Li et al., ‘Dissecting the contributions of organic nitrogen aerosols to global atmospheric nitrogen deposition and implications for ecosystems’, National Science Review 2023; 10: nwad244, https://doi.org/10.1093/nsr/nwad244, errors were identified concerning Fig. 4G, Table 1 and the corresponding values in the text. The specific corrections are as follows: 1. Global atmospheric burden of ON: Our original calculation did not properly account for the vertical variation of air density, leading to an overestimation of the ON burden (Column 2 in Table 1). We correct by using the correct air density profile. 2. Imine secondary organic nitrogen (SON) formation: In the original version, we calculated imine SON formation assuming constant reactant abundance. We update this calculation by using pseudo-first-order rate constants, where ammonium (NH4+) is assumed constant, while glyoxal and methylglyoxal decrease over timesteps. The corrected Table 1 and corresponding text are presented below. Corrections involve the values of atmospheric burden (second column from left in Table 1, and those in the main text; both highlighted in bold). (Table presented). GLOBAL BUDGET OF ATMOSPHERIC ON AND CONTRIBUTION TO ATMOSPHERIC TN DEPOSITION Table 1 summarizes the global budget of atmospheric ON as simulated by our model. The total atmospheric burden of ON was 0.4 Tg N (range in sensitivity experiments was 0.26 Tg N to 0.56 Tg N), including 0.2 Tg N of ONg and 0.2 Tg N of ONp. ONg species were mostly chemically produced in the atmosphere as acyl peroxy nitrates (e.g. peroxyacetyl nitrate) and non-acyl peroxy nitrates (e.g. methyl peroxy nitrate), and all ONg species had limited solubility [2]. As such, ONg were mainly removed from the atmosphere by thermal decomposition, photolysis or OH oxidation [26,60], with deposition accounting for a mere 1% to 2% of its global sink [61]. Globally, ONg only constituted 9% of the total atmospheric ON deposition. In contrast, ONp constituted only 50% of the global atmospheric ON burden but dominated the global atmospheric ON deposition (91%). Of the 0.2 Tg N global atmospheric ONp burden, 87% (0.18 Tg N) was in the fine mode (ONfp). ONcp constituted only 13% (0.03 Tg N) of the global ONp burden because of its rapid deposition. The corrected Fig. 4 is presented below. There is a slight difference in Fig. 4G while the other sub-figures remain unchanged. The publicly accessible model code has been updated at the original repository link: https://doi.org/10.57760/sciencedb.o00005.00024 We apologize for any inconvenience caused. These corrections do not affect the conclusions of the article.</p
Electrode Net: tailoring deep learning with signed distance field for fast and accurate multiscale design of porous electrodes
Designing novel porous electrodes with desirable merits is the key to advancing next-generation high-performing flow cells such as fuel cells, water electrolyzers, and flow batteries. However, engineering porous electrodes rationally and methodically remains challenging because it demands an in-depth understanding of their complex microstructures with extremely high computational costs. In this study, we develop a tailored deep learning framework (named “Electrode Net”, a 3-dimensional convolutional neural network with signed distance field) to efficiently and accurately predict the anisotropic transport properties of porous electrodes. A comprehensive dataset consisting of 15,433 real and generated geometric samples and their corresponding anisotropic transport properties is constructed, using an experimentally and numerically validated pore-scale model. Electrode Net can significantly accelerate the design of porous electrodes by reducing the computational cost by up to 96% and with precise prediction (R-squared range: 0.95–0.99) of the porosity, tortuosity, and permeability. The outstanding generalizability of our model is further confirmed by accurate prediction of three practical flow cell electrodes in fuel cells, water electrolyzers, and flow batteries. Furthermore, we demonstrate a practical multiscale electrode design by incorporating the pore-scale anisotropic transport properties from Electrode Net into cell-scale simulations, enabling the rational and efficient optimization of three essential parameters of the gas diffusion layers in proton exchange membrane fuel cells.</p