HKU-Pasteur Research Pole

HKU Scholars Hub
Not a member yet
    299645 research outputs found

    AnyDoor: Zero-shot Image Customization with Region-to-region Reference

    No full text
    This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations with desired shapes. Instead of tuning parameters for each object, our model is trained only once and effortlessly generalizes to diverse object-scene combinations at the inference stage. Such a challenging zeroshot setting requires an adequate characterization of a certain object. To this end, we leverage the powerful self-supervised image encoder (i.e., DINOv2) to extract the discriminative dentity feature of the target object. Besides, we complement the identity feature with detail features, which are carefully designed to maintain appearance details yet allow versatile local variations (e.g., lighting, orientation, posture, etc.), supporting the object in favorably blending with different surroundings. We further propose to borrow knowledge from video datasets, where we can observe various forms (i.e., along the time axis) of a single object, leading to stronger model generalizability and robustness. Starting from the task of object insertion, we further extend the framework of AnyDoor to a general solution with regionto-region image reference. With the different definitions of the source region and target region, the tasks of object insertion, object removal, and image variation could be integrated into one model without introducing extra parameters. In addition, we investigate incorporating other conditions like the mask, pose skeleton, and depth map as additional guidance to achieve more controllable generation</p

    UniMatch V2: Pushing the Limit of Semi-Supervised Semantic Segmentation

    No full text
    Semi-supervised semantic segmentation (SSS) aims at learning rich visual knowledge from cheap unlabeled images to enhance semantic segmentation capability. Among recent works, UniMatch (Yang et al. 2023) improves its precedents tremendously by amplifying the practice of weak-to-strong consistency regularization. Subsequent works typically follow similar pipelines and propose various delicate designs. Despite the achieved progress, strangely, even in this flourishing era of numerous powerful vision models, almost all SSS works are still sticking to 1) using outdated ResNet encoders with small-scale ImageNet-1 K pre-training, and 2) evaluation on simple Pascal and Cityscapes datasets. In this work, we argue that, it is necessary to switch the baseline of SSS from ResNet-based encoders to more capable ViT-based encoders (e.g., DINOv2) that are pre-trained on massive data. A simple update on the encoder (even using 2× fewer parameters) can bring more significant improvement than careful method designs. Built on this competitive baseline, we present our upgraded and simplified UniMatch V2, inheriting the core spirit of weak-to-strong consistency from V1, but requiring less training cost and providing consistently better results. Additionally, witnessing the gradually saturated performance on Pascal and Cityscapes, we appeal that we should focus on more challenging benchmarks with complex taxonomy, such as ADE20K and COCO datasets.</p

    Dynamic Event-Triggered Adaptive Tracking Control for Switched Nonlinear System With Vanishing Control Gains

    No full text
    The asymptotic tracking control problem is investigated for nonstrict-feedback switched nonlinear systems by developing a dynamic event-triggered adaptive control technology. A key contribution is to design a hysteresis-type switching rule and an event-triggering-based switching controller to realize the asymptotically convergent target. First, by utilizing the Nussbaum-type function technique, the obstacle caused by unknown nonsmooth saturation nonlinearity is circumvented. Second, a novel gain-dependent switching signal is constructed, which ensures a time interval between any adjacent switching. Furthermore, a dynamic event-triggered schema is established, which allows the occurrence of asynchronous switching phenomenon generated by switching and event-triggering. An event-triggering-based switching controller is designed so that the adaptive asymptotic tracking control problem is solvable even when input saturation and vanishing control gains arise. Finally, simulation results with theoretical analysis are presented to validate the feasibility of the control algorithm. Note to Practitioners - With the development of artificial intelligence techniques, the studies of switched nonlinear systems have received widespread attention. This investigation is motivated by the vanishing control gains problem for switched nonlinear systems. We propose a co-design of a hysteresis-type gain-dependent switching rule and an event-triggering-based switching controller for switched nonlinear systems under input saturation and vanishing control gains. In existing results, the designed control signal is required to enter into the systems everywhere, while we relax the restriction. An improved dynamic event-triggered adaptive control framework has been established to obtain less conservative conditions further, where the control gain is allowed to vanish at some points. This study presents the method for practitioners interested in adaptive control design

    Hybrid-Generative Diffusion Models for Attack-Oriented Twin Migration in Vehicular Metaverses

    No full text
    The vehicular metaverse is envisioned as a blended immersive domain that promises to bring revolutionary changes to the automotive industry. As a core component of vehicular metaverses, Vehicle Twins (VTs) are digital twins that cover the entire life cycle of vehicles, providing immersive virtual services for Vehicular Metaverse Users (VMUs). Vehicles with limited resources offload the computationally intensive tasks of constructing and updating VTs to edge servers and migrate VTs between these servers, ensuring seamless and immersive experiences for VMUs. However, the high mobility of vehicles, uneven deployment of edge servers, and potential security threats pose challenges to achieving efficient and reliable VT migrations. To address these issues, we propose a secure and reliable VT migration framework in vehicular metaverses. Specifically, we design a two-layer trust evaluation model to comprehensively evaluate the reputation value of edge servers in the network communication and interaction layers. Then, we model the VT migration problem as a partially observable Markov decision process and design a hybrid-Generative Diffusion Model (GDM) algorithm based on deep reinforcement learning to generate optimal migration decisions by taking hybrid actions (i.e., continuous actions and discrete actions). Numerical results demonstrate that the hybrid-GDM algorithm outperforms the baseline algorithms, showing strong adaptability in various settings and highlighting the potential of the hybrid-GDM algorithm for addressing various optimization issues in vehicular metaverses

    Generative AI Based Secure Wireless Sensing for ISAC Networks

    No full text
    Integrated sensing and communications (ISAC) is one of the crucial technologies for 6G, and channel state information (CSI) based sensing serves as an essential part of ISAC. However, current research on ISAC focuses mainly on improving sensing performance, overlooking security issues, particularly the unauthorized sensing of users. Hence, this paper proposes a diffusion model based secure sensing system (DFSS). Specifically, we first propose a discrete conditional diffusion model to generate graphs with nodes and edges, which guides the ISAC system to appropriately activate wireless links and nodes, ensuring the sensing performance while minimizing the operation cost. Using the activated links and nodes, DFSS then employs the continuous conditional diffusion model to generate safeguarding signals, which are next modulated onto the pilot at the transmitter to mask fluctuations caused by user activities. As such, only authorized ISAC devices with the safeguarding signals can extract the true CSI for sensing, while unauthorized devices are unable to perform the effective sensing. Experiment results demonstrate that DFSS can reduce the activity recognition accuracy of the unauthorized devices by approximately 70%, effectively shield the user from the illegitimate surveillance

    SecureShare: Blockchain based Secure and Verifiable Knowledge Sharing for AI-Generated Content (AIGC) Services

    No full text
    Benefiting from the rapidly expanding Internet of Things (IoT) data and powerful computing devices, AI-generated content (AIGC) trains models with vast knowledge to provide automated content generation services. Sharing knowledge through the ciphertext-policy attribute-based encryption (CP-ABE) algorithm is beneficial for training high-quality AIGC models to offer better services. However, existing CP-ABE sharing schemes often involve untrusted third parties, which can result in issues such as knowledge deletion, unverifiable access, and single points of failure. To address these challenges, some blockchain-based sharing schemes have been developed. However, they still face privacy leakage problems. In this paper, we propose SecureShare, a secure and verifiable knowledge sharing scheme based on a consortium blockchain for AIGC services. We begin by outlining a blockchain knowledge sharing architecture and optimizing the Delegated Proof of Stake (DPOS) committee node selection method to ensure that entities can achieve verifiable access control. Additionally, to achieve fine-grained access to knowledge ciphertext while preserving privacy, we propose a CP-ABE scheme with Policy Hiding, attribute privacy preservation, and Revocation, referred to as PHR-CP-ABE. PHR-CP-ABE ensures the privacy of access policies and attributes, and users whose attributes have been revoked cannot decrypt knowledge further. A case study on Dall-E clearly illustrates the operational mechanism of the proposed scheme. We provide theoretical analysis of the security of both the AIGC knowledge sharing scheme and PHR-CP-ABE. Through extensive performance analysis and comparisons with existing schemes, our approach demonstrates significant advantages in terms of computation and communication overhead

    StyleAdapter: A Unified Stylized Image Generation Model

    No full text
    This work focuses on generating high-quality images with specific style of reference images and content of provided textual descriptions. Current leading algorithms, i.e., DreamBooth and LoRA, require fine-tuning for each style, leading to time-consuming and computationally expensive processes. In this work, we propose StyleAdapter, a unified stylized image generation model capable of producing a variety of stylized images that match both the content of a given prompt and the style of reference images, without the need for per-style fine-tuning. It introduces a two-path cross-attention (TPCA) module to separately process style information and textual prompt, which cooperate with a semantic suppressing vision model (SSVM) to suppress the semantic content of style images. In this way, it can ensure that the prompt maintains control over the content of the generated images, while also mitigating the negative impact of semantic information in style references. This results in the content of the generated image adhering to the prompt, and its style aligning with the style references. Besides, our StyleAdapter can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet, to attain a more controllable and stable generation process. Extensive experiments demonstrate the superiority of our method over previous works.</p

    Mixture of Experts-augmented Deep Unfolding for Activity Detection in IRS-aided Systems

    No full text
    In the realm of activity detection for massive machine-type communications, intelligent reflecting surfaces (IRS) have shown significant potential in enhancing coverage for devices lacking direct connections to the base station (BS). However, traditional activity detection methods are typically designed for a single type of channel model, which does not reflect the complexities of real-world scenarios, particularly in systems incorporating IRS. To address this challenge, this paper introduces a novel approach that combines model-driven deep unfolding with a mixture of experts (MoE) framework. By automatically selecting one of three expert designs and applying it to the unfolded projected gradient method, our approach eliminates the need for prior knowledge of channel types between devices and the BS. Simulation results demonstrate that the proposed MoE-augmented deep unfolding method surpasses the traditional covariance-based method and black-box neural network design, delivering superior detection performance under mixed channel fading conditions

    Learning to Optimize Resource Allocation in Dynamic Wireless Environments: Embracing the New While Engaging the Old

    No full text
    Wireless resource allocation is a critical component in modern communication systems, and deep neural networks (DNNs) have shown great promise in addressing this challenge. However, the conventional DNNs assume that testing data follows the same distribution as that of the training data, which is incongruent with the dynamic nature of real-world wireless environments. This paper introduces a new training algorithm designed specifically for dynamic wireless environments where channel distribution exhibits variability. This method helps DNNs adapt to new environments while preserving previously learned information. The proposed approach distinguishes itself by updating the DNN parameters in the null space of the low-rank covariance of previous data, which reduces memory needs and boosts training efficiency. Additionally, to counter the problem of DNNs hitting their model capacity during continuous adaptation, a selective forgetting mechanism is proposed. This mechanism allows DNNs to discard the unimportant knowledge over time, freeing up model capacity for more effective adaptation. The effectiveness of the algorithm is validated by integrating it with graph neural networks and multilayer perceptrons for weighted sum-rate maximization. Through a comprehensive evaluation that includes synthetic and ray-tracing-based datasets, superior performance is demonstrated compared to existing methods

    Evaluation of Periodontal Infrabony Defect Topography via CBCT and Comparisons with Direct Intrasurgical Measurements

    Full text link
    Background: Two-dimensional periapical radiographs (PAs) only offer limited information regarding three-dimensional periodontal infrabony defects. In contrast, cone beam computed tomography (CBCT) enables visualization of the entire defect morphology. This study aimed to evaluate the agreement between CBCT and direct intrasurgical measurements (ISs) regarding the characteristics of infrabony defects, including measurements of defect depth, width, the type of defect (one-wall, two-wall, three-wall), and defect extension. Methods: Intrasurgical and radiographic assessments were performed by two calibrated examiners on 26 infrabony defects in 17 patients who underwent periodontal surgery. The defect depth, width, type, and extension were compared between intrasurgical observations and PA or CBCT findings. The CBCT assessment was performed mainly using axial reconstructions. Angle measurements were compared between CBCT and PAs. Results: The mean differences between CBCT and intrasurgical measurements were −0.11 ± 0.49 mm for depth and −0.07 ± 0.41 mm for width, with no significant differences. The ICC values were 0.938 and 0.923 for depth and width, respectively. The mean difference in width between PAs and ISs was significantly different (−0.36 ± 0.73 mm; p = 0.002). CBCT demonstrated high agreement with intrasurgical observations for defect type (κ = 0.819) and defect extension (κ = 0.855), while lower agreements were found for PAs. Conclusions: CBCT is a valid assessment modality for infrabony defects. It demonstrated strong agreement with ISs—as the gold standard—for depth and width measurements, and its agreement with ISs regarding defect type and extension appeared to surpass that of PAs.published_or_final_versio

    38,105

    full texts

    299,645

    metadata records
    Updated in last 30 days.
    HKU Scholars Hub
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇