arXiv.org e-Print Archive

arXiv.org e-Print Archive
Not a member yet
    623509 research outputs found

    Relevance-guided Audio Visual Fusion for Video Saliency Prediction

    No full text
    Audio data, often synchronized with video frames, plays a crucial role in guiding the audience\u27s visual attention. Incorporating audio information into video saliency prediction tasks can enhance the prediction of human visual behavior. However, existing audio-visual saliency prediction methods often directly fuse audio and visual features, which ignore the possibility of inconsistency between the two modalities, such as when the audio serves as background music. To address this issue, we propose a novel relevance-guided audio-visual saliency prediction network dubbed AVRSP. Specifically, the Relevance-guided Audio-Visual feature Fusion module (RAVF) dynamically adjusts the retention of audio features based on the semantic relevance between audio and visual elements, thereby refining the integration process with visual features. Furthermore, the Multi-scale feature Synergy (MS) module integrates visual features from different encoding stages, enhancing the network\u27s ability to represent objects at various scales. The Multi-scale Regulator Gate (MRG) could transfer crucial fusion information to visual features, thus optimizing the utilization of multi-scale visual features. Extensive experiments on six audio-visual eye movement datasets have demonstrated that our AVRSP network achieves competitive performance in audio-visual saliency prediction

    The truncated univariate rational moment problem

    No full text
    Given a closed subset KK in R\mathbb{R}, the rational KK-truncated moment problem (KK-RTMP) asks to characterize the existence of a positive Borel measure μμ, supported on KK, such that a linear functional L\mathcal{L}, defined on all rational functions of the form fq\frac{f}{q}, where qq is a fixed polynomial with all real zeros of even order and ff is any real polynomial of degree at most 2k2k, is an integration with respect to μμ. The case of a compact set KK was solved by Chandler in 1994, but there is no argument that ensures that μμ vanishes on all real zeros of qq. An obvious necessary condition for the solvability of the KK-RTMP is that L\mathcal{L} is nonnegative on every ff satisfying fK0f|_{K}\geq 0. If L\mathcal{L} is strictly positive on every 0fK00\neq f|_{K}\geq 0, we add the missing argument from Chandler\u27s solution and also bound the number of atoms in a minimal representing measure. We show by an example that nonnegativity of L\mathcal{L} is not sufficient and add the missing conditions to the solution. We also solve the KK-RTMP for unbounded KK and derive the solutions to the strong truncated Hamburger moment problem and the truncated moment problem on the unit circle as special cases.18 page

    Real-Time Fitness Exercise Classification and Counting from Video Frames

    No full text
    This paper introduces a novel method for real-time exercise classification using a Bidirectional Long Short-Term Memory (BiLSTM) neural network. Existing exercise recognition approaches often rely on synthetic datasets, raw coordinate inputs sensitive to user and camera variations, and fail to fully exploit the temporal dependencies in exercise movements. These issues limit their generalizability and robustness in real-world conditions, where lighting, camera angles, and user body types vary. To address these challenges, we propose a BiLSTM-based model that leverages invariant features, such as joint angles, alongside raw coordinates. By using both angles and (x, y, z) coordinates, the model adapts to changes in perspective, user positioning, and body differences, improving generalization. Training on 30-frame sequences enables the BiLSTM to capture the temporal context of exercises and recognize patterns evolving over time. We compiled a dataset combining synthetic data from the InfiniteRep dataset and real-world videos from Kaggle and other sources. This dataset includes four common exercises: squat, push-up, shoulder press, and bicep curl. The model was trained and validated on these diverse datasets, achieving an accuracy of over 99% on the test set. To assess generalizability, the model was tested on 2 separate test sets representative of typical usage conditions. Comparisons with the previous approach from the literature are present in the result section showing that the proposed model is the best-performing one. The classifier is integrated into a web application providing real-time exercise classification and repetition counting without manual exercise selection. Demo and datasets are available at the following GitHub Repository: https://github.com/RiccardoRiccio/Fitness-AI-Trainer-With-Automatic-Exercise-Recognition-and-Counting

    Twin Peak Method for Estimating Tissue Viscoelasticity using Shear Wave Elastography

    No full text
    Tissue viscoelasticity is becoming an increasingly useful biomarker beyond elasticity and can theoretically be estimated using shear wave elastography (SWE), by inverting the propagation and attenuation characteristics of shear waves. Estimating viscosity is often more difficult than elasticity because attenuation, the main effect of viscosity, leads to poor signal-to-noise ratio of the shear wave motion. In the present work, we provide an alternative to existing methods of viscoelasticity estimation that is robust against noise. The method minimizes the difference between simulated and measured versions of two sets of peaks (twin peaks) in the frequency-wavenumber domain, obtained first by traversing through each frequency and then by traversing through each wavenumber. The slopes and deviation of the twin peaks are sensitive to elasticity and viscosity respectively, leading to the effectiveness of the proposed inversion algorithm for characterizing mechanical properties. This expected effectiveness is confirmed through in silico verification, followed by ex vivo validation and in vivo application, indicating that the proposed approach can be effectively used in accurately estimating viscoelasticity, thus potentially contributing to the development of enhanced biomarkers.18 pages, 11 figure

    Analysis of Generalized Hebbian Learning Algorithm for Neuromorphic Hardware Using Spinnaker

    No full text
    Neuromorphic computing, inspired by biological neural networks, has emerged as a promising approach for solving complex machine learning tasks with greater efficiency and lower power consumption. The integration of biologically plausible learning algorithms, such as the Generalized Hebbian Algorithm (GHA), is key to enhancing the performance of neuromorphic systems. In this paper, we explore the application of GHA in large-scale neuromorphic platforms, specifically SpiNNaker, a hardware designed to simulate large neural networks. Our results demonstrate significant improvements in classification accuracy, showcasing the potential of biologically inspired learning algorithms in advancing the field of neuromorphic computing.8 pages, 1 figure, 7 table

    Topology in 2D non-Abelian Lattice Gauge Theories

    No full text
    In two dimensions, U(Nc)U(N_c) gauge theories exhibit a non-trivial topological structure, while SU(Nc)SU(N_c) theories are topologically trivial. Hence, for G=U(Nc)G = U(N_c) the phase space is divided into topological sectors, characterized by a topological index (a.k.a. ``topological charge\u27\u27). These sectors are separated by action barriers, which diverge if the lattice spacing is taken small, resulting in an algorithmic problem known as ``topological freezing\u27\u27. We study these theories in various box sizes and at various couplings. With the help of gradient flow we derive instanton-like solutions for 2D U(Nc)U(N_c) theory with a specific focus on the case of Nc=2N_c = 2.8 pages, 7 figures, 1 table. Proceedings of the 41st International Symposium on Lattice Field Theory (LATTICE2024), 28 July - 3 August, 2024 Liverpool, U

    Generalized Treatment of Energy Accommodation in Gas-Surface Interactions for Satellite Aerodynamics Applications

    No full text
    In the context of satellite aerodynamics in the Very-Low-Earth-Orbit (VLEO) regime, accurate modeling of gas-surface interactions (GSI) is crucial for determining aerodynamic forces and torques. Common models such as Sentman\u27s assume that gas particles are reflected diffusely from a surface, which leads to the incorporation of energy accommodation into the model. This technical note discusses the limitations of existing approaches for handling energy accommodation and provides a generalized treatment thereof that is valid for any molecular speed ratio. A new general expression for the temperature ratio of reflected to impinging particles is derived, which, when used in a GSI model, retains its validity even in hypothermal flows. Additionally, a simplified hyperthermal approximation is presented, proven to be an asymptote of the general expression, and shown to be an improvement upon existing approximations by comparison for a realistic VLEO scenario. The results contribute to a better understanding and modeling of GSI, potentially benefiting scientific investigations and operational applications in satellite aerodynamics

    Improved PIR Schemes using Matching Vectors and Derivatives

    No full text
    In this paper, we construct new t-server Private Information Retrieval (PIR) schemes with communication complexity subpolynomial in the previously best known, for all but finitely many t. Our results are based on combining derivatives (in the spirit of Woodruff-Yekhanin) with the Matching Vector based PIRs of Yekhanin and Efremenko. Previously such a combination was achieved in an ingenious way by Dvir and Gopi, using polynomials and derivatives over certain exotic rings, en route to their fundamental result giving the first 2-server PIR with subpolynomial communication. Our improved PIRs are based on two ingredients: - We develop a new and direct approach to combine derivatives with Matching Vector based PIRs. This approach is much simpler than that of Dvir-Gopi: it works over the same field as the original PIRs, and only uses elementary properties of polynomials and derivatives. - A key subproblem that arises in the above approach is a higher-order polynomial interpolation problem. We show how sparse S-decoding polynomials , a powerful tool from the original constructions of Matching Vector PIRs, can be used to solve this higher-order polynomial interpolation problem using surprisingly few higer-order evaluations. Using the known sparse S-decoding polynomials, in combination with our ideas leads to our improved PIRs. Notably, we get a 3-server PIR scheme with communication 2O((logn)1/3)2^{O^{\sim}( (\log n)^{1/3}) }, improving upon the previously best known communication of 2O(logn)2^{O^{\sim}( \sqrt{\log n})} due to Efremenko.16 page

    Explicit Two-Sided Vertex Expanders Beyond the Spectral Barrier

    No full text
    We construct the first explicit two-sided vertex expanders that bypass the spectral barrier. Previously, the strongest known explicit vertex expanders were given by dd-regular Ramanujan graphs, whose spectral properties imply that every small subset of vertices SS has at least 0.5dS0.5d|S| distinct neighbors. However, it is possible to construct Ramanujan graphs containing a small set SS with no more than 0.5dS0.5d|S| neighbors. In fact, no explicit construction was known to break the 0.5d0.5 d-barrier. In this work, we give an explicit construction of an infinite family of dd-regular graphs (for large enough dd) where every small set expands by a factor of 0.6d\approx 0.6d. More generally, for large enough d1,d2d_1,d_2, we give an infinite family of (d1,d2)(d_1,d_2)-biregular graphs where small sets on the left expand by a factor of 0.6d1\approx 0.6d_1, and small sets on the right expand by a factor of 0.6d2\approx 0.6d_2. In fact, our construction satisfies an even stronger property: small sets on the left and right have unique-neighbor expansion 0.6d10.6d_1 and 0.6d20.6d_2 respectively. Our construction follows the tripartite line product framework of Hsieh, McKenzie, Mohanty & Paredes, and instantiates it using the face-vertex incidence of the 44-dimensional Ramanujan clique complex as its base component. As a key part of our analysis, we derive new bounds on the triangle density of small sets in the Ramanujan clique complex.28 page

    Computing 1/mQ1/m_Q and 1/mQ21/m_Q^2 corrections to the static potential with lattice gauge theory using gradient flow

    No full text
    We present selected preliminary lattice gauge theory results for O(1/mQ)O(1/m_Q) and O(1/mQ2)O(1/m_Q^2) corrections to the static potential. These results are based on Wilson loops with two field strength insertions, which we renormalize using gradient flow. We explore tree level improvement to reduce systematic errors in the Wilson loops due to the finite lattice spacing and flow time, in particular at small temporal and spatial separations.10 pages, 5 figures; talk given at the 41st International Symposium on Lattice Field theory (LATTICE2024) 28 July - 3 August 2024, Liverpool, U

    375,182

    full texts

    623,509

    metadata records
    Updated in last 30 days.
    arXiv.org e-Print Archive is based in United States
    Access Repository Dashboard
    Do you manage Open Research Online? Become a CORE Member to access insider analytics, issue reports and manage access to outputs from your repository in the CORE Repository Dashboard! 👇