1,720,993 research outputs found

    Modelling sagittal and vertical phase differences in a lumped and distributed elements vocal fold model

    Full text link
    The quality and timbre of disordered voices heavily rely on the vibration properties of the vocal folds. We discuss the representation of sagittal phase differences in vocal fold oscillations through a numerical biomechanical model involving lumped elements as well as distributed elements, i.e., delay lines. A dynamic glottal source model is proposed in which the fold displacement along the vertical and the sagittal dimensions is modelled using delay lines. In contrast to other models, with which the reproduction of sagittal phase differences is impossible (e.g., in two-mass models) or not easy to control (e.g., in 3D 16-mass and multi-mass models in general), the one proposed here provides direct control over the amount of phase delay between folds’ oscillations at the posterior and anterior part of the glottis, i.e., the sagittal axis, and at the superior and inferior part of the glottis, i.e., the vertical axis, while keeping the dynamic model simple and computationally efficient. The model is assessed by addressing the reproduction of oscillatory patterns observed in high-speed videoendoscopic data, in which sagittal phase differences are observed. Also, timing asymmetry parameters observed in hemi glottal area waveforms (GAWs) are used for fitting

    Non-modal voice synthesis by low-dimensional physical models

    No full text
    The synthesis of different voice qualities by means of a low-dimensional glottal model is discussed. The glottal model is based on a one-mass model provided with a number of enhancements that make it suitable to the aim of the study. The simulation of modal and non-modal phonatory regimes is discussed. Both symmetric and nonsymmetric configurations are explored. The class of models under consideration is shown to be able to reproduce a broad range of phonation styles and to provide interesting control properties

    Modelling longitudinal phase differences in a lumped and distributed elements vocal fold model

    No full text
    We discuss the representation of anterior-posterior (A-P) phase differences in vocal cord oscillations through a numerical biomechanical model involving lumped elements as well as distributed elements, i.e., delay lines. A dynamic glottal source model is illustrated in which the fold displacement along the vertical and the longitudinal dimensions is modeled using numerical waveguide components. In contrast to other models, in which the reproduction of longitudinal phase differences are impossible (e.g., in two-mass models) or not easy to control (e.g., in 3D 16-mass and multi-mass models in general), the one proposed here provides direct control over the amount of phase delay between folds oscillations at the posterior and anterior part of the glottis, while keeping the dynamic model simple and computationally efficient. The model is assessed by addressing the reproduction of oscillatory patterns observed in high-speed videoendoscopic data, in which A-P phase differences are observed, and of parameters related to the glottal area waveform

    Improved fold closure in mass-spring low-dimensional glottal models

    No full text
    This work presents a low-dimensional physical model of the glottis in which a 2-D fold displacement representation allows to represent both the vertical and longitudinal displacements of the folds. We use a one-mass mechanical model, coupled to aerodynamic driving forces, and we use a delay line representation to account for the propagation of the displacement on the body-cover. The waveform is characterized by means of a set of acoustic parameters (open quotient, speed quotient, return quotient, fundamental frequency F0, etc.) that are used in the literature as typical voice source quantification parameters. The paper provides comparisons between values of these parameters computed for the proposed model and for analytical models (LF) of the flow

    Fitting a biomechanical model of the folds to high-speed video data through bayesian estimation

    Full text link
    High-speed video recording of the vocal folds during sustained phonation has become a widespread diagnostic tool, and the development of imaging techniques able to perform automated tracking and analysis of relevant glottal cues, such as folds edge position or glottal area, is an active research field. In this paper, a vocal folds vibration analysis method based on the processing of visual data through a biomechanical model of the layngeal dynamics is proposed. The procedure relies on a Bayesian non-stationary estimation of the biomechanical model parameters and state, to fit the folds edge position extracted from the high-speed video endoscopic data. This finely tuned dynamical model is then used as a state transition model in a Bayesian setting, and it allows to obtain a physiologically motivated estimation of upper and lower vocal folds edge position. Based on model prediction, an hypothesis on the lower fold position can be made even in complete fold occlusion conditions occurring during the end of the closed phase and the beginning of the open phase of the glottal cycle. To demonstrate the suitability of the procedure, the method is assessed on a set of audiovisual recordings featuring high-speed video endoscopic data from healthy subjects producing sustained voiced phonation with different laryngeal settings

    Spherical Harmonic Diagonal Unloading Beamforming with Ego-Noise Reduction for DOA Estimation from Autonomous Systems

    No full text
    A method to improve the localization of a sound source using a spherical microphone array embedded into autonomous systems is presented. The method is based on a low-complexity diagonal unloading (DU) beamforming in the spherical harmonic (SH) domain using a frequency smoothing power transform (FSPT) of the covariance matrices with a novel ego-noise reduction. The attenuation of the ego-noise in the signal-plus-ego-noise broadband FSTP covariance matrix is achieved by estimating the FSPT ego-noise covariance matrix and exploiting the subspace orthogonality property using a diagonal unloading procedure. Experiments with controlled real-world recordings performed by an aerial drone equipped with a 19-microphone spherical array while sensing a flying target drone demonstrate the efficiency of the proposed method

    Diagonal Unloading Beamforming in the Spherical Harmonic Domain for Acoustic Source Localization in Reverberant Environments

    No full text
    Spherical microphone arrays allow the sound field analysis in three dimensions with the advantage of having the same resolution in all directions. By considering the frequency-independent character of the steering vectors in the spherical harmonic (SH) domain, we propose a very low-complexity SH diagonal unloading (DU) beamforming with a novel frequency smoothing power transform (FSPT) of the covariance matrices. We consider the direction of arrival (DOA) estimation problem of acoustic sources in reverberant conditions. The DU beamforming provides high resolution directional response since it exploits the subspace orthogonality property of the covariance matrix by the removal or the attenuation of the signal subspaces, obtained through the subtraction of an opportune diagonal matrix from the covariance matrix. The FSPT aims at smoothing the narrowband covariance matrices of the entire set of frequency domain components, and it pursues this goal by minimizing the narrowband error contributions due to reverberation in the broadband frequency smoothing covariance matrix. We analyze the DOA estimation performance using speech signals with simulations and real acoustic data in reverberant conditions. The results show that the proposed SH-DU-FSPT has a DOA estimation performance comparable to that of high resolution state-of-the-art methods with a significant reduction of the computational cost, since the steering directional responses are computed on the broadband frequency smoothing covariance matrix
    corecore