1,721,033 research outputs found
A flow waveform-matched low-dimensional glottal model based on physical knowledge
The purpose of this study is to explore the possibility for physically based mathematical models of the voice source to accurately reproduce inverse filtered glottal volume-velocity waveforms. A low-dimensional, self-oscillating model of the glottal source with waveform-matching properties is proposed. The model relies on a lumped mechano-aerodynamic scheme loosely inspired by the one- and multimass lumped models. The vocal folds are represented by a single mechanical resonator and a propagation line which takes into account the vertical phase, differences. The vocal-fold displacement is coupled to the glottal flow by means of an aerodynamic driving block which includes a general parametric nonlinear component. The principal characteristics of the flow-induced oscillations are retained, and the overall model is able to match inverse-filtered glottal flow signals. The method offers in principle the possibility of performing transformations of the glottal flow by acting on the physiologically based parameters of the model. This is a desirable property, e.g., for speech synthesis applications. The model was tested on a data set which included inverse-filtered glottal flow waveforms of different characteristics. The results demonstrate the possibility of reproducing natural speech waveforms with high accuracy, and of controlling important characteristics of the synthesis such as pitch
Synthesis of voiced sounds by means of waveform adaptive physical models
The reproduction of voiced sounds by physical modeling is addressed.
A major focus is put on the possibility of fitting a physically
constrained model to real voice samples. A source-filter
scheme is adopted in which the vocal tract is represented by an allpole
filter and the voice source model relies on a lumped mechano
aerodynamic scheme inspired by the mass-spring paradigm. The
vocal folds are represented by a mechanical resonator plus a delay
line which takes into account the vertical phase differences. The
vocal fold displacement is coupled to the glottal flow by means of a
general parametric nonlinear model. An adaptive data-driven identification
procedure is outlined, where the parameters of the model
are tuned in order to accurately match the target speech waveform.
The simultaneous optimization of the source and the vocal tract parameters
is discussed. A recursive algorithm based on the Kalman filtering approach is proposed and evaluated. The performance on time varying voiced signals is discussed
A flow waveform adaptive mechanical glottal model
A waveform adaptive physical model of the glottal source is proposed. The model
relies on a lumped mechano aerodynamic schema loosely inspired to the oneand
two-mass lumped models. The vocal folds are represented by a single mechanical
resonator and a propagation line which takes into account the vertical
phase differences. The vocal folds displacement is coupled to the glottal flow
by means of an aerodynamic driving block which includes a general parametric
nonlinear component. The principal characteristics of the flow-induced oscillations are retained, and the overall model is able to adapt to glottal flow signals with different characteristics
Physically oriented glottis models with inverse filtered waveform matching properties
A low-dimensional physically oriented model of the glottal source is discussed. The model relies on a lumped mechano-aerodynamic scheme based on the mass-spring paradigm. The vocal folds are represented by a mechanical resonator plus a delay line which takes into account the vertical phase differences. First, a simple flow model based on Bernoulli’s law is assumed, and the properties of the system are discussed. The class of models under consideration is shown to be able to reproduce a broad range of phonation styles, and to provide interesting control properties. Secondly, an extended flow model is introduced with the aim of reproducing realistic glottal source waveforms obtained by inverse filtering. The new flow model is based on a general parametric nonlinear model. For this new scheme, the principal characteristics of the flow-induced oscillations are retained, and the overall model is suited for an identification approach where real inverse filtered glottal flow signals are to be reproduced. A data-driven identification procedure is outlined, where the parameters of the model are tuned in order to accurately match the target waveform. A set of inverse-filtered glottal flow wave forms with different characteristics are used to test the effectiveness of the approach. The results demonstrate that the model can reproduce a wide range of target waveforms
Radial Basis Function Networks for Conversion of Sound Spectra
In many advanced signal processing tasks, such as pitch shifting, voice conversion or sound synthesis, accurate spectral processing is required. Here, the use of Radial Basis Function Networks (RBFN) is proposed for the modeling of the spectral changes (or conversions) related to the control of important sound parameters, such as pitch or intensity. The identification of such conversion functions is based on a procedure which learns the shape of the conversion from few couples of target spectra from a data set. The generalization properties of RBFNs provides for interpolation with respect to the pitch range. In the construction of the training set, mel-cepstral encoding of the spectrum is used to catch the perceptually most relevant spectral changes. Moreover, a singular value decomposition (SVD) approach is used to reduce the dimension of conversion functions. The RBFN conversion functions introduced are characterized by a perceptually-based fast training procedure, desirable interpolation properties and computational efficiency
Sound Morphing With Gaussian Mixture Models
In this work a sound transformation model based on Gaussian Mixture Models is introduced and evaluated for audio morphing. To this aim, the GMM is used to build the acous- tic model of the source sound, and a set of conversion func- tions, which rely on the acoustic model, is used to trans- form the source sound. The method is experimented on a set of monophonic sounds and results show that it provides promising features
Transformation of instrumental sound related noise by means of adaptive filtering techniques
Hybrid parametric-physiological glottal modelling with application to voice quality assessment
A glottal model based on physical constraints is proposed. The model describes the vocal fold as a simple oscillator, i.e. a damped mass-spring system. The oscillator is coupled with a nonlinear block, accounting for fold interaction with the airflow. The nonlinear block is modelled as a regressor-based functional with weights to be identified, and a pitch-synchronous identification procedure is outlined. The model is used to analyse voiced sounds from normal and from pathological voices, and the application of the proposed analysis procedure to voice quality assessment is discussed
- …
