Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
Approximately a decade after the Kelly-Lochbaum voice model was
developed, Linear Predictive Coding (LPC) of speech began
[20,298,299]. The linear-prediction voice model
is best classified as a parametric, spectral, source-filter model, in
which the short-time spectrum is decomposed into a flat
excitation spectrum multiplied by a smooth spectral
envelope capturing primarily vocal formants (resonances).
LPC has been used quite often as a spectral transformation technique
in computer music, as well as for general-purpose audio spectral
envelopes [384],
and it remains much used for low-bit-rate speech coding in the variant
known as Codebook Excited Linear Prediction (CELP)
[340].A.18When applying LPC to audio at high sampling rates, it is important to
carry out some kind of auditory
frequency warping, such as according to mel, Bark, or ERB
frequency scales [183,461,485].
Interestingly, it was recognized from the beginning that the all-pole
LPC vocal-tract model could be interpreted as a modified
piecewise-cylindrical acoustic-tube model [20,299],
and this interpretation was most explicit when the vocal-tract filters
(computed by LPC in direct form) were realized as ladder filters
[299]. The physical interpretation is not really valid, however,
unless the vocal-tract filter parameters are estimated jointly with a
realistic glottal pulse shape.
LPC demands that the vocal
tract be driven by a flat spectrum--either an impulse (or
low-pitched impulse train) or white noise--which is not physically
accurate. When the glottal pulse shape (and lip radiation
characteristic) are ``factored out'', it becomes possible to convert
LPC coefficients into vocal-tract shape parameters (area ratios).
Approximate results can be obtained by assuming a simple roll-off
characteristic for the glottal pulse spectrum (e.g., -12 dB/octave) and
lip-radiation frequency response (nominally +6dB /octave), and
compensating with a simple preemphasis characteristic (e.g.,
dB/octave) [299].
More accurate glottal pulse estimation in terms of parameters of the
derivative-glottal-wave models by Liljencrants, Fant, and Klatt
[134,259] (still assuming +6dB/octave for lip radiation)
was carried out in the thesis research of Vicky Lu
[292], and further extension of that work appears in
[252,214,253].
Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]