Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
In 1962, John L. Kelly and Carol C. Lochbaum published a software
version of a digitized vocal-tract analog model
[247,248]. This may be the first
instance of a sampled traveling-wave model of the vocal tract,
as opposed to a lumped-parameter transmission-line model. In other
words, Kelly and Lochbaum apparently returned to the original acoustic
tube model (a sequence of cylinders), obtained d'Alembert's
traveling-wave solution in each section,
and applied Nyquist's
sampling theorem to digitize the system. This sampled, bandlimited
approach to digitization contrasts with the use of bilinear
transforms as in wave digital filters; an advantage is that the
frequency axis is not warped, but it is prone to aliasing when the
parameters vary over time (or if nonlinearities are present).
At the junction of two cylindrical tube sections, i.e., at area
discontinuities, lossless scattering occurs.A.14As mentioned in §A.5.4, reflection/transmission at impedance
discontinuities was well formulated in classical network theory
[34,35], and in transmission-line theory.
The Kelly-Lochbaum model can be regarded as a kind of ladder
filter [299] or, more precisely, using later terminology, a
digital waveguide filter [437]. Ladder and lattice
digital filters can be used to realize arbitrary transfer functions
[299], and they enjoy low sensitivity to round-off error,
guaranteed stability under coefficient interpolation, and freedom from
overflow oscillations and limit cycles under general conditions.
Ladder/lattice filters remain important options when designing
fixed-point implementations of digital filters, e.g., in VLSI. In the
context of wave digital filters, the Kelly-Lochbaum model may be
viewed as a digitized unit element filter [137],
reminiscent of waveguide filters used in microwave engineering. In
more recent terminology, it may be called a digital waveguide
model of the vocal tract in which the digital waveguides are
degenerated to single-sample
width [437,445].
In 1961, Kelly and Lochbaum collaborated with Max Mathews to create
what was most likely the first digital physical-modeling synthesis
example by any method.A.15 The voice
was computed on an IBM 704 computer using speech-vowel data from
Gunnar Fant's recent book [133]. Interestingly, Fant's
vocal-tract shape data were obtained (via x-rays) for Russian vowels,
not English, but they were close enough to be understandable. Arthur
C. Clarke, visiting John Pierce at Bell Labs, heard this demo, and he
later used it in ``2001: A Space Odyssey,''--the HAL9000 computer
slowly sang its ``first song'' (``Bicycle Built for Two'') as its
disassembly by astronaut Dave Bowman neared completion.A.16
Perhaps due in part to J. L. Kelly's untimely death afterward,
research on vocal-tract analog models tapered off thereafter, although
there was some additional work [309]. Perhaps the main
reason for the demise of this research thread was that spectral models
(both nonparametric models such as the vocoder, and parametric
source-filter models such as linear predictive coding (discussed
below)) proved to be more effective when the application was simply
speech coding at acceptably low bit rates and high fidelity levels.
In telephone speech-coding applications, there was no requirement that
a physical voice model be retained for purposes of expressive musical
performance. In fact, it was desired to automate and minimize the
``performance expertise'' required to operate the voice production
model. One could go so far as to say that the musical expressivity of
voice synthesis models reached their peak in the 1939 Voder and
related (manual) systems (§A.6.1).
In computer music, the Kelly-Lochbaum vocal tract model was revived
for singing-voice synthesis in the thesis research of Perry Cook
[87].A.17 In addition to the basic vocal tract model with side
branch for the nasal tract, Cook included neck radiation (e.g., for
`b'), and damping extensions. Additional work on incorporating
damping within the tube sections was carried out by Amir et
al. [15]. Other extensions include sparse
acoustic tube modeling [149] and extension to
piecewise conical acoustic tubes [511]. The
digital waveguide modeling framework [434,435,437]
can be viewed as an adaptation of extremely sparse acoustic-tube models
for artificial reverberation, vibrating strings, and wind-instrument
bores.
Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]