Singing Kelly-Lochbaum Vocal Tract

International Standard Book Number — Click for https://mathworld.wolfram.com/ISBN.html

Click for https://ccrma.stanford.edu/~jos/pasp/Ideal_Vibrating_String.html

Reverberation is the sound created by many echoes blending together, such as you hear when you clap your hands together in an acoustically 'live' room. — Click for https://ccrma.stanford.edu/~jos/pasp/Artificial_Reverberation.html

Click for http://www.cs.princeton.edu/~prc/NewWork.html

Click for http://ptolemy.eecs.berkeley.edu/~eal/audio/voder.html

Click for https://ccrma.stanford.edu/~jos/pmupd/Voice.html

Linear Prediction is an important signal modeling tool which captures correlation information in the form of prediction coefficients. In digital audio signal processing, linear prediction computes a superior spectral envelope that emphasizes spectral peaks. — Click for http://en.wikipedia.org/wiki/Linear_prediction

A speech sound produced primarily by vibration of the vocal folds, with the vocal tract held in an open configuration. — Click for http://ccrma.stanford.edu/~rjc/pubs/audio_speech/Describing_Speech_Sounds.html

Click for https://ccrma.stanford.edu/~jos/pmupd/pmupd.html

Click for https://ccrma.stanford.edu/~jos/pasp/Digital_Waveguides.html

Click for https://ccrma.stanford.edu/~jos/pasp/Digital_Waveguide_Models.html

A waveguide restricts wave propagation to a particular subspace, which is usually a line. Vibrating strings, woodwinds, and transmission lines are examples of one-dimensional waveguides. — Click for http://en.wikipedia.org/wiki/Waveguide

Click for https://ccrma.stanford.edu/~jos/pasp/Unit_Elements.html

Wave Digital Filters (WDF) digitize analog circuits element-by-element using the bilinear transform and virtual traveling waves. — Click for https://ccrma.stanford.edu/~jos/pasp/Wave_Digital_Filters.html

Click for https://ccrma.stanford.edu/~jos/mdft/Binary_Integer_Fixed_Point_Numbers.html

A filter in the audio signal processing context is any operation that accepts a signal as an input and produces a signal as an output. Most practical audio filters are linear and time invariant, in which case they can be characterized by their impulse response or their frequency response. — Click for https://ccrma.stanford.edu/~jos/filters/What_Filter.html

A filter is said to be stable if its impulse response decays to zero as time goes to infinity — Click for https://ccrma.stanford.edu/~jos/filters/Stability_Revisited.html

The transfer function is defined for LTI filters as the z transform of the filter output signal, divided by the z transform of the filter input signal — Click for https://ccrma.stanford.edu/~jos/filters/Transfer_Function_Analysis.html

A digital filter can be visualized as a 'black box' that accepts a sequence of numbers and emits a new sequence of numbers. In digital audio signal processing applications, such number sequences usually represent sounds. For example, digital filters are used to implement graphic equalizers and other digital audio effects. — Click for https://ccrma.stanford.edu/~jos/filters/

Click for https://ccrma.stanford.edu/~jos/wgft/wgft.html

Click for https://ccrma.stanford.edu/~jos/pasp/Conventional_Ladder_Filters.html

Click for https://ccrma.stanford.edu/~jos/pasp/One_Port_Network_Theory.html

The wave impedance in a propagation medium is the force variable (such as pressure for acoustic waves) divided by the velocity variable. In ideal plane waves and traveling waves, the wave impedance is a real, positive number. For spherical waves in air, the wave impedance is different at each frequency. — Click for https://ccrma.stanford.edu/~jos/OnePorts/Impedance.html

Click for https://ccrma.stanford.edu/~jos/pasp/Scattering_Impedance_Changes.html

Click for https://ccrma.stanford.edu/~jos/mdft/Alias_Operator.html

Click for https://ccrma.stanford.edu/~jos/pasp/Bilinear_Transformation.html

Derivation of the sampling theorem which states that any signal can be perfectly reconstructed, in principle, from uniformly spaced samples of that signal, provided that the sampling rate is higher than twice the highest frequency present in the signal — Click for https://ccrma.stanford.edu/~jos/mdft/Sampling_Theorem.html

Click for http://www.kettering.edu/~drussell/Demos/waves/wavemotion.html

Click for https://ccrma.stanford.edu/~jos/pasp/Sampled_Traveling_Waves.html

Click for https://ccrma.stanford.edu/~jos/pasp/Kelly_Lochbaum_Scattering_Junctions.html

Singing Kelly-Lochbaum Vocal Tract

In 1962, John L. Kelly and Carol C. Lochbaum published a software version of a digitized vocal-tract analog model [247,248]. This may be the first instance of a sampled traveling-wave model of the vocal tract, as opposed to a lumped-parameter transmission-line model. In other words, Kelly and Lochbaum apparently returned to the original acoustic tube model (a sequence of cylinders), obtained d'Alembert's traveling-wave solution in each section, and applied Nyquist's sampling theorem to digitize the system. This sampled, bandlimited approach to digitization contrasts with the use of bilinear transforms as in wave digital filters; an advantage is that the frequency axis is not warped, but it is prone to aliasing when the parameters vary over time (or if nonlinearities are present).

At the junction of two cylindrical tube sections, i.e., at area discontinuities, lossless scattering occurs.^A.14As mentioned in §A.5.4, reflection/transmission at impedance discontinuities was well formulated in classical network theory [34,35], and in transmission-line theory.

The Kelly-Lochbaum model can be regarded as a kind of ladder filter [299] or, more precisely, using later terminology, a digital waveguide filter [437]. Ladder and lattice digital filters can be used to realize arbitrary transfer functions [299], and they enjoy low sensitivity to round-off error, guaranteed stability under coefficient interpolation, and freedom from overflow oscillations and limit cycles under general conditions. Ladder/lattice filters remain important options when designing fixed-point implementations of digital filters, e.g., in VLSI. In the context of wave digital filters, the Kelly-Lochbaum model may be viewed as a digitized unit element filter [137], reminiscent of waveguide filters used in microwave engineering. In more recent terminology, it may be called a digital waveguide model of the vocal tract in which the digital waveguides are degenerated to single-sample width [437,445].

In 1961, Kelly and Lochbaum collaborated with Max Mathews to create what was most likely the first digital physical-modeling synthesis example by any method.^A.15 The voice was computed on an IBM 704 computer using speech-vowel data from Gunnar Fant's recent book [133]. Interestingly, Fant's vocal-tract shape data were obtained (via x-rays) for Russian vowels, not English, but they were close enough to be understandable. Arthur C. Clarke, visiting John Pierce at Bell Labs, heard this demo, and he later used it in ``2001: A Space Odyssey,''--the HAL9000 computer slowly sang its ``first song'' (``Bicycle Built for Two'') as its disassembly by astronaut Dave Bowman neared completion.^A.16

Perhaps due in part to J. L. Kelly's untimely death afterward, research on vocal-tract analog models tapered off thereafter, although there was some additional work [309]. Perhaps the main reason for the demise of this research thread was that spectral models (both nonparametric models such as the vocoder, and parametric source-filter models such as linear predictive coding (discussed below)) proved to be more effective when the application was simply speech coding at acceptably low bit rates and high fidelity levels. In telephone speech-coding applications, there was no requirement that a physical voice model be retained for purposes of expressive musical performance. In fact, it was desired to automate and minimize the ``performance expertise'' required to operate the voice production model. One could go so far as to say that the musical expressivity of voice synthesis models reached their peak in the 1939 Voder and related (manual) systems (§A.6.1).

In computer music, the Kelly-Lochbaum vocal tract model was revived for singing-voice synthesis in the thesis research of Perry Cook [87].^A.17 In addition to the basic vocal tract model with side branch for the nasal tract, Cook included neck radiation (e.g., for `b'), and damping extensions. Additional work on incorporating damping within the tube sections was carried out by Amir et al. [15]. Other extensions include sparse acoustic tube modeling [149] and extension to piecewise conical acoustic tubes [511]. The digital waveguide modeling framework [434,435,437] can be viewed as an adaptation of extremely sparse acoustic-tube models for artificial reverberation, vibrating strings, and wind-instrument bores.

Singing Kelly-Lochbaum Vocal Tract

``Physical Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2010, ISBN 978-0-9745607-2-4 Copyright © 2024-06-28 by Julius O. Smith III Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

``Physical Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2010, ISBN 978-0-9745607-2-4
Copyright © 2024-06-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University