International Standard Book Number — Click for https://mathworld.wolfram.com/ISBN.html

Search JOS Website

JOS Home Page

JOS Online Publications

Index of terms in JOS Website

Index: Spectral Audio Signal Processing

Spectral Audio Signal Processing

Summary

Choice of Hop Size

The spectrum of a signal gives the distribution of signal energy as a function of frequency. — Click for http://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html

The spectrum of a signal gives the distribution of signal energy as a function of frequency. — Click for http://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html

Click for https://ccrma.stanford.edu/~jos/mdft/Decibels.html

Click for https://ccrma.stanford.edu/~jos/mdft/Spectral_Bin_Numbers.html

Click for https://ccrma.stanford.edu/CCRMA/Courses/152/SPL.html

The magnitude spectrum of a signal is given by the magnitude (complex modulus) of its Fourier transform. — Click for https://ccrma.stanford.edu/~jos/mdft/Discrete_Fourier_Transform_DFT.html

A sinusoid is any function of the form A sin(ω t+φ), where t is the independent variable, and A, ω, φ are fixed parameters of the sinusoid called the amplitude, (radian) frequency, and phase, respectively. Sinusoidal motion is produced by any 'pure' vibration, such as that of an ideal tuning fork or mass-spring system. — Click for https://ccrma.stanford.edu/~jos/mdft/Sinusoids.html

Fast Fourier Transforms (FFT) are fast algorithms for computing the Discrete Fourier Transform (DFT) — Click for https://ccrma.stanford.edu/~jos/mdft/Fast_Fourier_Transform_FFT.html

Spectrum analysis of sound is analogous to decomposing white light into its component colors by means of a prism — Click for https://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html

When displaced from its equilibrium state, a harmonic oscillator experiences a restoring force according to Hooke's law. — Click for http://ccrma.stanford.edu/~jos/filters/Mass_Spring_Oscillator_Analysis.html

A periodic signal is a signal that forever repeats itself. — Click for http://en.wikibooks.org/wiki/Signals_and_Systems/Periodic_Signals

Every periodic signal has a fundamental frequency given by the inverse of the period. If the period is in units of seconds, then the fundamental frequency is in units of Hertz (cycles per second). According to Fourier theory, every periodic signal can be expressed as a sum of sinusoids at frequencies given by integer multiples of the fundamental frequency. For integers greater than 1, these frequencies are called harmonic frequencies, and the sinusoids at harmonic frequencies are typically called harmonics. — Click for https://www.physicsclassroom.com/class/sound/Lesson-4/Fundamental-Frequency-and-Harmonics

The amplitude envelope, or relatively slowly-changing outline of a sound waveform, makes for a useful first approximation of the instantaneous loudness of the sound. — Click for http://en.wikipedia.org/wiki/ADSR_envelope

A set of filters that decompose a signal into a set of components — Click for http://en.wikipedia.org/wiki/Filter_bank

An Finite Impulse Response (FIR) digital filter has an impulse response that reaches zero in a finite number of samples. Such filters cannot have any feedback loops. FIR filters are also called nonrecursive. The transfer function of an FIR filter is a polynomial. — Click for https://ccrma.stanford.edu/~jos/filters/FIR_Digital_Filters.html

The Short Time Fourier Transform (STFT) computes the spectrum (DFT) of successive time frames of a signal. — Click for https://ccrma.stanford.edu/~jos/sasp/Short_Time_Fourier_Transform.html

Pressing a key on the piano causes a felt-tipped hammer to strike a vibrating string. — Click for http://www.speech.kth.se/music/5_lectures/

A signal is typically a real-valued function of time. A discrete-time signal is typically a real-valued function of discrete time, and is therefore a time-ordered sequence of real numbers. — Click for http://ccrma.stanford.edu/~jos/filters/Definition_Signal.html

A periodic signal is a signal that forever repeats itself. — Click for http://en.wikibooks.org/wiki/Signals_and_Systems/Periodic_Signals

A filter in the audio signal processing context is any operation that accepts a signal as an input and produces a signal as an output. Most practical audio filters are linear and time invariant, in which case they can be characterized by their impulse response or their frequency response. — Click for https://ccrma.stanford.edu/~jos/filters/What_Filter.html

Click for http://en.wikipedia.org/wiki/Vocoder

A sinusoidal model for sound approximates each tonal component of the sound as a sum of slowly varying sinusoids. For tonal sounds such as from vibrating strings or wind instruments (including voiced speech), a sinusoidal model can provide a compact, high-fidelity representation. In addition to providing an intuitive, malleable representation for sound, sinusoidal models are also used in advanced audio compression. — Click for https://ccrma.stanford.edu/~jos/sasp/Sinusoidal_Modeling_Sound.html

Click for http://en.wikipedia.org/wiki/Vocoder

Search JOS Website

JOS Home Page

JOS Online Publications

Index of terms in JOS Website

Index: Spectral Audio Signal Processing

Spectral Audio Signal Processing

Summary

Choice of Hop Size

The PARSHL Program

This appendix is adapted from the original paper describing the PARSHL program [271] for sinusoidal modeling of audio. While many of the main points are summarized elsewhere in the text, the PARSHL paper is included here as a source of more detailed info on carrying out elementary sinusoidal modeling of sound based on the STFT.

As mentioned in §G.7.1, the phase vocoder was a widely used analysis tool for additive synthesis starting in the 1970s. A difficulty with the phase vocoder, as traditionally implemented, is that it uses a fixed uniform filter bank. While this works well for periodic signals, it is relatively inconvenient for inharmonic signals. An ``inharmonic phase vocoder'' called PARSHL^H.1 was developed in 1985 to address this problem in the context of piano signal modeling [271]. PARSHL worked by tracking peaks in the short-time Fourier transform (STFT), thereby synthesizing an adaptive inharmonic FIR filter bank, replacing the fixed uniform filter bank of the vocoder. In other respects, PARSHL could be regarded as a phase-vocoder analysis program.

The PARSHL program converted an STFT to a set of amplitude and frequency envelopes for inharmonic, quasi-sinusoidal-sum signals. Only the most prominent peaks in the spectrum of the input signal were tracked. For quasi harmonic sounds, such as the piano, the amplitudes and frequencies were sampled approximately once per period of the lowest frequency in the analysis band. For resynthesis, PARSHL supported both additive synthesis [233] using an oscillator bank and overlap-add reconstruction from the STFT, or both.

PARSHL followed the amplitude, frequency, and phase^H.2 of the most prominent peaks over time in a series of spectra, computed using the Fast Fourier Transform (FFT) The synthesis part of the program used the analysis parameters, or their modification, to generate a sinewave in the output for each peak track found.

The steps carried out by PARSHL were as follows:

Compute the STFT $\tilde{x}_m^\prime (e^{j\omega_k })$ using the frame size, window type, FFT size, and hop size specified by the user.
Compute the squared magnitude spectrum in dB ( $20\log_{10}\left\vert\tilde{x}_m^\prime (e^{j\omega_k })\right\vert$ ).
Find the bin numbers (frequency samples) of the spectral peaks. Parabolic interpolation is used to refine the peak location estimates. Three spectral samples (in dB) consisting of the local peak in the FFT and the samples on either side of it suffice to determine the parabola used.
The magnitude and phase of each peak is calculated from the maximum of the parabola determined in the previous step. The parabola is evaluated separately on the real and imaginary parts of the spectrum to provide a complex interpolated spectrum value.
Each peak is assigned to a frequency track by matching the peaks of the previous frame with the current one. These tracks can be ``started up,'' ``turned-off'' or ``turned-on'' at any frame by ramping in amplitude from or toward 0 .
Arbitrary modifications can be applied to the analysis parameters before resynthesis.
If additive synthesis is requested, a sinewave is generated for each frequency track, and all are summed into an output buffer. The instantaneous amplitude, frequency, and phase for each sinewave are calculated by interpolating the values from frame to frame. The length of the output buffer is equal to the hop size which is typically some fraction of the window length .
Repeat from step 1, advancing samples each iteration until the end of the input sound is reached.

The following sections provide further details:

Subsections

[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

The PARSHL Program

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1. Copyright © 2022-02-28 by Julius O. Smith III Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University