Envelope Compression

International Standard Book Number — Click for https://mathworld.wolfram.com/ISBN.html

Noise (in signal processing) is a random signal. In the language of statistical signal processing, a noise signal is typically modeled as 'stochastic process', which is in turn defined as a sequence of random variables. — Click for https://ccrma.stanford.edu/~jos/sasp/What_Noise.html

Psychoacoustics is the science of the human perception of sound. — Click for http://en.wikipedia.org/wiki/Psychoacoustics

The Short Time Fourier Transform (STFT) computes the spectrum (DFT) of successive time frames of a signal. — Click for https://ccrma.stanford.edu/~jos/sasp/Short_Time_Fourier_Transform.html

A set of filters that decompose a signal into a set of components — Click for http://en.wikipedia.org/wiki/Filter_bank

To downsample a signal by the factor N means to form a new signal consisting of every Nth sample of the original signal. — Click for https://ccrma.stanford.edu/~jos/mdft/Downsampling_Operator.html

The size of the passband, typically expressed in Hz. — Click for http://en.wikipedia.org/wiki/Bandpass

A signal is typically a real-valued function of time. A discrete-time signal is typically a real-valued function of discrete time, and is therefore a time-ordered sequence of real numbers. — Click for http://ccrma.stanford.edu/~jos/filters/Definition_Signal.html

A filter in the audio signal processing context is any operation that accepts a signal as an input and produces a signal as an output. Most practical audio filters are linear and time invariant, in which case they can be characterized by their impulse response or their frequency response. — Click for https://ccrma.stanford.edu/~jos/filters/What_Filter.html

The amplitude envelope, or relatively slowly-changing outline of a sound waveform, makes for a useful first approximation of the instantaneous loudness of the sound. — Click for http://en.wikipedia.org/wiki/ADSR_envelope

Envelope Compression

Once we have our data in the form of amplitude and frequency envelopes for each filter-bank channel, we can compress them by a large factor. If there are channels, we nominally expect to be able to downsample by a factor of , as discussed initially in Chapter 9 and more extensively in Chapter 11.

In early computer music [97,186], amplitude and frequency envelopes were ``downsampled'' by means of piecewise linear approximation. That is, a set of breakpoints were defined in time between which linear segments were used. These breakpoints correspond to ``knot points'' in the context of polynomial spline interpolation [286]. Piecewise linear approximation yielded large compression ratios for relatively steady tonal signals.^G.10For example, compression ratios of 100:1 were not uncommon for isolated ``toots'' on tonal orchestral instruments [97].

A more straightforward method is to simply downsample each envelope by some factor. Since each subband is bandlimited to the channel bandwidth, we expect a downsampling factor on the order of the number of channels in the filter bank. Using a hop size in the STFT results in downsampling by the factor (as discussed in §9.8). If channels are downsampled by , then the total number of samples coming out of the filter bank equals the number of samples going into the filter bank. This may be called critical downsampling, which is invariably used in filter banks for audio compression, as discussed further in Chapter 11. A benefit of converting a signal to critically sampled filter-bank form is that bits can be allocated based on the amount of energy in each subband relative to the psychoacoustic masking threshold in that band. Bit-allocation is typically different for tonal and noise signals in a band [113,25,16].

Envelope Compression

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1. Copyright © 2022-02-28 by Julius O. Smith III Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University