Audio Filter Banks

International Standard Book Number — Click for https://mathworld.wolfram.com/ISBN.html

The Fourier transform of a signal is a frequency-domain function — Click for https://ccrma.stanford.edu/~jos/mdft/

Audition is the act of listening. — Click for http://www.google.com/search?q=auditory

To downsample a signal by the factor N means to form a new signal consisting of every Nth sample of the original signal. — Click for https://ccrma.stanford.edu/~jos/mdft/Downsampling_Operator.html

Click for https://ccrma.stanford.edu/~jos/mdft/Alias_Operator.html

An Finite Impulse Response (FIR) digital filter has an impulse response that reaches zero in a finite number of samples. Such filters cannot have any feedback loops. FIR filters are also called nonrecursive. The transfer function of an FIR filter is a polynomial. — Click for https://ccrma.stanford.edu/~jos/filters/FIR_Digital_Filters.html

The Discrete Cosine Transform (DCT) computes a discrete-frequency spectrum from a discrete-time signal of finite length in a manner similar to the Discrete Fourier Transform (DFT) — Click for http://en.wikipedia.org/wiki/Discrete_cosine_transform

The Short Time Fourier Transform (STFT) computes the spectrum (DFT) of successive time frames of a signal. — Click for https://ccrma.stanford.edu/~jos/sasp/Short_Time_Fourier_Transform.html

Click for http://www.circuitsage.com/filter.html

Click for https://ccrma.stanford.edu/~jos/mdft/In_Phase_Quadrature_Sinusoidal.html

Click for https://ccrma.stanford.edu/~jos/sasp/Continuous_Wavelet_Transform.html#33749

The spectrogram is a three-dimensional plot of signal amplitude versus time and frequency. It is typically computed using the Short Time Fast Fourier Transform (STFT). — Click for http://en.wikipedia.org/wiki/Spectrogram

Fast Fourier Transforms (FFT) are fast algorithms for computing the Discrete Fourier Transform (DFT) — Click for https://ccrma.stanford.edu/~jos/mdft/Fast_Fourier_Transform_FFT.html

A signal is typically a real-valued function of time. A discrete-time signal is typically a real-valued function of discrete time, and is therefore a time-ordered sequence of real numbers. — Click for http://ccrma.stanford.edu/~jos/filters/Definition_Signal.html

Click for https://ccrma.stanford.edu/CCRMA/Courses/152/hearing.html

The quality factor Q of a two-pole resonator is equal to the resonant frequency divided by two times the damping constant. If the Q is large, then the impulse response of the resonator will be long. Conversely, if the Q is small, then the impulse response of the resonator will be short. — Click for http://ccrma.stanford.edu/~jos/filters/Quality_Factor_Q.html

Click for https://ccrma.stanford.edu/~jos/sasp/Auditory_Filter_Shapes.html#20675

The midpoint of the passband. — Click for http://en.wikipedia.org/wiki/Bandpass

The quality factor or Q of a resonator may be thought of as the number of cycles of oscillation at the resonant frequency in the impulse response of the resonator before it substantially decays to zero. — Click for https://ccrma.stanford.edu/~jos/filters/Quality_Factor_Q.html

A set of filters that decompose a signal into a set of components — Click for http://en.wikipedia.org/wiki/Filter_bank

Spectrum analysis of sound is analogous to decomposing white light into its component colors by means of a prism — Click for https://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html

The size of the passband, typically expressed in Hz. — Click for http://en.wikipedia.org/wiki/Bandpass

A filter in the audio signal processing context is any operation that accepts a signal as an input and produces a signal as an output. Most practical audio filters are linear and time invariant, in which case they can be characterized by their impulse response or their frequency response. — Click for https://ccrma.stanford.edu/~jos/filters/What_Filter.html

An auditory filter bank attempts to perform signal filtering similar to that of the basilar membrane in the ear. Auditory filter banks are often used as front ends for audio signal processing systems such as speech recognition systems or audio displays. — Click for https://ccrma.stanford.edu/~jos/bbt/Auditory_Filter_Banks.html

Click for https://ccrma.stanford.edu/CCRMA/Courses/152/hearing.html

Audio Filter Banks

It is well known that the frequency resolution of human hearing decreases with frequency [71,276]. As a result, any ``auditory filter bank'' must be a nonuniform filter bank in which the channel bandwidths increase with frequency over most of the spectrum. A classic approximate example is the third-octave filter bank.^11.14 A simpler (cruder) approximation is the octave filter bank,^11.15 also called a dyadic filter bank when implemented using a binary tree structure [287]. Both are examples of constant-Q filter banks [29,30,244], in which the bandwidth of each filter-bank channel is proportional to center frequency [263]. Approximate auditory filter banks, such as constant-Q filter banks, have extensive applications in computer music, audio engineering, and basic hearing research.

If the output signals from all channels of a constant-Q filter bank are all sampled at a particular time, we obtain what may be called a constant-Q transform [29]. A constant-Q transform can be efficiently implemented by smoothing the output of a Fast Fourier Transform (FFT) [30]. More generally, a multiresolution spectrogram can be implemented by combining FFTs of different lengths and advancing the FFTs forward through time (§7.3). Such nonuniform filter banks can also be implemented based on the Goetzel algorithm [33].

While the topic of filter banks is well developed in the literature, including constant-Q, nonuniform FFT-based, and wavelet filter banks, the simple, robust methods presented in this section appear to be new [265]. In particular, classic nonuniform FFT filter banks as described in [226] have not offered the perfect reconstruction property [287] in which the filter-bank sum yields the input signal exactly (to within a delay and/or scale factor) when the filter-band signals are not modified. The voluminous literature on perfect-reconstruction filter banks [287] addresses nonuniform filter banks, such as dyadic filter banks designed based on pseudo quadrature mirror filter designs, but simpler STFT methods do not yet appear to be incorporated. In the cosine-modulated filter-bank domain, subband DCTs have been used in a related way [303], but apparently without consideration for the possibility of a common time domain across multiple channels.^11.16

This section can be viewed as an extension of [30] to the FFT filter-bank case. Alternatively, it may be viewed as a novel method for nonuniform FIR filter-bank design and implementation, based on STFT methodology, with arbitrarily accurate reconstruction and controlled aliasing in the downsampled case. While we consider only auditory (approximately constant-Q) filter banks, the method works equally well for arbitrary nonuniform spectral partitions and overlap-add decompositions in the frequency domain.

Audio Filter Banks

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1. Copyright © 2022-02-28 by Julius O. Smith III Center for Computer Research in Music and Acoustics (CCRMA), Stanford University

``Spectral Audio Signal Processing'', by Julius O. Smith III, W3K Publishing, 2011, ISBN 978-0-9745607-3-1.
Copyright © 2022-02-28 by Julius O. Smith III
Center for Computer Research in Music and Acoustics (CCRMA), Stanford University