Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
Once we have our data in the form of amplitude and frequency envelopes
for each filter-bank channel, we can compress them by a large factor.
If there are
channels, we nominally expect to be able to
downsample by a factor of
, as discussed initially in Chapter 9
and more extensively in Chapter 11.
In early computer music [97,186], amplitude and
frequency envelopes were ``downsampled'' by means of piecewise
linear approximation. That is, a set of breakpoints were
defined in time between which linear segments were used. These
breakpoints correspond to ``knot points'' in the context of polynomial
spline interpolation [286]. Piecewise linear approximation
yielded large compression ratios for relatively steady tonal
signals.G.10For example, compression ratios of 100:1 were not uncommon for
isolated ``toots'' on tonal orchestral instruments [97].
A more straightforward method is to simply downsample each envelope by
some factor. Since each subband is bandlimited to the channel
bandwidth, we expect a downsampling factor on the order of the number
of channels in the filter bank. Using a hop size
in the STFT
results in downsampling by the factor
(as discussed
in §9.8). If
channels are downsampled by
, then the
total number of samples coming out of the filter bank equals the
number of samples going into the filter bank. This may be called
critical downsampling, which is invariably used in filter banks
for audio compression, as discussed further in Chapter 11. A benefit
of converting a signal to critically sampled filter-bank form is that
bits can be allocated based on the amount of energy in each subband
relative to the psychoacoustic masking threshold in that band.
Bit-allocation is typically different for tonal and noise signals in a
band [113,25,16].
Next |
Prev |
Up |
Top
|
Index |
JOS Index |
JOS Pubs |
JOS Home |
Search
[How to cite this work] [Order a printed hardcopy] [Comment on this page via email]