Next |
Prev |
Up |
Top
|
JOS Index |
JOS Pubs |
JOS Home |
Search
Another question related to the analysis window is the hop size
,
i.e., how much we can advance the analysis time origin from frame to
frame. This depends very much on the purposes of the analysis. In
general, more overlap will give more analysis points and therefore
smoother results across time, but the computational expense is
proportionately greater. For purposes of spectrogram display or
additive synthesis parameter extraction, criterion Eq. (6) is a good
general purpose choice. It states that the succesive frames should
overlap in time in such a way that all data are weighted equally.
However, it can be overly conservative for steady-state signals. For
additive synthesis purposes, it is more efficient and still effective
to increase the hop size to the number of samples over which the
spectrum is not changing appreciably. In the case of the steady-state
portion of piano tones, the hop size appears to be limited by the
fastest amplitude envelope ``beat'' frequency caused by mistuning
strings on one key or by overlapping partials from different keys.
For certain window types (sum-of-cosine windows), there exist perfect
overlap factors in the sense of Eq. (6). For example, a Rectangular
window can hop by
, where
is any positive integer, and a
Hanning or Hamming window can use any hop size of the form
.
For the Kaiser window, on the other hand, there is no perfect hop
size other than
.
The perfect overlap-add criterion for windows and their hop sizes is
not the best perspective to take when overlap-add synthesis is being
constructed from the modified spectra
[1]. As mentioned earlier, the hop size
is the
downsampling factor applied to each FFT filter-bank output, and the
window is the envelope of each filter's impulse response. The
downsampling by
causes aliasing, and the frame rate
is equal to twice the ``folding frequency'' of this aliasing.
Consequently, to minimize aliasing, the choice of hop size
should
be such that the folding frequency exceeds the ``cut-off freqency'' of
the window. The cut-off frequency of a window can be defined as the
frequency above which the window transform magnitude is less than or
equal to the worst-case sidelobe level. For convenience, we typically
use the frequency of the first zero-crossing beyond the main lobe as
the definition of cut-off frequency. Following this rule yields
overlap for the rectangular window,
overlap for Hamming and
Hanning windows, and
(5/6) overlap for Blackman windows. The
hop size useable with a Kaiser window is determined by its design
parameters (principally, the desired time-bandwidth product of the
window, or, the ``beta'' parameter) [8].
One may wonder what happens to the aliasing in the
perfect-reconstruction case in which Eq. (6) is satisfied. The
answer is that aliasing does occur in the individual filter-bank
outputs, but this aliasing is canceled in the reconstruction by
overlap-add if there were no modifications to the STFT. For
a general discussion of aliasing cancellation in downsampled filter
banks, see [23,24].
Next |
Prev |
Up |
Top
|
JOS Index |
JOS Pubs |
JOS Home |
Search
Download parshl.pdf