International Standard Book Number — Click for https://mathworld.wolfram.com/ISBN.html
Search JOS Website
JOS Home Page
JOS Online Publications
Index of terms in JOS Website
Index: Spectral Audio Signal Processing
Spectral Audio Signal Processing
The Short-Time Fourier Transform
Practical Computation of the STFT
Two Dual Interpretations of the STFT
Click for https://ccrma.stanford.edu/~jos/mdft/Alias_Operator.html
Click for https://ccrma.stanford.edu/~jos/mdft/Linear_Phase_Terms.html
Click for https://ccrma.stanford.edu/~jos/mdft/Spectral_Bin_Numbers.html
The sampling rate of a discrete-time signal is defined as the number of samples per second. Its units are thus in Hertz (Hz). Shannon's Sampling Theorem states that the original continuous-time signal can be recovered exactly from the samples if and only if the sampling rate is higher than twice the highest frequency present in the original signal. Any higher frequencies will alias to frequencies below half the sampling rate. — Click for https://ccrma.stanford.edu/~jos/mdft/Sampling_Theory.html
The Short Time Fourier Transform (STFT) computes the spectrum (DFT) of successive time frames of a signal. — Click for https://ccrma.stanford.edu/~jos/sasp/Short_Time_Fourier_Transform.html
Bandlimited interpolation is the ideal type of interpolation for properly sampled signals. — Click for https://ccrma.stanford.edu/~jos/resample/
The spectrum of a signal gives the distribution of signal energy as a function of frequency. — Click for http://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html
Fast Fourier Transforms (FFT) are fast algorithms for computing the Discrete Fourier Transform (DFT) — Click for https://ccrma.stanford.edu/~jos/mdft/Fast_Fourier_Transform_FFT.html
Spectrum analysis of sound is analogous to decomposing white light into its component colors by means of a prism — Click for https://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html
Click for https://ccrma.stanford.edu/~jos/mdft/Zero_Padding.html
Spectrum analysis of sound is analogous to decomposing white light into its component colors by means of a prism — Click for https://ccrma.stanford.edu/~jos/mdft/Example_Applications_DFT.html
A signal is typically a real-valued function of time. A discrete-time signal is typically a real-valued function of discrete time, and is therefore a time-ordered sequence of real numbers. — Click for http://ccrma.stanford.edu/~jos/filters/Definition_Signal.html
Read
samples of the input signal
into a local buffer of
length
which is initially zeroed
We call
the
th frame of the input signal, and
the
th time normalized input frame
(time-normalized by translating it to time zero). The frame length is
, which we assume to be odd for reasons to be
discussed later. The time advance
(in samples) from one frame to
the next is called the
hop size
or
step size.
Multiply the data frame pointwise by a length spectrum
analysis window
to obtain the
th
windowed data frame (time normalized):
Extend
with zeros on both sides to obtain a
zero-padded frame:
(8.5)
where
is chosen to be a power of two larger than
. The number
is the
zero-padding factor.
As discussed in §2.5.3,
the zero-padding factor is the interpolation factor for the
spectrum, i.e., each FFT bin is replaced by
bins, interpolating
the spectrum using ideal bandlimited interpolation [264], where
the ``band'' in this case is the
-sample nonzero duration of
in the time domain.
Take a length
FFT of
to obtain the time-normalized,
frequency-sampledSTFT at time
:
If needed, time normalization may be removed using a
linear phase term to yield the sampled STFT:
(8.7)
The (continuous-frequency) STFT may be approached arbitrarily closely
by using more zero padding and/or other interpolation methods.
Note that there is no irreversible time-aliasing when the STFT
frequency axis
is sampled to the points
, provided
the FFT size
is greater than or equal to the window length
.