Filter Based Processing
This technique used a set of parallel Butterworth filters to
separate the audio signal into several frequency sections. A Matlab script was
created that took as its input the audio stream, sampling frequency and sample
resolution, as well as four parameters: number of sections, length of interval,
overlap between intervals and the order of the filters to use.
The number of sections determined the frequency resolution of the resulting
data set by setting how many frequency bands the signal is broken down into. The interval
length affected the time resolution of the resulting data set by setting the length
of the interval that will be examined to produce a single vector. The overlap further affects
the time resolution of the resulting data set by setting how much one interval will
overlap with the preceding interval. The following figure illustrates the meaning of
these three parameters. The final parameter, the order of filter to generate, allowed the
cutoff between two adjacent frequency intervals to be adjusted. Each parameter offers the trade
off between accuracy of the data and efficiency of the process.
In order to produce a set of vectors, the audio signal was passed through the filters.
Each resulting signal was then rectified and low pass filtered to produce an envelope function of the
specific frequency range. The mean values of each of these functions were then used to build a vector.
This process was repeated for each interval extracted from the audio signal to produce a set of
vectors representing the entire recording.
Various filter orders were tested and it was found that a tenth order filter was necessary
to achieve adequate distinction between frequency sections. This resulted in an
excessive amount of processing time being required to transform the data. The
process was run on a set of audio recordings with a total run time of just under two
minutes at a sample frequency of 44100 Hz. Using 85% overlap, one second intervals and both
ten and twenty-five sections, the total run time exceeded twelve hours on a 1.2 GHz workstation.
As a result, this method was deemed too expensive to be useful.