Filter Based Processing

This technique used a set of parallel Butterworth filters to separate the audio signal into several frequency sections. A Matlab script was created that took as its input the audio stream, sampling frequency and sample resolution, as well as four parameters: number of sections, length of interval, overlap between intervals and the order of the filters to use.

The number of sections determined the frequency resolution of the resulting data set by setting how many frequency bands the signal is broken down into. The interval length affected the time resolution of the resulting data set by setting the length of the interval that will be examined to produce a single vector. The overlap further affects the time resolution of the resulting data set by setting how much one interval will overlap with the preceding interval. The following figure illustrates the meaning of these three parameters. The final parameter, the order of filter to generate, allowed the cutoff between two adjacent frequency intervals to be adjusted. Each parameter offers the trade off between accuracy of the data and efficiency of the process.

In order to produce a set of vectors, the audio signal was passed through the filters. Each resulting signal was then rectified and low pass filtered to produce an envelope function of the specific frequency range. The mean values of each of these functions were then used to build a vector. This process was repeated for each interval extracted from the audio signal to produce a set of vectors representing the entire recording.

Various filter orders were tested and it was found that a tenth order filter was necessary to achieve adequate distinction between frequency sections. This resulted in an excessive amount of processing time being required to transform the data. The process was run on a set of audio recordings with a total run time of just under two minutes at a sample frequency of 44100 Hz. Using 85% overlap, one second intervals and both ten and twenty-five sections, the total run time exceeded twelve hours on a 1.2 GHz workstation. As a result, this method was deemed too expensive to be useful.