THE ALGORITHM - Jack Wei

In the field of signal processing, Fourier Transforms are used to transform a signal in the time-domain to the frequency domain. This allows the analysis of the different frequency components of a signal.

The note detection algorithm that was chosen was to perform a series of discrete Fourier transforms at the Nyquest frequency of the notes. With 1/32 notes at 120 BPM, each Fourier transform would be performed on 1/32 seconds worth of data.

With the sound card configured to sample at the same sampling rate as audio CDs, each Fourier transform would only have around 1300 samples to operate upon. This issue is further complicated by the fact that the FFT (fast Fourier transform) algorithm that is to be used can only perform the transform on a 2^N data points without the possibility of adding artifacts in the output data. At 32 sampling periods for the FFT per second, the resulting transform graph would not have a high enough resolution to properly distinguish between different notes. Thus it was decided that the detection of 1/32 notes would be discarded and instead the fastest note would be the 1/16 note, allowing the FFT to operate on 2048 samples and output the same number of data points.

Once the data are transformed to the frequency domain, a note would be represented by a series of peaks unique to each note. It is the locations of these peaks that would allow the system to discern between different notes. A heuristic algorithm was devised to detect the existence of these peaks in the input of the data. Due to the low resolution of the transformed data (2048 data points for 44.1 KHz), multiple peaks need to be matched to differentiate between different notes.