Enrollment involves the creation of the voiceprint -- the biometric identity of the user. The procedure used here is called vector quantization. Vector quantization is a compression technique that reduces a set of Mel-Frequency Coefficients to a set of "code vectors." As mentioned in the feature extraction section, each frame gets 12 coefficients. These sets of coefficients can be seen as vectors in 12-space. The number of vectors depends on the length of the recording. Given the number of desired codewords, vector quantization generates the codewords by finding the finding the centroids of codevectors. The specific algorithm used is called Linde Buzo Gray Clustering Encoding algorithm. See the technical design document for algorithm details.

Below is a plot of the MFCC's and the corresponding code vectors.

Figure 4: Enrollment/Code Vectors Plot