Enrollment involves the creation of the voiceprint -- the biometric identity
of the user. The procedure used here is called vector quantization. Vector
quantization is a compression technique that reduces a set of Mel-Frequency
Coefficients to a set of "code vectors." As mentioned in the
feature extraction section, each frame gets 12 coefficients. These sets
of coefficients can be seen as vectors in 12-space. The number of vectors
depends on the length of the recording. Given the number of desired codewords,
vector quantization generates the codewords by finding the finding the
centroids of codevectors. The specific algorithm used is called Linde
Buzo Gray Clustering Encoding algorithm. See the technical design
document for algorithm details.
Below is a plot of the MFCC's and the corresponding code vectors.
Figure 4: Enrollment/Code Vectors Plot