Since some words can be said more consistently than others and people
say things with different consistency, it makes sense to have user specific
thresholds. After users have created their voiceprint, they are asked
to make a second recording. The system takes this second recording and
calculates the difference, or average distortion, between the second recording
and the saved voiceprint. Average distortion is the average euclidean
distance between the test vectors and the codebook vectors. This distortion
is then used as a basis for the the threshold. Specifically, the distortion
from the second recording is multiplied by 1.2 and saved as the threshold.
The reasoning behind the multiplication factor is that the user is likely
to say the word more differently the third time around during verification.
It is important during the threshold creation to say the word with fair
accuracy because if the resulting threshold is very high, imposters will
have an easier time passing verification test.