Speaker Verification System

Speaker Verification Submenu

- high level description
- preprocessing
- feature extraction
- enrollment
- threshold creation
- verification

 threshold creation

Since some words can be said more consistently than others and people say things with different consistency, it makes sense to have user specific thresholds. After users have created their voiceprint, they are asked to make a second recording. The system takes this second recording and calculates the difference, or average distortion, between the second recording and the saved voiceprint. Average distortion is the average euclidean distance between the test vectors and the codebook vectors. This distortion is then used as a basis for the the threshold. Specifically, the distortion from the second recording is multiplied by 1.2 and saved as the threshold. The reasoning behind the multiplication factor is that the user is likely to say the word more differently the third time around during verification.

It is important during the threshold creation to say the word with fair accuracy because if the resulting threshold is very high, imposters will have an easier time passing verification test.