Speaker Verification System
Specifications

Speaker Verification Submenu

Overview
Specifications
- high level description
- preprocessing
- feature extraction
- enrollment
- threshold creation
- verification
Performance
Recommendations
Manual

 threshold creation


Since some words can be said more consistently than others and people say things with different consistency, it makes sense to have user specific thresholds. After users have created their voiceprint, they are asked to make a second recording. The system takes this second recording and calculates the difference, or average distortion, between the second recording and the saved voiceprint. Average distortion is the average euclidean distance between the test vectors and the codebook vectors. This distortion is then used as a basis for the the threshold. Specifically, the distortion from the second recording is multiplied by 1.2 and saved as the threshold. The reasoning behind the multiplication factor is that the user is likely to say the word more differently the third time around during verification.

It is important during the threshold creation to say the word with fair accuracy because if the resulting threshold is very high, imposters will have an easier time passing verification test.