Speaker Verification System
Specifications

Speaker Verification Submenu

Overview
Specifications
- high level description
- preprocessing
- feature extraction
- enrollment
- threshold creation
- verification
Performance
Recommendations
Manual

 High Level Description


The Speaker Verifcation process can be broken up into a number of modules as shown in Fig. 1 below. A one second voice signal sampled at 16kHz first undergoes preprocessing to to minimize analysis errors. Preprocessing includes silence removal and high pass filtering. The processed voice signal then goes into the feature extraction module that identifies the voice characteristics. The next step depends on what the specific task, whether it is enrollment, threshold creation, or verification.

In enrollment, the feature vectors (from the feature extraction module) are compressed using vector quantisation. The compressed vectors are referred to as the voice print. These are stored onto the computer's hard drive.

In threshold creation, the feature vectors are compared with the existing voice print to create the threshold used for verification.

In verification, the feature vectors are compared in the same way as threshold creation but the difference is compared to the threshold to arrive at a pass or fail verdict. Pass means that the speaker was verified to be who they claim.

Figure 1: High Level System Diagram