In the recognition stage, an input utterance is vector-quantized by using the codebook of each reference speaker; the VQ distortion accumulated over the entire input utterance is used for making the recognition determination.
In forensic applications, it is common to first perform a speaker identification process to create a list of "best matches" and then perform a series of verification processes to determine a conclusive match.
Text-Independent Speaker Recognition Methods In text-independent speaker recognition, generally the words or sentences used in recognition trials cannot be predicted.
Scholarpedia, 2 2 This was done for each of the digits making up the input utterance.
Therefore, text-independent methods have attracted more attention. Therefore, attempts have been made to find efficient ways of compressing the training data using vector quantization VQ techniques.
High-level Speaker Recognition High-level features such as word idiolect, pronunciation, phone usage, prosody, etc.