🎙️

Voice Biometrics

Biometric Workshop Suite — Feature Visualisation

← All Modules
Waveform (time domain)
Spectrogram — Mel scale (scrolling)
Mel Filterbank Energies (26 filters)
Loading audio context…
Live Audio Stats
RMS Energy
Pitch (F0)
Spectral centroid
Zero-crossing rate
MFCCs (13 coefficients)

Each bar is one cepstral coefficient. C0 (energy) is omitted. The pattern across C1–C13 is a compact “voiceprint” of the current sound frame.

Speaker Verification

Record 3 times — your voice pattern is the biometric.

No profiles enrolled yet
How voice recognition works
🎤
1. Capture
Microphone samples audio at 44.1 kHz — speak for ~10 seconds
📊
2. Frame + FFT
Short windows (25 ms) transformed to frequency domain
🔟
3. Mel filterbank
26 triangular filters on a perceptual frequency scale
🧮
4. MFCC
DCT compresses filterbank energies to 13 cepstral coefficients
🔍
5. Match
Mean MFCC vector compared via cosine similarity to enrolled profile