Speaker Embedding Endpoint

Custom HuggingFace Inference Endpoint for extracting speaker embeddings using SpeechBrain's ECAPA-TDNN model.

Model

This endpoint uses speechbrain/spkrec-ecapa-voxceleb model which achieves:

  • 0.80% EER on VoxCeleb1 test set
  • 192-dimensional speaker embeddings

Usage

API Request

curl -X POST \
  https://your-endpoint-url.endpoints.huggingface.cloud \
  -H "Authorization: Bearer YOUR_HF_TOKEN" \
  -H "Content-Type: audio/wav" \
  --data-binary "@audio.wav"

Response

{
  "embedding": [0.123, -0.456, ...],
  "dimension": 192,
  "model": "speechbrain/spkrec-ecapa-voxceleb"
}

Speaker Verification

To verify if two audio files are from the same speaker:

  1. Extract embeddings from both audio files
  2. Calculate cosine similarity between embeddings
  3. If similarity > 0.6 (threshold), same speaker
from scipy.spatial.distance import cosine

similarity = 1 - cosine(embedding1, embedding2)
is_same_speaker = similarity > 0.6

Project

Part of Deep Truth - AI Deepfake Voice Detection & Speaker Verification Service

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support