praise-ml-handler
Unified ASR + Diarization + Speaker Embedding + Speaker Matching handler for praise.global.
Forked from sergeipetrov/asrdiarization-handler.
Extensions over upstream
- Speaker embedding extraction — extracts per-speaker embeddings from pyannote's internal wespeaker model as a byproduct of diarization
- Speaker matching — matches diarized speakers against known voice profiles using cosine similarity
- Confidence tiers — HIGH (≥0.55), MEDIUM (≥0.35), LOW (<0.35) calibrated for pyannote embeddings
API
Standard Inference Endpoint POST / with inputs (base64 audio) and parameters:
{
"inputs": "<base64_audio>",
"parameters": {
"task": "transcribe",
"language": "en",
"batch_size": 24,
"chunk_length_s": 30,
"min_speakers": 2,
"max_speakers": 12,
"return_embeddings": true,
"known_speakers": [
{"slug": "bob-ryan", "name": "Bob Ryan", "centroid_b64": "..."}
]
}
}
Response
{
"text": "full transcript...",
"chunks": [...],
"speakers": [...],
"speaker_embeddings": {
"SPEAKER_00": {"embedding_b64": "...", "embedding_dim": 512, "total_seconds": 45.2, "num_segments": 12}
},
"speaker_matches": {
"SPEAKER_00": {"matched_slug": "bob-ryan", "matched_name": "Bob Ryan", "confidence": "HIGH", "score": 0.72}
}
}
Deployment
Create via HF Inference Endpoints API with env vars:
ASR_MODEL=openai/whisper-large-v3DIARIZATION_MODEL=pyannote/speaker-diarization-3.1HF_TOKEN=<your_token>ASSISTANT_MODEL=distil-whisper/distil-large-v3(optional, for speculative decoding)
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support