Spaces:
Sleeping
Sleeping
metadata
title: Speechlib API
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
app_file: app.py
pinned: false
Speechlib REST API (ECAPA-TDNN)
νμ λΆλ¦¬(Speaker Diarization) + νμ μλ³(Speaker Identification) + μμ± μΈμ(STT) REST API
Features
- νμ λΆλ¦¬: pyannote/speaker-diarization-3.1λ‘ μ¬λ¬ νμ ꡬλΆ
- νμ μλ³: speechbrain ECAPA-TDNNμΌλ‘ λ±λ‘λ νμ μλ³ (κ³ μ λ°)
- μμ± μΈμ: faster-whisper (large-v3-turbo)λ₯Ό μ¬μ©ν STT
API Endpoints
GET /
API μν νμΈ
GET /health
ν¬μ€ 체ν¬
POST /transcribe
λ¨μ STT + νμ λΆλ¦¬ (νμ μλ³ μμ)
Parameters (multipart/form-data):
audio: μ€λμ€ νμΌ (νμ)language: μΈμ΄ μ½λ (κΈ°λ³Έκ°: ko)hf_token: HuggingFace ν ν° (νμ)
POST /process
μ 체 κΈ°λ₯: νμ λΆλ¦¬ + νμ μλ³ + STT
Parameters (multipart/form-data):
audio: λΆμν μ€λμ€ νμΌ (νμ)voice_sample: νμ μν νμΌ (μ ν)speaker_name: μλ³ν νμ μ΄λ¦ (κΈ°λ³Έκ°: speaker)language: μΈμ΄ μ½λ (κΈ°λ³Έκ°: ko)hf_token: HuggingFace ν ν° (νμ)
Usage Example
cURL
# λ¨μ STT
curl -X POST "https://YOUR_SPACE.hf.space/transcribe" \
-F "audio=@audio.wav" \
-F "language=ko" \
-F "hf_token=hf_YOUR_TOKEN"
# νμ μλ³ ν¬ν¨
curl -X POST "https://YOUR_SPACE.hf.space/process" \
-F "audio=@conversation.wav" \
-F "voice_sample=@speaker_sample.wav" \
-F "speaker_name=νκΈΈλ" \
-F "language=ko" \
-F "hf_token=hf_YOUR_TOKEN"
Python
import requests
# λ¨μ STT
response = requests.post(
"https://YOUR_SPACE.hf.space/transcribe",
files={"audio": open("audio.wav", "rb")},
data={"language": "ko", "hf_token": "hf_YOUR_TOKEN"}
)
print(response.json())
# νμ μλ³ ν¬ν¨
response = requests.post(
"https://YOUR_SPACE.hf.space/process",
files={
"audio": open("conversation.wav", "rb"),
"voice_sample": open("speaker_sample.wav", "rb")
},
data={
"speaker_name": "νκΈΈλ",
"language": "ko",
"hf_token": "hf_YOUR_TOKEN"
}
)
print(response.json())
JavaScript/Node.js
const FormData = require('form-data');
const fs = require('fs');
const axios = require('axios');
const form = new FormData();
form.append('audio', fs.createReadStream('audio.wav'));
form.append('language', 'ko');
form.append('hf_token', 'hf_YOUR_TOKEN');
const response = await axios.post(
'https://YOUR_SPACE.hf.space/transcribe',
form,
{ headers: form.getHeaders() }
);
console.log(response.data);
Response Format
{
"success": true,
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "μλ
νμΈμ",
"speaker": "νκΈΈλ",
"similarity": 85.3
}
],
"speaker_stats": {
"νκΈΈλ": {
"count": 10,
"duration": 45.5
}
},
"total_segments": 20
}
Notes
- ECAPA-TDNNμ μ μ¬λ μκ³κ° 25% μ΄μμΌ λ νμ λ§€μΉ
- GPU μ¬μ© κ°λ₯ μ μλμΌλ‘ GPU νμ©
- μ§μ μ€λμ€ ν¬λ§·: wav, mp3, m4a, ogg, flac, aac
- API λ¬Έμ: https://YOUR_SPACE.hf.space/docs