Spaces:
Sleeping
title: Reachy SpeechBrain API
emoji: π€
colorFrom: blue
colorTo: purple
sdk: docker
pinned: false
Reachy SpeechBrain API
FastAPI-based Speaker Recognition API for Reachy robots.
Reachy SpeechBrain API is a lightweight speaker recognition service built with FastAPI and SpeechBrain, designed to run on Hugging Face Spaces (Docker) or locally, and to be easily integrated with Reachy robots or any backend.
Features
- π€ Speaker recognition powered by SpeechBrain ECAPA-TDNN
- π€ Speaker enrollment, identification, and verification
- β‘ FastAPI HTTP API (simple & stateless)
- π³ Hugging Face Docker Space compatible
- π§ CPU-friendly speaker embeddings
- π€ Ready to integrate with Reachy Mini
- π¦ Dependency management with uv
- ποΈ Flexible storage: local or Hugging Face Hub dataset
API Endpoints
Health check
GET /health
Response:
{ "status": "ok" }
List speakers
GET /speakers
Response:
{
"speakers": ["alice", "bob", "charlie"]
}
Enroll a speaker
POST /speakers/{name}/enroll
Request
multipart/form-data- Field:
file(audio file: WAV, MP3, FLAC, etc.)
Example
curl -X POST \
-F "file=@voice_sample.wav" \
http://localhost:7860/speakers/alice/enroll
Response
{
"message": "Speaker 'alice' enrolled successfully",
"embedding_size": 192
}
Delete a speaker
DELETE /speakers/{name}
Example
curl -X DELETE http://localhost:7860/speakers/alice
Response
{
"message": "Speaker 'alice' deleted successfully"
}
Identify speaker
POST /identify
Identifies who is speaking from the enrolled speakers.
Request
multipart/form-data- Field:
file(audio file)
Example
curl -X POST \
-F "file=@unknown_voice.wav" \
http://localhost:7860/identify
Response
{
"identified": true,
"speaker": "alice",
"confidence": 0.85,
"threshold": 0.25
}
Verify speaker
POST /verify?name={speaker_name}
Verifies if the audio matches a specific speaker.
Request
- Query param:
name(speaker name to verify against) multipart/form-data- Field:
file(audio file)
Example
curl -X POST \
-F "file=@voice.wav" \
"http://localhost:7860/verify?name=alice"
Response
{
"verified": true,
"speaker": "alice",
"confidence": 0.92,
"threshold": 0.25
}
Deployment (Hugging Face Space)
Recommended setup:
- Space type:
Docker - Hardware: CPU (default) or GPU
- Exposed port:
7860
Repository structure
reachy-speechbrain-api/
βββ app.py # FastAPI application
βββ storage.py # Storage backends (local & HuggingFace)
βββ Dockerfile # Docker image definition
βββ pyproject.toml # Project configuration and dependencies
βββ uv.lock # Lockfile for reproducible builds
βββ .gitignore # Git ignore rules
βββ speakers/ # Speaker embeddings storage (created at runtime)
βββ tests/ # Test suite
β βββ __init__.py
β βββ conftest.py # Pytest fixtures
β βββ test_api.py # API tests
β βββ test_storage.py # Storage tests
βββ README.md
Once pushed, the Space will automatically build and expose:
https://<username>-<space-name>.hf.space
Docker (local run)
docker build -t reachy-speechbrain-api .
docker run -p 7860:7860 reachy-speechbrain-api
Storage Configuration
Speaker embeddings can be stored locally or on Hugging Face Hub.
Local storage (default)
By default, embeddings are stored in speakers/embeddings.json. No configuration needed.
Hugging Face Hub storage
To persist embeddings in a Hugging Face dataset (useful for sharing between instances):
# Set environment variables
export HF_EMBEDDINGS_REPO="username/my-speaker-embeddings"
export HF_TOKEN="hf_xxxxxxxxxxxxx" # Optional if logged in via `huggingface-cli login`
# Run the API
uv run uvicorn app:app --host 0.0.0.0 --port 7860
Or in Docker:
docker run -p 7860:7860 \
-e HF_EMBEDDINGS_REPO="username/my-speaker-embeddings" \
-e HF_TOKEN="hf_xxxxxxxxxxxxx" \
reachy-speechbrain-api
The dataset will be created automatically (as private) if it doesn't exist.
Dependencies
Dependencies are managed using uv.
Main dependencies:
fastapiuvicornspeechbrain(develop branch)torchaudiopython-multipartrequestshuggingface-hub
The lockfile (uv.lock) ensures reproducible builds.
Development
Install dev dependencies:
uv sync --extra dev
Tools
- ruff - Linter and formatter
- mypy - Static type checker
- pytest - Testing framework
- pytest-cov - Code coverage
Run tests
uv run pytest
Coverage report is generated in htmlcov/ and displayed in terminal.
Lint and format
uv run ruff check .
uv run ruff format .
Type checking
uv run mypy .
Release workflow
This project uses commitizen for versioning and changelog generation.
To trigger a new release, push a commit to main with the message chore: release a new version:
git commit --allow-empty -m "chore: release a new version"
git push origin main
This will:
- Bump the version based on conventional commits
- Generate/update the CHANGELOG
- Create a GitHub Release
- Sync to Hugging Face Space
Usage with Reachy
This API is designed to be called from:
- Reachy Mini
- A central VPS backend
- Another Hugging Face Space
Typical flow:
- Enrollment: Record voice samples from known users and enroll them
- Identification: When someone speaks, send audio to
/identifyto know who it is - Verification: Use
/verifyto confirm a claimed identity
Use cases:
- Personalized interactions based on who is speaking
- Access control for voice commands
- Multi-user conversation tracking
License
This project is licensed under the MIT License - see the LICENSE file for details.