Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.2
πΌ TotTalk Cry Eval
Real-time multi-model baby cry classification tool. Available as a CLI (terminal with live mic) and a Gradio web app (browser-based, deployable for free).
Models
| # | Name | Type | Source | Speed |
|---|---|---|---|---|
| 1 | foduucom-SVC | sklearn SVC, 194-dim MFCC features | HuggingFace | < 1 ms |
| 2 | DistilHuBERT | DistilHuBERT fine-tune (5 classes) | HuggingFace | ~35 ms |
| 3 | Kibalama-9c | Wav2Vec2 fine-tune (9 classes incl. discomfort, tired, cold/hot) | HuggingFace | ~90 ms |
| 4 | YAMNet-detector | TF Hub YAMNet (binary cry gate) | TF Hub | < 10 ms |
Web app (Gradio)
cd cry-eval
uv sync
uv run python app.py
Open http://localhost:7860 β record audio from your mic or upload a file.
Deploy for free on HuggingFace Spaces
- Go to huggingface.co/new-space
- Select Gradio β Blank, CPU Basic (free), Public visibility
- Create the Space, then push:
cp README.md README_GITHUB.md cp README_HF.md README.md git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/cry-eval git add -A && git commit -m "Configure for HF Spaces" git push hf main - Deploys automatically (~5 min first build)
CLI (terminal)
# Run with mic input
uv run python main.py
# Run with an audio file
uv run python main.py --file path/to/cry.wav
# Select specific models
uv run python main.py --models svc,hubert,kibalama
# Disable YAMNet gating
uv run python main.py --no-yamnet-gate
# Save predictions to JSONL
uv run python main.py --save-log results.jsonl
Requirements
- Python β₯ 3.11
- A working microphone (for live mode)
- ~1 GB RAM for transformer models
Model weights are auto-downloaded on first run into HuggingFace/TF Hub caches.
Project structure
cry-eval/
βββ pyproject.toml
βββ requirements.txt # for HF Spaces / pip deployments
βββ README.md
βββ README_HF.md # HuggingFace Spaces metadata
βββ app.py # Gradio web UI
βββ main.py # CLI entrypoint
βββ models/
β βββ base.py # abstract CryClassifier + CryPrediction
β βββ foduucom_svc.py # sklearn SVC
β βββ wiam_wav2vec2.py # DistilHuBERT fine-tune
β βββ kibalama.py # Wav2Vec2 9-class fine-tune
β βββ yamnet.py # YAMNet binary detector
β βββ ensemble.py # orchestrates all models
βββ audio/
β βββ capture.py # MicCapture + FileCapture
β βββ preprocess.py # MFCC, mel, resample, RMS
βββ display/
β βββ table.py # Rich live table renderer
βββ weights/ # auto-downloaded (gitignored)