Spaces:

grungecoder
/

tot-talk

Sleeping

App Files Files Community

tot-talk / README_GITHUB.md

grungecoder

Configure for HF Spaces

605cb44 2 months ago

preview code

raw

history blame contribute delete

3.05 kB

A newer version of the Gradio SDK is available: 6.15.2

Upgrade

🍼 TotTalk Cry Eval

Real-time multi-model baby cry classification tool. Available as a CLI (terminal with live mic) and a Gradio web app (browser-based, deployable for free).

Models

#	Name	Type	Source	Speed
1	foduucom-SVC	sklearn SVC, 194-dim MFCC features	HuggingFace	< 1 ms
2	DistilHuBERT	DistilHuBERT fine-tune (5 classes)	HuggingFace	~35 ms
3	Kibalama-9c	Wav2Vec2 fine-tune (9 classes incl. discomfort, tired, cold/hot)	HuggingFace	~90 ms
4	YAMNet-detector	TF Hub YAMNet (binary cry gate)	TF Hub	< 10 ms

Web app (Gradio)

cd cry-eval
uv sync
uv run python app.py

Open http://localhost:7860 — record audio from your mic or upload a file.

Deploy for free on HuggingFace Spaces

Go to huggingface.co/new-space
Select Gradio → Blank, CPU Basic (free), Public visibility

Create the Space, then push:

cp README.md README_GITHUB.md
cp README_HF.md README.md
git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/cry-eval
git add -A && git commit -m "Configure for HF Spaces"
git push hf main

Deploys automatically (~5 min first build)

CLI (terminal)

# Run with mic input
uv run python main.py

# Run with an audio file
uv run python main.py --file path/to/cry.wav

# Select specific models
uv run python main.py --models svc,hubert,kibalama

# Disable YAMNet gating
uv run python main.py --no-yamnet-gate

# Save predictions to JSONL
uv run python main.py --save-log results.jsonl

Requirements

Python ≥ 3.11
A working microphone (for live mode)
~1 GB RAM for transformer models

Model weights are auto-downloaded on first run into HuggingFace/TF Hub caches.

Project structure

cry-eval/
├── pyproject.toml
├── requirements.txt       # for HF Spaces / pip deployments
├── README.md
├── README_HF.md           # HuggingFace Spaces metadata
├── app.py                 # Gradio web UI
├── main.py                # CLI entrypoint
├── models/
│   ├── base.py           # abstract CryClassifier + CryPrediction
│   ├── foduucom_svc.py   # sklearn SVC
│   ├── wiam_wav2vec2.py  # DistilHuBERT fine-tune
│   ├── kibalama.py       # Wav2Vec2 9-class fine-tune
│   ├── yamnet.py         # YAMNet binary detector
│   └── ensemble.py       # orchestrates all models
├── audio/
│   ├── capture.py        # MicCapture + FileCapture
│   └── preprocess.py     # MFCC, mel, resample, RMS
├── display/
│   └── table.py          # Rich live table renderer
└── weights/              # auto-downloaded (gitignored)