Spaces:

grungecoder
/

tot-talk

Sleeping

App Files Files Community

tot-talk / README_GITHUB.md

grungecoder

Configure for HF Spaces

605cb44 2 months ago

preview code

raw

history blame contribute delete

3.05 kB

	# 🍼 TotTalk Cry Eval

	Real-time multi-model baby cry classification tool. Available as a CLI (terminal with live mic) and a Gradio web app (browser-based, deployable for free).

	## Models

	\| # \| Name \| Type \| Source \| Speed \|
	\|---\|------\|------\|--------\|-------\|
	\| 1 \| foduucom-SVC \| sklearn SVC, 194-dim MFCC features \| [HuggingFace](https://huggingface.co/foduucom/baby-cry-classification) \| < 1 ms \|
	\| 2 \| DistilHuBERT \| DistilHuBERT fine-tune (5 classes) \| [HuggingFace](https://huggingface.co/AmeerHesham/distilhubert-finetuned-baby_cry) \| ~35 ms \|
	\| 3 \| Kibalama-9c \| Wav2Vec2 fine-tune (9 classes incl. discomfort, tired, cold/hot) \| [HuggingFace](https://huggingface.co/Kibalama/baby_cry_classification_model) \| ~90 ms \|
	\| 4 \| YAMNet-detector \| TF Hub YAMNet (binary cry gate) \| [TF Hub](https://tfhub.dev/google/yamnet/1) \| < 10 ms \|

	## Web app (Gradio)

	```bash
	cd cry-eval
	uv sync
	uv run python app.py
	```

	Open `http://localhost:7860` — record audio from your mic or upload a file.

	### Deploy for free on HuggingFace Spaces

	1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
	2. Select Gradio → Blank, CPU Basic (free), Public visibility
	3. Create the Space, then push:
	```bash
	cp README.md README_GITHUB.md
	cp README_HF.md README.md
	git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/cry-eval
	git add -A && git commit -m "Configure for HF Spaces"
	git push hf main
	```
	4. Deploys automatically (~5 min first build)

	## CLI (terminal)

	```bash
	# Run with mic input
	uv run python main.py

	# Run with an audio file
	uv run python main.py --file path/to/cry.wav

	# Select specific models
	uv run python main.py --models svc,hubert,kibalama

	# Disable YAMNet gating
	uv run python main.py --no-yamnet-gate

	# Save predictions to JSONL
	uv run python main.py --save-log results.jsonl
	```

	## Requirements

	- Python ≥ 3.11
	- A working microphone (for live mode)
	- ~1 GB RAM for transformer models

	Model weights are auto-downloaded on first run into HuggingFace/TF Hub caches.

	## Project structure

	```
	cry-eval/
	├── pyproject.toml
	├── requirements.txt # for HF Spaces / pip deployments
	├── README.md
	├── README_HF.md # HuggingFace Spaces metadata
	├── app.py # Gradio web UI
	├── main.py # CLI entrypoint
	├── models/
	│ ├── base.py # abstract CryClassifier + CryPrediction
	│ ├── foduucom_svc.py # sklearn SVC
	│ ├── wiam_wav2vec2.py # DistilHuBERT fine-tune
	│ ├── kibalama.py # Wav2Vec2 9-class fine-tune
	│ ├── yamnet.py # YAMNet binary detector
	│ └── ensemble.py # orchestrates all models
	├── audio/
	│ ├── capture.py # MicCapture + FileCapture
	│ └── preprocess.py # MFCC, mel, resample, RMS
	├── display/
	│ └── table.py # Rich live table renderer
	└── weights/ # auto-downloaded (gitignored)
	```