# 🍼 TotTalk Cry Eval Real-time multi-model baby cry classification tool. Available as a **CLI** (terminal with live mic) and a **Gradio web app** (browser-based, deployable for free). ## Models | # | Name | Type | Source | Speed | |---|------|------|--------|-------| | 1 | **foduucom-SVC** | sklearn SVC, 194-dim MFCC features | [HuggingFace](https://huggingface.co/foduucom/baby-cry-classification) | < 1 ms | | 2 | **DistilHuBERT** | DistilHuBERT fine-tune (5 classes) | [HuggingFace](https://huggingface.co/AmeerHesham/distilhubert-finetuned-baby_cry) | ~35 ms | | 3 | **Kibalama-9c** | Wav2Vec2 fine-tune (9 classes incl. discomfort, tired, cold/hot) | [HuggingFace](https://huggingface.co/Kibalama/baby_cry_classification_model) | ~90 ms | | 4 | **YAMNet-detector** | TF Hub YAMNet (binary cry gate) | [TF Hub](https://tfhub.dev/google/yamnet/1) | < 10 ms | ## Web app (Gradio) ```bash cd cry-eval uv sync uv run python app.py ``` Open `http://localhost:7860` — record audio from your mic or upload a file. ### Deploy for free on HuggingFace Spaces 1. Go to [huggingface.co/new-space](https://huggingface.co/new-space) 2. Select **Gradio → Blank**, **CPU Basic** (free), Public visibility 3. Create the Space, then push: ```bash cp README.md README_GITHUB.md cp README_HF.md README.md git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/cry-eval git add -A && git commit -m "Configure for HF Spaces" git push hf main ``` 4. Deploys automatically (~5 min first build) ## CLI (terminal) ```bash # Run with mic input uv run python main.py # Run with an audio file uv run python main.py --file path/to/cry.wav # Select specific models uv run python main.py --models svc,hubert,kibalama # Disable YAMNet gating uv run python main.py --no-yamnet-gate # Save predictions to JSONL uv run python main.py --save-log results.jsonl ``` ## Requirements - Python ≥ 3.11 - A working microphone (for live mode) - ~1 GB RAM for transformer models Model weights are auto-downloaded on first run into HuggingFace/TF Hub caches. ## Project structure ``` cry-eval/ ├── pyproject.toml ├── requirements.txt # for HF Spaces / pip deployments ├── README.md ├── README_HF.md # HuggingFace Spaces metadata ├── app.py # Gradio web UI ├── main.py # CLI entrypoint ├── models/ │ ├── base.py # abstract CryClassifier + CryPrediction │ ├── foduucom_svc.py # sklearn SVC │ ├── wiam_wav2vec2.py # DistilHuBERT fine-tune │ ├── kibalama.py # Wav2Vec2 9-class fine-tune │ ├── yamnet.py # YAMNet binary detector │ └── ensemble.py # orchestrates all models ├── audio/ │ ├── capture.py # MicCapture + FileCapture │ └── preprocess.py # MFCC, mel, resample, RMS ├── display/ │ └── table.py # Rich live table renderer └── weights/ # auto-downloaded (gitignored) ```