# 🍼 TotTalk Cry Eval

Real-time multi-model baby cry classification tool. Available as a **CLI** (terminal with live mic) and a **Gradio web app** (browser-based, deployable for free).

## Models

| # | Name | Type | Source | Speed |
|---|------|------|--------|-------|
| 1 | **foduucom-SVC** | sklearn SVC, 194-dim MFCC features | [HuggingFace](https://huggingface.co/foduucom/baby-cry-classification) | < 1 ms |
| 2 | **DistilHuBERT** | DistilHuBERT fine-tune (5 classes) | [HuggingFace](https://huggingface.co/AmeerHesham/distilhubert-finetuned-baby_cry) | ~35 ms |
| 3 | **Kibalama-9c** | Wav2Vec2 fine-tune (9 classes incl. discomfort, tired, cold/hot) | [HuggingFace](https://huggingface.co/Kibalama/baby_cry_classification_model) | ~90 ms |
| 4 | **YAMNet-detector** | TF Hub YAMNet (binary cry gate) | [TF Hub](https://tfhub.dev/google/yamnet/1) | < 10 ms |

## Web app (Gradio)

```bash
cd cry-eval
uv sync
uv run python app.py
```

Open `http://localhost:7860` — record audio from your mic or upload a file.

### Deploy for free on HuggingFace Spaces

1. Go to [huggingface.co/new-space](https://huggingface.co/new-space)
2. Select **Gradio → Blank**, **CPU Basic** (free), Public visibility
3. Create the Space, then push:
   ```bash
   cp README.md README_GITHUB.md
   cp README_HF.md README.md
   git remote add hf https://huggingface.co/spaces/YOUR_USERNAME/cry-eval
   git add -A && git commit -m "Configure for HF Spaces"
   git push hf main
   ```
4. Deploys automatically (~5 min first build)

## CLI (terminal)

```bash
# Run with mic input
uv run python main.py

# Run with an audio file
uv run python main.py --file path/to/cry.wav

# Select specific models
uv run python main.py --models svc,hubert,kibalama

# Disable YAMNet gating
uv run python main.py --no-yamnet-gate

# Save predictions to JSONL
uv run python main.py --save-log results.jsonl
```

## Requirements

- Python ≥ 3.11
- A working microphone (for live mode)
- ~1 GB RAM for transformer models

Model weights are auto-downloaded on first run into HuggingFace/TF Hub caches.

## Project structure

```
cry-eval/
├── pyproject.toml
├── requirements.txt       # for HF Spaces / pip deployments
├── README.md
├── README_HF.md           # HuggingFace Spaces metadata
├── app.py                 # Gradio web UI
├── main.py                # CLI entrypoint
├── models/
│   ├── base.py           # abstract CryClassifier + CryPrediction
│   ├── foduucom_svc.py   # sklearn SVC
│   ├── wiam_wav2vec2.py  # DistilHuBERT fine-tune
│   ├── kibalama.py       # Wav2Vec2 9-class fine-tune
│   ├── yamnet.py         # YAMNet binary detector
│   └── ensemble.py       # orchestrates all models
├── audio/
│   ├── capture.py        # MicCapture + FileCapture
│   └── preprocess.py     # MFCC, mel, resample, RMS
├── display/
│   └── table.py          # Rich live table renderer
└── weights/              # auto-downloaded (gitignored)
```