File size: 1,422 Bytes
52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a 52b0ede 46cc63a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 | # Model results and comparison
**Demo catalog:** [`configs/model_catalog.yaml`](../configs/model_catalog.yaml) · Baseline metrics: [`models/baseline/manifest.json`](../models/baseline/manifest.json)
| Model | F1 (test, weighted) | Train–test gap | Default in UI |
|-------|---------------------|----------------|---------------|
| LR + TF-IDF (Baseline) | 0.758 | 4.76 pp | No |
| Frozen Toxic-BERT (Baseline) | 0.790 | 0.16 pp | No |
| **Meta-Feature Stacking (Production)** | **0.805** | **2.54 pp** | **Yes** |
**Handover:** [`reports/HANDOVER_REPORT.md`](../reports/HANDOVER_REPORT.md) · **Production JSON:** [`reports/notebook_14/final_result.json`](../reports/notebook_14/final_result.json) · **Golden baseline:** [`reports/golden_baseline/`](../reports/golden_baseline/)
## Baselines
**LR + TF-IDF** — Notebooks 01–03, artifact `models/baseline/lr_tfidf.joblib`, tuning in [`configs/best_params.yaml`](../configs/best_params.yaml).
**Frozen Toxic-BERT** — Notebook 12, `unitary/toxic-bert` inference-only; see golden baseline reports and `manifest.json` → `frozen_toxic_bert`.
## Production
```bash
uv run python -m src.experiments.notebook_14_final_stack
```
Requires `uv sync --extra hf`.
## Other experiments
Historical table: [`reports/summary.csv`](../reports/summary.csv). RF/XGBoost pipelines and `reports/v2/` figures are teammate or archived work — not in the demo model catalog.
|