YouTube Toxic Comment Detector (youtube_hate_detector)
EspaΓ±ol: README.es.md
Automated Safe vs Toxic classification for YouTube-style comments. Production stack: FastAPI (REST) + React (YouTube Watch UI). Default model: Logistic Regression + TF-IDF (models/final_model.joblib).
Clone and layout
git clone <your-repo-url>
cd youtube_hate_detector # use this folder name locally (team convention)
youtube_hate_detector/
βββ configs/ # pipeline, features, model_catalog, suggested_videos
βββ frontend/ # React SPA (Vite)
βββ models/ # final_model.joblib, experiments/
βββ src/
β βββ api/ # FastAPI routes
β βββ service/ # ModelService (inference)
βββ pyproject.toml # uv dependencies
βββ uv.lock
βββ docker-compose.yml
How to use FastAPI
The API loads ModelService once at startup and serves JSON only (the React app is the UI).
cp .env.example .env
uv sync # baseline (LR model only)
uv sync --extra hf # required for DistilBERT / toxic-bert / Fine-tuned HF models
uv run uvicorn src.api.main:app --reload --port 8000
Verify HF deps: uv run python -c "import transformers; print('ok')".
| Resource | URL |
|---|---|
| Swagger | http://localhost:8000/docs |
| Health | http://localhost:8000/health |
Main endpoints
| Method | Path | Description |
|---|---|---|
POST |
/predict |
Score one comment { "text", "threshold" } |
POST |
/predict-video |
Fetch YouTube comments + score { "url", "max_comments", "threshold" } |
GET |
/videos/suggested |
Metadata for right-rail videos (from configs/suggested_videos.yaml) |
GET |
/models |
Available models |
GET |
/models/status |
Per-model availability (HF deps, local weights) |
PUT |
/model/{name} |
Switch active model (warmup-validated) |
Set YOUTUBE_API_KEY in .env for real comments and suggested-video thumbnails.
Change models without UI changes: edit configs/model_catalog.yaml, then restart the API or use Settings in the app.
React UI (local dev)
# Terminal 1 β API
uv run uvicorn src.api.main:app --reload --port 8000
# Terminal 2 β frontend (proxies API)
cd frontend && npm install && npm run dev
Open http://localhost:5173 β Watch page with staged demo player, real suggested videos (click to load comments), English UI.
Docker
export YOUTUBE_API_KEY=your_key # optional but recommended
docker compose up --build # LR model only (default)
# Hugging Face models (transformers + torch; larger image):
INSTALL_HF=1 docker compose build --build-arg INSTALL_HF=1
INSTALL_HF=1 docker compose up
| URL | Service |
|---|---|
| http://localhost:8000 | API + built React SPA |
| http://localhost:8000/docs | Swagger |
Container: youtube_hate_detector-app.
Training (unchanged)
uv run python -m src.pipeline.run_pipeline --model lr
See docs/PIPELINE.md.
Configuration
| File | Purpose |
|---|---|
.env |
Secrets (YOUTUBE_API_KEY, MODEL_NAME) |
configs/model_catalog.yaml |
Inference models for API/UI |
configs/suggested_videos.yaml |
YouTube IDs for the suggested rail |
configs/pipeline.yaml |
Training data paths |
Tests
uv sync --extra dev --extra hf
uv run pytest
Briefing vs team stack
| Topic | Briefing | This repo |
|---|---|---|
| UI | Streamlit | React |
| API | FastAPI | FastAPI |
| Package manager | varies | uv |
Legacy Streamlit (src/app/) has been removed.