--- title: Wasla Feedback Moderation API emoji: 🛡️ colorFrom: blue colorTo: indigo sdk: docker app_port: 7860 pinned: false --- # Feedback Moderation API AI-powered microservice that detects toxic / abusive language in user-submitted text. Uses **local** Hugging Face transformer models — your data never leaves your infrastructure. ### Dual-Model Architecture | Language | Model | Labels | | -------- | ----- | ------ | | English, French, Italian (and more) | `citizenlab/distilbert-base-multilingual-cased-toxicity` | `toxic` / `not_toxic` | | Arabic | `Hate-speech-CNERG/dehatebert-mono-arabic` | `HATE` / `NON_HATE` | Arabic text is **automatically detected** using the [lingua](https://github.com/pemistahl/lingua-py) language-detection library and routed to the dedicated Arabic hate-speech model for significantly higher accuracy (~87.8 %). All other languages use the multilingual model. ### LLM Verification Layer (optional) When the local classifier's confidence falls in a configurable **grey zone** (default: 0.40 – 0.85), the text is forwarded to a free **Hugging Face Inference API** LLM (default: `mistralai/Mistral-7B-Instruct-v0.3`) for a second opinion. The LLM score is blended with the local score (40/60 weighting) to produce a refined verdict. - ⚡ **No extra latency** for clear-cut predictions (>95 % of requests) - 🆓 **Free** — uses HF Inference Providers with a free token - 🔒 **Graceful fallback** — if the LLM is unavailable, the local model result is used as-is Set `HF_TOKEN` in `.env` to enable. Leave it empty to disable. --- ## Quick Start ### 1. Clone & create a virtual environment ```bash git clone moderator-api cd moderator-api python -m venv venv # Windows .\venv\Scripts\Activate.ps1 # macOS / Linux source venv/bin/activate ``` ### 2. Install dependencies ```bash pip install -r requirements.txt ``` ### 3. Configure Copy the example env file and set a strong API key: ```bash cp .env.example .env # then edit .env → MODERATOR_API_KEY= ``` ### 4. Run ```bash uvicorn main:app --reload ``` The server starts at **http://127.0.0.1:8000**. Interactive docs are at **http://127.0.0.1:8000/docs**. > **First launch** will download two models (~1 GB total). > Subsequent starts load from the local cache. --- ## API Reference ### `GET /health` Liveness probe. Returns `{"status": "ok", "model_loaded": true}`. ### `POST /api/v1/moderate` Analyse a piece of text for toxicity. **Headers** | Header | Required | Description | | ----------- | -------- | -------------------- | | `X-API-Key` | ✅ | Your secret API key | **Request body** ```json { "text": "This product is terrible and I hate everything about it!" } ``` **Response** ```json { "has_bad_words": true, "confidence": 0.9812, "label": "toxic", "detected_language": "en", "llm_verified": false } ``` --- ## Calling from an External App ```python import requests API_URL = "http://your-server:8000/api/v1/moderate" API_KEY = "your-secret-api-key" response = requests.post( API_URL, json={"text": "I absolutely love this!"}, headers={"X-API-Key": API_KEY}, ) result = response.json() if result["has_bad_words"]: print("⚠️ Feedback rejected — toxic content detected.") else: print("✅ Feedback is clean.") ``` --- ## Docker ### Build ```bash docker build -t moderator-api . ``` ### Run ```bash docker run -d \ -p 8000:8000 \ -e MODERATOR_API_KEY=your-secret-api-key \ --name moderator-api \ moderator-api ``` --- ## Deploy to Hugging Face Spaces (Free) ### 1. Create a Space Go to [huggingface.co/new-space](https://huggingface.co/new-space) and create a new Space: - **Space name**: `moderator-api` - **SDK**: Docker - **Visibility**: Private (recommended) ### 2. Add secrets In your Space → **Settings → Secrets**, add: | Secret | Value | | ------ | ----- | | `MODERATOR_API_KEY` | A strong random string | | `HF_TOKEN` | Your Hugging Face token | | `LLM_VERIFY_LOW` | `0.0` (or `0.40` for grey-zone only) | | `LLM_VERIFY_HIGH` | `1.0` (or `0.85` for grey-zone only) | ### 3. Push your code ```bash git init git remote add space https://huggingface.co/spaces/YOUR_USERNAME/moderator-api git add . git commit -m "Deploy moderator API" git push space main ``` ### 4. Access your API Once built, your API is live at: ``` https://YOUR_USERNAME-moderator-api.hf.space/api/v1/moderate ``` Interactive docs at: ``` https://YOUR_USERNAME-moderator-api.hf.space/docs ``` > ⚠️ First deploy takes ~5 minutes (model downloads ~1 GB). > Subsequent deploys use cached layers and are much faster. --- ## Configuration (Environment Variables) | Variable | Default | Description | | --------------------- | ------------------------------ | ------------------------------------------ | | `MODERATOR_API_KEY` | `change-me-before-production` | Secret key clients must send in headers | | `MODEL_NAME` | `citizenlab/distilbert-base-multilingual-cased-toxicity` | Multilingual Hugging Face model (en/fr/it) | | `ARABIC_MODEL_NAME` | `Hate-speech-CNERG/dehatebert-mono-arabic` | Dedicated Arabic hate-speech model | | `TOXICITY_THRESHOLD` | `0.70` | Confidence cutoff to flag text as toxic | | `ARABIC_TOXICITY_THRESHOLD` | `0.45` | Lower threshold for Arabic (model outputs lower scores) | | `HF_TOKEN` | *(empty — LLM disabled)* | Free HF token to enable LLM verification | | `LLM_MODEL_NAME` | `Qwen/Qwen2.5-72B-Instruct` | LLM used for grey-zone verification | | `LLM_VERIFY_LOW` | `0.40` | Lower bound of the grey zone | | `LLM_VERIFY_HIGH` | `0.85` | Upper bound of the grey zone | --- ## Project Structure ``` moderator-api/ ├── main.py # FastAPI application & moderation endpoint ├── requirements.txt # Python dependencies ├── Dockerfile # Production container image ├── .dockerignore # Files excluded from Docker build ├── .env # Local environment variables (git-ignored) ├── .env.example # Template for .env ├── .gitignore └── README.md ``` ## License MIT