moderator-api / README.md
Mohameddda's picture
Initial commit
1c3fe08
metadata
title: Wasla Feedback Moderation API
emoji: πŸ›‘οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false

Feedback Moderation API

AI-powered microservice that detects toxic / abusive language in user-submitted text.
Uses local Hugging Face transformer models β€” your data never leaves your infrastructure.

Dual-Model Architecture

Language Model Labels
English, French, Italian (and more) citizenlab/distilbert-base-multilingual-cased-toxicity toxic / not_toxic
Arabic Hate-speech-CNERG/dehatebert-mono-arabic HATE / NON_HATE

Arabic text is automatically detected using the lingua language-detection library and routed to the dedicated Arabic hate-speech model for significantly higher accuracy (~87.8 %). All other languages use the multilingual model.

LLM Verification Layer (optional)

When the local classifier's confidence falls in a configurable grey zone (default: 0.40 – 0.85), the text is forwarded to a free Hugging Face Inference API LLM (default: mistralai/Mistral-7B-Instruct-v0.3) for a second opinion. The LLM score is blended with the local score (40/60 weighting) to produce a refined verdict.

  • ⚑ No extra latency for clear-cut predictions (>95 % of requests)
  • πŸ†“ Free β€” uses HF Inference Providers with a free token
  • πŸ”’ Graceful fallback β€” if the LLM is unavailable, the local model result is used as-is

Set HF_TOKEN in .env to enable. Leave it empty to disable.


Quick Start

1. Clone & create a virtual environment

git clone <your-repo-url> moderator-api
cd moderator-api
python -m venv venv

# Windows
.\venv\Scripts\Activate.ps1
# macOS / Linux
source venv/bin/activate

2. Install dependencies

pip install -r requirements.txt

3. Configure

Copy the example env file and set a strong API key:

cp .env.example .env
# then edit .env β†’ MODERATOR_API_KEY=<your-secret>

4. Run

uvicorn main:app --reload

The server starts at http://127.0.0.1:8000.
Interactive docs are at http://127.0.0.1:8000/docs.

First launch will download two models (~1 GB total).
Subsequent starts load from the local cache.


API Reference

GET /health

Liveness probe. Returns {"status": "ok", "model_loaded": true}.

POST /api/v1/moderate

Analyse a piece of text for toxicity.

Headers

Header Required Description
X-API-Key βœ… Your secret API key

Request body

{
  "text": "This product is terrible and I hate everything about it!"
}

Response

{
  "has_bad_words": true,
  "confidence": 0.9812,
  "label": "toxic",
  "detected_language": "en",
  "llm_verified": false
}

Calling from an External App

import requests

API_URL  = "http://your-server:8000/api/v1/moderate"
API_KEY  = "your-secret-api-key"

response = requests.post(
    API_URL,
    json={"text": "I absolutely love this!"},
    headers={"X-API-Key": API_KEY},
)
result = response.json()

if result["has_bad_words"]:
    print("⚠️  Feedback rejected β€” toxic content detected.")
else:
    print("βœ…  Feedback is clean.")

Docker

Build

docker build -t moderator-api .

Run

docker run -d \
  -p 8000:8000 \
  -e MODERATOR_API_KEY=your-secret-api-key \
  --name moderator-api \
  moderator-api

Deploy to Hugging Face Spaces (Free)

1. Create a Space

Go to huggingface.co/new-space and create a new Space:

  • Space name: moderator-api
  • SDK: Docker
  • Visibility: Private (recommended)

2. Add secrets

In your Space β†’ Settings β†’ Secrets, add:

Secret Value
MODERATOR_API_KEY A strong random string
HF_TOKEN Your Hugging Face token
LLM_VERIFY_LOW 0.0 (or 0.40 for grey-zone only)
LLM_VERIFY_HIGH 1.0 (or 0.85 for grey-zone only)

3. Push your code

git init
git remote add space https://huggingface.co/spaces/YOUR_USERNAME/moderator-api
git add .
git commit -m "Deploy moderator API"
git push space main

4. Access your API

Once built, your API is live at:

https://YOUR_USERNAME-moderator-api.hf.space/api/v1/moderate

Interactive docs at:

https://YOUR_USERNAME-moderator-api.hf.space/docs

⚠️ First deploy takes ~5 minutes (model downloads ~1 GB).
Subsequent deploys use cached layers and are much faster.


Configuration (Environment Variables)

Variable Default Description
MODERATOR_API_KEY change-me-before-production Secret key clients must send in headers
MODEL_NAME citizenlab/distilbert-base-multilingual-cased-toxicity Multilingual Hugging Face model (en/fr/it)
ARABIC_MODEL_NAME Hate-speech-CNERG/dehatebert-mono-arabic Dedicated Arabic hate-speech model
TOXICITY_THRESHOLD 0.70 Confidence cutoff to flag text as toxic
ARABIC_TOXICITY_THRESHOLD 0.45 Lower threshold for Arabic (model outputs lower scores)
HF_TOKEN (empty β€” LLM disabled) Free HF token to enable LLM verification
LLM_MODEL_NAME Qwen/Qwen2.5-72B-Instruct LLM used for grey-zone verification
LLM_VERIFY_LOW 0.40 Lower bound of the grey zone
LLM_VERIFY_HIGH 0.85 Upper bound of the grey zone

Project Structure

moderator-api/
β”œβ”€β”€ main.py              # FastAPI application & moderation endpoint
β”œβ”€β”€ requirements.txt     # Python dependencies
β”œβ”€β”€ Dockerfile           # Production container image
β”œβ”€β”€ .dockerignore        # Files excluded from Docker build
β”œβ”€β”€ .env                 # Local environment variables (git-ignored)
β”œβ”€β”€ .env.example         # Template for .env
β”œβ”€β”€ .gitignore
└── README.md

License

MIT