Spaces:

Mohameddda
/

moderator-api

Running

App Files Files Community

moderator-api / README.md

Mohameddda

Initial commit

1c3fe08 about 1 month ago

preview code

raw

history blame contribute delete

6.41 kB

metadata

title: Wasla Feedback Moderation API
emoji: 🛡️
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false

Feedback Moderation API

AI-powered microservice that detects toxic / abusive language in user-submitted text.
Uses local Hugging Face transformer models — your data never leaves your infrastructure.

Dual-Model Architecture

Language	Model	Labels
English, French, Italian (and more)	`citizenlab/distilbert-base-multilingual-cased-toxicity`	`toxic` / `not_toxic`
Arabic	`Hate-speech-CNERG/dehatebert-mono-arabic`	`HATE` / `NON_HATE`

Arabic text is automatically detected using the lingua language-detection library and routed to the dedicated Arabic hate-speech model for significantly higher accuracy (~87.8 %). All other languages use the multilingual model.

LLM Verification Layer (optional)

When the local classifier's confidence falls in a configurable grey zone (default: 0.40 – 0.85), the text is forwarded to a free Hugging Face Inference API LLM (default: mistralai/Mistral-7B-Instruct-v0.3) for a second opinion. The LLM score is blended with the local score (40/60 weighting) to produce a refined verdict.

⚡ No extra latency for clear-cut predictions (>95 % of requests)
🆓 Free — uses HF Inference Providers with a free token
🔒 Graceful fallback — if the LLM is unavailable, the local model result is used as-is

Set HF_TOKEN in .env to enable. Leave it empty to disable.

Quick Start

1. Clone & create a virtual environment

git clone <your-repo-url> moderator-api
cd moderator-api
python -m venv venv

# Windows
.\venv\Scripts\Activate.ps1
# macOS / Linux
source venv/bin/activate

2. Install dependencies

pip install -r requirements.txt

3. Configure

Copy the example env file and set a strong API key:

cp .env.example .env
# then edit .env → MODERATOR_API_KEY=<your-secret>

4. Run

uvicorn main:app --reload

The server starts at http://127.0.0.1:8000.
Interactive docs are at http://127.0.0.1:8000/docs.

First launch will download two models (~1 GB total).
Subsequent starts load from the local cache.

API Reference

`GET /health`

Liveness probe. Returns {"status": "ok", "model_loaded": true}.

`POST /api/v1/moderate`

Analyse a piece of text for toxicity.

Headers

Header	Required	Description
`X-API-Key`	✅	Your secret API key

Request body

{
  "text": "This product is terrible and I hate everything about it!"
}

Response

{
  "has_bad_words": true,
  "confidence": 0.9812,
  "label": "toxic",
  "detected_language": "en",
  "llm_verified": false
}

Calling from an External App

import requests

API_URL  = "http://your-server:8000/api/v1/moderate"
API_KEY  = "your-secret-api-key"

response = requests.post(
    API_URL,
    json={"text": "I absolutely love this!"},
    headers={"X-API-Key": API_KEY},
)
result = response.json()

if result["has_bad_words"]:
    print("⚠️  Feedback rejected — toxic content detected.")
else:
    print("✅  Feedback is clean.")

Docker

Build

docker build -t moderator-api .

Run

docker run -d \
  -p 8000:8000 \
  -e MODERATOR_API_KEY=your-secret-api-key \
  --name moderator-api \
  moderator-api

Deploy to Hugging Face Spaces (Free)

1. Create a Space

Go to huggingface.co/new-space and create a new Space:

Space name: moderator-api
SDK: Docker
Visibility: Private (recommended)

2. Add secrets

In your Space → Settings → Secrets, add:

Secret	Value
`MODERATOR_API_KEY`	A strong random string
`HF_TOKEN`	Your Hugging Face token
`LLM_VERIFY_LOW`	`0.0` (or `0.40` for grey-zone only)
`LLM_VERIFY_HIGH`	`1.0` (or `0.85` for grey-zone only)

3. Push your code

git init
git remote add space https://huggingface.co/spaces/YOUR_USERNAME/moderator-api
git add .
git commit -m "Deploy moderator API"
git push space main

4. Access your API

Once built, your API is live at:

https://YOUR_USERNAME-moderator-api.hf.space/api/v1/moderate

Interactive docs at:

https://YOUR_USERNAME-moderator-api.hf.space/docs

⚠️ First deploy takes ~5 minutes (model downloads ~1 GB).
Subsequent deploys use cached layers and are much faster.

Configuration (Environment Variables)

Variable	Default	Description
`MODERATOR_API_KEY`	`change-me-before-production`	Secret key clients must send in headers
`MODEL_NAME`	`citizenlab/distilbert-base-multilingual-cased-toxicity`	Multilingual Hugging Face model (en/fr/it)
`ARABIC_MODEL_NAME`	`Hate-speech-CNERG/dehatebert-mono-arabic`	Dedicated Arabic hate-speech model
`TOXICITY_THRESHOLD`	`0.70`	Confidence cutoff to flag text as toxic
`ARABIC_TOXICITY_THRESHOLD`	`0.45`	Lower threshold for Arabic (model outputs lower scores)
`HF_TOKEN`	(empty — LLM disabled)	Free HF token to enable LLM verification
`LLM_MODEL_NAME`	`Qwen/Qwen2.5-72B-Instruct`	LLM used for grey-zone verification
`LLM_VERIFY_LOW`	`0.40`	Lower bound of the grey zone
`LLM_VERIFY_HIGH`	`0.85`	Upper bound of the grey zone

Project Structure

moderator-api/
├── main.py              # FastAPI application & moderation endpoint
├── requirements.txt     # Python dependencies
├── Dockerfile           # Production container image
├── .dockerignore        # Files excluded from Docker build
├── .env                 # Local environment variables (git-ignored)
├── .env.example         # Template for .env
├── .gitignore
└── README.md

License

MIT