Spaces:
Running
title: Wasla Feedback Moderation API
emoji: π‘οΈ
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860
pinned: false
Feedback Moderation API
AI-powered microservice that detects toxic / abusive language in user-submitted text.
Uses local Hugging Face transformer models β your data never leaves your infrastructure.
Dual-Model Architecture
| Language | Model | Labels |
|---|---|---|
| English, French, Italian (and more) | citizenlab/distilbert-base-multilingual-cased-toxicity |
toxic / not_toxic |
| Arabic | Hate-speech-CNERG/dehatebert-mono-arabic |
HATE / NON_HATE |
Arabic text is automatically detected using the lingua language-detection library and routed to the dedicated Arabic hate-speech model for significantly higher accuracy (~87.8 %). All other languages use the multilingual model.
LLM Verification Layer (optional)
When the local classifier's confidence falls in a configurable grey zone (default: 0.40 β 0.85), the text is forwarded to a free Hugging Face Inference API LLM (default: mistralai/Mistral-7B-Instruct-v0.3) for a second opinion. The LLM score is blended with the local score (40/60 weighting) to produce a refined verdict.
- β‘ No extra latency for clear-cut predictions (>95 % of requests)
- π Free β uses HF Inference Providers with a free token
- π Graceful fallback β if the LLM is unavailable, the local model result is used as-is
Set HF_TOKEN in .env to enable. Leave it empty to disable.
Quick Start
1. Clone & create a virtual environment
git clone <your-repo-url> moderator-api
cd moderator-api
python -m venv venv
# Windows
.\venv\Scripts\Activate.ps1
# macOS / Linux
source venv/bin/activate
2. Install dependencies
pip install -r requirements.txt
3. Configure
Copy the example env file and set a strong API key:
cp .env.example .env
# then edit .env β MODERATOR_API_KEY=<your-secret>
4. Run
uvicorn main:app --reload
The server starts at http://127.0.0.1:8000.
Interactive docs are at http://127.0.0.1:8000/docs.
First launch will download two models (~1 GB total).
Subsequent starts load from the local cache.
API Reference
GET /health
Liveness probe. Returns {"status": "ok", "model_loaded": true}.
POST /api/v1/moderate
Analyse a piece of text for toxicity.
Headers
| Header | Required | Description |
|---|---|---|
X-API-Key |
β | Your secret API key |
Request body
{
"text": "This product is terrible and I hate everything about it!"
}
Response
{
"has_bad_words": true,
"confidence": 0.9812,
"label": "toxic",
"detected_language": "en",
"llm_verified": false
}
Calling from an External App
import requests
API_URL = "http://your-server:8000/api/v1/moderate"
API_KEY = "your-secret-api-key"
response = requests.post(
API_URL,
json={"text": "I absolutely love this!"},
headers={"X-API-Key": API_KEY},
)
result = response.json()
if result["has_bad_words"]:
print("β οΈ Feedback rejected β toxic content detected.")
else:
print("β
Feedback is clean.")
Docker
Build
docker build -t moderator-api .
Run
docker run -d \
-p 8000:8000 \
-e MODERATOR_API_KEY=your-secret-api-key \
--name moderator-api \
moderator-api
Deploy to Hugging Face Spaces (Free)
1. Create a Space
Go to huggingface.co/new-space and create a new Space:
- Space name:
moderator-api - SDK: Docker
- Visibility: Private (recommended)
2. Add secrets
In your Space β Settings β Secrets, add:
| Secret | Value |
|---|---|
MODERATOR_API_KEY |
A strong random string |
HF_TOKEN |
Your Hugging Face token |
LLM_VERIFY_LOW |
0.0 (or 0.40 for grey-zone only) |
LLM_VERIFY_HIGH |
1.0 (or 0.85 for grey-zone only) |
3. Push your code
git init
git remote add space https://huggingface.co/spaces/YOUR_USERNAME/moderator-api
git add .
git commit -m "Deploy moderator API"
git push space main
4. Access your API
Once built, your API is live at:
https://YOUR_USERNAME-moderator-api.hf.space/api/v1/moderate
Interactive docs at:
https://YOUR_USERNAME-moderator-api.hf.space/docs
β οΈ First deploy takes ~5 minutes (model downloads ~1 GB).
Subsequent deploys use cached layers and are much faster.
Configuration (Environment Variables)
| Variable | Default | Description |
|---|---|---|
MODERATOR_API_KEY |
change-me-before-production |
Secret key clients must send in headers |
MODEL_NAME |
citizenlab/distilbert-base-multilingual-cased-toxicity |
Multilingual Hugging Face model (en/fr/it) |
ARABIC_MODEL_NAME |
Hate-speech-CNERG/dehatebert-mono-arabic |
Dedicated Arabic hate-speech model |
TOXICITY_THRESHOLD |
0.70 |
Confidence cutoff to flag text as toxic |
ARABIC_TOXICITY_THRESHOLD |
0.45 |
Lower threshold for Arabic (model outputs lower scores) |
HF_TOKEN |
(empty β LLM disabled) | Free HF token to enable LLM verification |
LLM_MODEL_NAME |
Qwen/Qwen2.5-72B-Instruct |
LLM used for grey-zone verification |
LLM_VERIFY_LOW |
0.40 |
Lower bound of the grey zone |
LLM_VERIFY_HIGH |
0.85 |
Upper bound of the grey zone |
Project Structure
moderator-api/
βββ main.py # FastAPI application & moderation endpoint
βββ requirements.txt # Python dependencies
βββ Dockerfile # Production container image
βββ .dockerignore # Files excluded from Docker build
βββ .env # Local environment variables (git-ignored)
βββ .env.example # Template for .env
βββ .gitignore
βββ README.md
License
MIT