ANI00's picture
Add root Dockerfile for HF Spaces build
af65c6d verified
---
title: Content Moderation OpenEnv
emoji: 🛡️
colorFrom: blue
colorTo: indigo
sdk: docker
app_file: app.py
pinned: false
---
# Content Moderation OpenEnv
An AI content moderation environment built to the OpenEnv specification. Agents triage real-world content — spam emails, harmful social media posts, and AI-generated deepfakes — using a standard `step()` / `reset()` / `state()` API.
[![OpenEnv Spec](https://img.shields.io/badge/OpenEnv-Spec-blue)](https://github.com/openenv-core/spec)
[![Python 3.11+](https://img.shields.io/badge/Python-3.11+-blue.svg)](https://www.python.org/downloads/)
[![FastAPI](https://img.shields.io/badge/FastAPI-0.111.0-green.svg)](https://fastapi.tiangolo.com/)
[![Docker](https://img.shields.io/badge/Docker-Ready-blue.svg)](https://www.docker.com/)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
---
## 📋 Table of Contents
- [Environment Description & Motivation](#environment-description--motivation)
- [Task Descriptions](#task-descriptions)
- [Observation Space](#observation-space)
- [Action Space](#action-space)
- [Reward Functions](#reward-functions)
- [Baseline Scores](#baseline-scores)
- [Setup & Usage](#setup--usage)
- [Requirements](#requirements)
- [Local Installation](#local-installation)
- [Docker Deployment](#docker-deployment)
- [HuggingFace Spaces Deployment](#huggingface-spaces-deployment)
- [Running the Inference Script](#running-the-inference-script)
- [API Reference](#api-reference)
- [Project Structure](#project-structure)
- [Environment Variables](#environment-variables)
- [Running Tests](#running-tests)
- [Troubleshooting](#troubleshooting)
- [Citation](#citation)
- [Acknowledgements](#acknowledgements)
---
## Environment Description & Motivation
Content moderation is a high-stakes, high-volume real-world task. Human moderators review millions of items daily across platforms and inboxes. This environment simulates a realistic moderation pipeline across three difficulty levels, enabling AI agents to learn decision-making strategies under resource constraints.
**Key Challenges:**
- Multi-label classification with imbalanced datasets
- Confidence calibration under uncertainty
- Real-world content variability (spam, deepfakes, policy violations)
- Escalation vs. immediate action tradeoffs
| Task | Difficulty | Content Type | Metrics | Description |
|---|---|---|---|---|
| `text_spam` | Easy | Email / SMS | Binary classification + confidence | Spam vs. legitimate email filtering with sender reputation signals |
| `content_moderation` | Medium | Social media posts | Multi-label (up to 11 labels) | Detect policy violations (hate speech, harassment, violence, etc.) |
| `deepfake_detection` | Hard | Multimodal (image + text) | Detection accuracy + labels | Identify AI-manipulated media using vision transformer + metadata |
**Models Used:**
- **Deepfake Detection**: `dima806/deepfake_vs_real_image_detection` (ViT, ~300MB)
- **Language Model**: `meta-llama/Llama-3.1-8B-Instruct` (8B params, via transformers or HF API)
- **Infrastructure**: FastAPI + Uvicorn, Docker containerization
---
## Task Descriptions
### Task 1: Text Spam Classification (Easy)
**Objective:** Binary classification of emails/SMS as spam or legitimate.
**Dataset:**
- 50 items (30 spam, 20 legitimate)
- Features: text content, sender reputation score, link count, source
- Ground truth: decision + labels
**Example:**
```json
{
"content_id": "ts_001",
"content_type": "text",
"text": "CONGRATULATIONS! You've won $1,000,000! Click here NOW to claim your prize!!!",
"metadata": {"source": "email", "sender_reputation": 0.05, "link_count": 3},
"ground_truth": {
"decision": "reject",
"labels": ["spam", "scam"],
"is_harmful": true
}
}
```
---
### Task 2: Content Moderation (Medium)
**Objective:** Multi-label classification of social media posts for policy violations.
**Dataset:**
- 40 items across diverse platforms
- Labels: spam, scam, phishing, hate_speech, violence, harassment, misinformation, adult_content, deepfake, political_manipulation, fraud
- Features: post text, engagement metrics, user reputation, report count
**Violation Categories:**
| Category | Definition | Examples |
|---|---|---|
| Hate Speech | Dehumanizing content targeting identity | Slurs, discrimination, incitement |
| Violence | Threats or glorification of violence | Physical harm, weapon promotion |
| Harassment | Coordinated or severe personal attacks | Doxxing, targeted campaigns |
| Misinformation | False claims with societal impact | Election fraud claims, health hoaxes |
---
### Task 3: Deepfake Detection (Hard)
**Objective:** Detect AI-manipulated media and classify content appropriately.
**Dataset:**
- 30 items (multimodal: images + descriptions)
- Deepfake detection model outputs raw confidence scores (0-1)
- Features: image description, detector_score, metadata
**Detector Score Interpretation:**
- `0.0-0.3`: Likely real/authentic
- `0.3-0.7`: Uncertain, may require additional analysis
- `0.7-1.0`: Likely deepfake/manipulated
**Example:**
```json
{
"content_id": "df_001",
"content_type": "multimodal",
"image_description": "Portrait of person in business attire, lighting appears natural",
"detector_score": 0.82,
"metadata": {"platform": "social_media", "report_count": 3}
}
```
---
## Observation Space
Every step returns a `ContentObservation` with the following structure:
```json
{
"content_id": "string",
"content_type": "text | multimodal",
"text": "string (optional, for text tasks)",
"image_description": "string (optional, deepfake task only)",
"detector_score": 0.0-1.0 (optional, deepfake task only),
"metadata": {
"source": "email | social_media | platform",
"sender_reputation": 0.0-1.0,
"link_count": 0,
"report_count": 0,
"timestamp": "ISO8601"
},
"step_num": 1,
"total_steps": 10
}
```
| Field | Type | Required | Task | Description |
|---|---|---|---|---|
| `content_id` | string | All | Unique identifier for the content item |
| `content_type` | string | All | Type of content: `text` or `multimodal` |
| `text` | string | text_spam, content_moderation | The actual email/post body |
| `image_description` | string | deepfake_detection | AI-generated description of the image |
| `detector_score` | float | deepfake_detection | Raw output from deepfake model (0-1) |
| `metadata` | object | All | Platform-specific signals (reputation, reports, etc.) |
| `step_num` | int | All | Current step in episode |
| `total_steps` | int | All | Total steps in this episode |
---
## Action Space
Agents must respond with a `ModerationAction` object:
```json
{
"decision": "approve | reject | escalate | flag",
"reason": "string (max 200 chars, explanation)",
"confidence": 0.0-1.0,
"labels": ["spam", "scam", "phishing", "hate_speech", "violence",
"harassment", "misinformation", "adult_content",
"deepfake", "political_manipulation", "fraud"]
}
```
| Field | Type | Range | Description |
|---|---|---|---|
| `decision` | enum | {approve, reject, escalate, flag} | **approve**: Safe content; **reject**: Clear violation, remove immediately; **flag**: Borderline, needs review; **escalate**: Uncertain, route to specialist |
| `reason` | string | 0-200 chars | Brief explanation of the decision |
| `confidence` | float | [0.0, 1.0] | Agent's confidence in this decision. Calibration is rewarded. |
| `labels` | array | 0-11 labels | Detected violation categories. Must be valid from the allowed set. |
**Valid Labels:**
- `spam` — Unsolicited bulk messaging
- `scam` — Financial or identity theft attempts
- `phishing` — Social engineering attacks
- `hate_speech` — Dehumanizing content targeting identity groups
- `violence` — Threats, glorification, or instruction of violence
- `harassment` — Personal attacks, doxxing, coordinated abuse
- `misinformation` — False claims with societal impact
- `adult_content` — NSFW or sexually explicit material
- `deepfake` — AI-manipulated media
- `political_manipulation` — Coordinated inauthentic behavior
- `fraud` — Financial scams, false claims
---
## Reward Functions
Rewards are computed per task based on decision accuracy, label coverage (F1), and confidence calibration.
### text_spam (Easy)
| Component | Reward | Condition |
|---|---|---|
| Correct decision | **+0.65** | `decision` matches ground truth |
| Escalate on harmful | **+0.30** | Harmful content + escalate/flag (partial credit) |
| Label F1 contribution | **+0.20** | F1 score of predicted vs. true labels |
| Confidence calibration | **±0.10** | Bonus if confident on correct, penalty if confident on wrong |
| **Max per step** | **1.00** | Sum of components (capped) |
### content_moderation (Medium)
| Component | Reward | Condition |
|---|---|---|
| Correct decision | **+0.50** | `decision` matches ground truth |
| Partial credit | **+0.25** | Harmful content + flag/escalate (conservative approach) |
| Label F1 contribution | **+0.35** | Multi-label F1 score (up to 11 labels) |
| Confidence calibration | **±0.10** | Brier score penalty for miscalibration |
| **Max per step** | **1.00** | Sum of components (capped) |
### deepfake_detection (Hard)
| Component | Reward | Condition |
|---|---|---|
| Correct decision | **+0.40** | `decision` matches ground truth |
| Deepfake detection | **+0.30** | Accuracy vs. detector_score threshold |
| Detector alignment | **+0.10** | Bonus for leveraging model signals |
| Label F1 contribution | **+0.20** | Multi-label F1 (fewer labels than medium task) |
| Confidence calibration | **±0.10** | Calibration error penalty |
| **Max per step** | **1.00** | Sum of components (capped) |
**Calibration Bonus Formula:**
```
bonus = 0.1 × (confidence if correct else -confidence)
```
---
## Baseline Scores
Scores reported for **Llama-3.1-8B-Instruct** with `temperature=0.2` and `top-p=0.95`:
| Task | Score | Steps | Notes |
|---|---|---|---|
| `text_spam` | **0.72** | 5 | Strong on obvious spam; struggles with phishing disguised as legitimate |
| `content_moderation` | **0.58** | 8 | Good binary decisions; incomplete label coverage (F1 ≈0.52) |
| `deepfake_detection` | **0.44** | 10 | Relies on image descriptions; independent detector signals underutilized |
---
## Setup & Usage
### Requirements
- **Python**: 3.11 or higher
- **Docker** (optional, for containerized deployment)
- **GPU** (optional, recommended for deepfake models): CUDA 12.1+
- **Memory**: 8GB+ RAM (16GB recommended for local LLM inference)
- **Disk**: 10GB+ (models cached in `~/.cache/huggingface/`)
### Local Installation
1. **Clone and navigate:**
```bash
git clone https://github.com/Anidipta/Content-Moderation-env.git
cd Content-Moderation-env
```
2. **Create virtual environment:**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install dependencies:**
```bash
pip install -r server/requirements.txt
```
4. **Start the server:**
```bash
uvicorn server.main:app --host 0.0.0.0 --port 7860
```
Server runs at `http://localhost:7860`
5. **Access API documentation:**
- Swagger UI: `http://localhost:7860/docs`
- ReDoc: `http://localhost:7860/redoc`
### Docker Deployment
#### Build the Image
```bash
# Basic build
docker build -f server/Dockerfile -t content-moderation-env .
# Build with memory allocation (recommended)
docker build --memory=4g -f server/Dockerfile -t content-moderation-env .
# Build with progress output
docker build --progress=plain -f server/Dockerfile -t content-moderation-env .
```
#### Run the Container
```bash
# Basic run
docker run -p 7860:7860 content-moderation-env
# Run with environment variables
docker run -p 7860:7860 \
-e API_BASE_URL="https://router.huggingface.co/v1" \
-e MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" \
-e HF_TOKEN="hf_your_token_here" \
content-moderation-env
# Run with GPU support
docker run --gpus all -p 7860:7860 content-moderation-env
# Run with volume mounts (cache models locally)
docker run -p 7860:7860 \
-v ~/.cache/huggingface:/app/.cache/huggingface \
content-moderation-env
# Run in background
docker run -d -p 7860:7860 --name moderation-env content-moderation-env
# Check logs
docker logs moderation-env
# Stop container
docker stop moderation-env
```
#### Dockerfile Details
The [server/Dockerfile](server/Dockerfile) uses:
- **Base Image**: `python:3.11-slim` (~300MB) — minimal footprint with Python runtime
- **System Dependencies**: `libgl1 libglib2.0-0 curl` — required for vision models and health checks
- **Dependencies Installation**: Multi-stage approach with pip cache optimization
- **Model Preloading**: Deepfake detection model downloaded during build for faster startup
- **Environment Setup**: HuggingFace cache directories and Python settings pre-configured
- **Entry Point**: FastAPI app via Uvicorn on port 7860
```dockerfile
# Key optimizations:
- --no-cache-dir: Reduces image size by 50%
- --no-build-isolation: Prevents memory spikes during pip install
- Pre-downloaded models: Eliminates first-run delays
- Minimal dependencies: Only libraries needed for the environment
```
#### Deployment to Production
**Docker Compose:**
```yaml
version: '3.8'
services:
moderation-api:
build:
context: .
dockerfile: server/Dockerfile
ports:
- "7860:7860"
environment:
- API_BASE_URL=https://router.huggingface.co/v1
- MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
- HF_TOKEN=${HF_TOKEN}
volumes:
- ~/.cache/huggingface:/app/.cache/huggingface
restart: unless-stopped
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:7860/health"]
interval: 30s
timeout: 10s
retries: 3
```
Run with: `docker-compose up -d`
### HuggingFace Spaces Deployment
1. Create a new Space with Docker SDK
2. Add Secrets (Settings → Repository secrets):
- `HF_TOKEN`: Your HuggingFace API token
3. Add Variables (Settings → Repository variables):
- `API_BASE_URL`: `https://router.huggingface.co/v1`
- `MODEL_NAME`: `meta-llama/Llama-3.1-8B-Instruct`
4. Push this repository to the Space
5. Space URL becomes your `PING_URL` for validation scripts
---
## Running the Inference Script
```bash
# API mode (HF inference endpoint)
export API_BASE_URL="https://router.huggingface.co/v1"
export MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct"
export HF_TOKEN="hf_your_token_here"
export SERVER_URL="http://localhost:7860"
export TASK_NAME="text_spam"
python inference.py
# Local transformers pipeline mode
export USE_LOCAL_MODEL="true"
python inference.py
```
### Output Format
```
[START] task=text_spam env=content_moderation_env model=meta-llama/Llama-3.1-8B-Instruct
[STEP] step=1 action={"decision":"reject","confidence":0.9,"labels":["spam"]} reward=0.85 done=false error=null
[STEP] step=2 action={"decision":"approve","confidence":0.8,"labels":[]} reward=0.75 done=false error=null
[STEP] step=3 action={"decision":"escalate","confidence":0.5,"labels":["scam"]} reward=0.30 done=false error=null
[STEP] step=4 action={"decision":"reject","confidence":0.85,"labels":["phishing"]} reward=0.70 done=false error=null
[STEP] step=5 action={"decision":"approve","confidence":0.88,"labels":[]} reward=0.75 done=true error=null
[END] success=true steps=5 score=0.720 rewards=0.85,0.75,0.30,0.70,0.75
```
| Field | Type | Description |
|---|---|---|
| `task` | string | The task being evaluated |
| `step` | int | Current step number in episode |
| `decision` | string | Agent's moderation decision |
| `confidence` | float | Agent's confidence (0-1) |
| `labels` | array | Detected violation labels |
| `reward` | float | Reward received for this step |
| `done` | boolean | Episode completion flag |
| `error` | string/null | Error message if applicable |
| `score` | float | Final episode score |
---
## API Reference
### Server Endpoints
All endpoints are JSON-based with FastAPI's automatic validation.
#### 1. Reset Episode
**POST** `/reset`
Start a new moderation episode.
**Request Body:**
```json
{
"task": "text_spam"
}
```
**Response (200 OK):**
```json
{
"observation": {
"content_id": "ts_001",
"content_type": "text",
"text": "CONGRATULATIONS! You've won $1,000,000!...",
"metadata": {"source": "email", "sender_reputation": 0.05, "link_count": 3},
"step_num": 1,
"total_steps": 10
},
"info": {}
}
```
**Error (400):**
```json
{
"detail": "Unknown task 'invalid_task'. Valid: ['text_spam', 'content_moderation', 'deepfake_detection']"
}
```
---
#### 2. Submit Action
**POST** `/step`
Submit a moderation action for the current content.
**Request Body:**
```json
{
"decision": "reject",
"reason": "Email contains typical spam patterns and suspicious links",
"confidence": 0.92,
"labels": ["spam", "scam"]
}
```
**Response (200 OK):**
```json
{
"observation": {
"content_id": "ts_002",
"content_type": "text",
"text": "Hi Sarah, confirming our meeting tomorrow...",
"metadata": {"source": "email", "sender_reputation": 0.92, "link_count": 0},
"step_num": 2,
"total_steps": 10
},
"reward": 0.85,
"done": false,
"info": {}
}
```
---
#### 3. Get Current State
**GET** `/state`
Retrieve the current episode state without taking an action.
**Response (200 OK):**
```json
{
"observation": {...},
"reward": 0.85,
"done": false,
"info": {
"task": "text_spam",
"items_completed": 2,
"total_items": 10,
"cumulative_reward": 1.60
}
}
```
---
#### 4. Close Episode
**POST** `/close`
Explicitly close the episode and clean up resources.
**Response (200 OK):**
```json
{
"status": "closed",
"final_reward": 7.20,
"steps_completed": 10
}
```
---
#### 5. List Available Tasks
**GET** `/tasks`
Get metadata about all available tasks.
**Response (200 OK):**
```json
{
"text_spam": {
"description": "Classify email/message content as spam or legitimate",
"difficulty": "easy",
"num_items": 50,
"content_type": "text"
},
"content_moderation": {
"description": "Detect policy violations in social media posts",
"difficulty": "medium",
"num_items": 40,
"content_type": "text"
},
"deepfake_detection": {
"description": "Identify AI-manipulated media",
"difficulty": "hard",
"num_items": 30,
"content_type": "multimodal"
}
}
```
---
#### 6. Health Check
**GET** `/health`
Check server health and status.
**Response (200 OK):**
```json
{
"status": "ok"
}
```
---
#### 7. Root Endpoint
**GET** `/`
Redirects to interactive Swagger UI documentation.
---
## Project Structure
```
content-moderation-env/
├── README.md # This file
├── uv.lock # Dependency lock file (UV package manager)
├── inference.py # Baseline agent script (235 lines)
│ # Demonstrates LLM agent interaction
│ # Supports HF API and local inference modes
├── server/ # FastAPI application (core)
│ ├── __init__.py # Package marker (empty)
│ │
│ ├── main.py # FastAPI app & HTTP endpoints (57 lines)
│ │ # Defines: /reset, /step, /state, /close
│ │ # /tasks, /health, / endpoints
│ │
│ ├── env.py # OpenEnv environment implementation (122 lines)
│ │ # Core logic: reset(), step(), state(), close()
│ │ # Thread-safe with locks for concurrency
│ │
│ ├── models.py # Pydantic data models
│ │ # Defines: ContentObservation, ModerationAction
│ │ # StepResult, ResetResult, EnvState
│ │
│ ├── tasks.py # Task datasets & ground truth (193 lines)
│ │ # Contains: text_spam, content_moderation,
│ │ # deepfake_detection task definitions & items
│ │
│ ├── graders.py # Reward functions per task (95 lines)
│ │ # Implements: label F1, calibration bonus,
│ │ # decision accuracy scoring logic
│ │
│ ├── deepfake_model.py # HF deepfake detection pipeline (90 lines)
│ │ # Lazy-loads: dima806/deepfake_vs_real...
│ │ # Caches model in HF_HOME for reuse
│ │
│ ├── openenv.yaml # OpenEnv specification metadata
│ │ # Declares task specs, observation/action space
│ │
│ ├── Dockerfile # Docker container definition
│ │ # Base: python:3.11-slim (~300MB)
│ │ # Installs system deps, pip packages,
│ │ # pre-downloads deepfake model
│ │
│ └── requirements.txt # Python dependencies (12 packages)
│ # Key: fastapi, uvicorn, transformers,
│ # torch, openai, python-dotenv
├── test/ # Test suite
│ └── test.py # pytest tests (20+ test cases)
│ # Coverage: tasks, endpoints, rewards
└── .env # Environment variables (git-ignored)
# Stores: HF_TOKEN, API_BASE_URL, etc.
```
---
## Environment Variables
Configuration is controlled via environment variables. Create a `.env` file in the project root:
```env
# ============ API Configuration ============
API_BASE_URL=https://router.huggingface.co/v1
# URL of the LLM inference endpoint
# Default: HuggingFace router (requires HF_TOKEN)
MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
# Which LLM to use for agent inference
# Other options: gpt-3.5-turbo, claude-3-opus, mistral-large, etc.
HF_TOKEN=hf_your_token_here
# HuggingFace API token for authenticated requests
# Get from: https://huggingface.co/settings/tokens
# ============ Server Configuration ============
SERVER_URL=http://localhost:7860
# Where the OpenEnv API server runs
# Used by inference.py to connect to environment
# ============ Task & Inference Configuration ============
TASK_NAME=text_spam
# Which task to run: text_spam, content_moderation, deepfake_detection
USE_LOCAL_MODEL=false
# If true: Load Llama-3.1-8B locally via transformers
# If false: Use remote API (requires HF_TOKEN)
# Local mode requires 16GB+ RAM
# ============ HuggingFace Model Caching ============
HF_HOME=/app/.cache/huggingface
# Directory for cached HF models and datasets
# Mounted as volume in Docker for persistence
TRANSFORMERS_CACHE=/app/.cache/huggingface
# Alternative env var for transformers library caching
# ============ Python Configuration ============
PYTHONDONTWRITEBYTECODE=1
# Don't create __pycache__ directories
PYTHONUNBUFFERED=1
# Stream logs immediately (useful in Docker)
# ============ Logging ============
LOG_LEVEL=INFO
# Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL
```
### Variable Precedence
1. Environment variables (highest priority)
2. `.env` file
3. Hardcoded defaults in code (lowest priority)
Example override:
```bash
export HF_TOKEN="hf_custom_token" && python inference.py
# Uses custom token instead of .env value
```
---
## Running Tests
The project includes a comprehensive test suite using pytest.
### Setup
```bash
pip install pytest pytest-cov
```
### Run All Tests
```bash
pytest test/test.py -v
```
### Run Specific Test Class
```bash
pytest test/test.py::TestTasks -v
```
### Run with Coverage Report
```bash
pytest test/test.py --cov=server --cov-report=html
# Opens htmlcov/index.html in browser for coverage visualization
```
### Test Categories
| Test | Coverage | Status |
|---|---|---|
| Task loading | All 3 tasks initialize correctly | ✓ |
| API endpoints | /reset, /step, /state, /close, /tasks, /health | ✓ |
| Reward grading | text_spam, content_moderation, deepfake_detection | ✓ |
| Input validation | Action schema validation, label validation | ✓ |
| Edge cases | Empty labels, out-of-range confidence, etc. | ✓ |
---
## Troubleshooting
### Installation Issues
**Problem:** `ImportError: No module named 'openai'`
```bash
Solution: pip install "openai>=1.40.0"
```
**Problem:** `ImportError: No module named 'torch'`
```bash
Solution: pip install torch torchvision
# For GPU: pip install torch torchvision -f https://download.pytorch.org/whl/cu121/torch_stable.html
```
**Problem:** `FileNotFoundError: requirements.txt`
```bash
Solution: Ensure you're in the project root: cd content-moderation-env/
# Then: pip install -r server/requirements.txt
```
### Docker Issues
**Problem:** `Segmentation fault (core dumped)` during build
```
Solution: Allocate more memory to Docker build:
docker build --memory=8g -f server/Dockerfile -t content-moderation-env .
```
**Problem:** `failed to solve: failed to compute cache key`
```
Solution: Ensure requirements.txt is in server/ directory:
# Current: server/requirements.txt (correct)
# Wrong: ./requirements.txt
```
**Problem:** Port 7860 already in use
```bash
Solution: Use different port:
docker run -p 8000:7860 content-moderation-env
# Now access at http://localhost:8000
```
### Runtime Issues
**Problem:** `Connection refused: localhost:7860`
```bash
Solution: Ensure server is running:
uvicorn server.main:app --host 0.0.0.0 --port 7860
In Docker, use: docker logs <container_id>
```
**Problem:** `Client.__init__() got an unexpected keyword argument 'proxies'`
```bash
Solution: Update OpenAI client:
pip install --upgrade openai
```
**Problem:** HuggingFace models downloading very slowly
```bash
Solution: Check internet connection and verify HF_TOKEN:
export HF_TOKEN="hf_your_token_here"
# Or download models ahead of time
python -c "from transformers import pipeline; pipeline('image-classification', model='dima806/deepfake_vs_real_image_detection')"
```
### API Issues
**Problem:** Invalid request to `/step` without `/reset`
```json
Error: "Environment not initialized. Call /reset first."
Solution: Always call POST /reset before any /step requests
```
**Problem:** Invalid label in action
```json
Error: {"detail": "Invalid label: 'unknown_label'"}
Solution: Use only valid labels from the specification
```
**Problem:** Confidence out of range
```
Solution: Ensure confidence is between 0.0 and 1.0
```
---
## Citation
If you use this environment in your research, please cite:
```bibtex
@software{content_moderation_openenv_2025,
title={Content Moderation OpenEnv: A Real-World AI Triage Environment},
author={Anidipta},
year={2025},
url={https://github.com/Anidipta/Content-Moderation-env},
note={OpenEnv Specification Compliant}
}
```
---
## Acknowledgements
🙏 Built for the **OpenEnv Hackathon 2025**.
**Special Thanks To:**
- OpenEnv community for the specification and framework
- HuggingFace for model hosting and inference APIs
- Meta for the Llama-3.1-8B-Instruct model
- Contributors and testers who improved the environment
**Dataset & Content Note:**
The email and content corpus is entirely **synthetic** and does not represent any real individuals, companies, organizations, or actual events. All examples are generated for demonstration and testing purposes only.
**License:** MIT License — See [LICENSE](LICENSE) file for details
**Questions?** Open an issue on GitHub or contact the maintainers.
---
**Last Updated:** April 8, 2026 | **OpenEnv Spec Version:** 1.0
colorTo: green
sdk: docker
pinned: false
license: mit
---
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
>>>>>>> f6dee02010a32ba1936311cbb3790fa087282e74