--- title: Content Moderation OpenEnv emoji: 🛡️ colorFrom: blue colorTo: indigo sdk: docker app_file: app.py pinned: false --- # Content Moderation OpenEnv An AI content moderation environment built to the OpenEnv specification. Agents triage real-world content — spam emails, harmful social media posts, and AI-generated deepfakes — using a standard `step()` / `reset()` / `state()` API. [![OpenEnv Spec](https://img.shields.io/badge/OpenEnv-Spec-blue)](https://github.com/openenv-core/spec) [![Python 3.11+](https://img.shields.io/badge/Python-3.11+-blue.svg)](https://www.python.org/downloads/) [![FastAPI](https://img.shields.io/badge/FastAPI-0.111.0-green.svg)](https://fastapi.tiangolo.com/) [![Docker](https://img.shields.io/badge/Docker-Ready-blue.svg)](https://www.docker.com/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE) --- ## 📋 Table of Contents - [Environment Description & Motivation](#environment-description--motivation) - [Task Descriptions](#task-descriptions) - [Observation Space](#observation-space) - [Action Space](#action-space) - [Reward Functions](#reward-functions) - [Baseline Scores](#baseline-scores) - [Setup & Usage](#setup--usage) - [Requirements](#requirements) - [Local Installation](#local-installation) - [Docker Deployment](#docker-deployment) - [HuggingFace Spaces Deployment](#huggingface-spaces-deployment) - [Running the Inference Script](#running-the-inference-script) - [API Reference](#api-reference) - [Project Structure](#project-structure) - [Environment Variables](#environment-variables) - [Running Tests](#running-tests) - [Troubleshooting](#troubleshooting) - [Citation](#citation) - [Acknowledgements](#acknowledgements) --- ## Environment Description & Motivation Content moderation is a high-stakes, high-volume real-world task. Human moderators review millions of items daily across platforms and inboxes. This environment simulates a realistic moderation pipeline across three difficulty levels, enabling AI agents to learn decision-making strategies under resource constraints. **Key Challenges:** - Multi-label classification with imbalanced datasets - Confidence calibration under uncertainty - Real-world content variability (spam, deepfakes, policy violations) - Escalation vs. immediate action tradeoffs | Task | Difficulty | Content Type | Metrics | Description | |---|---|---|---|---| | `text_spam` | Easy | Email / SMS | Binary classification + confidence | Spam vs. legitimate email filtering with sender reputation signals | | `content_moderation` | Medium | Social media posts | Multi-label (up to 11 labels) | Detect policy violations (hate speech, harassment, violence, etc.) | | `deepfake_detection` | Hard | Multimodal (image + text) | Detection accuracy + labels | Identify AI-manipulated media using vision transformer + metadata | **Models Used:** - **Deepfake Detection**: `dima806/deepfake_vs_real_image_detection` (ViT, ~300MB) - **Language Model**: `meta-llama/Llama-3.1-8B-Instruct` (8B params, via transformers or HF API) - **Infrastructure**: FastAPI + Uvicorn, Docker containerization --- ## Task Descriptions ### Task 1: Text Spam Classification (Easy) **Objective:** Binary classification of emails/SMS as spam or legitimate. **Dataset:** - 50 items (30 spam, 20 legitimate) - Features: text content, sender reputation score, link count, source - Ground truth: decision + labels **Example:** ```json { "content_id": "ts_001", "content_type": "text", "text": "CONGRATULATIONS! You've won $1,000,000! Click here NOW to claim your prize!!!", "metadata": {"source": "email", "sender_reputation": 0.05, "link_count": 3}, "ground_truth": { "decision": "reject", "labels": ["spam", "scam"], "is_harmful": true } } ``` --- ### Task 2: Content Moderation (Medium) **Objective:** Multi-label classification of social media posts for policy violations. **Dataset:** - 40 items across diverse platforms - Labels: spam, scam, phishing, hate_speech, violence, harassment, misinformation, adult_content, deepfake, political_manipulation, fraud - Features: post text, engagement metrics, user reputation, report count **Violation Categories:** | Category | Definition | Examples | |---|---|---| | Hate Speech | Dehumanizing content targeting identity | Slurs, discrimination, incitement | | Violence | Threats or glorification of violence | Physical harm, weapon promotion | | Harassment | Coordinated or severe personal attacks | Doxxing, targeted campaigns | | Misinformation | False claims with societal impact | Election fraud claims, health hoaxes | --- ### Task 3: Deepfake Detection (Hard) **Objective:** Detect AI-manipulated media and classify content appropriately. **Dataset:** - 30 items (multimodal: images + descriptions) - Deepfake detection model outputs raw confidence scores (0-1) - Features: image description, detector_score, metadata **Detector Score Interpretation:** - `0.0-0.3`: Likely real/authentic - `0.3-0.7`: Uncertain, may require additional analysis - `0.7-1.0`: Likely deepfake/manipulated **Example:** ```json { "content_id": "df_001", "content_type": "multimodal", "image_description": "Portrait of person in business attire, lighting appears natural", "detector_score": 0.82, "metadata": {"platform": "social_media", "report_count": 3} } ``` --- ## Observation Space Every step returns a `ContentObservation` with the following structure: ```json { "content_id": "string", "content_type": "text | multimodal", "text": "string (optional, for text tasks)", "image_description": "string (optional, deepfake task only)", "detector_score": 0.0-1.0 (optional, deepfake task only), "metadata": { "source": "email | social_media | platform", "sender_reputation": 0.0-1.0, "link_count": 0, "report_count": 0, "timestamp": "ISO8601" }, "step_num": 1, "total_steps": 10 } ``` | Field | Type | Required | Task | Description | |---|---|---|---|---| | `content_id` | string | All | Unique identifier for the content item | | `content_type` | string | All | Type of content: `text` or `multimodal` | | `text` | string | text_spam, content_moderation | The actual email/post body | | `image_description` | string | deepfake_detection | AI-generated description of the image | | `detector_score` | float | deepfake_detection | Raw output from deepfake model (0-1) | | `metadata` | object | All | Platform-specific signals (reputation, reports, etc.) | | `step_num` | int | All | Current step in episode | | `total_steps` | int | All | Total steps in this episode | --- ## Action Space Agents must respond with a `ModerationAction` object: ```json { "decision": "approve | reject | escalate | flag", "reason": "string (max 200 chars, explanation)", "confidence": 0.0-1.0, "labels": ["spam", "scam", "phishing", "hate_speech", "violence", "harassment", "misinformation", "adult_content", "deepfake", "political_manipulation", "fraud"] } ``` | Field | Type | Range | Description | |---|---|---|---| | `decision` | enum | {approve, reject, escalate, flag} | **approve**: Safe content; **reject**: Clear violation, remove immediately; **flag**: Borderline, needs review; **escalate**: Uncertain, route to specialist | | `reason` | string | 0-200 chars | Brief explanation of the decision | | `confidence` | float | [0.0, 1.0] | Agent's confidence in this decision. Calibration is rewarded. | | `labels` | array | 0-11 labels | Detected violation categories. Must be valid from the allowed set. | **Valid Labels:** - `spam` — Unsolicited bulk messaging - `scam` — Financial or identity theft attempts - `phishing` — Social engineering attacks - `hate_speech` — Dehumanizing content targeting identity groups - `violence` — Threats, glorification, or instruction of violence - `harassment` — Personal attacks, doxxing, coordinated abuse - `misinformation` — False claims with societal impact - `adult_content` — NSFW or sexually explicit material - `deepfake` — AI-manipulated media - `political_manipulation` — Coordinated inauthentic behavior - `fraud` — Financial scams, false claims --- ## Reward Functions Rewards are computed per task based on decision accuracy, label coverage (F1), and confidence calibration. ### text_spam (Easy) | Component | Reward | Condition | |---|---|---| | Correct decision | **+0.65** | `decision` matches ground truth | | Escalate on harmful | **+0.30** | Harmful content + escalate/flag (partial credit) | | Label F1 contribution | **+0.20** | F1 score of predicted vs. true labels | | Confidence calibration | **±0.10** | Bonus if confident on correct, penalty if confident on wrong | | **Max per step** | **1.00** | Sum of components (capped) | ### content_moderation (Medium) | Component | Reward | Condition | |---|---|---| | Correct decision | **+0.50** | `decision` matches ground truth | | Partial credit | **+0.25** | Harmful content + flag/escalate (conservative approach) | | Label F1 contribution | **+0.35** | Multi-label F1 score (up to 11 labels) | | Confidence calibration | **±0.10** | Brier score penalty for miscalibration | | **Max per step** | **1.00** | Sum of components (capped) | ### deepfake_detection (Hard) | Component | Reward | Condition | |---|---|---| | Correct decision | **+0.40** | `decision` matches ground truth | | Deepfake detection | **+0.30** | Accuracy vs. detector_score threshold | | Detector alignment | **+0.10** | Bonus for leveraging model signals | | Label F1 contribution | **+0.20** | Multi-label F1 (fewer labels than medium task) | | Confidence calibration | **±0.10** | Calibration error penalty | | **Max per step** | **1.00** | Sum of components (capped) | **Calibration Bonus Formula:** ``` bonus = 0.1 × (confidence if correct else -confidence) ``` --- ## Baseline Scores Scores reported for **Llama-3.1-8B-Instruct** with `temperature=0.2` and `top-p=0.95`: | Task | Score | Steps | Notes | |---|---|---|---| | `text_spam` | **0.72** | 5 | Strong on obvious spam; struggles with phishing disguised as legitimate | | `content_moderation` | **0.58** | 8 | Good binary decisions; incomplete label coverage (F1 ≈0.52) | | `deepfake_detection` | **0.44** | 10 | Relies on image descriptions; independent detector signals underutilized | --- ## Setup & Usage ### Requirements - **Python**: 3.11 or higher - **Docker** (optional, for containerized deployment) - **GPU** (optional, recommended for deepfake models): CUDA 12.1+ - **Memory**: 8GB+ RAM (16GB recommended for local LLM inference) - **Disk**: 10GB+ (models cached in `~/.cache/huggingface/`) ### Local Installation 1. **Clone and navigate:** ```bash git clone https://github.com/Anidipta/Content-Moderation-env.git cd Content-Moderation-env ``` 2. **Create virtual environment:** ```bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate ``` 3. **Install dependencies:** ```bash pip install -r server/requirements.txt ``` 4. **Start the server:** ```bash uvicorn server.main:app --host 0.0.0.0 --port 7860 ``` Server runs at `http://localhost:7860` 5. **Access API documentation:** - Swagger UI: `http://localhost:7860/docs` - ReDoc: `http://localhost:7860/redoc` ### Docker Deployment #### Build the Image ```bash # Basic build docker build -f server/Dockerfile -t content-moderation-env . # Build with memory allocation (recommended) docker build --memory=4g -f server/Dockerfile -t content-moderation-env . # Build with progress output docker build --progress=plain -f server/Dockerfile -t content-moderation-env . ``` #### Run the Container ```bash # Basic run docker run -p 7860:7860 content-moderation-env # Run with environment variables docker run -p 7860:7860 \ -e API_BASE_URL="https://router.huggingface.co/v1" \ -e MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" \ -e HF_TOKEN="hf_your_token_here" \ content-moderation-env # Run with GPU support docker run --gpus all -p 7860:7860 content-moderation-env # Run with volume mounts (cache models locally) docker run -p 7860:7860 \ -v ~/.cache/huggingface:/app/.cache/huggingface \ content-moderation-env # Run in background docker run -d -p 7860:7860 --name moderation-env content-moderation-env # Check logs docker logs moderation-env # Stop container docker stop moderation-env ``` #### Dockerfile Details The [server/Dockerfile](server/Dockerfile) uses: - **Base Image**: `python:3.11-slim` (~300MB) — minimal footprint with Python runtime - **System Dependencies**: `libgl1 libglib2.0-0 curl` — required for vision models and health checks - **Dependencies Installation**: Multi-stage approach with pip cache optimization - **Model Preloading**: Deepfake detection model downloaded during build for faster startup - **Environment Setup**: HuggingFace cache directories and Python settings pre-configured - **Entry Point**: FastAPI app via Uvicorn on port 7860 ```dockerfile # Key optimizations: - --no-cache-dir: Reduces image size by 50% - --no-build-isolation: Prevents memory spikes during pip install - Pre-downloaded models: Eliminates first-run delays - Minimal dependencies: Only libraries needed for the environment ``` #### Deployment to Production **Docker Compose:** ```yaml version: '3.8' services: moderation-api: build: context: . dockerfile: server/Dockerfile ports: - "7860:7860" environment: - API_BASE_URL=https://router.huggingface.co/v1 - MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct - HF_TOKEN=${HF_TOKEN} volumes: - ~/.cache/huggingface:/app/.cache/huggingface restart: unless-stopped healthcheck: test: ["CMD", "curl", "-f", "http://localhost:7860/health"] interval: 30s timeout: 10s retries: 3 ``` Run with: `docker-compose up -d` ### HuggingFace Spaces Deployment 1. Create a new Space with Docker SDK 2. Add Secrets (Settings → Repository secrets): - `HF_TOKEN`: Your HuggingFace API token 3. Add Variables (Settings → Repository variables): - `API_BASE_URL`: `https://router.huggingface.co/v1` - `MODEL_NAME`: `meta-llama/Llama-3.1-8B-Instruct` 4. Push this repository to the Space 5. Space URL becomes your `PING_URL` for validation scripts --- ## Running the Inference Script ```bash # API mode (HF inference endpoint) export API_BASE_URL="https://router.huggingface.co/v1" export MODEL_NAME="meta-llama/Llama-3.1-8B-Instruct" export HF_TOKEN="hf_your_token_here" export SERVER_URL="http://localhost:7860" export TASK_NAME="text_spam" python inference.py # Local transformers pipeline mode export USE_LOCAL_MODEL="true" python inference.py ``` ### Output Format ``` [START] task=text_spam env=content_moderation_env model=meta-llama/Llama-3.1-8B-Instruct [STEP] step=1 action={"decision":"reject","confidence":0.9,"labels":["spam"]} reward=0.85 done=false error=null [STEP] step=2 action={"decision":"approve","confidence":0.8,"labels":[]} reward=0.75 done=false error=null [STEP] step=3 action={"decision":"escalate","confidence":0.5,"labels":["scam"]} reward=0.30 done=false error=null [STEP] step=4 action={"decision":"reject","confidence":0.85,"labels":["phishing"]} reward=0.70 done=false error=null [STEP] step=5 action={"decision":"approve","confidence":0.88,"labels":[]} reward=0.75 done=true error=null [END] success=true steps=5 score=0.720 rewards=0.85,0.75,0.30,0.70,0.75 ``` | Field | Type | Description | |---|---|---| | `task` | string | The task being evaluated | | `step` | int | Current step number in episode | | `decision` | string | Agent's moderation decision | | `confidence` | float | Agent's confidence (0-1) | | `labels` | array | Detected violation labels | | `reward` | float | Reward received for this step | | `done` | boolean | Episode completion flag | | `error` | string/null | Error message if applicable | | `score` | float | Final episode score | --- ## API Reference ### Server Endpoints All endpoints are JSON-based with FastAPI's automatic validation. #### 1. Reset Episode **POST** `/reset` Start a new moderation episode. **Request Body:** ```json { "task": "text_spam" } ``` **Response (200 OK):** ```json { "observation": { "content_id": "ts_001", "content_type": "text", "text": "CONGRATULATIONS! You've won $1,000,000!...", "metadata": {"source": "email", "sender_reputation": 0.05, "link_count": 3}, "step_num": 1, "total_steps": 10 }, "info": {} } ``` **Error (400):** ```json { "detail": "Unknown task 'invalid_task'. Valid: ['text_spam', 'content_moderation', 'deepfake_detection']" } ``` --- #### 2. Submit Action **POST** `/step` Submit a moderation action for the current content. **Request Body:** ```json { "decision": "reject", "reason": "Email contains typical spam patterns and suspicious links", "confidence": 0.92, "labels": ["spam", "scam"] } ``` **Response (200 OK):** ```json { "observation": { "content_id": "ts_002", "content_type": "text", "text": "Hi Sarah, confirming our meeting tomorrow...", "metadata": {"source": "email", "sender_reputation": 0.92, "link_count": 0}, "step_num": 2, "total_steps": 10 }, "reward": 0.85, "done": false, "info": {} } ``` --- #### 3. Get Current State **GET** `/state` Retrieve the current episode state without taking an action. **Response (200 OK):** ```json { "observation": {...}, "reward": 0.85, "done": false, "info": { "task": "text_spam", "items_completed": 2, "total_items": 10, "cumulative_reward": 1.60 } } ``` --- #### 4. Close Episode **POST** `/close` Explicitly close the episode and clean up resources. **Response (200 OK):** ```json { "status": "closed", "final_reward": 7.20, "steps_completed": 10 } ``` --- #### 5. List Available Tasks **GET** `/tasks` Get metadata about all available tasks. **Response (200 OK):** ```json { "text_spam": { "description": "Classify email/message content as spam or legitimate", "difficulty": "easy", "num_items": 50, "content_type": "text" }, "content_moderation": { "description": "Detect policy violations in social media posts", "difficulty": "medium", "num_items": 40, "content_type": "text" }, "deepfake_detection": { "description": "Identify AI-manipulated media", "difficulty": "hard", "num_items": 30, "content_type": "multimodal" } } ``` --- #### 6. Health Check **GET** `/health` Check server health and status. **Response (200 OK):** ```json { "status": "ok" } ``` --- #### 7. Root Endpoint **GET** `/` Redirects to interactive Swagger UI documentation. --- ## Project Structure ``` content-moderation-env/ │ ├── README.md # This file ├── uv.lock # Dependency lock file (UV package manager) ├── inference.py # Baseline agent script (235 lines) │ # Demonstrates LLM agent interaction │ # Supports HF API and local inference modes │ ├── server/ # FastAPI application (core) │ ├── __init__.py # Package marker (empty) │ │ │ ├── main.py # FastAPI app & HTTP endpoints (57 lines) │ │ # Defines: /reset, /step, /state, /close │ │ # /tasks, /health, / endpoints │ │ │ ├── env.py # OpenEnv environment implementation (122 lines) │ │ # Core logic: reset(), step(), state(), close() │ │ # Thread-safe with locks for concurrency │ │ │ ├── models.py # Pydantic data models │ │ # Defines: ContentObservation, ModerationAction │ │ # StepResult, ResetResult, EnvState │ │ │ ├── tasks.py # Task datasets & ground truth (193 lines) │ │ # Contains: text_spam, content_moderation, │ │ # deepfake_detection task definitions & items │ │ │ ├── graders.py # Reward functions per task (95 lines) │ │ # Implements: label F1, calibration bonus, │ │ # decision accuracy scoring logic │ │ │ ├── deepfake_model.py # HF deepfake detection pipeline (90 lines) │ │ # Lazy-loads: dima806/deepfake_vs_real... │ │ # Caches model in HF_HOME for reuse │ │ │ ├── openenv.yaml # OpenEnv specification metadata │ │ # Declares task specs, observation/action space │ │ │ ├── Dockerfile # Docker container definition │ │ # Base: python:3.11-slim (~300MB) │ │ # Installs system deps, pip packages, │ │ # pre-downloads deepfake model │ │ │ └── requirements.txt # Python dependencies (12 packages) │ # Key: fastapi, uvicorn, transformers, │ # torch, openai, python-dotenv │ ├── test/ # Test suite │ └── test.py # pytest tests (20+ test cases) │ # Coverage: tasks, endpoints, rewards │ └── .env # Environment variables (git-ignored) # Stores: HF_TOKEN, API_BASE_URL, etc. ``` --- ## Environment Variables Configuration is controlled via environment variables. Create a `.env` file in the project root: ```env # ============ API Configuration ============ API_BASE_URL=https://router.huggingface.co/v1 # URL of the LLM inference endpoint # Default: HuggingFace router (requires HF_TOKEN) MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct # Which LLM to use for agent inference # Other options: gpt-3.5-turbo, claude-3-opus, mistral-large, etc. HF_TOKEN=hf_your_token_here # HuggingFace API token for authenticated requests # Get from: https://huggingface.co/settings/tokens # ============ Server Configuration ============ SERVER_URL=http://localhost:7860 # Where the OpenEnv API server runs # Used by inference.py to connect to environment # ============ Task & Inference Configuration ============ TASK_NAME=text_spam # Which task to run: text_spam, content_moderation, deepfake_detection USE_LOCAL_MODEL=false # If true: Load Llama-3.1-8B locally via transformers # If false: Use remote API (requires HF_TOKEN) # Local mode requires 16GB+ RAM # ============ HuggingFace Model Caching ============ HF_HOME=/app/.cache/huggingface # Directory for cached HF models and datasets # Mounted as volume in Docker for persistence TRANSFORMERS_CACHE=/app/.cache/huggingface # Alternative env var for transformers library caching # ============ Python Configuration ============ PYTHONDONTWRITEBYTECODE=1 # Don't create __pycache__ directories PYTHONUNBUFFERED=1 # Stream logs immediately (useful in Docker) # ============ Logging ============ LOG_LEVEL=INFO # Log level: DEBUG, INFO, WARNING, ERROR, CRITICAL ``` ### Variable Precedence 1. Environment variables (highest priority) 2. `.env` file 3. Hardcoded defaults in code (lowest priority) Example override: ```bash export HF_TOKEN="hf_custom_token" && python inference.py # Uses custom token instead of .env value ``` --- ## Running Tests The project includes a comprehensive test suite using pytest. ### Setup ```bash pip install pytest pytest-cov ``` ### Run All Tests ```bash pytest test/test.py -v ``` ### Run Specific Test Class ```bash pytest test/test.py::TestTasks -v ``` ### Run with Coverage Report ```bash pytest test/test.py --cov=server --cov-report=html # Opens htmlcov/index.html in browser for coverage visualization ``` ### Test Categories | Test | Coverage | Status | |---|---|---| | Task loading | All 3 tasks initialize correctly | ✓ | | API endpoints | /reset, /step, /state, /close, /tasks, /health | ✓ | | Reward grading | text_spam, content_moderation, deepfake_detection | ✓ | | Input validation | Action schema validation, label validation | ✓ | | Edge cases | Empty labels, out-of-range confidence, etc. | ✓ | --- ## Troubleshooting ### Installation Issues **Problem:** `ImportError: No module named 'openai'` ```bash Solution: pip install "openai>=1.40.0" ``` **Problem:** `ImportError: No module named 'torch'` ```bash Solution: pip install torch torchvision # For GPU: pip install torch torchvision -f https://download.pytorch.org/whl/cu121/torch_stable.html ``` **Problem:** `FileNotFoundError: requirements.txt` ```bash Solution: Ensure you're in the project root: cd content-moderation-env/ # Then: pip install -r server/requirements.txt ``` ### Docker Issues **Problem:** `Segmentation fault (core dumped)` during build ``` Solution: Allocate more memory to Docker build: docker build --memory=8g -f server/Dockerfile -t content-moderation-env . ``` **Problem:** `failed to solve: failed to compute cache key` ``` Solution: Ensure requirements.txt is in server/ directory: # Current: server/requirements.txt (correct) # Wrong: ./requirements.txt ``` **Problem:** Port 7860 already in use ```bash Solution: Use different port: docker run -p 8000:7860 content-moderation-env # Now access at http://localhost:8000 ``` ### Runtime Issues **Problem:** `Connection refused: localhost:7860` ```bash Solution: Ensure server is running: uvicorn server.main:app --host 0.0.0.0 --port 7860 In Docker, use: docker logs ``` **Problem:** `Client.__init__() got an unexpected keyword argument 'proxies'` ```bash Solution: Update OpenAI client: pip install --upgrade openai ``` **Problem:** HuggingFace models downloading very slowly ```bash Solution: Check internet connection and verify HF_TOKEN: export HF_TOKEN="hf_your_token_here" # Or download models ahead of time python -c "from transformers import pipeline; pipeline('image-classification', model='dima806/deepfake_vs_real_image_detection')" ``` ### API Issues **Problem:** Invalid request to `/step` without `/reset` ```json Error: "Environment not initialized. Call /reset first." Solution: Always call POST /reset before any /step requests ``` **Problem:** Invalid label in action ```json Error: {"detail": "Invalid label: 'unknown_label'"} Solution: Use only valid labels from the specification ``` **Problem:** Confidence out of range ``` Solution: Ensure confidence is between 0.0 and 1.0 ``` --- ## Citation If you use this environment in your research, please cite: ```bibtex @software{content_moderation_openenv_2025, title={Content Moderation OpenEnv: A Real-World AI Triage Environment}, author={Anidipta}, year={2025}, url={https://github.com/Anidipta/Content-Moderation-env}, note={OpenEnv Specification Compliant} } ``` --- ## Acknowledgements 🙏 Built for the **OpenEnv Hackathon 2025**. **Special Thanks To:** - OpenEnv community for the specification and framework - HuggingFace for model hosting and inference APIs - Meta for the Llama-3.1-8B-Instruct model - Contributors and testers who improved the environment **Dataset & Content Note:** The email and content corpus is entirely **synthetic** and does not represent any real individuals, companies, organizations, or actual events. All examples are generated for demonstration and testing purposes only. **License:** MIT License — See [LICENSE](LICENSE) file for details **Questions?** Open an issue on GitHub or contact the maintainers. --- **Last Updated:** April 8, 2026 | **OpenEnv Spec Version:** 1.0 colorTo: green sdk: docker pinned: false license: mit --- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference >>>>>>> f6dee02010a32ba1936311cbb3790fa087282e74