openenv-content-moderation / docs /API_CONTRACT.md
ThejasRao's picture
Initial OpenENV hackathon submission
c492c3f

πŸ“„ API_CONTRACT.md β€” API Specification & Schemas


🧠 Overview

This document defines the API contract for the moderation environment.

All endpoints:

  • follow REST principles
  • return JSON responses
  • are deterministic and reproducible
  • align with OpenEnv requirements

🌐 Base URL

http://localhost:8000

(Will be deployed on Hugging Face Spaces)


πŸ“Œ Endpoints Summary

Endpoint Method Description
/reset POST Initialize a new episode
/step POST Apply an action
/state GET Get current state
/tasks GET List available tasks
/grader GET Get score for current episode
/baseline GET Run rule-based baseline agent
/agent/run POST Run Gemini 2.5 Flash agent πŸ”Ή Added
/health GET Liveness check

πŸš€ 1. /reset

Description

Initializes a new episode.


Request

{
  "task": "easy | medium | hard",
  "seed": 42
}

Response

{
  "state": {
    "post_content": "...",
    "user_history": [...],
    "reports": ...,
    "engagement": "...",
    "geo": "...",
    "available_context": []
  },
  "available_actions": [
    "fetch_user_history",
    "fetch_thread_context",
    "check_policy_clause",
    "mark_violation_type",
    "allow",
    "flag",
    "remove",
    "escalate"
  ],
  "done": false
}

πŸ”„ 2. /step

Description

Executes one action in the environment.


Request

{
  "action": "fetch_user_history",
  "parameters": {
    "violation_type": "misinformation"
  }
}

Response

{
  "state": {
    "post_content": "...",
    "user_history": [...],
    "reports": ...,
    "engagement": "...",
    "geo": "...",
    "thread_context": [...],
    "policy_context": "...",
    "predicted_violation": "..."
  },
  "reward": 0.2,
  "done": false,
  "info": {
    "step": 2,
    "message": "User history fetched"
  }
}

πŸ“Š 3. /state

Description

Returns current environment state without taking action.


Request

{}

Response

{
  "state": {...},
  "step_count": 3,
  "done": false
}

πŸ“‹ 4. /tasks

Description

Returns available tasks and action schema.


Response

{
  "tasks": [
    {
      "name": "easy",
      "description": "Clear violations"
    },
    {
      "name": "medium",
      "description": "Ambiguous moderation"
    },
    {
      "name": "hard",
      "description": "Contextual + geo-aware"
    }
  ],
  "action_schema": {
    "action": "string",
    "parameters": "object (optional)"
  }
}

πŸ§ͺ 5. /grader

Description

Returns score for completed episode.


Response

{
  "score": 0.82,
  "breakdown": {
    "final_action": 1.0,
    "violation": 1.0,
    "investigation": 0.66,
    "efficiency": 0.8
  },
  "steps_taken": 4,
  "max_steps": 7
}

πŸ€– 6. /baseline

Description

Runs the rule-based baseline agent on a task and returns the graded result.

Query Parameters

Param Type Default Description
task_id string easy_harassment Task to run
seed int task default Optional seed

Response

{
  "task_id": "easy_harassment",
  "seed": 42,
  "score": {
    "final_action_score": 1.0,
    "classification_score": 1.0,
    "investigation_score": 1.0,
    "efficiency_score": 0.5,
    "total": 0.875,
    "breakdown": {}
  },
  "trajectory": [...]
}

🧠 7. /agent/run β€” πŸ”Ή Added

Description

Runs the Gemini 2.5 Flash LLM agent on a full episode. Requires GOOGLE_API_KEY or GEMINI_API_KEY in the server environment.

Method: POST

Request Body

{
  "task_id": "hard_misinformation",
  "seed": 777
}

Response

{
  "task_id": "hard_misinformation",
  "seed": 777,
  "score": {
    "final_action_score": 1.0,
    "classification_score": 1.0,
    "investigation_score": 1.0,
    "efficiency_score": 0.43,
    "total": 0.88,
    "breakdown": {...}
  },
  "trajectory": [
    { "step": 1, "action": {"action_type": "fetch_user_history", "parameters": {}}, "reward": 0.2, "reward_reason": "..." },
    ...
  ]
}

Error: Missing API Key

{ "detail": "GOOGLE_API_KEY or GEMINI_API_KEY environment variable is required..." }

HTTP 500


🧾 Action Definitions


πŸ” Investigation Actions

Action Parameters Description
fetch_user_history none Retrieve user past activity
fetch_thread_context none Retrieve conversation context
check_policy_clause {clause_id} Retrieve policy info

🧠 Classification Actions

Action Parameters Description
mark_violation_type {type} Predict violation category

🎯 Decision Actions

Action Parameters Description
allow none Approve content
flag none Mark for review
remove none Remove content
escalate none Send to human review

⚠️ Error Handling


Invalid Action

{
  "error": "Invalid action",
  "reward": -0.2
}

Episode Already Complete

{
  "error": "Episode finished",
  "done": true
}

Missing Parameters

{
  "error": "Missing required parameter: violation_type"
}

πŸ” State Schema

{
  "post_content": "string",
  "user_history": "array",
  "reports": "integer",
  "engagement": "low | medium | high",
  "geo": "string",
  "thread_context": "array",
  "policy_context": "string",
  "predicted_violation": "string",
  "final_action": "string"
}

🧠 Design Principles

  • consistent request/response format
  • explicit state representation
  • deterministic outputs
  • minimal ambiguity

🧠 One-Line Summary

A clean, deterministic API contract enabling structured interaction between agent and moderation environment.