Spaces:
Sleeping
title: Context Corruption Env
emoji: 🔍
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
license: mit
ContextCorruption-Env
OpenEnv Hackathon | Meta x Hugging Face x PyTorch
ContextCorruption-Env is an OpenEnv environment for training epistemic robustness in LLMs. The agent receives a factual question plus retrieved documents, some of which are deliberately corrupted. It must answer the question and flag unreliable sources.
This submission targets Theme #3.1: World Modeling / Professional Tasks. The environment simulates a partially observable information workspace where some evidence is trustworthy and some evidence lies.
Required Materials
- Environment Space: https://huggingface.co/spaces/Siddh12334/context-corruption-env
- Mini-blog / writeup:
BLOG.md - Training Space: https://huggingface.co/spaces/Siddh12334/context-corruption-training
- Trained LoRA checkpoint: https://huggingface.co/Siddh12334/qwen-1.5b-context-corruption
- Training logs/history:
assets/training_history_rl5jygl8.csv - Raw training output log:
assets/wandb_run_rl5jygl8/output.log - Completion samples:
assets/completions_samples.md - Training script:
training/train_grpo.py - Notebook:
training/ContextCorruption_GRPO.ipynb
Environment Summary
Each episode contains:
- 1 factual question
- 8 retrieved documents
- 1-4 corrupted documents
- 12-step budget
- deterministic reward
The agent can take four actions:
read_doc: spend budget to inspect a document;flag_suspicious: mark a document as likely corrupted;unflag_doc: remove a flag;submit_answer: finish with an answer and confidence score.
The environment is intentionally simple to run but hard to master. A weak agent can guess an answer. A stronger agent must notice contradictions and avoid over-flagging clean documents.
Interactive Demo UI
The FastAPI app serves a lightweight frontend at /. It lets users start an episode, inspect the eight retrieved documents, spend read budget, flag suspicious documents, submit an answer with confidence, and optionally call the trained model through /model/infer.
Run locally with:
uvicorn environment.server:app --host 0.0.0.0 --port 7860
Reward
The reward is deterministic and compositional. There is no hidden LLM judge.
| Component | What It Rewards | Weight |
|---|---|---|
| Answer correctness | exact match after normalization | +0.40 |
| Corruption recall | fraction of corrupt docs found | +0.30 |
| Precision | avoids false accusations | +0.20 |
| Confidence calibration | confidence helps only when correct | +/-0.10 |
| Efficiency | small bonus for conserving budget | +0.05 |
Reward range: -0.5 to 1.05.
Results
We trained Qwen2-1.5B-Instruct with GRPO using Unsloth / TRL. The run was sized for hackathon constraints, but it produced a clear signal above the random baseline.
| Agent | Reward Evidence |
|---|---|
| Random baseline | 0.1302 avg reward over 100 episodes |
| Qwen2-1.5B GRPO | 0.3289 final logged reward in the finished WandB run |
The trained LoRA adapter is pushed to the Hub and is loaded by the hosted Space through /model/infer for a live sanity check.
Additional exported charts:
The WandB run was exported into this repo so judges do not need access to a private project. See the raw log, scalar history, config, summary, and completion tables under assets/wandb_run_rl5jygl8/.
Repo Structure
environment/ # OpenEnv environment, actions, reward, server, model inference
data/ # QA loading, corruptions, document generation
training/ # GRPO training script and notebook
eval/ # random baseline evaluation
assets/ # charts, exported training logs, completion samples

