# PROJECT.md — OpenEnv Environment Project ## 🎯 Project Overview **Environment Name:** `[ENV_NAME]` **Domain:** `[DOMAIN]` _(e.g., Software Engineering / Finance / Healthcare / Legal)_ **Task Summary:** `[ONE_SENTENCE_DESCRIPTION_OF_REAL_WORLD_TASK]` > ⚠️ This must be a **real-world task** — not a game or toy environment. --- ## 🔗 Official OpenEnv References (Always Follow) | # | Tutorial | |---|---------| | 1 | [01-environments.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/01-environments.md) | | 2 | [02-deployment.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/02-deployment.md) | | 3 | [03-scaling.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/03-scaling.md) | | 4 | [04-training.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/04-training.md) | --- ## 🚀 High-Level Pipeline ``` [REAL_WORLD_TASK / DATA_SOURCE] │ ▼ OpenEnv Environment (server/environment.py) │ FastAPI (server/app.py) ←→ Docker ▼ HTTPEnvClient (client.py) │ reset() / step() / state() ▼ GRPO Training (TRL + vLLM) │ ▼ Fine-tuned LLM → pushed to Hugging Face Hub ``` --- ## 📦 Tech Stack | Layer | Technology | |-------|-----------| | Environment server | FastAPI + Uvicorn | | Containerisation | Docker | | Deployment | Hugging Face Spaces | | Training framework | TRL (GRPOTrainer) | | Model backend | vLLM (colocate mode) | | Base model | `[BASE_MODEL]` _(e.g., Qwen/Qwen3-1.7B)_ | | Package manager | `uv` | --- ## 📁 Repository Layout ``` [ENV_NAME]/ ├── server/ │ ├── app.py ← FastAPI entry point │ ├── environment.py ← Core environment logic │ └── Dockerfile ├── models.py ← Typed Action / Observation / State ├── client.py ← HTTPEnvClient subclass ├── openenv.yaml ← Manifest (required) └── pyproject.toml ``` --- ## ✅ Definition of Done - [ ] `openenv init` scaffold created - [ ] `models.py` — typed `Action`, `Observation`, `State` defined - [ ] `environment.py` — `reset()`, `step()`, `state` implemented - [ ] `server/app.py` — uses `create_fastapi_app(env)` - [ ] `curl /health` → `{"status": "healthy"}` - [ ] Docker image builds and runs locally - [ ] Pushed to HF Spaces via `openenv push` - [ ] GRPO training runs end-to-end - [ ] Fine-tuned model pushed to HF Hub - [ ] Evaluation metrics recorded