Spaces:
Sleeping
Sleeping
File size: 2,532 Bytes
9b47159 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 | # PROJECT.md β OpenEnv Environment Project
## π― Project Overview
**Environment Name:** `[ENV_NAME]`
**Domain:** `[DOMAIN]` _(e.g., Software Engineering / Finance / Healthcare / Legal)_
**Task Summary:** `[ONE_SENTENCE_DESCRIPTION_OF_REAL_WORLD_TASK]`
> β οΈ This must be a **real-world task** β not a game or toy environment.
---
## π Official OpenEnv References (Always Follow)
| # | Tutorial |
|---|---------|
| 1 | [01-environments.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/01-environments.md) |
| 2 | [02-deployment.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/02-deployment.md) |
| 3 | [03-scaling.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/03-scaling.md) |
| 4 | [04-training.md](https://github.com/meta-pytorch/OpenEnv/blob/main/tutorial/04-training.md) |
---
## π High-Level Pipeline
```
[REAL_WORLD_TASK / DATA_SOURCE]
β
βΌ
OpenEnv Environment (server/environment.py)
β FastAPI (server/app.py) ββ Docker
βΌ
HTTPEnvClient (client.py)
β reset() / step() / state()
βΌ
GRPO Training (TRL + vLLM)
β
βΌ
Fine-tuned LLM β pushed to Hugging Face Hub
```
---
## π¦ Tech Stack
| Layer | Technology |
|-------|-----------|
| Environment server | FastAPI + Uvicorn |
| Containerisation | Docker |
| Deployment | Hugging Face Spaces |
| Training framework | TRL (GRPOTrainer) |
| Model backend | vLLM (colocate mode) |
| Base model | `[BASE_MODEL]` _(e.g., Qwen/Qwen3-1.7B)_ |
| Package manager | `uv` |
---
## π Repository Layout
```
[ENV_NAME]/
βββ server/
β βββ app.py β FastAPI entry point
β βββ environment.py β Core environment logic
β βββ Dockerfile
βββ models.py β Typed Action / Observation / State
βββ client.py β HTTPEnvClient subclass
βββ openenv.yaml β Manifest (required)
βββ pyproject.toml
```
---
## β
Definition of Done
- [ ] `openenv init` scaffold created
- [ ] `models.py` β typed `Action`, `Observation`, `State` defined
- [ ] `environment.py` β `reset()`, `step()`, `state` implemented
- [ ] `server/app.py` β uses `create_fastapi_app(env)`
- [ ] `curl /health` β `{"status": "healthy"}`
- [ ] Docker image builds and runs locally
- [ ] Pushed to HF Spaces via `openenv push`
- [ ] GRPO training runs end-to-end
- [ ] Fine-tuned model pushed to HF Hub
- [ ] Evaluation metrics recorded
|