--- title: OpenReward Echo Env (TRL test fixture) emoji: 🦜 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false license: apache-2.0 --- # OpenReward Echo Env Minimal [Open Reward Standard](https://openrewardstandard.io) environment used as the test fixture for [`trl.experimental.openreward`](https://github.com/huggingface/trl/tree/main/trl/experimental/openreward). The model is given a target string and must call `echo(text=...)` with exactly that string. Reward is `1.0` on match, `0.0` otherwise; the episode finishes on a correct echo. Pure Python — no sandbox, no external state — so responses are deterministic and the env can run thousands of concurrent sessions on free-tier hardware. ## Use ```python import os os.environ["OPENREWARD_API_URL"] = "https://trl-internal-testing-openreward-echo-env.hf.space" os.environ["OPENREWARD_SESSION_URL"] = "https://trl-internal-testing-openreward-echo-env.hf.space" from trl.experimental.openreward import OpenRewardSpec spec = OpenRewardSpec( "https://trl-internal-testing-openreward-echo-env.hf.space", env_name="echoenvironment", ) print(spec.train_dataset) # 8 rows: target ∈ {"hello", "world", "trl", ...} ``` The two `OPENREWARD_*_URL` overrides are needed because the `openreward` SDK by default expects a two-subdomain platform layout (`api.` + `sessions.`); for a single-host self-hosted server both have to point at the same URL. ## Tasks | split | count | shape | |---|---|---| | `train` | 8 | `{"id": "echo-N", "target": ""}` | ## Local development ```bash pip install -r requirements.txt python server.py # listens on PORT (default 8080) ``` ## Files - `server.py` — env definition + `Server([EchoEnvironment]).run()` - `Dockerfile` — `python:3.11-slim` + the deps; HF Spaces serves it on port 7860 - `requirements.txt` — `openreward`, `fastapi`, `uvicorn`