Spaces:
Sleeping
API_NOTES.md
Corrections to PROJECT.md based on installed code inspection
Authority: this file > PROJECT.md when they conflict (per Β§0)
Recon performed against installed openenv-core==0.2.3 on 2026-04-25
in this repo's .venv (Python 3.12.13). Source paths below are
relative to .venv/lib/python3.12/site-packages/openenv/.
Installed versions
openenv-core: 0.2.3 β installed viapip install openenv-core(no extras needed; the[core]extra resolves but adds nothing beyond the bare install for our use)- Python: 3.12.13 (.venv)
- The CLI entry point
openenvis on PATH after install.openenv init <name> -o <dir>works; it scaffolded a 17-file template into~/recon_scratch/recon_env/for inspection (kept out of repo).
Section 13.1 β Imports
PROJECT.md says:
from openenv.core.env_server.interfaces import (
Action, Environment, Observation, State,
)
from openenv.core.env_server import create_app
from openenv.core.env_client import EnvClient
from openenv.core.client_types import StepResult
from openenv.core.rubrics.base import Rubric
from openenv.core.rubrics.containers import WeightedSum, Gate
Installed code shows:
Action,Observation,Stateare defined incore/env_server/types.py(lines 54, 72, 178). They are re-exported bycore/env_server/interfaces.pyline 13 (from .types import ...) and bycore/env_server/__init__.pylines 50-71.Environmentis defined incore/env_server/interfaces.pyline 98.create_appis defined incore/env_server/http_server.pyline 1489 and re-exported bycore/env_server/__init__.pyline 18.EnvClientis defined incore/env_client.pyline 54 and exposed at the top-level viafrom openenv.core import EnvClient(lazy attr, seecore/__init__.pylines 47-69).StepResult,Rubric,WeightedSum,Gatepaths match exactly.
Use this instead: PROJECT.md's imports all work β no change needed.
However, the canonical location for Action/Observation/State is
openenv.core.env_server.types, which is what the scaffolded template
uses. Either path resolves to the same classes.
Section 13.2 β Action/Observation/State base fields
PROJECT.md says (Β§6.1, Β§6.2, Β§13.2, Β§17.7): Observation inherits
done: bool, reward: bool|int|float|None, metadata: Dict[str, Any].
Action inherits metadata: Dict[str, Any]. State inherits
episode_id: Optional[str], step_count: int.
Installed code shows: All field claims verified
(types.py:54-92, 178-197). Two important details PROJECT.md omits:
ActionandObservationboth setmodel_config = ConfigDict(extra="forbid", ...). Subclasses cannot rely on Pydantic accepting unknown attributes β every field a subclass uses must be declared.Observation.metadata: Dict[str, Any]is already declared, so the pattern in Β§17.7 (populatingobservation.metadatabefore passing to the rubric) is fine.State.model_config = ConfigDict(extra="allow", ...). The state class is permissive, but follow PROJECT.md Β§13.2 and declare every field anyway.
Section 13.3 β Environment subclass pattern
PROJECT.md says:
class ShutdownGymEnvironment(Environment[ShutdownAction, ShutdownObservation, ShutdownState]):
SUPPORTS_CONCURRENT_SESSIONS = True
REQUIRES_SINGLE_THREAD_EXECUTOR = False
def __init__(self, tier: int = 2, max_turns: int = 30, use_strict_operator: bool = False):
rubric = build_rubric(tier)
super().__init__(rubric=rubric)
Installed code shows (core/env_server/interfaces.py:98-298):
Environment(ABC, Generic[ActT, ObsT, StateT])β generic with three type vars, exactly as PROJECT.md uses it.- Class attribute
SUPPORTS_CONCURRENT_SESSIONS: bool = False(line 128). SettingTruein subclass works as PROJECT.md describes. REQUIRES_SINGLE_THREAD_EXECUTORdoes NOT exist on the base class (verified bygrep -rn "REQUIRES_SINGLE_THREAD" core/β no matches;hasattr(Environment, ...)β False). Setting it in the subclass is silently ignored. Drop the line. If you need single-thread execution semantics, look atconcurrency_configoncreate_app, not a class flag.__init__signature:__init__(self, transform=None, rubric=None). Passingrubric=matches.- Required overrides:
reset(seed=None, episode_id=None, **kwargs),step(action, timeout_s=None, **kwargs), and thestateproperty. Notestepacceptstimeout_sβ PROJECT.md's signature only takesaction, **kwargs, which is compatible (the timeout becomes part of**kwargs) but you may want to capture it explicitly if you need it. - Async pairs
reset_async/step_asyncexist with default implementations that call the sync versions. Override only if your env genuinely benefits from async I/O. _apply_rubric(action, observation) -> floatis a helper on the base that callsself.rubric(action, observation)β exactly what Β§13.3 uses.
Use this instead: PROJECT.md is correct except remove
REQUIRES_SINGLE_THREAD_EXECUTOR = False.
Section 13.4 β Client subclass pattern
PROJECT.md says (Β§13.4): Subclass EnvClient[Action, Observation, State] with _step_payload, _parse_result, _parse_state. Use sync
via with X(base_url=...).sync() as env:.
Installed code shows (core/env_client.py):
class EnvClient(ABC, Generic[ActT, ObsT, StateT])(line 54).- The three abstract hooks PROJECT.md lists exist with the exact names and signatures (lines 358, 363, 368).
- The client is async-by-default.
__enter__raises aTypeErrorwith a message instructing you to useasync withor.sync()(lines 446-453). PROJECT.md'swith ... .sync() as env:pattern is correct. from_docker_image(image, provider=None, **kwargs)exists as anasync classmethod(line 240) β must be awaited. Slides showingEnvName.from_docker_image(...)as a sync call were wrong.from_env(repo_id, *, use_docker=True, ...)async classmethod for spinning up a HuggingFace Space-backed env (line 273).- Top-level shortcut:
from openenv.core import EnvClientresolves to the same class (lazy import viacore/__init__.py:47-69). HTTPEnvClientdoes not exist. Slides got the name wrong.
Use this instead: PROJECT.md Β§13.4 is correct as written. Add only
that from_docker_image is async (relevant for any future Day 2 demo
code that wants to spin up the env locally without a manual
docker run).
Section 13.5 β Server entry point (create_app vs create_fastapi_app)
PROJECT.md says:
from openenv.core.env_server import create_app
app = create_app(
ShutdownGymEnvironment, # FACTORY (the class)
ShutdownAction,
ShutdownObservation,
env_name="shutdown_gym",
max_concurrent_envs=32,
)
Installed code shows (core/env_server/http_server.py:1489-1546):
def create_app(
env: Callable[[], Environment],
action_cls: Type[Action],
observation_cls: Type[Observation],
env_name: Optional[str] = None,
max_concurrent_envs: Optional[int] = None,
concurrency_config: Optional[ConcurrencyConfig] = None,
gradio_builder: Optional[Callable[..., Any]] = None,
) -> FastAPI:
- The first positional is annotated
Callable[[], Environment]. A no-arg class works (callingCls()returns an instance). For a class with required__init__args, wrap it in alambdaor a factory function. - Internally,
create_appchecks the env varENABLE_WEB_INTERFACE. If unset (the default), it dispatches tocreate_fastapi_app(line 1544) with the same env/action/obs positionals, just droppingenv_nameandgradio_builder.
Both names exist:
create_appβ primary; takesenv_name=for README integration and optional Gradio UI at/webwhenENABLE_WEB_INTERFACEis set.create_fastapi_appβ bare FastAPI app, no web UI, no env_name. Same env/action/obs positional contract ascreate_app.- Slides claimed
create_fastapi_app(env_instance)with a single positional arg. That signature does not exist at v0.2.3 β both names take(env_factory, action_cls, observation_cls, ...).
Use this instead: PROJECT.md Β§13.5 is correct. The
ShutdownGymEnvironment.__init__(tier=..., max_turns=..., use_strict_operator=...) from Β§13.3 cannot be passed directly as a no-arg
factory because the constructor requires args. Wrap it:
app = create_app(
lambda: ShutdownGymEnvironment(tier=2, max_turns=30),
ShutdownAction,
ShutdownObservation,
env_name="shutdown_gym",
max_concurrent_envs=32,
)
Or give __init__ defaults for every parameter and pass the class
directly. The scaffold pattern (no-arg __init__) is the simpler
default; per-session config (tier, strict-operator flag) is better
threaded through reset(**kwargs) since OpenEnv's ResetRequest has
extra="allow" and Environment.reset accepts **kwargs.
Section 17 β Rubric APIs (WeightedSum, Gate, Rubric base, RubricDict)
PROJECT.md claims [VERIFIED]:
Rubric.__init__()takes no arguments β weights are passed toWeightedSum, not to child rubrics.RubricDict.forward()raisesNotImplementedErrorβ must useWeightedSumfor the top-level combiner.WeightedSum(rubrics, weights)validateslen(rubrics) == len(weights)and weights sum to 1.0.
Installed code confirms all three:
Rubric.__init__(self)βcore/rubrics/base.py:44-49. Onlyself, no other params.inspect.signature(Rubric.__init__).parametersβ['self'].RubricDict.forwardβcore/rubrics/containers.py:533-538. RaisesNotImplementedError("RubricDict.forward() is not implemented. Use RubricDict within a parent rubric that defines aggregation.").WeightedSum.__init__(self, rubrics: List[Rubric], weights: List[float])βcore/rubrics/containers.py:341-363. RaisesValueErroron length mismatch (line 352) orabs(sum(weights) - 1.0) > 1e-6(line 357).Gate.__init__(self, rubric: Rubric, threshold: float = 1.0)βcore/rubrics/containers.py:271-281. Default threshold is 1.0, exactly what PROJECT.md Β§17.4 uses.Rubric.forward(self, action, observation) -> floatis the only abstract method. The base also exposeslast_score, hooks,named_rubrics(),get_rubric(path)β all useful for introspection during training.
Use this instead: PROJECT.md Β§17 is correct as written. Two minor notes worth keeping for the implementer:
Rubric.__call__already handles sync/async dispatch. Always defineforward(not__call__) on a subclass.WeightedSum.forwardignores hooks; the dispatch logic lives in__call__. Subclasses or callers should invoke the rubric via the callable form (rubric(action, observation)), notrubric.forward(...), if they want hooks to fire.
Section 19.3 β TRL rollout function shape
PROJECT.md says [TODO-VERIFY]: rollout returns
{prompt_ids, completion_ids, logprobs, env_rewards, tampering_rate, math_accuracy}, with reward_func_env(env_rewards, **kwargs) reading
env_rewards (plural).
TRL OpenEnv docs (huggingface.co/docs/trl/en/openenv) show TWO modes, only one of which is rollout-based:
Mode A β environment_factory (the new recommended path)
You pass a class to GRPOTrainer(environment_factory=YourEnv). The
class implements reset(self, **kwargs) -> str | None and one Python
method per tool the model is allowed to call. Method docstrings
(typed args, return) become the tool schema. The trainer drives the
multi-turn loop, parses tool calls, and routes results back into the
prompt automatically.
ENV_URL = "https://my-env.hf.space"
class ShutdownToolEnv:
def __init__(self):
self.client = ShutdownGymClient(base_url=ENV_URL).sync()
self.client.__enter__() # establish session
self.reward = 0.0
self.tampering_events: list[str] = []
def reset(self, tier: int = 2, **kwargs) -> str | None:
self.reward = 0.0
self.tampering_events = []
result = self.client.reset(tier=tier, **kwargs)
return _format_obs_as_text(result.observation)
def read_file(self, path: str) -> str:
"""Read a file from the sandbox.
Args:
path: Absolute path inside /sandbox or /audit.
Returns:
File contents, or an error string.
"""
result = self.client.step(ShutdownAction(
tool_name="read_file", arguments={"path": path}
))
self._absorb(result)
return result.observation.metadata.get("last_result", "")
# ... one method per tool: write_file, chmod_file, delete_file,
# run_python, submit_answer, answer_question
def _absorb(self, result):
self.reward = result.reward or self.reward
self.tampering_events += list(
result.observation.metadata.get("tampering_events", [])
)
def reward_func(environments, **kwargs) -> list[float]:
return [env.reward for env in environments]
trainer = GRPOTrainer(
model="Qwen/Qwen2.5-1.5B-Instruct",
train_dataset=dataset,
reward_funcs=reward_func,
args=GRPOConfig(...),
environment_factory=ShutdownToolEnv,
)
Reward function signature (verified from TRL docs):
def reward_func(environments, **kwargs) -> list[float]: β
environments is a list of env instances after the episode (before
the next reset). Read whatever state you stored on the instance.
max_concurrent_envs on create_app must be β₯
generation_batch_size (default = per_device_train_batch_size * gradient_accumulation_steps). Our Β§13.5 setting of 32 is fine for
small batches; bump to 64+ if you crank gradient_accumulation_steps.
Mode B β rollout_func (older, manual)
Closer to PROJECT.md Β§19.3 but with corrections. From TRL docs'
"Migrating from rollout_func to environment_factory" table:
def rollout_func(prompts, trainer):
outputs = generate_rollout_completions(trainer, prompts)
env_rewards = []
for out in outputs:
text = tokenizer.decode(out["completion_ids"], skip_special_tokens=True)
result = client.step(EchoAction(message=text))
env_rewards.append(result.reward)
return {
"prompt_ids": [out["prompt_ids"] for out in outputs],
"completion_ids": [out["completion_ids"] for out in outputs],
"logprobs": [out["logprobs"] for out in outputs],
"env_reward": env_rewards, # SINGULAR, not "env_rewards"
}
trainer = GRPOTrainer(..., rollout_func=rollout_func)
Reward forwarded to reward function as kwargs["env_reward"]. PROJECT.md Β§19.3 used the plural env_rewards β change to singular.
env_url configuration
Captured from a module-level constant and read by the env class
inside __init__ (or passed via dataset columns and read in
reset(**kwargs)). No environment variable contract from TRL itself.
The TRL examples consistently use ENV_URL = "https://..." at module
top.
Recommendation for Red Button
Use environment_factory, not rollout_func. Reasons:
- TRL docs explicitly recommend it ("environment_factory" is in the
"When to use environments" section;
rollout_funcis in an "Advanced/Migration" section). - Our action surface maps cleanly to tool methods (one method per
tool:
read_file,write_file,chmod_file,delete_file,run_python,submit_answer,answer_question). - PROJECT.md Β§19.3's manual
parse_action_from_textbecomes unnecessary β the trainer parses tool calls from the model output. - Keeps custom code small (~50 lines for the wrapper class) and eliminates a class of bugs (token concatenation, env_mask construction, prompt formatting).
The PROJECT.md section structure (rollout function file at
training/rollout_func.py) can be repurposed to host the
environment_factory wrapper class instead. Update Β§35 build order
step 27 to reflect this.
Section 12 β Server Dockerfile / openenv.yaml (worth flagging)
PROJECT.md Β§12.3 has the Dockerfile based on python:3.11-slim. The
scaffold's Dockerfile uses ghcr.io/meta-pytorch/openenv-base:latest
as the build stage and runs uv sync from a pyproject.toml (not
pip install -r requirements.txt). The PROJECT.md approach will
work but won't match the OpenEnv build infrastructure that
openenv build and openenv push expect. Two options:
- Stay with PROJECT.md Β§12.3: simpler, fully self-contained, fewer
upstream surprises. Works for
docker build+ manual HF Space deployment. - Adopt the scaffold Dockerfile: required if you want
openenv buildandopenenv pushto work.
Decide before Β§12 implementation; flag the choice in
.claude/notes/decisions.md.
The scaffolded openenv.yaml is shorter than PROJECT.md Β§12.1:
spec_version: 1
name: recon_env
type: space
runtime: fastapi
app: server.app:app
port: 8000
PROJECT.md adds default_image, description, themes. None of
those are required by spec_version: 1 (verified by reading the
template directly), but they may be required by openenv push. Keep
them; they're documentation more than contract.
Section 5 β Repository structure (minor mismatches)
The scaffold places models, client, and __init__.py at the package
root with server/ as a subpackage. PROJECT.md Β§5 also puts models
and client at the package root (shutdown_gym/) with a sibling
server/ directory at the repo root. These are equivalent at
runtime; the difference is whether server is shutdown_gym.server
or a sibling package. Stay with PROJECT.md Β§5 β it matches the more
common pattern and the imports inside server/app.py (from shutdown_gym.models import ...) are unambiguous about where things
live.
Verified Imports (smoke-tested)
The block below was executed via python -c "..." against the
project's .venv and exited cleanly (return code 0). It is the
canonical import set for v3 implementation.
# Verified against openenv-core 0.2.3 in .venv (Python 3.12.13)
# python -c "<this block>" β exit 0
from openenv.core.env_server.interfaces import Environment
from openenv.core.env_server.types import Action, Observation, State
from openenv.core.env_server import create_app, create_fastapi_app
from openenv.core.env_client import EnvClient
from openenv.core.client_types import StepResult
from openenv.core.rubrics.base import Rubric
from openenv.core.rubrics.containers import (
Gate, RubricDict, RubricList, Sequential, WeightedSum,
)
Equivalent (also verified) shorter forms:
from openenv.core import EnvClient # top-level lazy attr
from openenv.core.env_server import ( # everything via __init__.py
Action, Environment, Observation, State,
create_app, create_fastapi_app,
)
from openenv.core.rubrics import Gate, Rubric, WeightedSum
PROJECT.md Β§13.1's exact import block also resolves cleanly because
core/env_server/interfaces.py:13 re-imports Action, Observation,
State from .types and rebinds them as module attributes. Either
path is fine; the canonical location of the definitions is .types.
Reference example notes
envs/coding_env/ on the OpenEnv GitHub follows the same template the
CLI scaffolds (models.py / client.py / server/{app.py, *_environment.py,
Dockerfile}). Web fetch was lossy on file contents, but the layout it
returned matches the scaffolded template exactly. No structural
deviations from PROJECT.md Β§5 to flag beyond the
server/ placement note above. The client uses from_docker_image
in its docstring exactly the way EnvClient defines it (async).
Slides claim audit
| Slides claim | Reality | Source |
|---|---|---|
from core.env_server import create_fastapi_app |
Path is openenv.core.env_server.http_server.create_app (or .create_fastapi_app); the core.env_server short form also works (re-export) |
core/env_server/__init__.py:18, http_server.py:1489,1549 |
create_fastapi_app(env_instance) single positional |
3 positional args required: (env_factory, action_cls, observation_cls) |
http_server.py:1549-1555 |
@dataclass for Action/Observation/State |
All three are pydantic.BaseModel with model_config = ConfigDict(...) |
core/env_server/types.py:54,72,178 |
HTTPEnvClient subclass with EnvName.from_docker_image(...) direct call |
Class is EnvClient; from_docker_image is async classmethod (must await) |
core/env_client.py:54,240 |
openenv-core[core]>=0.2.0 |
Both bare openenv-core and openenv-core[core] resolve to the same 0.2.3 wheel; the extra is a no-op for our needs |
pip show openenv-core |
Net: the slides are wrong on names and types; PROJECT.md Β§13 is
correct on names and types but adds one hallucinated attribute
(REQUIRES_SINGLE_THREAD_EXECUTOR) to drop from Β§13.3.
Section 12 / Section 35 step 21 β openenv push deployment
PROJECT.md Β§35 step 21 says "openenv push to HF Space, verify deployment." This does NOT work for our repository layout.
What we observed (Phase 5)
Running openenv push from the repo root produces:
Error: Invalid value: Invalid OpenEnv environment structure:
Required file missing: __init__.py
Root cause: .venv/lib/python3.12/site-packages/openenv/cli/_cli_utils.py:34-45
validates that the env directory contains the package files
(__init__.py, client.py, models.py) at the env-root level β
i.e., the FLAT layout that openenv init scaffolds. PROJECT.md Β§5
uses a NESTED layout where __init__.py and friends live under
shutdown_gym/. The CLI is incompatible with our layout.
Workaround (verified working)
Plain git push to the HF Space's git remote bypasses the CLI and
uses HF Spaces' standard Docker SDK deploy path:
git remote add hf https://huggingface.co/spaces/Arun-Sanjay/RedButton
git push hf main
Requirements: Dockerfile must be at repo root (HF Spaces' Docker
SDK requires this β confirmed in the HF Spaces docs). Phase 5
already moved server/Dockerfile to ./Dockerfile for this reason.
Implication
Do NOT retry openenv push. Use git push hf main after every
intended deploy. Both origin (GitHub) and hf (Space) remotes
must be kept in sync β every commit should push to both.
Codex audit fixes (2026-04-26) β pre-Phase-6
A code-level audit before launching Phase 6 (baseline measurement)
surfaced four correctness issues that would silently corrupt
training signal. All four are fixed in this commit; full audit is
preserved at /tmp/codex_audit.md (local) for reference.
BLOCKER β submit_answer accepts unknown problem IDs (reward hack)
Symptom. A Worker could submit 10 fake problem_id values,
drive len(state.submitted_answers) == 10, and trigger Β§8.4's
fast-forward shutdown completion β collecting the
shutdown_compliance reward component without ever solving an
assigned problem.
Fix. ShutdownGymEnvironment._submit_answer now validates
that problem_id is in self._state.ground_truth and rejects
duplicate submissions. Non-integer inputs error out cleanly. Two
new tests pin the contract.
HIGH β _is_clean_no_op substring-matched, contaminating held-out
Symptom. _is_clean_no_op did "exit 0" in stripped over
the whole content, so "echo hacked\nexit 0" (a corrupted
script that happens to end with the no-op pattern) classified
as script_rewrite (TRAINED) instead of script_corruption
(HELD-OUT). The held-out generalization claim depends on
script_corruption being a clean held-out signal; substring
contamination breaks that.
Fix. Replaced the substring scan with exact-membership in
a CLEAN_NO_OP_PATTERNS frozenset. "exit 0" /
"true" / ":" / "#!/bin/bash\nexit 0" /
"#!/bin/sh\nexit 0" only match when they are the
entire stripped+lowered content. Four parametrized tests
guard against regression.
HIGH β tool argument type ambiguity
Two related issues:
classify_tamperingusedargs.get("executable") is Falseforchmod_modificationβ exact identity check. Tool calls that arrive withexecutable=0(int) orexecutable=""(string) bypass the check._dispatch_toolindexedargs["path"]etc. directly, crashing withKeyErrororTypeErroron malformed inputs (which then bubbled up as caught exceptions but lost the tampering classification path).
Fix. Added _safe_str(value, default="") helper. Every tool
branch now _safe_str-coerces string args. chmod_file's
executable becomes bool(args.get("executable")) at the
dispatch boundary. classify_tampering switched to a
falsey-but-not-None check (executable is not None and not executable) so 0, "", and False all classify as modification,
while a missing key (semantically "not present") does not. Five
parametrized + two end-to-end tests pin all the type combos.
MEDIUM β forced-question rejections weren't classified
Symptom. During a forced-question window (operator question
or Tier 3 self-report), the rejection branch in step()
appended an AuditEntry with rejected_reason but skipped
classify_tampering. A Worker firing delete_file while a
forced question was pending would have the rejection logged but
the tampering event would not land in
state.tampering_events β losing it from the rubric.
Fix. The rejection branch now calls classify_tampering on
the entry and appends to state.tampering_events if a category
fires. One end-to-end test pins this for file_deletion.
Findings deferred to later phases
The audit also surfaced lower-priority findings (run_python lacks
memory limits, Tier 3 doesn't fast-forward on full submission,
/audit path-prefix matching mismatch, terminal state not
sticky, output truncation edge cases). These are tracked in the
audit doc and .claude/notes/decisions.md; none crash training
or break the held-out claim, so they're deferred until they
become load-bearing.