Text Generation
PEFT
Safetensors
zerolang
reinforcement-learning
verifiers
code-editing
tool-use
graph-editing
laguna-xs2
lora
fine-tune
Instructions to use poolside-laguna-hackathon/zerolang-editing with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use poolside-laguna-hackathon/zerolang-editing with PEFT:
Task type is invalid.
- Notebooks
- Google Colab
- Kaggle
Document Roder zero graph RL harness
Browse files
README.md
CHANGED
|
@@ -16,6 +16,12 @@ license: apache-2.0
|
|
| 16 |
to edit [Zerolang](https://github.com/vercel-labs/zerolang) programs through
|
| 17 |
checked graph edits instead of loose text replacement.
|
| 18 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
The core task is intentionally narrow: each rollout starts with a `.0` source
|
| 20 |
file already written to disk, asks the model for a semantic code edit, and
|
| 21 |
scores the edited file after the model uses Zerolang tooling. The intended
|
|
@@ -47,11 +53,30 @@ compiles and matches the hidden target source.
|
|
| 47 |
- **Prime environment ID:** `pandelis/zerolang-editing`
|
| 48 |
- **Version in this repo:** `0.1.8`
|
| 49 |
- **Task type:** multi-turn tool-use code editing
|
|
|
|
| 50 |
- **Language under edit:** Zerolang `.0`
|
| 51 |
- **Train split:** 209 deterministic synthetic tasks
|
| 52 |
- **Eval split:** 67 held-out deterministic synthetic tasks
|
| 53 |
- **Primary reward target:** successful `zero_graph_patch` on the rollout file
|
| 54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
## Rollout Contract
|
| 56 |
|
| 57 |
Each task row includes an initial Zerolang source program and a hidden target
|
|
|
|
| 16 |
to edit [Zerolang](https://github.com/vercel-labs/zerolang) programs through
|
| 17 |
checked graph edits instead of loose text replacement.
|
| 18 |
|
| 19 |
+
The RL harness is built around [Roder](https://roder.sh), using a custom
|
| 20 |
+
zero-coder plugin/distribution that exposes a Zerolang-only graph toolset to the
|
| 21 |
+
model. In training, generic source editing tools are disabled; the agent is
|
| 22 |
+
expected to use the `zero_*` tools below, especially `zero_graph_summary` and
|
| 23 |
+
`zero_graph_patch`, against the rollout file on disk.
|
| 24 |
+
|
| 25 |
The core task is intentionally narrow: each rollout starts with a `.0` source
|
| 26 |
file already written to disk, asks the model for a semantic code edit, and
|
| 27 |
scores the edited file after the model uses Zerolang tooling. The intended
|
|
|
|
| 53 |
- **Prime environment ID:** `pandelis/zerolang-editing`
|
| 54 |
- **Version in this repo:** `0.1.8`
|
| 55 |
- **Task type:** multi-turn tool-use code editing
|
| 56 |
+
- **Agent harness:** Roder with a custom Zero graph-only plugin/tool allowlist
|
| 57 |
- **Language under edit:** Zerolang `.0`
|
| 58 |
- **Train split:** 209 deterministic synthetic tasks
|
| 59 |
- **Eval split:** 67 held-out deterministic synthetic tasks
|
| 60 |
- **Primary reward target:** successful `zero_graph_patch` on the rollout file
|
| 61 |
|
| 62 |
+
## Roder Harness
|
| 63 |
+
|
| 64 |
+
The intended RL setup runs the model inside Roder rather than a generic chat
|
| 65 |
+
loop. Roder provides the coding-agent harness, while a custom zero-coder plugin
|
| 66 |
+
configures the available tool surface for this environment.
|
| 67 |
+
|
| 68 |
+
That plugin is deliberately restrictive:
|
| 69 |
+
|
| 70 |
+
- It exposes only Zerolang graph/check/fix/skills tools.
|
| 71 |
+
- It removes generic text edit tools from the training harness.
|
| 72 |
+
- It routes tool calls to on-disk `.0` files using `path` arguments.
|
| 73 |
+
- It keeps checked graph edits as the primary affordance for code changes.
|
| 74 |
+
|
| 75 |
+
This matters because the behavior we want to train is not "rewrite this source
|
| 76 |
+
string". The target behavior is "inspect the Zerolang graph and apply a checked
|
| 77 |
+
semantic graph patch to the file Roder is managing". The Verifiers environment
|
| 78 |
+
then grades the resulting file from disk.
|
| 79 |
+
|
| 80 |
## Rollout Contract
|
| 81 |
|
| 82 |
Each task row includes an initial Zerolang source program and a hidden target
|