| --- |
| library_name: peft |
| base_model: Tesslate/OmniCoder-9B |
| tags: |
| - carl |
| - terminals |
| - intuition-labs |
| - rl |
| - grpo |
| - tool-calling |
| - coding-agent |
| license: other |
| --- |
| |
| # OmniCoder-9B-Zero-Phase2Prime (v10) |
|
|
| **CARL** (Coherence-Aware Reinforcement Learning) LoRA adapter by [Intuition Labs](https://terminals.tech) / [Tej Desai](https://github.com/wheattoast11). |
|
|
| Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution. |
|
|
| ## Eval Results (2026-04-09) |
|
|
| | Metric | Value | |
| |--------|-------| |
| | Task completion | **92%** | |
| | Tool format compliance | 99% | |
| | Mean tool calls | 11.09 | |
| | Individual tool failure rate | 43% (recovers via retry) | |
| | Mean tokens | 1441 | |
| | Phase 2' Gate | **PASS** | |
|
|
| ## Training |
|
|
| - **Base model:** [Tesslate/OmniCoder-9B](https://huggingface.co/Tesslate/OmniCoder-9B) |
| - **Method:** GRPO with CodingSandboxEnv (real subprocess execution) |
| - **Steps:** 80 | **Generations:** 2 per prompt |
| - **Rewards:** 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length) |
| - **LoRA:** r=64, alpha=128, targets=qkvo+gate+up+down |
| |
| ## Usage |
| |
| |
| |
| ## CARL Naming |
| |
| This adapter is also available as a merged model at `wheattoast11/il-terminals-carl-omni9b-v10` (pending). |
| |
| Pattern: `il-terminals-carl-{base}-{tag}` | [Intuition Labs](https://terminals.tech) |
| |
| ## Papers |
| |
| - Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944) |
| - Semantic Realizability (DOI: 10.5281/zenodo.18992031) |
| |