--- library_name: peft base_model: Tesslate/OmniCoder-9B tags: - carl - terminals - intuition-labs - rl - grpo - tool-calling - coding-agent license: other --- # OmniCoder-9B-Zero-Phase2Prime (v10) **CARL** (Coherence-Aware Reinforcement Learning) LoRA adapter by [Intuition Labs](https://terminals.tech) / [Tej Desai](https://github.com/wheattoast11). Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution. ## Eval Results (2026-04-09) | Metric | Value | |--------|-------| | Task completion | **92%** | | Tool format compliance | 99% | | Mean tool calls | 11.09 | | Individual tool failure rate | 43% (recovers via retry) | | Mean tokens | 1441 | | Phase 2' Gate | **PASS** | ## Training - **Base model:** [Tesslate/OmniCoder-9B](https://huggingface.co/Tesslate/OmniCoder-9B) - **Method:** GRPO with CodingSandboxEnv (real subprocess execution) - **Steps:** 80 | **Generations:** 2 per prompt - **Rewards:** 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length) - **LoRA:** r=64, alpha=128, targets=qkvo+gate+up+down ## Usage ## CARL Naming This adapter is also available as a merged model at `wheattoast11/il-terminals-carl-omni9b-v10` (pending). Pattern: `il-terminals-carl-{base}-{tag}` | [Intuition Labs](https://terminals.tech) ## Papers - Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944) - Semantic Realizability (DOI: 10.5281/zenodo.18992031)