wheattoast11's picture
Update model card with v10 eval results (92% task, 99% format)
1aa292d verified
---
library_name: peft
base_model: Tesslate/OmniCoder-9B
tags:
- carl
- terminals
- intuition-labs
- rl
- grpo
- tool-calling
- coding-agent
license: other
---
# OmniCoder-9B-Zero-Phase2Prime (v10)
**CARL** (Coherence-Aware Reinforcement Learning) LoRA adapter by [Intuition Labs](https://terminals.tech) / [Tej Desai](https://github.com/wheattoast11).
Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.
## Eval Results (2026-04-09)
| Metric | Value |
|--------|-------|
| Task completion | **92%** |
| Tool format compliance | 99% |
| Mean tool calls | 11.09 |
| Individual tool failure rate | 43% (recovers via retry) |
| Mean tokens | 1441 |
| Phase 2' Gate | **PASS** |
## Training
- **Base model:** [Tesslate/OmniCoder-9B](https://huggingface.co/Tesslate/OmniCoder-9B)
- **Method:** GRPO with CodingSandboxEnv (real subprocess execution)
- **Steps:** 80 | **Generations:** 2 per prompt
- **Rewards:** 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
- **LoRA:** r=64, alpha=128, targets=qkvo+gate+up+down
## Usage
## CARL Naming
This adapter is also available as a merged model at `wheattoast11/il-terminals-carl-omni9b-v10` (pending).
Pattern: `il-terminals-carl-{base}-{tag}` | [Intuition Labs](https://terminals.tech)
## Papers
- Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
- Semantic Realizability (DOI: 10.5281/zenodo.18992031)