wheattoast11's picture
Update model card with v10 eval results (92% task, 99% format)
1aa292d verified
metadata
library_name: peft
base_model: Tesslate/OmniCoder-9B
tags:
  - carl
  - terminals
  - intuition-labs
  - rl
  - grpo
  - tool-calling
  - coding-agent
license: other

OmniCoder-9B-Zero-Phase2Prime (v10)

CARL (Coherence-Aware Reinforcement Learning) LoRA adapter by Intuition Labs / Tej Desai.

Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.

Eval Results (2026-04-09)

Metric Value
Task completion 92%
Tool format compliance 99%
Mean tool calls 11.09
Individual tool failure rate 43% (recovers via retry)
Mean tokens 1441
Phase 2' Gate PASS

Training

  • Base model: Tesslate/OmniCoder-9B
  • Method: GRPO with CodingSandboxEnv (real subprocess execution)
  • Steps: 80 | Generations: 2 per prompt
  • Rewards: 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
  • LoRA: r=64, alpha=128, targets=qkvo+gate+up+down

Usage

CARL Naming

This adapter is also available as a merged model at wheattoast11/il-terminals-carl-omni9b-v10 (pending).

Pattern: il-terminals-carl-{base}-{tag} | Intuition Labs

Papers

  • Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
  • Semantic Realizability (DOI: 10.5281/zenodo.18992031)