Update model card with v10 eval results (92% task, 99% format)

1aa292d verified 3 days ago

1.56 kB

library_name: peft
base_model: Tesslate/OmniCoder-9B
tags:
  - carl
  - terminals
  - intuition-labs
  - rl
  - grpo
  - tool-calling
  - coding-agent
license: other

OmniCoder-9B-Zero-Phase2Prime (v10)

CARL (Coherence-Aware Reinforcement Learning) LoRA adapter by Intuition Labs / Tej Desai.

Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.

Eval Results (2026-04-09)

Metric	Value
Task completion	92%
Tool format compliance	99%
Mean tool calls	11.09
Individual tool failure rate	43% (recovers via retry)
Mean tokens	1441
Phase 2' Gate	PASS

Training

Base model: Tesslate/OmniCoder-9B
Method: GRPO with CodingSandboxEnv (real subprocess execution)
Steps: 80 | Generations: 2 per prompt
Rewards: 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
LoRA: r=64, alpha=128, targets=qkvo+gate+up+down

Usage

CARL Naming

This adapter is also available as a merged model at wheattoast11/il-terminals-carl-omni9b-v10 (pending).

Pattern: il-terminals-carl-{base}-{tag} | Intuition Labs

Papers

Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
Semantic Realizability (DOI: 10.5281/zenodo.18992031)