File size: 1,556 Bytes
c56d8a8
 
1aa292d
c56d8a8
1aa292d
 
 
 
 
 
 
 
c56d8a8
 
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
 
 
 
 
 
 
 
c56d8a8
1aa292d
c56d8a8
1aa292d
 
 
 
 
c56d8a8
1aa292d
c56d8a8
 
 
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
c56d8a8
1aa292d
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
---
library_name: peft
base_model: Tesslate/OmniCoder-9B
tags:
  - carl
  - terminals
  - intuition-labs
  - rl
  - grpo
  - tool-calling
  - coding-agent
license: other
---

# OmniCoder-9B-Zero-Phase2Prime (v10)

**CARL** (Coherence-Aware Reinforcement Learning) LoRA adapter by [Intuition Labs](https://terminals.tech) / [Tej Desai](https://github.com/wheattoast11).

Phase 2' Environment GRPO: Tool-calling through real interaction. The model learned WHICH tools solve WHICH tasks through 80 steps of GRPO with real subprocess execution.

## Eval Results (2026-04-09)

| Metric | Value |
|--------|-------|
| Task completion | **92%** |
| Tool format compliance | 99% |
| Mean tool calls | 11.09 |
| Individual tool failure rate | 43% (recovers via retry) |
| Mean tokens | 1441 |
| Phase 2' Gate | **PASS** |

## Training

- **Base model:** [Tesslate/OmniCoder-9B](https://huggingface.co/Tesslate/OmniCoder-9B)
- **Method:** GRPO with CodingSandboxEnv (real subprocess execution)
- **Steps:** 80 | **Generations:** 2 per prompt
- **Rewards:** 5-function cascade (tool_engagement + task_completion + gated_CARL + tool_format + GR3_length)
- **LoRA:** r=64, alpha=128, targets=qkvo+gate+up+down

## Usage



## CARL Naming

This adapter is also available as a merged model at `wheattoast11/il-terminals-carl-omni9b-v10` (pending).

Pattern: `il-terminals-carl-{base}-{tag}` | [Intuition Labs](https://terminals.tech)

## Papers

- Bounded Informational Time Crystals (DOI: 10.5281/zenodo.18906944)
- Semantic Realizability (DOI: 10.5281/zenodo.18992031)