Spaces:

Jayant2304
/

commitment-os

Sleeping

Jayant2304 commited on 29 days ago

Commit

665fae0

verified ·

1 Parent(s): 2c07089

Update Blog.md

Files changed (1) hide show

Blog.md CHANGED Viewed

@@ -246,19 +246,19 @@ The training loop connects directly to the live CommitmentOS API — not a stati
 **hard_011 — Investor Dinner Cascade**
-| | Before Training | After Training |
-|--|----------------|----------------|
 | Steps taken | 1 (immediate surrender) | 6 |
 | Constraints met | 0 / 6 | **6 / 6** |
 | Commitments honored | 0 | **1** (happy hour renegotiated) |
 | Emails sent | 0 | **2** (Team + VP_Chen) |
 | Final reward | 0.50 | **0.99** |
-**Reward by task — before vs after across all 15 scenarios:**
 ![Baseline vs Improved Reward by Task — blue bars near 1.0, grey baseline bars ranging 0.4-0.76](reward_by_task.svg)
-*Every task improves. Every single one.*
 **LLM checkpoint results (pre-RL vs post-RL Qwen2.5-1.5B):**

 **hard_011 — Investor Dinner Cascade**
+| | No-Action Baseline | Task-Completing Agent |
+|--|-------------------|----------------------|
 | Steps taken | 1 (immediate surrender) | 6 |
 | Constraints met | 0 / 6 | **6 / 6** |
 | Commitments honored | 0 | **1** (happy hour renegotiated) |
 | Emails sent | 0 | **2** (Team + VP_Chen) |
 | Final reward | 0.50 | **0.99** |
+**Capability gap across all 15 tasks:**
 ![Baseline vs Improved Reward by Task — blue bars near 1.0, grey baseline bars ranging 0.4-0.76](reward_by_task.svg)
+*An agent that submits immediately (grey) vs one that uses the tools correctly (blue). This is the capability gap CommitmentOS trains a model to close.*
 **LLM checkpoint results (pre-RL vs post-RL Qwen2.5-1.5B):**