v8: Stage 2 Context-Contract Planning (MLX LoRA, iter 1000, val=0.032)
Browse files- 0000200_adapters.safetensors +3 -0
- 0000400_adapters.safetensors +3 -0
- 0000600_adapters.safetensors +3 -0
- 0000800_adapters.safetensors +3 -0
- 0001000_adapters.safetensors +3 -0
- 0001200_adapters.safetensors +3 -0
- README.md +171 -0
- adapter_config.json +49 -0
- adapters.safetensors +3 -0
0000200_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:67322f026246e071e283f1df0a52731c457a3b23f573868fb6be965fef6d1713
|
| 3 |
+
size 161523781
|
0000400_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d0af972dab5d6f496b81f8706ba1ac3c1131d3130fc578c9949322e72383633e
|
| 3 |
+
size 161523781
|
0000600_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1fceb28c77779930aed540ed89653261fcae365f4c9ebc784a295043c125837d
|
| 3 |
+
size 161523781
|
0000800_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c67b0f6a1ca8ffeb76a1d8e22ae842999b0fef063433289b59185d8a57213cb1
|
| 3 |
+
size 161523781
|
0001000_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:65fe2fd597efdf2d7124923a35dd1489254889e870ae543ed265d264c896967a
|
| 3 |
+
size 161523781
|
0001200_adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ddc136f59c37c531b2f1f9e7ea49ccc20ffcf865e4216f5df04567091aa388bf
|
| 3 |
+
size 161523781
|
README.md
ADDED
|
@@ -0,0 +1,171 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: Qwen/Qwen2.5-7B-Instruct
|
| 4 |
+
tags:
|
| 5 |
+
- workflow-planner
|
| 6 |
+
- slm
|
| 7 |
+
- lora
|
| 8 |
+
- mlx
|
| 9 |
+
- context-contract-planning
|
| 10 |
+
- tier-2-eligibility
|
| 11 |
+
language:
|
| 12 |
+
- en
|
| 13 |
+
pipeline_tag: text-generation
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# SLM Workflow Planner v8 — Context-Contract Planning (MLX LoRA)
|
| 17 |
+
|
| 18 |
+
## Overview
|
| 19 |
+
|
| 20 |
+
**v8** is a Stage 2 enhancement of the SLM Workflow Planner. It extends the v3-best checkpoint with **context-contract planning** — the ability to make routing decisions based on the `required_context` and `produces_context` of ALL nodes in a workflow graph, not just directly connected edges.
|
| 21 |
+
|
| 22 |
+
This enables three new capabilities:
|
| 23 |
+
- **Recovery Routing (Backjump):** On failure, jump backward to an earlier context-satisfiable node
|
| 24 |
+
- **Stage Skipping:** Skip unnecessary stages when required context is already available (e.g., walk-in customers)
|
| 25 |
+
- **Non-Adjacent Parallelism:** Fork two independent context-satisfiable nodes that aren't connected by fork-edges
|
| 26 |
+
|
| 27 |
+
## Model Details
|
| 28 |
+
|
| 29 |
+
| Property | Value |
|
| 30 |
+
|---|---|
|
| 31 |
+
| **Base Model** | Qwen/Qwen2.5-7B-Instruct |
|
| 32 |
+
| **Fine-tune Type** | LoRA (MLX format) |
|
| 33 |
+
| **LoRA Rank** | 16 |
|
| 34 |
+
| **LoRA Scale** | 2.0 |
|
| 35 |
+
| **LoRA Dropout** | 0.02 |
|
| 36 |
+
| **Tuned Layers** | 28/32 |
|
| 37 |
+
| **Trainable Parameters** | 40.37M (0.53%) |
|
| 38 |
+
| **Framework** | MLX (Apple Silicon) |
|
| 39 |
+
|
| 40 |
+
## Training
|
| 41 |
+
|
| 42 |
+
| Property | Value |
|
| 43 |
+
|---|---|
|
| 44 |
+
| **Lineage** | base(8000) → v2(100) → v3(200) → v3-cont → v3-best → **v8(1000)** |
|
| 45 |
+
| **Resume Checkpoint** | v3-best (59.2% on 76-scenario suite) |
|
| 46 |
+
| **Training Iterations** | 1000 (stopped early — val loss converged) |
|
| 47 |
+
| **Learning Rate** | 2e-5 (cosine decay to 1e-6, 100-step warmup) |
|
| 48 |
+
| **Batch Size** | 4 (effective 8 with grad accumulation) |
|
| 49 |
+
| **Max Sequence Length** | 768 tokens |
|
| 50 |
+
| **Dataset** | 696K samples from 150 workflows |
|
| 51 |
+
| **Val Loss** | 0.032 (from 0.272 starting) |
|
| 52 |
+
|
| 53 |
+
### Training Data Distribution
|
| 54 |
+
|
| 55 |
+
| Category | Count | % | Description |
|
| 56 |
+
|---|---|---|---|
|
| 57 |
+
| META | 187K | 26.9% | Dead-end escalation |
|
| 58 |
+
| NEGATIVE | 187K | 26.9% | Tier-2 visible but edge chosen ("satisfiable ≠ sensible") |
|
| 59 |
+
| NEXT_EDGE | 116K | 16.7% | Normal edge progression |
|
| 60 |
+
| NEXT_SKIP 🛡 | 55K | 8.0% | Forward dead-end recovery (Tier-2) |
|
| 61 |
+
| RETRY | 36K | 5.2% | Edge retry on failure |
|
| 62 |
+
| JOIN | 30K | 4.3% | Parallel branch merge |
|
| 63 |
+
| NEXT_BACKJUMP 🛡 | 28K | 4.0% | Failure recovery to earlier node (Tier-2) |
|
| 64 |
+
| FORK_EDGE | 28K | 4.0% | Edge-adjacent fork |
|
| 65 |
+
| FORK_NONADJ 🛡 | 28K | 4.0% | Non-adjacent parallel fork (Tier-2) |
|
| 66 |
+
|
| 67 |
+
🛡 = Protected from downsampling during balancing
|
| 68 |
+
|
| 69 |
+
## Prompt Format
|
| 70 |
+
|
| 71 |
+
The model uses a **tiered prompt** with two candidate sections:
|
| 72 |
+
|
| 73 |
+
```
|
| 74 |
+
Current node: NODE_A (SYSTEM, stage 3)
|
| 75 |
+
Outcome: success
|
| 76 |
+
Failure type: none
|
| 77 |
+
|
| 78 |
+
State:
|
| 79 |
+
goal_progress=0.40
|
| 80 |
+
retry_count=0
|
| 81 |
+
...
|
| 82 |
+
|
| 83 |
+
Produced context: {ctx_start, intake_data, assessment_score}
|
| 84 |
+
|
| 85 |
+
Edge candidates (normal path):
|
| 86 |
+
1. NODE_B (AGENT) [processor] → requires: {assessment_score} → produces: {approval}
|
| 87 |
+
|
| 88 |
+
Context-eligible (off-path, invocable now):
|
| 89 |
+
1. NODE_X (SYSTEM, stage 5, gap=+2) [validator] → requires: {intake_data} ✓ → produces: {validation}
|
| 90 |
+
|
| 91 |
+
Forkable sets: []
|
| 92 |
+
Join-ready: []
|
| 93 |
+
|
| 94 |
+
What is the best action?
|
| 95 |
+
```
|
| 96 |
+
|
| 97 |
+
**Output format:** `DECISION_TYPE NODE_ID`
|
| 98 |
+
- `NEXT NODE_B` — advance to NODE_B
|
| 99 |
+
- `FORK NODE_A, NODE_B` — parallel fork
|
| 100 |
+
- `RETRY NODE_A` — retry current
|
| 101 |
+
- `JOIN NODE_A` — merge parallel branches
|
| 102 |
+
- `META` — escalate to human
|
| 103 |
+
|
| 104 |
+
## Evaluation Results
|
| 105 |
+
|
| 106 |
+
### Section A: Stratified Test (100 held-out samples)
|
| 107 |
+
|
| 108 |
+
| Category | Exact Accuracy | Type Accuracy |
|
| 109 |
+
|---|---|---|
|
| 110 |
+
| META | 20/20 (100%) | 20/20 (100%) |
|
| 111 |
+
| NEGATIVE (Tier-2 visible, edge chosen) | 5/5 (100%) | 5/5 (100%) |
|
| 112 |
+
| SKIP_FORWARD | 7/7 (100%) | 7/7 (100%) |
|
| 113 |
+
| RETRY | 18/20 (90%) | 18/20 (90%) |
|
| 114 |
+
| JOIN | 16/20 (80%) | 16/20 (80%) |
|
| 115 |
+
| FORK (non-adjacent) | 12/18 (67%) | 14/18 (78%) |
|
| 116 |
+
| NEXT (edge) | 5/8 (63%) | 8/8 (100%) |
|
| 117 |
+
| **TOTAL** | **83/100 (83%)** | **88/100 (88%)** |
|
| 118 |
+
|
| 119 |
+
### Section B: Tier-2 Specific (90 held-out samples)
|
| 120 |
+
|
| 121 |
+
| Category | Exact Accuracy | Type Accuracy |
|
| 122 |
+
|---|---|---|
|
| 123 |
+
| Non-Adjacent Fork | 15/15 (100%) | 15/15 (100%) |
|
| 124 |
+
| META with Context | 15/15 (100%) | 15/15 (100%) |
|
| 125 |
+
| Negative Contrast | 14/15 (93%) | 14/15 (93%) |
|
| 126 |
+
| RETRY with Context | 14/15 (93%) | 14/15 (93%) |
|
| 127 |
+
| Skip Forward | 13/15 (87%) | 14/15 (93%) |
|
| 128 |
+
| JOIN with Context | 10/15 (67%) | 10/15 (67%) |
|
| 129 |
+
| **TOTAL** | **81/90 (90%)** | **82/90 (91%)** |
|
| 130 |
+
|
| 131 |
+
## Key Capabilities
|
| 132 |
+
|
| 133 |
+
1. **Context-Contract Reasoning:** Evaluates `required_context ⊆ produced_keys` to identify all invocable nodes
|
| 134 |
+
2. **Recovery Routing:** Backjumps on process/resource failure when no edge retry exists
|
| 135 |
+
3. **Stage Skipping:** Advances to forward context-eligible nodes at dead-ends
|
| 136 |
+
4. **Non-Adjacent Parallelism:** Forks independent context-eligible nodes with different actors
|
| 137 |
+
5. **Negative Contrast:** Learned "satisfiable ≠ sensible" — doesn't take Tier-2 when edge path is correct
|
| 138 |
+
|
| 139 |
+
## Usage (MLX)
|
| 140 |
+
|
| 141 |
+
```python
|
| 142 |
+
from mlx_lm import load, generate
|
| 143 |
+
|
| 144 |
+
model, tokenizer = load(
|
| 145 |
+
"Qwen/Qwen2.5-7B-Instruct",
|
| 146 |
+
adapter_path="sameer-saraf-quant-ai/slm-workflow-planner-v8-mlx"
|
| 147 |
+
)
|
| 148 |
+
|
| 149 |
+
messages = [
|
| 150 |
+
{"role": "system", "content": "You are a workflow planner..."},
|
| 151 |
+
{"role": "user", "content": "<tiered prompt>"},
|
| 152 |
+
]
|
| 153 |
+
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 154 |
+
response = generate(model, tokenizer, prompt=prompt, max_tokens=30)
|
| 155 |
+
print(response) # "NEXT ESTIMATION_AND_APPROVAL"
|
| 156 |
+
```
|
| 157 |
+
|
| 158 |
+
## Ensemble Recommendation
|
| 159 |
+
|
| 160 |
+
For production use, combine with GPT-4.1 arbiter for the ~10% edge cases (mainly JOIN confusion):
|
| 161 |
+
- v8 handles 90%+ of decisions autonomously
|
| 162 |
+
- GPT validates uncertain decisions (estimated 5-10% of traffic)
|
| 163 |
+
|
| 164 |
+
## Architecture Context
|
| 165 |
+
|
| 166 |
+
This adapter is part of the **Agentic OS** system:
|
| 167 |
+
- **Temporal** handles durable execution and state management
|
| 168 |
+
- **Neo4j** stores workflow graph definitions
|
| 169 |
+
- **SLM (this model)** makes real-time routing decisions
|
| 170 |
+
- **Guardrails** validate SLM output before execution
|
| 171 |
+
|
adapter_config.json
ADDED
|
@@ -0,0 +1,49 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"adapter_path": "src_slm/training/adapters_7b_v8",
|
| 3 |
+
"batch_size": 4,
|
| 4 |
+
"config": "src_slm/training/lora_config_v8.yaml",
|
| 5 |
+
"data": "src_slm/training/data",
|
| 6 |
+
"fine_tune_type": "lora",
|
| 7 |
+
"grad_accumulation_steps": 2,
|
| 8 |
+
"grad_checkpoint": true,
|
| 9 |
+
"iters": 2000,
|
| 10 |
+
"learning_rate": 2e-05,
|
| 11 |
+
"lora_parameters": {
|
| 12 |
+
"rank": 16,
|
| 13 |
+
"dropout": 0.02,
|
| 14 |
+
"scale": 2.0
|
| 15 |
+
},
|
| 16 |
+
"lr_schedule": {
|
| 17 |
+
"name": "cosine_decay",
|
| 18 |
+
"arguments": [
|
| 19 |
+
2e-05,
|
| 20 |
+
2000,
|
| 21 |
+
1e-06
|
| 22 |
+
],
|
| 23 |
+
"warmup": 100,
|
| 24 |
+
"warmup_init": 0.0
|
| 25 |
+
},
|
| 26 |
+
"mask_prompt": true,
|
| 27 |
+
"max_seq_length": 768,
|
| 28 |
+
"model": "Qwen/Qwen2.5-7B-Instruct",
|
| 29 |
+
"num_layers": 28,
|
| 30 |
+
"optimizer": "adam",
|
| 31 |
+
"optimizer_config": {
|
| 32 |
+
"adam": {},
|
| 33 |
+
"adamw": {},
|
| 34 |
+
"muon": {},
|
| 35 |
+
"sgd": {},
|
| 36 |
+
"adafactor": {}
|
| 37 |
+
},
|
| 38 |
+
"project_name": null,
|
| 39 |
+
"report_to": null,
|
| 40 |
+
"resume_adapter_file": "src_slm/training/adapters_7b_v3_best/adapters.safetensors",
|
| 41 |
+
"save_every": 200,
|
| 42 |
+
"seed": 42,
|
| 43 |
+
"steps_per_eval": 100,
|
| 44 |
+
"steps_per_report": 20,
|
| 45 |
+
"test": true,
|
| 46 |
+
"test_batches": 200,
|
| 47 |
+
"train": true,
|
| 48 |
+
"val_batches": 100
|
| 49 |
+
}
|
adapters.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:65fe2fd597efdf2d7124923a35dd1489254889e870ae543ed265d264c896967a
|
| 3 |
+
size 161523781
|