v8: Stage 2 Context-Contract Planning (MLX LoRA, iter 1000, val=0.032)

Browse files

Files changed (9) hide show

0000200_adapters.safetensors +3 -0
0000400_adapters.safetensors +3 -0
0000600_adapters.safetensors +3 -0
0000800_adapters.safetensors +3 -0
0001000_adapters.safetensors +3 -0
0001200_adapters.safetensors +3 -0
README.md +171 -0
adapter_config.json +49 -0
adapters.safetensors +3 -0

0000200_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:67322f026246e071e283f1df0a52731c457a3b23f573868fb6be965fef6d1713
+size 161523781

0000400_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d0af972dab5d6f496b81f8706ba1ac3c1131d3130fc578c9949322e72383633e
+size 161523781

0000600_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1fceb28c77779930aed540ed89653261fcae365f4c9ebc784a295043c125837d
+size 161523781

0000800_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c67b0f6a1ca8ffeb76a1d8e22ae842999b0fef063433289b59185d8a57213cb1
+size 161523781

0001000_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65fe2fd597efdf2d7124923a35dd1489254889e870ae543ed265d264c896967a
+size 161523781

0001200_adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ddc136f59c37c531b2f1f9e7ea49ccc20ffcf865e4216f5df04567091aa388bf
+size 161523781

README.md ADDED Viewed

	@@ -0,0 +1,171 @@

+---
+license: apache-2.0
+base_model: Qwen/Qwen2.5-7B-Instruct
+tags:
+  - workflow-planner
+  - slm
+  - lora
+  - mlx
+  - context-contract-planning
+  - tier-2-eligibility
+language:
+  - en
+pipeline_tag: text-generation
+---
+# SLM Workflow Planner v8 — Context-Contract Planning (MLX LoRA)
+## Overview
+**v8** is a Stage 2 enhancement of the SLM Workflow Planner. It extends the v3-best checkpoint with **context-contract planning** — the ability to make routing decisions based on the `required_context` and `produces_context` of ALL nodes in a workflow graph, not just directly connected edges.
+This enables three new capabilities:
+- **Recovery Routing (Backjump):** On failure, jump backward to an earlier context-satisfiable node
+- **Stage Skipping:** Skip unnecessary stages when required context is already available (e.g., walk-in customers)
+- **Non-Adjacent Parallelism:** Fork two independent context-satisfiable nodes that aren't connected by fork-edges
+## Model Details
+| Property | Value |
+|---|---|
+| **Base Model** | Qwen/Qwen2.5-7B-Instruct |
+| **Fine-tune Type** | LoRA (MLX format) |
+| **LoRA Rank** | 16 |
+| **LoRA Scale** | 2.0 |
+| **LoRA Dropout** | 0.02 |
+| **Tuned Layers** | 28/32 |
+| **Trainable Parameters** | 40.37M (0.53%) |
+| **Framework** | MLX (Apple Silicon) |
+## Training
+| Property | Value |
+|---|---|
+| **Lineage** | base(8000) → v2(100) → v3(200) → v3-cont → v3-best → **v8(1000)** |
+| **Resume Checkpoint** | v3-best (59.2% on 76-scenario suite) |
+| **Training Iterations** | 1000 (stopped early — val loss converged) |
+| **Learning Rate** | 2e-5 (cosine decay to 1e-6, 100-step warmup) |
+| **Batch Size** | 4 (effective 8 with grad accumulation) |
+| **Max Sequence Length** | 768 tokens |
+| **Dataset** | 696K samples from 150 workflows |
+| **Val Loss** | 0.032 (from 0.272 starting) |
+### Training Data Distribution
+| Category | Count | % | Description |
+|---|---|---|---|
+| META | 187K | 26.9% | Dead-end escalation |
+| NEGATIVE | 187K | 26.9% | Tier-2 visible but edge chosen ("satisfiable ≠ sensible") |
+| NEXT_EDGE | 116K | 16.7% | Normal edge progression |
+| NEXT_SKIP 🛡 | 55K | 8.0% | Forward dead-end recovery (Tier-2) |
+| RETRY | 36K | 5.2% | Edge retry on failure |
+| JOIN | 30K | 4.3% | Parallel branch merge |
+| NEXT_BACKJUMP 🛡 | 28K | 4.0% | Failure recovery to earlier node (Tier-2) |
+| FORK_EDGE | 28K | 4.0% | Edge-adjacent fork |
+| FORK_NONADJ 🛡 | 28K | 4.0% | Non-adjacent parallel fork (Tier-2) |
+🛡 = Protected from downsampling during balancing
+## Prompt Format
+The model uses a **tiered prompt** with two candidate sections:
+```
+Current node: NODE_A (SYSTEM, stage 3)
+Outcome: success
+Failure type: none
+State:
+  goal_progress=0.40
+  retry_count=0
+  ...
+Produced context: {ctx_start, intake_data, assessment_score}
+Edge candidates (normal path):
+  1. NODE_B (AGENT) [processor] → requires: {assessment_score} → produces: {approval}
+Context-eligible (off-path, invocable now):
+  1. NODE_X (SYSTEM, stage 5, gap=+2) [validator] → requires: {intake_data} ✓ → produces: {validation}
+Forkable sets: []
+Join-ready: []
+What is the best action?
+```
+**Output format:** `DECISION_TYPE NODE_ID`
+- `NEXT NODE_B` — advance to NODE_B
+- `FORK NODE_A, NODE_B` — parallel fork
+- `RETRY NODE_A` — retry current
+- `JOIN NODE_A` — merge parallel branches
+- `META` — escalate to human
+## Evaluation Results
+### Section A: Stratified Test (100 held-out samples)
+| Category | Exact Accuracy | Type Accuracy |
+|---|---|---|
+| META | 20/20 (100%) | 20/20 (100%) |
+| NEGATIVE (Tier-2 visible, edge chosen) | 5/5 (100%) | 5/5 (100%) |
+| SKIP_FORWARD | 7/7 (100%) | 7/7 (100%) |
+| RETRY | 18/20 (90%) | 18/20 (90%) |
+| JOIN | 16/20 (80%) | 16/20 (80%) |
+| FORK (non-adjacent) | 12/18 (67%) | 14/18 (78%) |
+| NEXT (edge) | 5/8 (63%) | 8/8 (100%) |
+| **TOTAL** | **83/100 (83%)** | **88/100 (88%)** |
+### Section B: Tier-2 Specific (90 held-out samples)
+| Category | Exact Accuracy | Type Accuracy |
+|---|---|---|
+| Non-Adjacent Fork | 15/15 (100%) | 15/15 (100%) |
+| META with Context | 15/15 (100%) | 15/15 (100%) |
+| Negative Contrast | 14/15 (93%) | 14/15 (93%) |
+| RETRY with Context | 14/15 (93%) | 14/15 (93%) |
+| Skip Forward | 13/15 (87%) | 14/15 (93%) |
+| JOIN with Context | 10/15 (67%) | 10/15 (67%) |
+| **TOTAL** | **81/90 (90%)** | **82/90 (91%)** |
+## Key Capabilities
+1. **Context-Contract Reasoning:** Evaluates `required_context ⊆ produced_keys` to identify all invocable nodes
+2. **Recovery Routing:** Backjumps on process/resource failure when no edge retry exists
+3. **Stage Skipping:** Advances to forward context-eligible nodes at dead-ends
+4. **Non-Adjacent Parallelism:** Forks independent context-eligible nodes with different actors
+5. **Negative Contrast:** Learned "satisfiable ≠ sensible" — doesn't take Tier-2 when edge path is correct
+## Usage (MLX)
+```python
+from mlx_lm import load, generate
+model, tokenizer = load(
+    "Qwen/Qwen2.5-7B-Instruct",
+    adapter_path="sameer-saraf-quant-ai/slm-workflow-planner-v8-mlx"
+)
+messages = [
+    {"role": "system", "content": "You are a workflow planner..."},
+    {"role": "user", "content": "<tiered prompt>"},
+]
+prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+response = generate(model, tokenizer, prompt=prompt, max_tokens=30)
+print(response)  # "NEXT ESTIMATION_AND_APPROVAL"
+```
+## Ensemble Recommendation
+For production use, combine with GPT-4.1 arbiter for the ~10% edge cases (mainly JOIN confusion):
+- v8 handles 90%+ of decisions autonomously
+- GPT validates uncertain decisions (estimated 5-10% of traffic)
+## Architecture Context
+This adapter is part of the **Agentic OS** system:
+- **Temporal** handles durable execution and state management
+- **Neo4j** stores workflow graph definitions
+- **SLM (this model)** makes real-time routing decisions
+- **Guardrails** validate SLM output before execution

adapter_config.json ADDED Viewed

	@@ -0,0 +1,49 @@

+{
+    "adapter_path": "src_slm/training/adapters_7b_v8",
+    "batch_size": 4,
+    "config": "src_slm/training/lora_config_v8.yaml",
+    "data": "src_slm/training/data",
+    "fine_tune_type": "lora",
+    "grad_accumulation_steps": 2,
+    "grad_checkpoint": true,
+    "iters": 2000,
+    "learning_rate": 2e-05,
+    "lora_parameters": {
+        "rank": 16,
+        "dropout": 0.02,
+        "scale": 2.0
+    },
+    "lr_schedule": {
+        "name": "cosine_decay",
+        "arguments": [
+            2e-05,
+            2000,
+            1e-06
+        ],
+        "warmup": 100,
+        "warmup_init": 0.0
+    },
+    "mask_prompt": true,
+    "max_seq_length": 768,
+    "model": "Qwen/Qwen2.5-7B-Instruct",
+    "num_layers": 28,
+    "optimizer": "adam",
+    "optimizer_config": {
+        "adam": {},
+        "adamw": {},
+        "muon": {},
+        "sgd": {},
+        "adafactor": {}
+    },
+    "project_name": null,
+    "report_to": null,
+    "resume_adapter_file": "src_slm/training/adapters_7b_v3_best/adapters.safetensors",
+    "save_every": 200,
+    "seed": 42,
+    "steps_per_eval": 100,
+    "steps_per_report": 20,
+    "test": true,
+    "test_batches": 200,
+    "train": true,
+    "val_batches": 100
+}

adapters.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:65fe2fd597efdf2d7124923a35dd1489254889e870ae543ed265d264c896967a
+size 161523781