sameer-saraf-quant-ai's picture
v8: Stage 2 Context-Contract Planning (MLX LoRA, iter 1000, val=0.032)
d7cca15 verified
metadata
license: apache-2.0
base_model: Qwen/Qwen2.5-7B-Instruct
tags:
  - workflow-planner
  - slm
  - lora
  - mlx
  - context-contract-planning
  - tier-2-eligibility
language:
  - en
pipeline_tag: text-generation

SLM Workflow Planner v8 β€” Context-Contract Planning (MLX LoRA)

Overview

v8 is a Stage 2 enhancement of the SLM Workflow Planner. It extends the v3-best checkpoint with context-contract planning β€” the ability to make routing decisions based on the required_context and produces_context of ALL nodes in a workflow graph, not just directly connected edges.

This enables three new capabilities:

  • Recovery Routing (Backjump): On failure, jump backward to an earlier context-satisfiable node
  • Stage Skipping: Skip unnecessary stages when required context is already available (e.g., walk-in customers)
  • Non-Adjacent Parallelism: Fork two independent context-satisfiable nodes that aren't connected by fork-edges

Model Details

Property Value
Base Model Qwen/Qwen2.5-7B-Instruct
Fine-tune Type LoRA (MLX format)
LoRA Rank 16
LoRA Scale 2.0
LoRA Dropout 0.02
Tuned Layers 28/32
Trainable Parameters 40.37M (0.53%)
Framework MLX (Apple Silicon)

Training

Property Value
Lineage base(8000) β†’ v2(100) β†’ v3(200) β†’ v3-cont β†’ v3-best β†’ v8(1000)
Resume Checkpoint v3-best (59.2% on 76-scenario suite)
Training Iterations 1000 (stopped early β€” val loss converged)
Learning Rate 2e-5 (cosine decay to 1e-6, 100-step warmup)
Batch Size 4 (effective 8 with grad accumulation)
Max Sequence Length 768 tokens
Dataset 696K samples from 150 workflows
Val Loss 0.032 (from 0.272 starting)

Training Data Distribution

Category Count % Description
META 187K 26.9% Dead-end escalation
NEGATIVE 187K 26.9% Tier-2 visible but edge chosen ("satisfiable β‰  sensible")
NEXT_EDGE 116K 16.7% Normal edge progression
NEXT_SKIP πŸ›‘ 55K 8.0% Forward dead-end recovery (Tier-2)
RETRY 36K 5.2% Edge retry on failure
JOIN 30K 4.3% Parallel branch merge
NEXT_BACKJUMP πŸ›‘ 28K 4.0% Failure recovery to earlier node (Tier-2)
FORK_EDGE 28K 4.0% Edge-adjacent fork
FORK_NONADJ πŸ›‘ 28K 4.0% Non-adjacent parallel fork (Tier-2)

πŸ›‘ = Protected from downsampling during balancing

Prompt Format

The model uses a tiered prompt with two candidate sections:

Current node: NODE_A (SYSTEM, stage 3)
Outcome: success
Failure type: none

State:
  goal_progress=0.40
  retry_count=0
  ...

Produced context: {ctx_start, intake_data, assessment_score}

Edge candidates (normal path):
  1. NODE_B (AGENT) [processor] β†’ requires: {assessment_score} β†’ produces: {approval}

Context-eligible (off-path, invocable now):
  1. NODE_X (SYSTEM, stage 5, gap=+2) [validator] β†’ requires: {intake_data} βœ“ β†’ produces: {validation}

Forkable sets: []
Join-ready: []

What is the best action?

Output format: DECISION_TYPE NODE_ID

  • NEXT NODE_B β€” advance to NODE_B
  • FORK NODE_A, NODE_B β€” parallel fork
  • RETRY NODE_A β€” retry current
  • JOIN NODE_A β€” merge parallel branches
  • META β€” escalate to human

Evaluation Results

Section A: Stratified Test (100 held-out samples)

Category Exact Accuracy Type Accuracy
META 20/20 (100%) 20/20 (100%)
NEGATIVE (Tier-2 visible, edge chosen) 5/5 (100%) 5/5 (100%)
SKIP_FORWARD 7/7 (100%) 7/7 (100%)
RETRY 18/20 (90%) 18/20 (90%)
JOIN 16/20 (80%) 16/20 (80%)
FORK (non-adjacent) 12/18 (67%) 14/18 (78%)
NEXT (edge) 5/8 (63%) 8/8 (100%)
TOTAL 83/100 (83%) 88/100 (88%)

Section B: Tier-2 Specific (90 held-out samples)

Category Exact Accuracy Type Accuracy
Non-Adjacent Fork 15/15 (100%) 15/15 (100%)
META with Context 15/15 (100%) 15/15 (100%)
Negative Contrast 14/15 (93%) 14/15 (93%)
RETRY with Context 14/15 (93%) 14/15 (93%)
Skip Forward 13/15 (87%) 14/15 (93%)
JOIN with Context 10/15 (67%) 10/15 (67%)
TOTAL 81/90 (90%) 82/90 (91%)

Key Capabilities

  1. Context-Contract Reasoning: Evaluates required_context βŠ† produced_keys to identify all invocable nodes
  2. Recovery Routing: Backjumps on process/resource failure when no edge retry exists
  3. Stage Skipping: Advances to forward context-eligible nodes at dead-ends
  4. Non-Adjacent Parallelism: Forks independent context-eligible nodes with different actors
  5. Negative Contrast: Learned "satisfiable β‰  sensible" β€” doesn't take Tier-2 when edge path is correct

Usage (MLX)

from mlx_lm import load, generate

model, tokenizer = load(
    "Qwen/Qwen2.5-7B-Instruct",
    adapter_path="sameer-saraf-quant-ai/slm-workflow-planner-v8-mlx"
)

messages = [
    {"role": "system", "content": "You are a workflow planner..."},
    {"role": "user", "content": "<tiered prompt>"},
]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
response = generate(model, tokenizer, prompt=prompt, max_tokens=30)
print(response)  # "NEXT ESTIMATION_AND_APPROVAL"

Ensemble Recommendation

For production use, combine with GPT-4.1 arbiter for the ~10% edge cases (mainly JOIN confusion):

  • v8 handles 90%+ of decisions autonomously
  • GPT validates uncertain decisions (estimated 5-10% of traffic)

Architecture Context

This adapter is part of the Agentic OS system:

  • Temporal handles durable execution and state management
  • Neo4j stores workflow graph definitions
  • SLM (this model) makes real-time routing decisions
  • Guardrails validate SLM output before execution