Qwenjamin Franklin V2

Qwenjamin Franklin V2 (Full)

The second-generation local workshop release, built on Qwen 3.5 9B and tuned for stronger everyday reasoning, stricter JSON/tool behavior, and solid false-premise correction. V2 is trained from fresh CUDA-native adapters that do not reuse earlier MLX adapters.

What This Release Is

  • Full fused PyTorch weights from the best V2 checkpoint
  • Base: Qwen/Qwen3.5-9B
  • Workshop lineage: v55 targeted SFT
  • CUDA-native PEFT training (no MLX adapter reuse)

Comparisons

Internal workshop evals — directional, not leaderboard claims.

Eval Stock Qwen3.5-9B V1 (Qwenjamin_Franklin) V2 (this)
full40 309/400 325/400 341/400
false_smoke 102/110 110/110 97/110
json_hard 15/30 30/30 30/30
tool_schema_canary 50/175 106/175 169/175
workbench_local_agent 63/100 72/100 70/100
no_tool_leakage 99/100 100/100 99/100

What improved vs V1: full40 reasoning (+16 pts), tool schema discipline (+63 pts). What V1 still leads on: false-premise correction (110 vs 97), workbench (72 vs 70).

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("stamsam/Qwenjamin_Franklin_V2", torch_dtype="auto", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("stamsam/Qwenjamin_Franklin_V2", trust_remote_code=True)

Notes

  • For strict tool JSON, use explicit output instructions.
  • Verify important outputs in high-stakes workflows.
  • A compact 4-bit MLX version is at stamsam/Qwenjamin_Franklin_V2_4bit.
Downloads last month
13
Safetensors
Model size
9B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for stamsam/Qwenjamin_Franklin_V2

Finetuned
Qwen/Qwen3.5-9B
Finetuned
(269)
this model

Collection including stamsam/Qwenjamin_Franklin_V2