YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Files
v3 run β original GRPO (Apr 2026)
final/adapter_model.safetensors(445 MB) β inference-ready, load with PEFTfinal/adapter_config.jsonβ PEFT configfinal/{tokenizer.json, chat_template.jinja, tokenizer_config.json}β tokenizercheckpoints/checkpoint-N/adapter_model.safetensorsβ intermediate adapters at steps 10, 20, 30, 40, 50, 60, 70, 74
v4-3 run β hard-curriculum continuation (May 2026)
Continued from v3's checkpoint-74 for 50 more steps with a harder
curriculum and tightened reward shaping. See
checkpoints/v4-3-may01-hard-curriculum/README.md for the full diff.
checkpoints/v4-3-may01-hard-curriculum/checkpoint-{10,20,30,40,50}/β intermediate adapters (adapter_config.json+adapter_model.safetensors)
Inference
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained(
"<path-to-Qwen3.5-9B-smtlib-sft-fixed>",
torch_dtype="bfloat16",
device_map="auto",
)
tok = AutoTokenizer.from_pretrained("Debargha/Qwen3.5-9B-GRPO-Adapters",
subfolder="final")
# v3 final adapter:
model = PeftModel.from_pretrained(base, "Debargha/Qwen3.5-9B-GRPO-Adapters",
subfolder="final")
# v4-3 latest adapter (step 50, on top of v3 checkpoint-74):
# model = PeftModel.from_pretrained(
# base, "Debargha/Qwen3.5-9B-GRPO-Adapters",
# subfolder="checkpoints/v4-3-may01-hard-curriculum/checkpoint-50",
# )
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support