Merged checkpoint of:
HallD/SkeptiSTEM-4B-stageR1-merged-16bit
HallD/SkeptiSTEM-4B-stageR2-format-lora
HallD/SkeptiSTEM-4B-stageR3-grpo-lora
This checkpoint bakes both adapters into the weights for one-shot inference.
Chat template
Files info
Base model