SkeptiSTEM-4B Final Merged (16-bit)

Merged checkpoint of:

  • Base: HallD/SkeptiSTEM-4B-stageR1-merged-16bit
  • Stage R2 (format): HallD/SkeptiSTEM-4B-stageR2-format-lora
  • Stage R3 (GRPO): HallD/SkeptiSTEM-4B-stageR3-grpo-lora

This checkpoint bakes both adapters into the weights for one-shot inference.

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for HallD/SkeptiSTEM-4B-final-merged-16bit

Base model

Qwen/Qwen3-4B-Base
Adapter
(3)
this model