Qwen3-8B-SPoT

Model Description

Qwen3-8B-SPoT is a reasoning-enhanced large language model post-trained from the Qwen/Qwen3-8B base model. It is trained using the Surgical Post-Training (SPoT) paradigm, which significantly improves the model's reasoning capabilities while alleviating the catastrophic forgetting typically associated with Supervised Fine-Tuning (SFT).

This model was introduced in the paperSurgical Post-Training: Cutting Errors, Keeping Knowledge (Lin & Han, 2026).

Training Details & Performance

  • Efficiency: The model was trained using merely 4k rectified math data pairs. It avoids standard multi-phase pipelines (SFT → GRPO → DPO).
  • Reasoning Improvement: SPoT improves the base Qwen3-8B's accuracy by 6.2% on average across in-domain and Out-of-Domain (OOD) complex math and reasoning tasks.
  • Knowledge Retention: The model robustly mitigates catastrophic forgetting, yielding stability on general capability benchmarks like IFEval.

Usage

You can load and generate text with this model using the standard Hugging Face transformers pipeline:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "linius/Qwen3-8B-SPoT"

# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Prepare prompt
prompt = "Solve the following math problem step-by-step: ..."
messages =[
    {"role": "system", "content": "You are a helpful and precise reasoning assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids =[
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Citation

If you find our model, data, or the SPoT methodology useful in your research, please consider citing our paper:

BibTeX:

@article{lin2026surgical,
      title={Surgical Post-Training: Cutting Errors, Keeping Knowledge}, 
      author={Wenye Lin and Kai Han},
      year={2026},
      journal={arXiv preprint arXiv:2603.01683}
}
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for linius/Qwen3-8B-SPoT

Base model

Qwen/Qwen3-8B-Base
Finetuned
Qwen/Qwen3-8B
Finetuned
(1034)
this model

Collection including linius/Qwen3-8B-SPoT

Paper for linius/Qwen3-8B-SPoT