Qwen3-8B-SPoT / README.md
linius's picture
Update README.md
24f3e3d verified
metadata
license: apache-2.0
base_model: Qwen/Qwen3-8B
tags:
  - qwen
  - qwen3
  - math
  - reasoning
  - alignment
  - rlhf
  - dpo
datasets:
  - math
arxiv: 2603.01683
language:
  - en

Qwen3-8B-SPoT

Model Description

Qwen3-8B-SPoT is a reasoning-enhanced large language model post-trained from the Qwen/Qwen3-8B base model. It is trained using the Surgical Post-Training (SPoT) paradigm, which significantly improves the model's reasoning capabilities while alleviating the catastrophic forgetting typically associated with Supervised Fine-Tuning (SFT).

This model was introduced in the paperSurgical Post-Training: Cutting Errors, Keeping Knowledge (Lin & Han, 2026).

Training Details & Performance

  • Efficiency: The model was trained using merely 4k rectified math data pairs. It avoids standard multi-phase pipelines (SFT → GRPO → DPO).
  • Reasoning Improvement: SPoT improves the base Qwen3-8B's accuracy by 6.2% on average across in-domain and Out-of-Domain (OOD) complex math and reasoning tasks.
  • Knowledge Retention: The model robustly mitigates catastrophic forgetting, yielding stability on general capability benchmarks like IFEval.

Usage

You can load and generate text with this model using the standard Hugging Face transformers pipeline:

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "linius/Qwen3-8B-SPoT"

# Load Tokenizer and Model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Prepare prompt
prompt = "Solve the following math problem step-by-step: ..."
messages =[
    {"role": "system", "content": "You are a helpful and precise reasoning assistant."},
    {"role": "user", "content": prompt}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=2048
)
generated_ids =[
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)

Citation

If you find our model, data, or the SPoT methodology useful in your research, please consider citing our paper:

BibTeX:

@article{lin2026surgical,
      title={Surgical Post-Training: Cutting Errors, Keeping Knowledge}, 
      author={Wenye Lin and Kai Han},
      year={2026},
      journal={arXiv preprint arXiv:2603.01683}
}