Text Generation
Transformers
Safetensors
English
qwen2
conversational
text-generation-inference
Sky-T1-32B-Preview / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag, link to paper, and cite the paper
2743d64 verified
|
raw
history blame
2.65 kB
metadata
base_model:
  - Qwen/Qwen2.5-32B-Instruct
datasets:
  - codeparrot/apps
  - BAAI/TACO
  - AI-MO/NuminaMath-CoT
language:
  - en
library_name: transformers
license: apache-2.0
pipeline_tag: text-generation

Model Details

Model Description

This is a 32B reasoning model trained from Qwen2.5-32B-Instruct with 17K data. The performance is on par with o1-preview model on both math and coding. Please see our blog post for more details.

  • Developed by: NovaSky Team from Sky Computing Lab at UC Berkeley.

Training Details

Training Data

17K verified correct responses from Qwen/QwQ-32B-Preview on coding, math. In addition, we add the science portion from the Still-2 paper.

Training Procedure

We perform supervised fine tuning on the data, with a batch size of 96.

Speeds

We use Llama-Factory for training. On 8 H100, the training takes 19 hours with DeepSpeed Zero-3 Offload.

Evaluation

Sky-T1-32B-Preview Qwen-2.5-32B-Instruct QwQ o1-preview
Math500 82.4 76.2 85.4 81.4
AIME2024 43.3 16.7 50.0 40.0
LiveCodeBench-Easy 86.3 84.6 90.7 92.9
LiveCodeBench-Medium 56.8 40.8 56.3 54.9
LiveCodeBench-Hard 17.9 9.8 17.1 16.3
GPQA-Diamond 56.8 45.5 52.5 75.2

Acknowledgement

We would like to thanks the compute resources from Lambda Lab and AnyScale. We would like to thanks the academic feedback and support from the Still-2 Team, and Junyang Lin from the Qwen Team.

Citation

Please considering citing our paper if you found it useful for your research. Thank you!

@misc{zhu2025llms,
  author       = {Wenxuan Zhu and Xiangru Tang and Ziyang Ma and Hongbo Zhang and Tianqi Chen},
  title        = {LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!},
  year         = {2025},
  eprint={2502.07374},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url = {https://arxiv.org/abs/2502.07374}
}