Z1-7B / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag
c4e6ba5 verified
|
raw
history blame
1.42 kB
---
base_model:
- Qwen/Qwen2.5-Coder-7B-Instruct
library_name: transformers
license: mit
metrics:
- accuracy
pipeline_tag: text-generation
---
<div align="center">
<h1 align="center">
Z1: Efficient Test-time Scaling with Code
</h1>
<p>Train Large Language Model to Reason with Shifted Thinking
</p>
</div>
<p align="center">
<a href="https://arxiv.org/abs/2504.00810"><b>[πŸ“œ Paper]</b></a> β€’
<a href="https://huggingface.co/efficientscaling/Z1-7B"><b>[πŸ€— HF Models]</b></a> β€’
<a href="https://github.com/efficientscaling/Z1"><b>[🐱 GitHub]</b></a>
<!-- <a href="https://9557c5365a6f44dc84.gradio.live"><b>[🐯 Gradio Demo]</b></a> -->
<br>
<!-- <a href="#-quick-start">Quick Start</a> β€’ -->
<!-- <a href="#%EF%B8%8F-citation">Citation</a> -->
</p>
## Model Details
To begin with the shifted thinking mode, please refer to https://github.com/efficientscaling/Z1.
## Evaluation
<p align="left">
<img src="tts.png" width="700">
<br>
<!-- <em>Test-time scaling comparison between Z1-7B and R1-Distill-Qwen-7B. </em> -->
</p>
## Citation
```
@misc{yu2025efficientscaling,
title={Z1: Efficient Test-time Scaling with Code},
author={Zhaojian Yu and Yinghao Wu and Yilun Zhao and Arman Cohan and Xiao-Ping Zhang},
year={2025},
eprint={2504.00810},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2504.00810},
}
```