efficientscaling
/

Z1-7B

Text Generation

text-generation-inference

Model card Files Files and versions

Z1-7B / README.md

nielsr's picture

nielsr HF Staff

Add pipeline tag

c4e6ba5 verified 11 months ago

|

1.42 kB

	---
	base_model:
	- Qwen/Qwen2.5-Coder-7B-Instruct
	library_name: transformers
	license: mit
	metrics:
	- accuracy
	pipeline_tag: text-generation
	---

	<div align="center">
	<h1 align="center">
	Z1: Efficient Test-time Scaling with Code
	</h1>
	<p>Train Large Language Model to Reason with Shifted Thinking
	</p>
	</div>
	<p align="center">
	<a href="https://arxiv.org/abs/2504.00810"><b>[📜 Paper]</b></a> •
	<a href="https://huggingface.co/efficientscaling/Z1-7B"><b>[🤗 HF Models]</b></a> •
	<a href="https://github.com/efficientscaling/Z1"><b>[🐱 GitHub]</b></a>
	<!-- <a href="https://9557c5365a6f44dc84.gradio.live"><b>[🐯 Gradio Demo]</b></a> -->
	<br>

	<!-- <a href="#-quick-start">Quick Start</a> • -->
	<!-- <a href="#%EF%B8%8F-citation">Citation</a> -->
	</p>

	## Model Details
	To begin with the shifted thinking mode, please refer to https://github.com/efficientscaling/Z1.


	## Evaluation

	<p align="left">
	<img src="tts.png" width="700">
	<br>
	<!-- <em>Test-time scaling comparison between Z1-7B and R1-Distill-Qwen-7B. </em> -->
	</p>

	## Citation
	```
	@misc{yu2025efficientscaling,
	title={Z1: Efficient Test-time Scaling with Code},
	author={Zhaojian Yu and Yinghao Wu and Yilun Zhao and Arman Cohan and Xiao-Ping Zhang},
	year={2025},
	eprint={2504.00810},
	archivePrefix={arXiv},
	primaryClass={cs.CL},
	url={https://arxiv.org/abs/2504.00810},
	}
	```