pixas
/

DECS_7B

Text Generation

text-generation-inference

Model card Files Files and versions

DECS_7B / README.md

pixas's picture

Update README.md

0e9d916 verified 2 days ago

|

history blame contribute delete

2.83 kB

	---
	language:
	- zh
	- en
	pipeline_tag: text-generation
	tags:
	- deepscaler
	- grpo
	- qwen2
	base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	license: other
	library_name: transformers
	---

	# DECS_7B

	This is the official model for ICLR 2026 Oral "Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling".
	DECS_7B is a reasoning-focused causal language model built from `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` and further trained with DECS algorithm, focused on 50% fewer tokens when answering a reasoning-required problem.

	## Model Summary

	- Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`
	- Upload date: `2026-02-24`
	- Recommended use: long-form reasoning and mathematical/problem-solving style generation
	- Paper link: https://arxiv.org/pdf/2509.25827
	- Project page: https://pixas.github.io/decs-iclr26-site/
	- Github repo: https://github.com/pixas/DECS

	## Quick Start (Transformers)

	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model_id = "pixas/DECS_7B"
	tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)

	messages = [
	{"role": "user", "content": "Solve: If x^2 - 5x + 6 = 0, what are x values?"}
	]
	prompt = tokenizer.apply_chat_template(
	messages, tokenize=False, add_generation_prompt=True
	)
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=512,
	temperature=0.6,
	top_p=0.95,
	)

	new_tokens = outputs[0][inputs["input_ids"].shape[-1]:]
	print(tokenizer.decode(new_tokens, skip_special_tokens=True))
	```

	## Quick Start (vLLM)

	```python
	from vllm import LLM, SamplingParams

	llm = LLM(model="pixas/DECS_7B", trust_remote_code=True)
	sampling = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=512)
	prompt = "Please reason step by step: what is 37 * 48?"
	outputs = llm.generate([prompt], sampling_params=sampling)
	print(outputs[0].outputs[0].text)
	```

	## Notes

	- This model may produce incorrect or unverifiable reasoning. Always validate outputs in high-stakes settings.
	- Performance can vary by prompt style and decoding parameters.
	- License and acceptable-use constraints should follow the upstream base model and your deployment policy.



	## Citation

	If you use this model, please cite our paper:
	```bibtex
	@inproceedings{jiang2026overthinking,
	title={Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling},
	author={Shuyang Jiang and Yusheng Liao and Ya Zhang and Yanfeng Wang and Yu Wang},
	booktitle={The Fourteenth International Conference on Learning Representations},
	year={2026},
	url={https://openreview.net/forum?id=kdeiRledV6}
	}
	```