Instructions to use pixas/DECS_7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use pixas/DECS_7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="pixas/DECS_7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("pixas/DECS_7B")
model = AutoModelForCausalLM.from_pretrained("pixas/DECS_7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use pixas/DECS_7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "pixas/DECS_7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pixas/DECS_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/pixas/DECS_7B

SGLang

How to use pixas/DECS_7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "pixas/DECS_7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pixas/DECS_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "pixas/DECS_7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "pixas/DECS_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use pixas/DECS_7B with Docker Model Runner:
```
docker model run hf.co/pixas/DECS_7B
```

Improve model card: add library_name, paper link, and clean up structure

by nielsr HF Staff - opened Mar 18

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+24

-95

Files changed (1) hide show

README.md +24 -95

README.md CHANGED Viewed

@@ -1,105 +1,34 @@
 ---
 language:
 - zh
 - en
 pipeline_tag: text-generation
 tags:
 - deepscaler
 - reasoning
 - grpo
 - qwen2
-base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
-license: other
 ---
 # DECS_7B
-This is the official model for ICLR 2026 Oral "Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling".
-DECS_7B is a reasoning-focused causal language model built from `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` and further trained with DECS algorithm, focused on 50% fewer tokens when answering a reasoning-required problem.
-## Model Summary
-- Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`
-- Upload date: `2026-02-24`
-- Recommended use: long-form reasoning and mathematical/problem-solving style generation
-## Quick Start (Transformers)
-```python
-import torch
-from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = "pixas/DECS_7B"
-tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
-model = AutoModelForCausalLM.from_pretrained(
-    model_id,
-    torch_dtype=torch.bfloat16,
-    device_map="auto",
-)
-messages = [
-    {"role": "user", "content": "Solve: If x^2 - 5x + 6 = 0, what are x values?"}
-]
-prompt = tokenizer.apply_chat_template(
-    messages, tokenize=False, add_generation_prompt=True
-)
-inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
-with torch.no_grad():
-    outputs = model.generate(
-        **inputs,
-        max_new_tokens=512,
-        temperature=0.6,
-        top_p=0.95,
-    )
-new_tokens = outputs[0][inputs["input_ids"].shape[-1]:]
-print(tokenizer.decode(new_tokens, skip_special_tokens=True))
-```
-## Quick Start (vLLM)
-```python
-from vllm import LLM, SamplingParams
-llm = LLM(model="pixas/DECS_7B", trust_remote_code=True)
-sampling = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=512)
-prompt = "Please reason step by step: what is 37 * 48?"
-outputs = llm.generate([prompt], sampling_params=sampling)
-print(outputs[0].outputs[0].text)
-```
-## Notes
-- This model may produce incorrect or unverifiable reasoning. Always validate outputs in high-stakes settings.
-- Performance can vary by prompt style and decoding parameters.
-- License and acceptable-use constraints should follow the upstream base model and your deployment policy.
-## Citation
----
-language:
-- zh
-- en
-pipeline_tag: text-generation
-tags:
-- deepscaler
-- reasoning
-- grpo
-- qwen2
-base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
-license: other
----
-# DECS_1.5B
-This is the official model for ICLR 2026 Oral "Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling".
-DECS_1.5B is a reasoning-focused causal language model built from `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` and further trained with DECS algorithm, focused on 50% fewer tokens when answering a reasoning-required problem.
 ## Model Summary
-- Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
-- Upload date: `2026-02-24`
-- Recommended use: long-form reasoning and mathematical/problem-solving style generation
 ## Quick Start (Transformers)
@@ -107,7 +36,7 @@ DECS_1.5B is a reasoning-focused causal language model built from `deepseek-ai/D
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
-model_id = "pixas/DECS_1.5B"
 tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
     model_id,
@@ -140,7 +69,7 @@ print(tokenizer.decode(new_tokens, skip_special_tokens=True))
 ```python
 from vllm import LLM, SamplingParams
-llm = LLM(model="pixas/DECS_1.5B", trust_remote_code=True)
 sampling = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=512)
 prompt = "Please reason step by step: what is 37 * 48?"
 outputs = llm.generate([prompt], sampling_params=sampling)
@@ -149,20 +78,20 @@ print(outputs[0].outputs[0].text)
 ## Notes
-- This model may produce incorrect or unverifiable reasoning. Always validate outputs in high-stakes settings.
-- Performance can vary by prompt style and decoding parameters.
-- License and acceptable-use constraints should follow the upstream base model and your deployment policy.
 ## Citation
 If you use this model, please cite our paper:
 ```bibtex
-@inproceedings{jiang2026overthinking,
-title={Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling},
-author={Shuyang Jiang and Yusheng Liao and Ya Zhang and Yanfeng Wang and Yu Wang},
-booktitle={The Fourteenth International Conference on Learning Representations},
-year={2026},
-url={https://openreview.net/forum?id=kdeiRledV6}
 }
-```

 ---
+base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
 language:
 - zh
 - en
+license: other
 pipeline_tag: text-generation
+library_name: transformers
 tags:
 - deepscaler
 - reasoning
 - grpo
 - qwen2
 ---
 # DECS_7B
+This is the official model repository for **DECS_7B**, presented in the ICLR 2026 Oral paper: **"Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling"**.
+[**Paper**](https://huggingface.co/papers/2509.25827) | [**Code**](https://github.com/pixas/DECS) | [**Project Page**](https://pixas.github.io/decs-iclr26-site/)
+## Model Description
+DECS_7B is a reasoning-focused causal language model built from `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` and further trained with the **DECS** (Decoupled Rewards and Curriculum Scheduling) algorithm.
+The DECS framework addresses the "overthinking" problem in large reasoning models—where models generate excessively long reasoning paths without performance benefits. DECS achieves a reduction in reasoning tokens by over 50% across multiple benchmarks while maintaining or improving accuracy. It introduces a decoupled token-level reward mechanism and a curriculum batch scheduling strategy to optimize the efficiency-efficacy equilibrium.
 ## Model Summary
+- **Base model:** `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`
+- **Upload date:** `2026-02-24`
+- **Recommended use:** Long-form reasoning, mathematical problem solving, and efficient step-by-step logic generation.
 ## Quick Start (Transformers)
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "pixas/DECS_7B"
 tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
 model = AutoModelForCausalLM.from_pretrained(
     model_id,
 ```python
 from vllm import LLM, SamplingParams
+llm = LLM(model="pixas/DECS_7B", trust_remote_code=True)
 sampling = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=512)
 prompt = "Please reason step by step: what is 37 * 48?"
 outputs = llm.generate([prompt], sampling_params=sampling)
 ## Notes
+- **Reasoning Accuracy:** While optimized for efficiency, this model may produce incorrect or unverifiable reasoning. Always validate outputs in high-stakes settings.
+- **Licensing:** License and acceptable-use constraints follow the upstream base model and your deployment policy.
 ## Citation
 If you use this model, please cite our paper:
 ```bibtex
+@inproceedings{jiang2026decs,
+  title     = {Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling},
+  author    = {Jiang, Shuyang and Tao, Xiaofeng and Zhang, Kui and Xiao, Yanghua},
+  booktitle = {International Conference on Learning Representations (ICLR)},
+  year      = {2026},
+  note      = {Oral},
+  url       = {https://arxiv.org/abs/2509.25827}
 }
+```