pixas commited on
Commit
beed8de
·
verified ·
1 Parent(s): a9a1882

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -79
README.md CHANGED
@@ -5,11 +5,11 @@ language:
5
  pipeline_tag: text-generation
6
  tags:
7
  - deepscaler
8
- - reasoning
9
  - grpo
10
  - qwen2
11
  base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
12
  license: other
 
13
  ---
14
 
15
  # DECS_7B
@@ -22,6 +22,8 @@ DECS_7B is a reasoning-focused causal language model built from `deepseek-ai/Dee
22
  - Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`
23
  - Upload date: `2026-02-24`
24
  - Recommended use: long-form reasoning and mathematical/problem-solving style generation
 
 
25
 
26
  ## Quick Start (Transformers)
27
 
@@ -76,83 +78,6 @@ print(outputs[0].outputs[0].text)
76
  - License and acceptable-use constraints should follow the upstream base model and your deployment policy.
77
 
78
 
79
- ## Citation
80
- ---
81
- language:
82
- - zh
83
- - en
84
- pipeline_tag: text-generation
85
- tags:
86
- - deepscaler
87
- - reasoning
88
- - grpo
89
- - qwen2
90
- base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B
91
- license: other
92
- ---
93
-
94
- # DECS_1.5B
95
- This is the official model for ICLR 2026 Oral "Overthinking Reduction with Decoupled Rewards and Curriculum Data Scheduling".
96
- DECS_1.5B is a reasoning-focused causal language model built from `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B` and further trained with DECS algorithm, focused on 50% fewer tokens when answering a reasoning-required problem.
97
-
98
- ## Model Summary
99
-
100
- - Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B`
101
- - Upload date: `2026-02-24`
102
- - Recommended use: long-form reasoning and mathematical/problem-solving style generation
103
-
104
- ## Quick Start (Transformers)
105
-
106
- ```python
107
- import torch
108
- from transformers import AutoModelForCausalLM, AutoTokenizer
109
-
110
- model_id = "pixas/DECS_1.5B"
111
- tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
112
- model = AutoModelForCausalLM.from_pretrained(
113
- model_id,
114
- torch_dtype=torch.bfloat16,
115
- device_map="auto",
116
- )
117
-
118
- messages = [
119
- {"role": "user", "content": "Solve: If x^2 - 5x + 6 = 0, what are x values?"}
120
- ]
121
- prompt = tokenizer.apply_chat_template(
122
- messages, tokenize=False, add_generation_prompt=True
123
- )
124
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
125
-
126
- with torch.no_grad():
127
- outputs = model.generate(
128
- **inputs,
129
- max_new_tokens=512,
130
- temperature=0.6,
131
- top_p=0.95,
132
- )
133
-
134
- new_tokens = outputs[0][inputs["input_ids"].shape[-1]:]
135
- print(tokenizer.decode(new_tokens, skip_special_tokens=True))
136
- ```
137
-
138
- ## Quick Start (vLLM)
139
-
140
- ```python
141
- from vllm import LLM, SamplingParams
142
-
143
- llm = LLM(model="pixas/DECS_1.5B", trust_remote_code=True)
144
- sampling = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=512)
145
- prompt = "Please reason step by step: what is 37 * 48?"
146
- outputs = llm.generate([prompt], sampling_params=sampling)
147
- print(outputs[0].outputs[0].text)
148
- ```
149
-
150
- ## Notes
151
-
152
- - This model may produce incorrect or unverifiable reasoning. Always validate outputs in high-stakes settings.
153
- - Performance can vary by prompt style and decoding parameters.
154
- - License and acceptable-use constraints should follow the upstream base model and your deployment policy.
155
-
156
 
157
  ## Citation
158
 
@@ -165,4 +90,4 @@ booktitle={The Fourteenth International Conference on Learning Representations},
165
  year={2026},
166
  url={https://openreview.net/forum?id=kdeiRledV6}
167
  }
168
- ```
 
5
  pipeline_tag: text-generation
6
  tags:
7
  - deepscaler
 
8
  - grpo
9
  - qwen2
10
  base_model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
11
  license: other
12
+ library_name: transformers
13
  ---
14
 
15
  # DECS_7B
 
22
  - Base model: `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B`
23
  - Upload date: `2026-02-24`
24
  - Recommended use: long-form reasoning and mathematical/problem-solving style generation
25
+ - Paper link: https://arxiv.org/pdf/2509.25827
26
+ - Project page: https://pixas.github.io/decs-iclr26-site/
27
 
28
  ## Quick Start (Transformers)
29
 
 
78
  - License and acceptable-use constraints should follow the upstream base model and your deployment policy.
79
 
80
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
81
 
82
  ## Citation
83
 
 
90
  year={2026},
91
  url={https://openreview.net/forum?id=kdeiRledV6}
92
  }
93
+ ```