Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ license: cc-by-nc-4.0
|
|
| 10 |
## Introduction
|
| 11 |
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
| 12 |
|
| 13 |
-
## Performance
|
| 14 |
|
| 15 |
| Model | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) |
|
| 16 |
|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|
|
|
|
|
| 10 |
## Introduction
|
| 11 |
E1-Math-1.5B is a language model fine-tuned from DeepSeek-R1-Distilled-Qwen-1.5B. It is trained for Elastic Reasoning by budget-constrained rollout strategy, integrated into GRPO, which teaches the model to reason adaptively when the thinking process is cut short and generalizes effectively to unseen budget constraints without additional training.
|
| 12 |
|
| 13 |
+
## Performance (Avg@16)
|
| 14 |
|
| 15 |
| Model | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) | Tokens | Acc (%) |
|
| 16 |
|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|--------------|---------------|
|