Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,143 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Model Overview
|
| 2 |
+
|
| 3 |
+
### Description:
|
| 4 |
+
DLER-Qwen-R1-1.5B is an ultra-efficient 1.5B open-weight reasoning model designed for challenging tasks such as mathematics, programming, and scientific problem-solving. It is trained with the DLER algorithm on agentica-org/DeepScaleR-Preview-Dataset. Compared to DeepSeek’s 1.5B model, DLER-Qwen-R1-1.5B achieves substantial efficiency gains, reducing the average response length by nearly 80% across diverse mathematical benchmarks with better accuracy.
|
| 5 |
+
|
| 6 |
+
|
| 7 |
+
This model is for research and development only.
|
| 8 |
+
|
| 9 |
+
|
| 10 |
+
### Deployment Geography:
|
| 11 |
+
Global <br>
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
### Use Case: <br>
|
| 15 |
+
Researchers and developers can use this model to solve math, coding, and STEM questions.
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
### Release Date: <br>
|
| 19 |
+
Hugging Face 9/10/2025 via https://huggingface.co/nvidia/DLER-R1-1.5B <br>
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
## Model Architecture:
|
| 25 |
+
**Architecture Type:** Dense decoder-only Transformer model <br>
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
**Network Architecture:** [DeepSeek-R1-1.5B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B) <br>
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
**This model was developed based on DeepSeek-R1-Distill-Qwen-1.5B <br>
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
## Software Integration:
|
| 37 |
+
**Runtime Engine(s):** Transformers
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
**Supported Hardware Microarchitecture Compatibility:** <br>
|
| 41 |
+
* NVIDIA Ampere <br>
|
| 42 |
+
* NVIDIA Hopper <br>
|
| 43 |
+
|
| 44 |
+
|
| 45 |
+
**Preferred/Supported Operating System(s):**
|
| 46 |
+
* Linux <br>
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
## Model Version(s):
|
| 53 |
+
1.0
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
### Training Dataset:
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
| Dataset | Link |
|
| 62 |
+
|---------------------------|-------------------------------------------------------------------------------------------|
|
| 63 |
+
| DeepScaleR-Preview-Dataset | [Link](https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset) |
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
**Properties:** 479K question and answer pairs <br>
|
| 67 |
+
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
### Evaluation Results:
|
| 72 |
+
|
| 73 |
+
|
| 74 |
+
**Benchmark Score <br>
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
| Model | MATH | AVG Length | AIME | # Tokens | AMC | # Tokens | Minerva | # Tokens | Olympiad | # Tokens | Avg # Tokens |
|
| 78 |
+
|--------------------|----------|------------|--------|----------|-------|----------|---------|----------|----------|----------|--------------|
|
| 79 |
+
| Deepseek-R1-1.5B | 84.31 | 5500 | 29.79 | 16916 | 61.97 | 10967 | 38.41 | 7494 | 44.07 | 11620 | 10499 |
|
| 80 |
+
| **DLER-R1-1.5B** | **86.95 (+2.64%)** | **1652 (-70%)** | **34.375 (+4.59%)** | **3551 (-80%)** | **70.48 (+8.51%)** | **2537 (-77%)** | **43.58 (+5.18%)** | **2029 (-73%)** | **48.314 (+4.24%)** | **2563 (-78%)** | **2466 (-77%)** |
|
| 81 |
+
|
| 82 |
+
|
| 83 |
+
# Inference:
|
| 84 |
+
```python
|
| 85 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
| 86 |
+
import torch
|
| 87 |
+
|
| 88 |
+
|
| 89 |
+
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
model = AutoModelForCausalLM.from_pretrained('nvidia/DLER-R1-1.5B').to(device)
|
| 93 |
+
tokenizer = AutoTokenizer.from_pretrained('nvidia/DLER-R1-1.5B')
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
messages = [
|
| 97 |
+
{"role": "user", "content": "Convert the point $(0,3)$ in rectangular coordinates to polar coordinates. Enter your answer in the form $(r,\\theta),$ where $r > 0$ and $0 \\le \\theta < 2 \\pi.$"+" Let's think step by step and output the final answer within \\boxed{}."},
|
| 98 |
+
]
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
tokenized_chat = tokenizer.apply_chat_template(
|
| 102 |
+
messages,
|
| 103 |
+
tokenize=True,
|
| 104 |
+
add_generation_prompt=True,
|
| 105 |
+
return_tensors="pt"
|
| 106 |
+
).to(model.device)
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
outputs = model.generate(
|
| 110 |
+
tokenized_chat,
|
| 111 |
+
max_new_tokens=10000,
|
| 112 |
+
eos_token_id=tokenizer.eos_token_id
|
| 113 |
+
)
|
| 114 |
+
|
| 115 |
+
|
| 116 |
+
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
| 117 |
+
```
|
| 118 |
+
|
| 119 |
+
|
| 120 |
+
### License/Terms of Use
|
| 121 |
+
TBD
|
| 122 |
+
https://docs.google.com/spreadsheets/d/15AiIBHLsm-HY1RZH5nkaA0siE-5grHke9uFYaiD_28E/edit?gid=1088371820#gid=1088371820
|
| 123 |
+
|
| 124 |
+
|
| 125 |
+
## Ethical Considerations:
|
| 126 |
+
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
|
| 127 |
+
|
| 128 |
+
|
| 129 |
+
Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).
|
| 130 |
+
|
| 131 |
+
|
| 132 |
+
## Citation
|
| 133 |
+
If you find our dataset helpful, please cite the following [paper]():
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
```
|
| 137 |
+
|
| 138 |
+
|
| 139 |
+
```
|
| 140 |
+
|
| 141 |
+
|
| 142 |
+
|
| 143 |
+
|