Update README.md
Browse files
README.md
CHANGED
|
@@ -10,7 +10,7 @@ library_name: transformers
|
|
| 10 |
|
| 11 |
# SimpleSD-4B-thinking
|
| 12 |
|
| 13 |
-
This model
|
| 14 |
|
| 15 |
- **Self-distillation sampling:** temperature=1.1, top_p=0.95, top_k=20
|
| 16 |
- **Evaluation sampling:** temperature=0.7, top_p=0.95, top_k=20
|
|
|
|
| 10 |
|
| 11 |
# SimpleSD-4B-thinking
|
| 12 |
|
| 13 |
+
This model is an example of the **Simple Self-Distillation (SimpleSD)** method that improves code generation by fine-tuning a language model on its own sampled outputs—without rewards, verifiers, teacher models, or reinforcement learning. Please see the paper below for more information. This uses Qwen for initialization.
|
| 14 |
|
| 15 |
- **Self-distillation sampling:** temperature=1.1, top_p=0.95, top_k=20
|
| 16 |
- **Evaluation sampling:** temperature=0.7, top_p=0.95, top_k=20
|