Update README.md
Browse files
README.md
CHANGED
|
@@ -45,8 +45,10 @@ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing
|
|
| 45 |
|
| 46 |
---
|
| 47 |
license: apache-2.0
|
| 48 |
-
|
|
|
|
| 49 |
- MasterControlAIML/JSON-Unstructured-Structured
|
|
|
|
| 50 |
---
|
| 51 |
**DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
|
| 52 |
|
|
|
|
| 45 |
|
| 46 |
---
|
| 47 |
license: apache-2.0
|
| 48 |
+
|
| 49 |
+
Datasets:
|
| 50 |
- MasterControlAIML/JSON-Unstructured-Structured
|
| 51 |
+
|
| 52 |
---
|
| 53 |
**DeepSeek R1 Strategy Replication on Qwen-2.5-1.5b on 8*H100 GPUS**
|
| 54 |
|