Update README.md
Browse files
README.md
CHANGED
|
@@ -159,10 +159,10 @@ If you use this model or parts of this work, please consider citing the referenc
|
|
| 159 |
## References
|
| 160 |
|
| 161 |
* Qwen/Qwen2-5-Coder-3B-Instruct
|
| 162 |
-
[https://huggingface.co/Qwen/Qwen2
|
| 163 |
|
| 164 |
* Group Relative Policy Optimization (GRPO):
|
| 165 |
-
[https://arxiv.org/abs/
|
| 166 |
|
| 167 |
* Unsloth – Fast and memory-efficient fine-tuning via QLoRA
|
| 168 |
[https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth)
|
|
|
|
| 159 |
## References
|
| 160 |
|
| 161 |
* Qwen/Qwen2-5-Coder-3B-Instruct
|
| 162 |
+
[https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
|
| 163 |
|
| 164 |
* Group Relative Policy Optimization (GRPO):
|
| 165 |
+
[https://arxiv.org/abs/2402.03300](https://arxiv.org/abs/2402.03300)
|
| 166 |
|
| 167 |
* Unsloth – Fast and memory-efficient fine-tuning via QLoRA
|
| 168 |
[https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth)
|