GiuLeo01
/

FortranCodeGen-3B-SynthData

Text Generation

reinforcement learning

text-generation-inference

Model card Files Files and versions

GiuLeo01 commited on May 19

Commit

e413293

·

verified ·

1 Parent(s): eea4487

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -159,10 +159,10 @@ If you use this model or parts of this work, please consider citing the referenc
 ## References
 * Qwen/Qwen2-5-Coder-3B-Instruct
-  [https://huggingface.co/Qwen/Qwen2-5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2-5-Coder-3B-Instruct)
 * Group Relative Policy Optimization (GRPO):
-  [https://arxiv.org/abs/2205.13636](https://arxiv.org/abs/2205.13636)
 * Unsloth – Fast and memory-efficient fine-tuning via QLoRA
   [https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth)

 ## References
 * Qwen/Qwen2-5-Coder-3B-Instruct
+  [https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct)
 * Group Relative Policy Optimization (GRPO):
+  [https://arxiv.org/abs/2402.03300](https://arxiv.org/abs/2402.03300)
 * Unsloth – Fast and memory-efficient fine-tuning via QLoRA
   [https://github.com/unslothai/unsloth](https://github.com/unslothai/unsloth)