NousResearch
/

NousCoder-14B

Text Generation

Model card Files Files and versions

jli505 commited on 19 days ago

Commit

c3610aa

·

verified ·

1 Parent(s): 97abdef

Update README.md

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -16,4 +16,10 @@ pipeline_tag: text-generation
 We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
 On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
 of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
-We trained on 24k verifiable coding problems using 48 B200s over the course of four days.

 We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
 On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
 of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
+We trained on 24k verifiable coding problems using 48 B200s over the course of four days.
+![test](./lcb_score_vs_step.png)
+![test](./performance_params_ratio.png)
+# Acknowledgements
+I would like to thank my mentor, Roger Jin, Dakota Mahan, Teknium, and others at the Nous Research team for their invaluable support throughout this project. I would also like to thank Together AI and Agentica for their immensely helpful blog posts on DeepCoder-14B. Finally, thank you to Modal and Lambda for their generous support by providing me with credits.