jli505 commited on
Commit
c3610aa
·
verified ·
1 Parent(s): 97abdef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -1
README.md CHANGED
@@ -16,4 +16,10 @@ pipeline_tag: text-generation
16
  We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
17
  On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
18
  of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
19
- We trained on 24k verifiable coding problems using 48 B200s over the course of four days.
 
 
 
 
 
 
 
16
  We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
17
  On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
18
  of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
19
+ We trained on 24k verifiable coding problems using 48 B200s over the course of four days.
20
+
21
+ ![test](./lcb_score_vs_step.png)
22
+ ![test](./performance_params_ratio.png)
23
+
24
+ # Acknowledgements
25
+ I would like to thank my mentor, Roger Jin, Dakota Mahan, Teknium, and others at the Nous Research team for their invaluable support throughout this project. I would also like to thank Together AI and Agentica for their immensely helpful blog posts on DeepCoder-14B. Finally, thank you to Modal and Lambda for their generous support by providing me with credits.