Update README.md
Browse files
README.md
CHANGED
|
@@ -16,4 +16,10 @@ pipeline_tag: text-generation
|
|
| 16 |
We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
|
| 17 |
On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
|
| 18 |
of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
|
| 19 |
-
We trained on 24k verifiable coding problems using 48 B200s over the course of four days.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
We introduce *HermesCoder-14B*, a code reasoning model post-trained on [Qwen3-14B](https://huggingface.co/Qwen/Qwen3-14B) via reinforcement learning with verifiable rewards (RLVR).
|
| 17 |
On LiveCodeBench v5 (08/01/2024 - 05/01/2025), we achieve a Pass@1 accuracy of 67.87\%, up 7.08\% from the baseline Pass@1 accuracy of 60.79\%
|
| 18 |
of Qwen3-14B. To the best of our knowledge, this is the highest-performing 14B model to date.
|
| 19 |
+
We trained on 24k verifiable coding problems using 48 B200s over the course of four days.
|
| 20 |
+
|
| 21 |
+

|
| 22 |
+

|
| 23 |
+
|
| 24 |
+
# Acknowledgements
|
| 25 |
+
I would like to thank my mentor, Roger Jin, Dakota Mahan, Teknium, and others at the Nous Research team for their invaluable support throughout this project. I would also like to thank Together AI and Agentica for their immensely helpful blog posts on DeepCoder-14B. Finally, thank you to Modal and Lambda for their generous support by providing me with credits.
|