|
|
--- |
|
|
base_model: |
|
|
- beyoru/EvolLLM |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- qwen3 |
|
|
- code |
|
|
- tool |
|
|
- agent |
|
|
- evolution |
|
|
- merge |
|
|
- RL |
|
|
- grpo |
|
|
- rlvr |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
--- |
|
|
|
|
|
This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode. |
|
|
|
|
|
<p align="center"> |
|
|
<img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300"> |
|
|
</p> |
|
|
|
|
|
|
|
|
Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-<p align="center"> |
|
|
|
|
|
# Support me at: |
|
|
|
|
|
<a href="https://www.buymeacoffee.com/ductransa0g" target="_blank"> |
|
|
<img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" width="150px"> |
|
|
</a> |
|
|
</p> |