MaxCoder-4B / README.md
beyoru's picture
Update README.md
ba48b6f verified
---
base_model:
- beyoru/EvolLLM
tags:
- text-generation-inference
- transformers
- qwen3
- code
- tool
- agent
- evolution
- merge
- RL
- grpo
- rlvr
license: apache-2.0
language:
- en
library_name: transformers
---
This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.
<p align="center">
<img src="https://cdn-uploads.huggingface.co/production/uploads/65905af887944e494e37e09a/s4drmYGEYWZyt2ZUkxIpI.png" width="300">
</p>
Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-<p align="center">
# Support me at:
<a href="https://www.buymeacoffee.com/ductransa0g" target="_blank">
<img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" width="150px">
</a>
</p>