metadata
base_model:
- beyoru/EvolLLM
tags:
- text-generation-inference
- transformers
- qwen3
- code
- tool
- agent
- evolution
- merge
- RL
- grpo
license: apache-2.0
language:
- en
This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.
Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.