README.md · beyoru/MinCoder-4B-Expert at main

MinCoder-4B-Expert / README.md

beyoru

Update README.md

dc31edb verified about 1 month ago

preview code

raw

history blame contribute delete

758 Bytes

metadata

base_model:
  - beyoru/EvolLLM
tags:
  - text-generation-inference
  - transformers
  - qwen3
  - code
  - tool
  - agent
  - evolution
  - merge
  - RL
  - grpo
license: apache-2.0
language:
  - en

This model is fine-tuned Qwen model using a custom reinforcement learning (RL) framework that rewards the model for producing solutions passing automated test cases — similar to the process of programming task evaluation on LeetCode.

Instead of relying on labeled ground truth answers, the model learns through test-case-based rewards, promoting generalization and reasoning ability in algorithmic problem-solving.