LoRA adapters (Qwen3-1.7B) for training RLMs via RL. SFT, STaR, DPO, GRPO-v4. Code: github.com/pythonomar22/rl4rlm
Omar Abul-Hassan
omar81939
AI & ML interests
None yet
Recent Activity
liked a model about 2 months ago
omar81939/rlm-qwen35-35b-a3b upvoted a collection about 2 months ago
RL4RLM: Training Native Recursive Language Models updated a model about 2 months ago
omar81939/rlm-qwen35-35b-a3bOrganizations
None yet