LoRA adapters (Qwen3-1.7B) for training RLMs via RL. SFT, STaR, DPO, GRPO-v4. Code: github.com/pythonomar22/rl4rlm
Omar Abul-Hassan
omar81939
AI & ML interests
None yet
Recent Activity
updated
a collection
10 days ago
RL4RLM: Training Native Recursive Language Models updated
a collection
10 days ago
RL4RLM: Training Native Recursive Language Models updated
a collection
10 days ago
RL4RLM: Training Native Recursive Language Models Organizations
None yet