omar81939 's Collections

RL4RLM: Training Native Recursive Language Models

LoRA adapters (Qwen3-1.7B) for training RLMs via RL. SFT, STaR, DPO, GRPO-v4. Code: github.com/pythonomar22/rl4rlm