Spaces:
Runtime error
Runtime error
| title: RLM Training - Needle in Haystack | |
| sdk: docker | |
| hardware: t4-small | |
| # RLM Training - Recursive Language Model Skills | |
| Training Qwen3-0.6B-Base to find needles in haystacks using GRPO. | |
| ## Task | |
| - Long context with hidden facts | |
| - Model learns to extract specific information | |
| - 20 steps quick test | |
| ## Based on | |
| - RLM Paper (arXiv:2512.24601) | |
| - Sebastian Raschka's GRPO insights | |