Spaces:
Runtime error
Runtime error
metadata
title: RLM Training - Needle in Haystack
sdk: docker
hardware: t4-small
RLM Training - Recursive Language Model Skills
Training Qwen3-0.6B-Base to find needles in haystacks using GRPO.
Task
- Long context with hidden facts
- Model learns to extract specific information
- 20 steps quick test
Based on
- RLM Paper (arXiv:2512.24601)
- Sebastian Raschka's GRPO insights