rlm-training-test / README.md
mindchain's picture
Add README.md
30f24da verified
metadata
title: RLM Training - Needle in Haystack
sdk: docker
hardware: t4-small

RLM Training - Recursive Language Model Skills

Training Qwen3-0.6B-Base to find needles in haystacks using GRPO.

Task

  • Long context with hidden facts
  • Model learns to extract specific information
  • 20 steps quick test

Based on

  • RLM Paper (arXiv:2512.24601)
  • Sebastian Raschka's GRPO insights