Spaces:

mindchain
/

rlm-training-test

Runtime error

Add README.md

30f24da verified about 2 months ago

388 Bytes

	---
	title: RLM Training - Needle in Haystack
	sdk: docker
	hardware: t4-small
	---

	# RLM Training - Recursive Language Model Skills

	Training Qwen3-0.6B-Base to find needles in haystacks using GRPO.

	## Task
	- Long context with hidden facts
	- Model learns to extract specific information
	- 20 steps quick test

	## Based on
	- RLM Paper (arXiv:2512.24601)
	- Sebastian Raschka's GRPO insights