Spaces:
Runtime error
Runtime error
Add README.md
Browse files
README.md
CHANGED
|
@@ -1,10 +1,18 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji: 🌍
|
| 4 |
-
colorFrom: purple
|
| 5 |
-
colorTo: indigo
|
| 6 |
sdk: docker
|
| 7 |
-
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: RLM Training - Needle in Haystack
|
|
|
|
|
|
|
|
|
|
| 3 |
sdk: docker
|
| 4 |
+
hardware: t4-small
|
| 5 |
---
|
| 6 |
|
| 7 |
+
# RLM Training - Recursive Language Model Skills
|
| 8 |
+
|
| 9 |
+
Training Qwen3-0.6B-Base to find needles in haystacks using GRPO.
|
| 10 |
+
|
| 11 |
+
## Task
|
| 12 |
+
- Long context with hidden facts
|
| 13 |
+
- Model learns to extract specific information
|
| 14 |
+
- 20 steps quick test
|
| 15 |
+
|
| 16 |
+
## Based on
|
| 17 |
+
- RLM Paper (arXiv:2512.24601)
|
| 18 |
+
- Sebastian Raschka's GRPO insights
|