mindchain commited on
Commit
11e046c
·
verified ·
1 Parent(s): 94a59aa

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -6
README.md CHANGED
@@ -1,10 +1,17 @@
1
  ---
2
- title: Rlm Evaluation Test
3
- emoji: 😻
4
- colorFrom: pink
5
- colorTo: indigo
6
  sdk: docker
7
- pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: RLM Model Evaluation
 
 
 
3
  sdk: docker
4
+ hardware: t4-small
5
  ---
6
 
7
+ # RLM Model Evaluation
8
+
9
+ Evaluates the trained needle-in-haystack model against the base model.
10
+
11
+ ## Models
12
+ - Base: Qwen/Qwen3-0.6B-Base
13
+ - Trained: mindchain/qwen3-0.6b-rlm-needle
14
+
15
+ ## Expected Results
16
+ - Base: ~25% accuracy (random guessing)
17
+ - Trained: 50-75% accuracy (after GRPO training)