RLVR-World
updated
RLVR-World: Training World Models with Reinforcement Learning
Paper
• 2505.13934
• Published
• 16
thuml/rt1-frame-tokenizer
Updated
• 14
thuml/rt1-world-model-single-step-base
0.1B • Updated
• 13
thuml/rt1-world-model-single-step-rlvr
thuml/rt1-compressive-tokenizer
Updated
• 16
thuml/rt1-world-model-multi-step-base
0.1B • Updated
• 8
thuml/rt1-world-model-multi-step-rlvr
thuml/webarena-world-model-cot
Viewer
• Updated
• 6.41k • 72
thuml/webarena-world-model-sft
2B • Updated
• 8
thuml/webarena-world-model-rlvr
2B • Updated
• 6
thuml/bytesized32-world-model-cot
Viewer
• Updated
• 304k • 65
• 3
thuml/bytesized32-world-model-sft
2B • Updated
• 7
thuml/bytesized32-world-model-rlvr-binary-reward
2B • Updated
• 6
thuml/bytesized32-world-model-rlvr-task-specific-reward
2B • Updated
• 8