Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
RLVR-SvS
/
SvS-LLama-8B
like
1
Follow
RLVR-SvS
4
Reinforcement Learning
Safetensors
RLVR-SvS/Variational-DAPO
English
llama
arxiv:
2508.14029
License:
mit
Model card
Files
Files and versions
xet
Community
main
SvS-LLama-8B
/
tokenizer.json
Commit History
add model weights and training data
344e672
verified
MasterVito
commited on
Dec 11, 2025