Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Schrieffer
/
Llama-SARM-4B
like
1
Reinforcement Learning
Transformers
Safetensors
llama
text-classification
reward-model
rlhf
sparse-autoencoder
interpretability
custom_code
text-generation-inference
arxiv:
2508.08746
License:
llama3.1
Model card
Files
Files and versions
xet
Community
1
Deploy
Use this model
main
Llama-SARM-4B
9.11 GB
4 contributors
History:
18 commits
zhangshuyi.0109
update citation & evaluation
56cebd7
14 days ago
.gitattributes
1.52 kB
initial commit
4 months ago
README.md
4.04 kB
update citation & evaluation
14 days ago
config.json
1.26 kB
init
2 months ago
model-00001-of-00002.safetensors
4.98 GB
xet
Initial commit of the reward model
4 months ago
model-00002-of-00002.safetensors
4.13 GB
xet
Initial commit of the reward model
4 months ago
model.safetensors.index.json
12.3 kB
Initial commit of the reward model
4 months ago
modeling_sarm_llama.py
23.7 kB
init
2 months ago
special_tokens_map.json
434 Bytes
Add tokenizer
4 months ago
tokenizer.json
9.09 MB
Add tokenizer
4 months ago
tokenizer_config.json
55.6 kB
add remote code
3 months ago