Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
LeTue09
/
arithmetic-grpo
like
0
arxiv:
14 papers
Model card
Files
Files and versions
xet
Community
main
arithmetic-grpo
/
verl
/
trainer
575 kB
Ctrl+K
Ctrl+K
1 contributor
History:
1 commit
LeTue09
initial clean commit
1faccd4
29 days ago
config
initial clean commit
29 days ago
ppo
initial clean commit
29 days ago
README.md
1.14 kB
initial clean commit
29 days ago
__init__.py
Safe
600 Bytes
initial clean commit
29 days ago
constants_ppo.py
2.05 kB
initial clean commit
29 days ago
main_eval.py
2.8 kB
initial clean commit
29 days ago
main_generation_server.py
6.67 kB
initial clean commit
29 days ago
main_ppo.py
18.3 kB
initial clean commit
29 days ago
runtime_env.yaml
195 Bytes
initial clean commit
29 days ago
sft_trainer.py
18.6 kB
initial clean commit
29 days ago
sft_trainer_ray.py
15.7 kB
initial clean commit
29 days ago