Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
LeTue09
/
arithmetic-grpo
like
0
arxiv:
14 papers
Model card
Files
Files and versions
xet
Community
main
arithmetic-grpo
/
tests
/
special_e2e
94.2 kB
Ctrl+K
Ctrl+K
1 contributor
History:
1 commit
LeTue09
initial clean commit
1faccd4
29 days ago
envs
initial clean commit
29 days ago
generation
initial clean commit
29 days ago
ppo_trainer
initial clean commit
29 days ago
sft
initial clean commit
29 days ago
README.md
83 Bytes
initial clean commit
29 days ago
__init__.py
Safe
600 Bytes
initial clean commit
29 days ago
check_custom_rwd_fn.py
1.17 kB
initial clean commit
29 days ago
check_results.py
1.75 kB
initial clean commit
29 days ago
run_dapo.sh
3.55 kB
initial clean commit
29 days ago
run_fully_async_policy.sh
8.12 kB
initial clean commit
29 days ago
run_geo3k_fsdp_sgl_multiturn_w_tool.sh
2.61 kB
initial clean commit
29 days ago
run_grpo_lora_with_merge.sh
3.59 kB
initial clean commit
29 days ago
run_gsm8k_fsdp_sgl_multiturn_sf_tool.sh
2.52 kB
initial clean commit
29 days ago
run_gsm8k_fsdp_sgl_multiturn_w_tool.sh
2.6 kB
initial clean commit
29 days ago
run_one_step_off_policy.sh
7.56 kB
initial clean commit
29 days ago
run_ppo_trainer_megatron.sh
13.1 kB
initial clean commit
29 days ago
run_ppo_trainer_torchtitan.sh
3.18 kB
initial clean commit
29 days ago
run_ppo_trainer_veomni.sh
3.06 kB
initial clean commit
29 days ago
run_test.sh
359 Bytes
initial clean commit
29 days ago