Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
zz1358m
/
SofT-GRPO-master
like
8
TensorBoard
Safetensors
arxiv:
2511.06411
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
1
main
SofT-GRPO-master
1 contributor
History:
18 commits
zz1358m
Update README.md
128d4db
verified
4 months ago
Results
Delete Results/training curves/tensorboard_show-1.5B/run.txt
4 months ago
Soft-Thinking+noise+loss-main
Upload 4 files
4 months ago
assets
Upload mainprocess.png
4 months ago
saved_weight
Upload folder using huggingface_hub
4 months ago
verl-0.4.x
Upload folder using huggingface_hub
4 months ago
.gitattributes
3.26 kB
Upload mainprocess.png
4 months ago
.gitignore
39 Bytes
Upload folder using huggingface_hub
4 months ago
README.md
5.57 kB
Update README.md
4 months ago
SofT-GRPO-deepscaler-8k-dir.sh
3.18 kB
Upload 5 files
4 months ago
SofT-GRPO-deepscaler-8k-gau.sh
3.12 kB
Upload 5 files
4 months ago
SofT-GRPO-deepscaler-8k-llama3.sh
3.28 kB
Update SofT-GRPO-deepscaler-8k-llama3.sh
4 months ago
SofT-GRPO-deepscaler-8k-qwen7.sh
3.3 kB
Update SofT-GRPO-deepscaler-8k-qwen7.sh
4 months ago
SofT-GRPO-deepscaler-8k.sh
3.28 kB
Update SofT-GRPO-deepscaler-8k.sh
4 months ago
requirements.txt
4.01 kB
Upload 2 files
4 months ago