Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
zz1358m
/
SofT-GRPO-master
like
8
TensorBoard
Safetensors
arxiv:
2511.06411
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
1
eca69d7
SofT-GRPO-master
52.2 GB
Ctrl+K
Ctrl+K
1 contributor
History:
14 commits
zz1358m
Update SofT-GRPO-deepscaler-8k-llama3.sh
eca69d7
verified
6 months ago
Results
Delete Results/training curves/tensorboard_show-1.5B/run.txt
6 months ago
Soft-Thinking+noise+loss-main
Upload 4 files
6 months ago
assets
Upload mainprocess.png
6 months ago
saved_weight
Upload folder using huggingface_hub
6 months ago
verl-0.4.x
Upload folder using huggingface_hub
6 months ago
.gitattributes
3.26 kB
Upload mainprocess.png
6 months ago
.gitignore
39 Bytes
Upload folder using huggingface_hub
6 months ago
README.md
5.19 kB
Upload README.md
6 months ago
SofT-GRPO-deepscaler-8k-dir.sh
3.18 kB
Upload 5 files
6 months ago
SofT-GRPO-deepscaler-8k-gau.sh
3.12 kB
Upload 5 files
6 months ago
SofT-GRPO-deepscaler-8k-llama3.sh
3.28 kB
Update SofT-GRPO-deepscaler-8k-llama3.sh
6 months ago
SofT-GRPO-deepscaler-8k-qwen7.sh
3.3 kB
Upload 5 files
6 months ago
SofT-GRPO-deepscaler-8k.sh
3.28 kB
Update SofT-GRPO-deepscaler-8k.sh
6 months ago
requirements.txt
4.01 kB
Upload 2 files
6 months ago