Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
4
wang
wzx111
Follow
0 followers
·
1 following
AI & ML interests
None yet
Recent Activity
new
activity
13 days ago
wzx111/Qwen3-1.7B-MATH-GDPO:
Which post-training method was actually used for this model, GDPO or GRPO?
updated
a dataset
about 1 month ago
wzx111/MATH-lighteval-level3
published
a dataset
about 1 month ago
wzx111/MATH-lighteval-level3
View all activity
Organizations
wzx111
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
New activity in
wzx111/Qwen3-1.7B-MATH-GDPO
13 days ago
Which post-training method was actually used for this model, GDPO or GRPO?
1
#1 opened 14 days ago by
roseblooming
updated
a dataset
about 1 month ago
wzx111/MATH-lighteval-level3
Viewer
•
Updated
Dec 9, 2025
•
2.72k
•
8
published
a dataset
about 1 month ago
wzx111/MATH-lighteval-level3
Viewer
•
Updated
Dec 9, 2025
•
2.72k
•
8
published
a model
about 1 month ago
wzx111/Qwen3-1.7B-GRPO-math
Updated
Nov 29, 2025
updated
a model
about 1 month ago
wzx111/Qwen3-1.7B-GRPO-math
Updated
Nov 29, 2025
updated
a dataset
about 2 months ago
wzx111/MATH-lighteval-level-middlehigh
Viewer
•
Updated
Nov 24, 2025
•
5.63k
•
7
published
a dataset
about 2 months ago
wzx111/MATH-lighteval-level-middlehigh
Viewer
•
Updated
Nov 24, 2025
•
5.63k
•
7
updated
a dataset
about 2 months ago
wzx111/MATH-lighteval-level-middle
Viewer
•
Updated
Nov 24, 2025
•
7.87k
•
3
published
a dataset
about 2 months ago
wzx111/MATH-lighteval-level-middle
Viewer
•
Updated
Nov 24, 2025
•
7.87k
•
3
updated
a model
about 2 months ago
wzx111/Qwen3-1.7B-Open-R1-ADPO
Text Generation
•
2B
•
Updated
Nov 23, 2025
•
1
published
a model
about 2 months ago
wzx111/Qwen3-1.7B-Open-R1-ADPO
Text Generation
•
2B
•
Updated
Nov 23, 2025
•
1
updated
a model
about 2 months ago
wzx111/Qwen3-1.7B-Open-R1-GRPO-Baseline
Text Generation
•
2B
•
Updated
Nov 22, 2025
•
1
published
a model
about 2 months ago
wzx111/Qwen3-1.7B-Open-R1-GRPO-Baseline
Text Generation
•
2B
•
Updated
Nov 22, 2025
•
1
New activity in
Qwen/Qwen3-235B-A22B
8 months ago
是不是奖励函数没有ngram重复度惩罚
2
#7 opened 9 months ago by
wzx111
updated
a model
8 months ago
wzx111/Qwen3-1.7B-Open-R1-GRPO
2B
•
Updated
May 14, 2025
•
2
published
a model
8 months ago
wzx111/Qwen3-1.7B-Open-R1-GRPO
2B
•
Updated
May 14, 2025
•
2
updated
a model
8 months ago
wzx111/Qwen3-1.7B-Open-R1-GDPO-epcoh_
Text Generation
•
2B
•
Updated
May 14, 2025
•
5
published
a model
8 months ago
wzx111/Qwen3-1.7B-Open-R1-GDPO-epcoh_
Text Generation
•
2B
•
Updated
May 14, 2025
•
5
updated
a model
8 months ago
wzx111/Qwen3-1.7B-MATH-GDPO-EPOCH2
Text Generation
•
2B
•
Updated
May 2, 2025
•
4
published
a model
8 months ago
wzx111/Qwen3-1.7B-MATH-GDPO-EPOCH2
Text Generation
•
2B
•
Updated
May 2, 2025
•
4
Load more