Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
1
2
Mehul Damani
PRO
mehuldamani
Follow
John6666's profile picture
wjurayj's profile picture
Spechawk's profile picture
3 followers
·
0 following
https://damanimehul.github.io
MehulDamani2
damanimehul
AI & ML interests
Reinforcement Learning, Large Language Models
Recent Activity
published
a dataset
about 20 hours ago
mehuldamani/multi-answer-sft-target-dataset
published
a model
2 days ago
mehuldamani/sfted_rlvr_multi__veryHardDataset_moreThinking
updated
a dataset
2 days ago
mehuldamani/multi-answer-sft-target-dataset
View all activity
Organizations
None yet
mehuldamani
's models
218
Sort: Recently updated
mehuldamani/RLCR-hotpot-octo
Text Generation
•
8B
•
Updated
Nov 19, 2025
•
3
•
1
mehuldamani/RLVR-hotpot-octo
Text Generation
•
8B
•
Updated
Nov 19, 2025
•
2
mehuldamani/RLCR-hotpot-olmo-v2
Text Generation
•
7B
•
Updated
Nov 19, 2025
•
1
mehuldamani/RLCR-hotpot-olmo
Text Generation
•
7B
•
Updated
Nov 18, 2025
•
2
mehuldamani/RLVR-hotpot-olmo
Text Generation
•
7B
•
Updated
Nov 17, 2025
•
1
mehuldamani/RLVR-hotpot-mistral
Text Generation
•
7B
•
Updated
Nov 17, 2025
•
2
mehuldamani/calibration-only-v4
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/bandit-log-RLCR-v2
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/bandit-brier-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/bandit-log-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
1
mehuldamani/toy-log-RLCR-v3
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
3
•
1
mehuldamani/toy-log-RLCR-v1
Text Generation
•
8B
•
Updated
Nov 17, 2025
•
2
mehuldamani/RLVR-hotpot-gemma
Updated
Nov 16, 2025
mehuldamani/RLCR-hotpot-llama-base
Updated
Nov 16, 2025
mehuldamani/RLCR-hotpot-mistral
Updated
Nov 16, 2025
mehuldamani/nov15_qwen3_8b_math_rlVr_single
Updated
Nov 16, 2025
mehuldamani/calibration-only-v3
Updated
Nov 15, 2025
mehuldamani/nov15_qwen3_8b_math_rlcr_single
Updated
Nov 15, 2025
mehuldamani/nov15_qwen3_8b_math_rlcr_multiple
Updated
Nov 15, 2025
mehuldamani/RLCR-math-tac-v2
Updated
Nov 15, 2025
mehuldamani/RLVR-hotpot-llama
Updated
Nov 15, 2025
mehuldamani/RLCR-hotpot-llama
Updated
Nov 15, 2025
mehuldamani/RLCR-hotpot-log-loss-highw
Text Generation
•
8B
•
Updated
Nov 15, 2025
•
1
mehuldamani/RLCR-hotpot-tac
Text Generation
•
8B
•
Updated
Nov 14, 2025
•
1
mehuldamani/calibration-only-v2
Updated
Nov 14, 2025
mehuldamani/RLCR-math-tac
Updated
Nov 14, 2025
mehuldamani/RLCR-hotpot-log-loss
Text Generation
•
8B
•
Updated
Nov 14, 2025
•
3
mehuldamani/calibration-only
Updated
Nov 13, 2025
mehuldamani/RLCR-hotpot-log-loss-v3
Updated
Nov 13, 2025
mehuldamani/RLCR-hotpot-normal
Updated
Nov 13, 2025
Previous
1
2
3
4
5
6
...
8
Next