Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
1
1
Tanmay Jain
tanmayyyj
Follow
0 followers
·
1 following
tanmayyyyj
tanmayyyj
AI & ML interests
Computer Vision, NLP, Reinforcement Learning
Organizations
spaces
2
Sort: Recently updated
pinned
Paused
Reward Fn Inference
🤖
pinned
Paused
Reward Fn Trainer
🏋
Fine‑tune a language model with DPO using your dataset
models
13
Sort: Recently updated
tanmayyyj/ministral-8b-reward-fn-dpo
Updated
Feb 28
tanmayyyj/ministral-8b-reward-fn-sft
Updated
Feb 28
tanmayyyj/Qwen-3-math-reasoning
Updated
Jun 2, 2025
tanmayyyj/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
•
Updated
Jul 4, 2023
•
2
tanmayyyj/Taxi-v3
Updated
Jun 30, 2023
tanmayyyj/Taxi_v3
Reinforcement Learning
•
Updated
Jun 30, 2023
tanmayyyj/q-FrozenLake-v2-4x4-Non_Slippery
Reinforcement Learning
•
Updated
Jun 30, 2023
•
1
tanmayyyj/ppo-PyramidsRND
Reinforcement Learning
•
Updated
Jun 20, 2023
tanmayyyj/ppo-SnowballTarget
Reinforcement Learning
•
Updated
Jun 20, 2023
tanmayyyj/Cartpole-v1
Reinforcement Learning
•
Updated
Jun 19, 2023
View 13 models
datasets
2
Sort: Recently updated
tanmayyyj/reward-fn-dpo
Viewer
•
Updated
Feb 28
•
161
•
7
tanmayyyj/reward-fn-sft
Viewer
•
Updated
Feb 28
•
33
•
11