6 4 3

Sneha R

Sneha7

AI & ML interests

GenAI

Recent Activity

new activity 7 days ago

humanitys-last-hackathon/signup:Update src/app/signup-panel.tsx

liked a Space 9 days ago

humanitys-last-hackathon/signup

published a Space 13 days ago

Sneha7/langgraph

View all activity

Organizations

New activity in humanitys-last-hackathon/signup 7 days ago

Update src/app/signup-panel.tsx

#1 opened 14 days ago by

pcuenq

liked a Space 9 days ago

Humanity's Last Hackathon Signup

🖥

published a Space 13 days ago

Langgraph

📉

Generate a friendly greeting for any name

New activity in openenv/README 4 months ago

Need permission to add environment to hub?

#2 opened 4 months ago by

Sneha7

updated a Space 4 months ago

Code Debugging Challenge

🐛

published a Space 4 months ago

Code Debugging Challenge

🐛

New activity in openenv/README 4 months ago

What is the deadline for the OpenEnv hack event

#1 opened 4 months ago by

bpHigh

reacted to sergiopaniego's post with 🔥 4 months ago

Post

2896

ICYMI, you can fine-tune open LLMs using Claude Code

just tell it:
“Fine-tune Qwen3-0.6B on open-r1/codeforces-cots”

and Claude submits a real training job on HF GPUs using TRL.

it handles everything:
> dataset validation
> GPU selection
> training + Trackio monitoring
> job submission + cost estimation
when it’s done, your model is on the Hub, ready to use

read more about the process: https://huggingface.co/blog/hf-skills-training

1 reply

reacted to sergiopaniego's post with 🚀 4 months ago

Post

942

TRL v0.27.0 is out!! 🥳

It includes GDPO, the latest variant of GRPO for multi-reward RL ✨
GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence — developed by
@sliuau @SimonX et al.

Explore the paper: GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization (2601.05242)

Explore the full set of changes here:
https://github.com/huggingface/trl/releases/tag/v0.27.0