🏗️ Building on HF

4 41 27

Elena M

borntobeignored

AI & ML interests

None yet

Recent Activity

upvoted a paper 6 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

upvoted an article 6 days ago

Holo3.1: Fast & Local Computer Use Agents

liked a model 7 days ago

poolside/Laguna-M.1

View all activity

Organizations

upvoted a paper 6 days ago

From Trainee to Trainer: LLM-Designed Training Environment for RL with Multi-Agent Reasoning

Paper • 2606.17682 • Published 10 days ago • 26

upvoted an article 6 days ago

Article

Holo3.1: Fast & Local Computer Use Agents

Hcompany

•

23 days ago

• 32

upvoted 4 papers 13 days ago

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

Paper • 2606.11926 • Published 16 days ago • 118

On the Geometry of On-Policy Distillation

Paper • 2606.07082 • Published 21 days ago • 73

Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks

Paper • 2606.12344 • Published 16 days ago • 68

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Paper • 2606.13681 • Published 15 days ago • 140

upvoted a paper 14 days ago

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published 18 days ago • 33

upvoted a collection 15 days ago

Gemma 4

Collection

15 items • Updated 15 days ago • 991

upvoted an article 16 days ago

Article

The Open Source Community is backing OpenEnv for Agentic RL

burtenshaw, spisakjo, lysandre, darktex, willcb, qjoy, pawalt, cwing-nv, danielhanchen, andrewzhou, thegovind, shimmyshimmer, Hamid-Nazeri, Sanyam, zkwentz, emre0, lewtun, sergiopaniego, banghua

•

18 days ago

• 91

upvoted a paper 20 days ago

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

Paper • 2605.31433 • Published 28 days ago • 28

upvoted a paper 21 days ago

Trust Region On-Policy Distillation

Paper • 2606.01249 • Published 26 days ago • 44

upvoted a collection 21 days ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 14 days ago • 167

upvoted an article 27 days ago

Article

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

ariG23498, sayakpaul, sergiopaniego, ror, pcuenq

•

28 days ago

• 127

upvoted a paper 29 days ago

CUA-Gym: Scaling Verifiable Training Environments and Tasks for Computer-Use Agents

Paper • 2605.25624 • Published May 25 • 34

upvoted 2 papers about 1 month ago

Heterogeneous Agent Collaborative Reinforcement Learning

Paper • 2603.02604 • Published Mar 3 • 198

Golden Goose: A Simple Trick to Synthesize Unlimited RLVR Tasks from Unverifiable Internet Text

Paper • 2601.22975 • Published Jan 30 • 113

upvoted a collection about 1 month ago

Mellum

Collection

Series of code models by JetBrains • 12 items • Updated Oct 1, 2025 • 50

upvoted 2 papers about 1 month ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published Jan 28 • 50

Embarrassingly Simple Self-Distillation Improves Code Generation

Paper • 2604.01193 • Published Apr 1 • 56

upvoted an article about 1 month ago

Article

TRL v1.0: Post-Training Library Built to Move with the Field

qgallouedec, stevhliu, pcuenq, sergiopaniego

•

Mar 31

• 57

Elena M

AI & ML interests

Recent Activity

Organizations

borntobeignored's activity

Holo3.1: Fast & Local Computer Use Agents

The Open Source Community is backing OpenEnv for Agentic RL

Profiling in PyTorch (Part 1): A Beginner's Guide to torch.profiler

TRL v1.0: Post-Training Library Built to Move with the Field