Somshubra Majumdar

smajumdar94

43 114 114

AI & ML interests

None yet

Recent Activity

upvoted a paper about 2 months ago

Rubric-based On-policy Distillation

upvoted a paper about 2 months ago

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

liked a model 2 months ago

moonshotai/Kimi-K2.6

View all activity

Organizations

upvoted 2 papers about 2 months ago

Rubric-based On-policy Distillation

Paper • 2605.07396 • Published May 8 • 41

Unmasking On-Policy Distillation: Where It Helps, Where It Hurts, and Why

Paper • 2605.10889 • Published May 11 • 6

upvoted a paper 3 months ago

SKILL0: In-Context Agentic Reinforcement Learning for Skill Internalization

Paper • 2604.02268 • Published Apr 2 • 103

upvoted a paper 4 months ago

daVinci-Env: Open SWE Environment Synthesis at Scale

Paper • 2603.13023 • Published Mar 13 • 30

upvoted a collection 4 months ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 25 days ago • 169

upvoted a paper 4 months ago

BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?

Paper • 2603.03194 • Published Mar 3 • 57

upvoted 3 articles 5 months ago

Article

Custom Kernels for All from Codex and Claude

burtenshaw, sayakpaul, ariG23498, evalstate

•

Feb 13

• 80

Article

Forge: Scalable Agent RL Framework and Algorithm

MiniMax-AI

•

Feb 13

• 156

Article

We Got Claude to Build CUDA Kernels and teach open models!

burtenshaw, evalstate, merve, pcuenq

•

Jan 28

• 158

upvoted a collection 6 months ago

Openhands Trajectories

Collection

Dataset of 67,074 OpenHands trajectories collected with Qwen3-Coder-480B-A35B-Instruct and two RFT checkpoints trained on the data • 3 items • Updated Dec 23, 2025 • 8

upvoted a paper 7 months ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

Paper • 2511.18538 • Published Nov 23, 2025 • 306

upvoted a paper 9 months ago

DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

Paper • 2509.25454 • Published Sep 29, 2025 • 147

upvoted 4 papers 10 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7, 2025 • 190

upvoted an article 11 months ago

Article

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

nvidia

•

Aug 18, 2025

• 32

upvoted 2 papers 12 months ago

Replacing thinking with tool usage enables reasoning in small language models

Paper • 2507.05065 • Published Jul 7, 2025 • 17

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published Jul 16, 2025 • 43

upvoted an article 12 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 780

Somshubra Majumdar

AI & ML interests

Recent Activity

Organizations

smajumdar94's activity

Custom Kernels for All from Codex and Claude

Forge: Scalable Agent RL Framework and Algorithm

We Got Claude to Build CUDA Kernels and teach open models!

Supercharge Edge AI With High‑Accuracy Reasoning Using NVIDIA Nemotron Nano 2 9B

SmolLM3: smol, multilingual, long-context reasoner