Roman Nekrasov

Rob1234567

romannekrasovaillm

AI & ML interests

Areas of interest: agentic mid-training, reinforcement learning with reward verification (RLVR), scaling agent environments, interleaved agent reasoning with tools

Recent Activity

upvoted a collection 7 days ago

DeepSeek-V4

upvoted a collection 10 days ago

Qwen3.6

liked a model 26 days ago

deepseek-ai/DeepSeek-V4-Flash

View all activity

Organizations

None yet

upvoted a collection 7 days ago

DeepSeek-V4

Collection

6 items • Updated 6 days ago • 710

upvoted a collection 10 days ago

Qwen3.6

Collection

4 items • Updated Apr 22 • 425

upvoted 2 collections 3 months ago

Gemma 4

Collection

15 items • Updated 22 days ago • 1.01k

GigaChat 3.1

Collection

6 items • Updated Mar 24 • 62

upvoted a paper 3 months ago

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published Mar 22 • 78

upvoted a paper 4 months ago

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Paper • 2602.10693 • Published Feb 11 • 221

upvoted a collection 6 months ago

MedGemma Release

Collection

Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 9 items • Updated Mar 12 • 509

upvoted a collection 7 months ago

Olmo 3

Collection

Artifacts for the Olmo 3 release. • 7 items • Updated Mar 2 • 171

upvoted a paper 7 months ago

DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

Paper • 2511.22570 • Published Nov 27, 2025 • 96

upvoted 2 articles 7 months ago

Article

What makes good reasoning data

MiniMax-AI

•

Oct 30, 2025

• 45

Article

Aligning to What? Rethinking Agent Generalization in MiniMax M2

MiniMax-AI

•

Oct 30, 2025

• 43

upvoted a collection 8 months ago

Gemma 3 Release

Collection

28 items • Updated Mar 12 • 644

upvoted a collection 9 months ago

Qwen3Guard

Collection

7 items • Updated Dec 31, 2025 • 69

upvoted a paper about 1 year ago

Chain-of-Model Learning for Language Model

Paper • 2505.11820 • Published May 17, 2025 • 121

upvoted 2 collections about 1 year ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 43 items • Updated Mar 2 • 728

LiveBench

Collection

Datasets for LiveBench • 8 items • Updated Mar 31, 2025 • 15

upvoted a paper over 1 year ago

I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders

Paper • 2503.18878 • Published Mar 24, 2025 • 121

upvoted a collection over 1 year ago

DeepSeek-R1

Collection

10 items • Updated Nov 27, 2025 • 857

Roman Nekrasov

AI & ML interests

Recent Activity

Organizations

Rob1234567's activity

What makes good reasoning data

Aligning to What? Rethinking Agent Generalization in MiniMax M2