2 8 1

Egor Bogomolov

egor-bogomolov

AI & ML interests

None yet

Recent Activity

updated a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-traces

published a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-traces

updated a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-benchmark

View all activity

Organizations

updated a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-traces

Viewer • Updated 2 days ago • 435 • 21

published a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-traces

Viewer • Updated 2 days ago • 435 • 21

updated a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-benchmark

Viewer • Updated 2 days ago • 435 • 21

published a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-benchmark

Viewer • Updated 2 days ago • 435 • 21

updated a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-environments

Viewer • Updated 2 days ago • 217 • 22

published a dataset 2 days ago

JetBrains-Research/cwm-benchmarks-dl4c-environments

Viewer • Updated 2 days ago • 217 • 22

updated a Space 3 months ago

ML4SE Benchmark Viewer

📊

Explore ML4SE benchmark problems with filters and search

upvoted a paper 3 months ago

SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale

Paper • 2602.23866 • Published Feb 27 • 89

published a Space 3 months ago

ML4SE Benchmark Viewer

📊

Explore ML4SE benchmark problems with filters and search

updated a dataset 3 months ago

JetBrains-Research/REval

Viewer • Updated Mar 2 • 1.7k • 62 • 1

published a dataset 3 months ago

JetBrains-Research/REval

Viewer • Updated Mar 2 • 1.7k • 62 • 1

updated a Space 3 months ago

Long Code Arena

🏟

View model performance leaderboards for various tasks

authored a paper 7 months ago

The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation

Paper • 2510.23393 • Published Oct 27, 2025 • 21

upvoted a paper 7 months ago

The Best of N Worlds: Aligning Reinforcement Learning with Best-of-N Sampling via max@k Optimisation

Paper • 2510.23393 • Published Oct 27, 2025 • 21

authored 2 papers 7 months ago

The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management

Paper • 2508.21433 • Published Aug 29, 2025 • 8

Diff-XYZ: A Benchmark for Evaluating Diff Understanding

Paper • 2510.12487 • Published Oct 14, 2025 • 9

upvoted 2 papers 7 months ago

The Complexity Trap: Simple Observation Masking Is as Efficient as LLM Summarization for Agent Context Management

Paper • 2508.21433 • Published Aug 29, 2025 • 8

Diff-XYZ: A Benchmark for Evaluating Diff Understanding

Paper • 2510.12487 • Published Oct 14, 2025 • 9

authored 2 papers 8 months ago

Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code

Paper • 2412.02764 • Published Dec 3, 2024

EnvBench: A Benchmark for Automated Environment Setup

Paper • 2503.14443 • Published Mar 18, 2025 • 1

Egor Bogomolov

AI & ML interests

Recent Activity

Organizations

egor-bogomolov's activity

ML4SE Benchmark Viewer

ML4SE Benchmark Viewer

Long Code Arena