OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models
Paper
• 2411.04905
• Published
• 127
Granite Code Models: A Family of Open Foundation Models for Code
Intelligence
Paper
• 2405.04324
• Published
• 26
Seed-Coder: Let the Code Model Curate Data for Itself
Paper
• 2506.03524
• Published
• 6
Qwen2.5-Coder Technical Report
Paper
• 2409.12186
• Published
• 153
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests
Paper
• 2601.06953
• Published
• 45
MathCoder2: Better Math Reasoning from Continued Pretraining on
Model-translated Mathematical Code
Paper
• 2410.08196
• Published
• 48
Programming Every Example: Lifting Pre-training Data Quality like
Experts at Scale
Paper
• 2409.17115
• Published
• 64
Competitive Programming with Large Reasoning Models
Paper
• 2502.06807
• Published
• 69
How Programming Concepts and Neurons Are Shared in Code Language Models
Paper
• 2506.01074
• Published
• 4
Multi-Programming Language Sandbox for LLMs
Paper
• 2410.23074
• Published
codefuse-ai/CodeExercise-Python-27k
Updated
• 1.02k
• 67
Viewer
• Updated
• 887k • 1.48k
• 11
agentica-org/DeepCoder-Preview-Dataset
Viewer
• Updated
• 25k • 2.68k
• 99
nvidia/Nemotron-Competitive-Programming-v1
Preview
• Updated
• 1.59k
• 22
inclusionAI/Ling-Coder-SFT
Viewer
• Updated
• 4.48M • 567
• 37
allenai/Dolci-RL-Zero-Code-7B
Viewer
• Updated
• 13.3k • 163
• 10
Viewer
• Updated
• 49.6k • 3.58k
• 170
ByteDance-Seed/Code-Contests-Plus
Viewer
• Updated
• 49.2k • 5.38k
• 60
theblackcat102/evol-code-zh
Viewer
• Updated
• 10.3k • 28
• 11
microsoft/NextCoderDataset
Viewer
• Updated
• 381k • 668
• 54
RLVR-SvS/Variational-DAPO
Viewer
• Updated
• 314k • 26
• 3
Viewer
• Updated
• 1.35M • 101
• 4
Fate-Zero/ArcherCodeR-Dataset
Updated
• 177
• 2
nvidia/OpenCodeGeneticInstruct
Viewer
• Updated
• 15.1M • 275
• 20
Viewer
• Updated
• 4.97M • 2.8k
• 65
microsoft/EpiCoder-func-380k
Viewer
• Updated
• 380k • 38
• 29
EpiCoder: Encompassing Diversity and Complexity in Code Generation
Paper
• 2501.04694
• Published
• 18
IterPref: Focal Preference Learning for Code Generation via Iterative
Debugging
Paper
• 2503.02783
• Published
• 7
SynthCoder: A Synthetical Strategy to Tune LLMs for Code Completion
Paper
• 2508.15495
• Published
• 1
Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks
Paper
• 2510.23208
• Published
• 1
AutoML-org/SyntheticCode-800K
Viewer
• Updated
• 792k • 15
• 3
Viewer
• Updated
• 80k • 25
• 14
Can Programming Languages Boost Each Other via Instruction Tuning?
Paper
• 2308.16824
• Published
• 12
Idea First, Code Later: Disentangling Problem Solving from Code Generation in Evaluating LLMs for Competitive Programming
Paper
• 2601.11332
• Published
• 1
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in
LLMs
Paper
• 2506.19290
• Published
• 53
SoTaNa: The Open-Source Software Development Assistant
Paper
• 2308.13416
• Published
• 14
OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs
Paper
• 2504.04030
• Published
• 3