OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models Paper • 2411.04905 • Published Nov 7, 2024 • 127
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7, 2024 • 25
X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests Paper • 2601.06953 • Published 15 days ago • 43
MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code Paper • 2410.08196 • Published Oct 10, 2024 • 48
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 64
How Programming Concepts and Neurons Are Shared in Code Language Models Paper • 2506.01074 • Published Jun 1, 2025 • 4
EpiCoder: Encompassing Diversity and Complexity in Code Generation Paper • 2501.04694 • Published Jan 8, 2025 • 16
IterPref: Focal Preference Learning for Code Generation via Iterative Debugging Paper • 2503.02783 • Published Mar 4, 2025 • 7
SynthCoder: A Synthetical Strategy to Tune LLMs for Code Completion Paper • 2508.15495 • Published Aug 21, 2025 • 1
Increasing LLM Coding Capabilities through Diverse Synthetic Coding Tasks Paper • 2510.23208 • Published Oct 27, 2025 • 1
Can Programming Languages Boost Each Other via Instruction Tuning? Paper • 2308.16824 • Published Aug 31, 2023 • 12
Idea First, Code Later: Disentangling Problem Solving from Code Generation in Evaluating LLMs for Competitive Programming Paper • 2601.11332 • Published 10 days ago • 1
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24, 2025 • 53
SoTaNa: The Open-Source Software Development Assistant Paper • 2308.13416 • Published Aug 25, 2023 • 13
OpenCodeInstruct: A Large-scale Instruction Tuning Dataset for Code LLMs Paper • 2504.04030 • Published Apr 5, 2025 • 1