JSCPPProgrammer's picture
Initial: GenSearcher workflow + FireRed /generate adapter + Gradio
80b7188 verified

DeepCoder Programming Agent Example

This example demonstrates training and running DeepCoder, a code reasoning LLM fine-tuned from DeepSeek-R1-Distill-Qwen-14B on coding competition problems with RL. The model achieves 60.6% Pass@1 accuracy on LiveCodeBench v5, representing an 8% improvement over the base model.

Overview

The DeepCoder examples demonstrate:

  • How to use rLLM's CompetitionCodingAgent for programming tasks
  • How to train agents with iterative context lengthening (16K -> 32K)
  • How to evaluate coding performance on LiveCodeBench

Quick Start

Setup Coding Data

First, prepare your coding datasets:

cd examples/deepcoder
python prepare_deepcoder_data.py

Model Hosting

Start a model server (choose one option):

Option 1: Using vLLM

python -m vllm.entrypoints.openai.api_server \
    --model agentica-org/DeepCoder-14B-Preview \
    --host 0.0.0.0 \
    --port 30000 \
    --dtype bfloat16 \
    --max-model-len 32768

Option 2: Using SGLang

python -m sglang_router.launch_server \
    --model-path agentica-org/DeepCoder-14B-Preview \ 
    --dp-size 1 \
    --dtype bfloat16

Run DeepCoder Agent

Execute the coding agent for evaluation:

python evaluate_deepcoder.py

Train DeepCoder Agent

Train your own DeepCoder agent with iterative context lengthening:

# Train with 16K context
bash train_deepcoder_16k.sh

# Train with 32K context (modify MODEL_PATH to 16k checkpoint)
bash train_deepcoder_32k.sh

Code Reference

Code Agent Evaluator

Main script for evaluating coding performance:

--8<-- "examples/deepcoder/run_deepcoder.py"

Training Script

DeepCoder training configuration:

--8<-- "examples/deepcoder/train_deepcoder.py"

For detailed setup instructions, see the README in the deepcoder example directory.