File size: 2,500 Bytes
80b7188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
# DeepCoder Training Examples

This directory contains examples for training and running DeepCoder, a code reasoning LLM fine-tuned from DeepSeek-R1-Distill-Qwen-14B using distributed reinforcement learning (RL). 

Our examples uses the following:
* DeepSeek-R1-Distill-Qwen-14B as the base model
* agentica-org/DeepCoder-Preview-Dataset (lcbv5 subset) for training and evaluation



## Model Hosting

### Option 1: Using vLLM

Start a vLLM server with OpenAI-compatible API:

```bash

python -m vllm.entrypoints.openai.api_server \

    --model agentica-org/DeepCoder-14B-Preview \

    --host 0.0.0.0 \

    --port 30000 \

    --dtype bfloat16 \

    --max-model-len 65536

```

### Option 2: Using SGLang

```bash

python -m sglang_router.launch_server \

    --model-path agentica-org/DeepCoder-14B-Preview \ 

    --dp-size 1 \

    --dtype bfloat16

# increase dp_size to enable data-parallel processing on multi-GPU 

```

The server should be accessible at `http://localhost:30000/v1`

## Dataset Preparation

Prepare the DeepCoder Preview Dataset:

```bash

cd examples/deepcoder

python prepare_deepcoder_data.py

```

This will:
- Download the agentica-org/DeepCoder-Preview-Dataset (lcbv5 subset)
- Register both train/test splits with the RLLM DatasetRegistry

## Running Inference

Once your model server is running and datasets are prepared, you can run inference:

```bash

cd examples/deepcoder

python run_deepcoder.py

```

### Configuration Options

You can modify the inference script parameters:

- `n_parallel_agents`: Number of parallel agents (default: 64)
- `model_name`: Model to use (default: "agentica-org/DeepCoder-14B-Preview")
- `base_url`: API server URL (default: "http://localhost:30000/v1")
- `max_response_length`: Maximum response length (default: 64000)
- `max_prompt_length`: Maximum prompt length (default: 2048)
- `temperature`: Sampling temperature (default: 0.6)
- `top_p`: Top-p sampling (default: 0.95)

The script will:
1. Load the DeepCoder Preview test dataset
2. Run parallel and async trajectory collection using the agent execution engine
3. Evaluate results and report accuracy metrics

## Training

### Basic Training

To train DeepCoder with iterative context lengthening (16K -> 32K -> 64K):

```bash

bash examples/deepcoder/train_deepcoder_16k.sh



# modify MODEL_PATH to the 16k checkpoint path before running the script.

bash examples/deepcoder/train_deepcoder_32k.sh

```