PRDjudge / README.md

Milo0007

Duplicate from southalone/PRDjudge

2ceadbf about 1 month ago

preview code

raw

history blame contribute delete

1.98 kB

metadata

license: apache-2.0
datasets:
  - AGI-Eval/PRDbench
language:
  - en
  - zh
metrics:
  - accuracy
  - code_eval
base_model:
  - Qwen/Qwen3-Coder-30B-A3B-Instruct
pipeline_tag: text-generation
tags:
  - code

Model Deployment & Evaluation Guide

1. Deploy Model (vLLM OpenAI-Compatible Server)

Use vLLM to deploy a HuggingFace model as an OpenAI-compatible API server.

Basic Command

python -m vllm.entrypoints.openai.api_server \
  --model <path to PRDJudge model> \
  --served-model-name PRDJudge \
  --port 8004 <you can change to other port> \
  --enable-auto-tool-choice \
  --tool-call-parser qwen3_coder \
  --tensor-parallel-size <your GPU card number>

Once deployed, the service endpoint will be:

Local access: http://localhost:8004/v1
Cross-machine access: http://<server_ip>:8004/v1

You can verify the deployment with the following command:

curl http://localhost:8004/v1/models

2. Configure the Deployed Model in Minimal Agent-Eval

Follow our ADK-based Agent config.

Edit EvalAgent/code_eval_agent/config.py and add your model configuration:

"your_model_name": LiteLlmWithSleep(
    model="openai/PRDJudge",        # Model name loaded by vLLM, must match the --model parameter
    api_base="http://<server_ip>:8004/v1",         # vLLM service URL, default to use localhost when deployed locally
    api_key="EMPTY",                               # vLLM does not require an API key by default, use "EMPTY"
    max_tokens_threshold=64000,
    enable_compression=True,
    temperature=0.1
)

Note: The model field must include the openai/ prefix — this is the LiteLLM routing format for OpenAI-compatible endpoints. <your_vllm_model_name> should match the model name from the vLLM --model parameter (you can verify via curl http://<server_ip>:8004/v1/models).