Instructions to use Leopo1d/OpenVul-Qwen3-4B-ORPO with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Leopo1d/OpenVul-Qwen3-4B-ORPO with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Leopo1d/OpenVul-Qwen3-4B-ORPO")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Leopo1d/OpenVul-Qwen3-4B-ORPO")
model = AutoModelForCausalLM.from_pretrained("Leopo1d/OpenVul-Qwen3-4B-ORPO")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Leopo1d/OpenVul-Qwen3-4B-ORPO with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Leopo1d/OpenVul-Qwen3-4B-ORPO"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leopo1d/OpenVul-Qwen3-4B-ORPO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Leopo1d/OpenVul-Qwen3-4B-ORPO

SGLang

How to use Leopo1d/OpenVul-Qwen3-4B-ORPO with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Leopo1d/OpenVul-Qwen3-4B-ORPO" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leopo1d/OpenVul-Qwen3-4B-ORPO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Leopo1d/OpenVul-Qwen3-4B-ORPO" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Leopo1d/OpenVul-Qwen3-4B-ORPO",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Leopo1d/OpenVul-Qwen3-4B-ORPO with Docker Model Runner:
```
docker model run hf.co/Leopo1d/OpenVul-Qwen3-4B-ORPO
```

OpenVul-Qwen3-4B-ORPO

OpenVul-Qwen3-4B-ORPO, post-trained from OpenVul-Qwen3-4B-SFT-ep5, serves as an advanced vulnerability detection LLM optimized to distinguish between vulnerable code and its patched counterparts without reference and reward models.

📚 Data Curation:

Trained on Paired CoTs sampled directly from the SFT LLM to minimize distribution shift.

💡 Key Feature:

Focuses on context-level vulnerability detection, utilizing inter-procedural contexts (global variables, type definitions, callee functions etc.) rather than isolated functions.

🔗 Related Links:

📄 Prompt Template (RECOMMENDED!):

We recommend to use vLLM for inference. Please set enable_thinking=True, n=8, repetition_penalty=1.0, temperature=0.6, top_p=0.95, top_k=20, min_p=0, max_tokens=32768. More details can be found in code.

System Prompt

You are a vulnerability detection expert specializing in identifying security flaws in C/C++ code, with a focus on Common Weakness Enumeration (CWE) standards. You provide precise, evidence-based analysis without speculation, and clearly label any vulnerabilities you detect.

User Prompt

Your task is to evaluate whether the following C/C++ code contains any security vulnerabilities.

You will be provided with two sections:
1. Context: Relevant code such as includes, type definitions, global variables, macros, and definitions of any functions called within the target function.
2. Code: The target function to analyze.

Use all available information to analyze the function step by step.
If the target function alone is insufficient to determine whether a vulnerability exists, refer to the Context section before making a judgment.
Do not assume vulnerabilities — only report what is supported by the code and context.

In your final response, list all detected vulnerabilities and CWE identifiers if applicable.
Conclude with one of the following indicators on a new line:
- HAS_VUL — if any vulnerabilities are found
- NO_VUL — if no vulnerabilities are found

```Context
{Context}
```
```Code
File: {Located File}
Method: {Function Name}
----------------------------------------
{Target Function}
```

Analyze the code now.

📎 Citation:

@misc{li2026sftrldemystifyingposttraining,
      title={From SFT to RL: Demystifying the Post-Training Pipeline for LLM-based Vulnerability Detection}, 
      author={Youpeng Li and Fuxun Yu and Xinda Wang},
      year={2026},
      eprint={2602.14012},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2602.14012}, 
}