You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

log_copilot_32b_with_rag

A log-analysis copilot built on Qwen/QwQ-32B via SFT on RAG-style data, intended for log triage, troubleshooting, and root-cause analysis.

Create a virtual environment and install the model

python3.12 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt

Login huggingface

  hf auth login

Install model

hf download bmwlab-ntust/log_copilot_32b_with_rag --local-dir /home/dgx/johnson/models/log_copilot_32b

Start API Server

python3 -m vllm.entrypoints.openai.api_server \
    --model /home/dgx/johnson/models/log_copilot_32b \
    --max-model-len 4096 \
    --dtype bfloat16 \
    --port 8000
    --gpu-memory-utilization 0.8

Testing API

curl http://localhost:8000/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "/home/dgx/johnson/models/log_copilot_32b",
        "messages": [
            {"role": "user", "content": "Here is a snippet of OAI gNB error log: [ERROR] RRC Connection Reestablishment Reject. Which parameter configuration might be misconfigured, and how can I fix the gNB configuration?"}
        ],
        "temperature": 0.2,
        "max_tokens": 1024
    }'

Monitoring GPU

watch -n 2 nvidia-smi
Downloads last month
6
Safetensors
Model size
33B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bmwlab-ntust/log_copilot_32b_with_rag

Base model

Qwen/Qwen2.5-32B
Finetuned
Qwen/QwQ-32B
Finetuned
(97)
this model