--- base_model: unsloth/Qwen2.5-Coder-32B-Instruct-bnb-4bit tags: - text-generation-inference - transformers - unsloth - qwen2 - trl license: apache-2.0 language: - en --- # Qwen2.5-Coder-32B-Instruct-WMX Pre-fine-tuned LoRA adapters for unsloth/Qwen2.5-Coder-32B-Instruct. **This lora adapters have been fine-tuned for WMX services using the folowing datasets.** - https://huggingface.co/datasets/Jake5/movensys-info - https://huggingface.co/datasets/Jake5/wmx-doc-user - https://huggingface.co/datasets/Jake5/wmx-doc-robot ## Version v0.9 - Source: lora_model - Base model: unsloth/Qwen2.5-Coder-32B-Instruct - Uploaded on: 2025-09-12 ## Usage ```python from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen2.5-Coder-32B-Instruct") model = PeftModel.from_pretrained(base_model, "Jake5/Qwen2.5-Coder-32B-Instruct-WMX", subfolder="adapters_v0.9") tokenizer = AutoTokenizer.from_pretrained("Jake5/Qwen2.5-Coder-32B-Instruct-WMX", subfolder="adapters_v0.9") ``` ## vLLM Serving ```bash python -m vllm.entrypoints.openai.api_server \ --model unsloth/Qwen2.5-Coder-32B-Instruct \ --lora-modules my-lora=Jake5/Qwen2.5-Coder-32B-Instruct-WMX/adapters_v0.9 \ --dtype bfloat16 \ --port 8000 ```