EdgeAI Docs Qwen2.5 Coder 7B Instruct (LoRA Adapter)
This repository contains a LoRA adapter (not full model weights) trained for an offline Edge AI + MCP documentation assistant workflow.
Base model:
Qwen/Qwen2.5-Coder-7B-Instruct
Intended use
- Use this adapter with a local RAG pipeline.
- Keep retrieval output as the factual source.
- Use the adapter for response behavior: format, citation style, and grounded answering.
Training summary
- Train examples:
115 - Eval examples:
13 - Max steps:
30 - Precision/load strategy:
QLoRA 4-bit (NF4), bf16 compute - Final eval loss:
0.0641 - Device:
cuda(8GB VRAM class local GPU profile)
Files
adapter_model.safetensors: trained LoRA adapter weightsadapter_config.json: PEFT adapter configtokenizer.json,tokenizer_config.json,chat_template.jinja: tokenizer/chat formatting assetsrun_summary.json,trainer_train_metrics.json,training_args.bin: training metadata/artifacts
Quick start
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter_repo = "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_repo)
tokenizer = AutoTokenizer.from_pretrained(base_model)
Notes
- This adapter is optimized for docs-assistant behavior, not as a standalone factual memory.
- For best results, pair with MCP tools + document retrieval context.
- Downloads last month
- 22