Instructions to use eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct") model = PeftModel.from_pretrained(base_model, "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct") - Notebooks
- Google Colab
- Kaggle
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct")EdgeAI Docs Qwen2.5 Coder 7B Instruct (LoRA Adapter)
This repository contains a LoRA adapter (not full model weights) trained for an offline Edge AI + MCP documentation assistant workflow.
Base model:
Qwen/Qwen2.5-Coder-7B-Instruct
Intended use
- Use this adapter with a local RAG pipeline.
- Keep retrieval output as the factual source.
- Use the adapter for response behavior: format, citation style, and grounded answering.
Training summary
- Train examples:
115 - Eval examples:
13 - Max steps:
30 - Precision/load strategy:
QLoRA 4-bit (NF4), bf16 compute - Final eval loss:
0.0641 - Device:
cuda(8GB VRAM class local GPU profile)
Files
adapter_model.safetensors: trained LoRA adapter weightsadapter_config.json: PEFT adapter configtokenizer.json,tokenizer_config.json,chat_template.jinja: tokenizer/chat formatting assetsrun_summary.json,trainer_train_metrics.json,training_args.bin: training metadata/artifacts
Quick start
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter_repo = "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_repo)
tokenizer = AutoTokenizer.from_pretrained(base_model)
Notes
- This adapter is optimized for docs-assistant behavior, not as a standalone factual memory.
- For best results, pair with MCP tools + document retrieval context.
- Downloads last month
- -
# Gated model: Login with a HF token with gated access permission hf auth login