eoinedge's picture
Upload folder using huggingface_hub
e483531 verified
---
base_model: Qwen/Qwen2.5-Coder-7B-Instruct
library_name: peft
pipeline_tag: text-generation
license: apache-2.0
tags:
- lora
- qlora
- peft
- qwen2.5
- mcp
- edge-ai
- offline-rag
---
# EdgeAI Docs Qwen2.5 Coder 7B Instruct (LoRA Adapter)
This repository contains a **LoRA adapter** (not full model weights) trained for an offline Edge AI + MCP documentation assistant workflow.
Base model:
- `Qwen/Qwen2.5-Coder-7B-Instruct`
## Intended use
- Use this adapter with a local RAG pipeline.
- Keep retrieval output as the factual source.
- Use the adapter for response behavior: format, citation style, and grounded answering.
## Training summary
- Train examples: `115`
- Eval examples: `13`
- Max steps: `30`
- Precision/load strategy: `QLoRA 4-bit (NF4), bf16 compute`
- Final eval loss: `0.0641`
- Device: `cuda` (8GB VRAM class local GPU profile)
## Files
- `adapter_model.safetensors`: trained LoRA adapter weights
- `adapter_config.json`: PEFT adapter config
- `tokenizer.json`, `tokenizer_config.json`, `chat_template.jinja`: tokenizer/chat formatting assets
- `run_summary.json`, `trainer_train_metrics.json`, `training_args.bin`: training metadata/artifacts
## Quick start
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter_repo = "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct"
bnb = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_use_double_quant=True,
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
base_model,
quantization_config=bnb,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_repo)
tokenizer = AutoTokenizer.from_pretrained(base_model)
```
## Notes
- This adapter is optimized for docs-assistant behavior, not as a standalone factual memory.
- For best results, pair with MCP tools + document retrieval context.