Upload folder using huggingface_hub

e483531 verified 6 days ago

2 kB

base_model: Qwen/Qwen2.5-Coder-7B-Instruct
library_name: peft
pipeline_tag: text-generation
license: apache-2.0
tags:
  - lora
  - qlora
  - peft
  - qwen2.5
  - mcp
  - edge-ai
  - offline-rag

EdgeAI Docs Qwen2.5 Coder 7B Instruct (LoRA Adapter)

This repository contains a LoRA adapter (not full model weights) trained for an offline Edge AI + MCP documentation assistant workflow.

Base model:

Qwen/Qwen2.5-Coder-7B-Instruct

Intended use

Use this adapter with a local RAG pipeline.
Keep retrieval output as the factual source.
Use the adapter for response behavior: format, citation style, and grounded answering.

Training summary

Train examples: 115
Eval examples: 13
Max steps: 30
Precision/load strategy: QLoRA 4-bit (NF4), bf16 compute
Final eval loss: 0.0641
Device: cuda (8GB VRAM class local GPU profile)

Files

adapter_model.safetensors: trained LoRA adapter weights
adapter_config.json: PEFT adapter config
tokenizer.json, tokenizer_config.json, chat_template.jinja: tokenizer/chat formatting assets
run_summary.json, trainer_train_metrics.json, training_args.bin: training metadata/artifacts

Quick start

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base_model = "Qwen/Qwen2.5-Coder-7B-Instruct"
adapter_repo = "eoinedge/EdgeAI-Docs-Qwen2.5-Coder-7B-Instruct"

bnb = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    quantization_config=bnb,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter_repo)
tokenizer = AutoTokenizer.from_pretrained(base_model)

Notes

This adapter is optimized for docs-assistant behavior, not as a standalone factual memory.
For best results, pair with MCP tools + document retrieval context.