Instructions to use akeakeaki/pytorch-rag-lora-synth-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use akeakeaki/pytorch-rag-lora-synth-v2 with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-Coder-7B-Instruct") model = PeftModel.from_pretrained(base_model, "akeakeaki/pytorch-rag-lora-synth-v2") - Notebooks
- Google Colab
- Kaggle
PyTorch-RAG LoRA adapter
LoRA adapter for Qwen/Qwen2.5-Coder-7B-Instruct, fine-tuned for a
retrieval-augmented QA system over PyTorch documentation + StackOverflow.
Part of an HSE university project on RAG over a private knowledge base.
Training summary
- Base model:
Qwen/Qwen2.5-Coder-7B-Instruct - Method: LoRA (PEFT), bf16, RAG-aware SFT
- Rank / alpha / dropout: 16 / 32 / 0.05
- Target modules:
down_proj,gate_proj,k_proj,o_proj,q_proj,up_proj,v_proj - Trainable params: ~40M (0.53% of 7.66B)
- Dataset: ~1.8k StackOverflow PyTorch Q&A pairs, each enriched with top-k retrieved documentation chunks as context, plus ~15% adversarial "cannot answer from context" examples.
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-Coder-7B-Instruct", dtype="bfloat16", device_map="auto"
)
model = PeftModel.from_pretrained(base, "akeakeaki/pytorch-rag-lora-synth-v2")
tok = AutoTokenizer.from_pretrained("akeakeaki/pytorch-rag-lora-synth-v2")
messages = [
{"role": "system", "content": "You are an expert PyTorch assistant. "
"Answer using ONLY the provided Context."},
{"role": "user", "content": "Context:\n<retrieved chunks>\n\nQuestion: <q>"},
]
inputs = tok.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
out = model.generate(inputs, max_new_tokens=400)
print(tok.decode(out[0][inputs.shape[1]:], skip_special_tokens=True))
Serving with vLLM
vllm serve Qwen/Qwen2.5-Coder-7B-Instruct \
--enable-lora \
--lora-modules pytorch-rag=akeakeaki/pytorch-rag-lora-synth-v2 \
--max-lora-rank 16
Note on results
This is a research artifact. In our evaluation the v1 adapter did not
outperform the base model on the RAG task (composite RAG score dropped vs
the base-vanilla baseline) — most likely due to a stylistic shift toward
terse StackOverflow answers and limited training data. See the project
report for the full analysis. Use as a baseline / starting point.
- Downloads last month
- 21