Neloy262/rust_instruction_dataset
Viewer • Updated • 10k • 56 • 3
LoRA fine-tuned version of Qwen2.5-Coder-14B-Instruct specifically optimized for Rust code generation. This model significantly improves Rust syntax understanding and generates 100% Rust code compared to the base model which sometimes generates Python/C++ code. Trained with Q-LoRA (4-bit quantization) on RTX 3090, achieving final loss of 0.5738.
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
base_model = "Qwen/Qwen2.5-Coder-14B-Instruct"
model = AutoModelForCausalLM.from_pretrained(base_model, quantization_config=bnb_config, device_map="auto")
model = PeftModel.from_pretrained(model, "huaiwuai/Qwen2.5-Coder-14B-Instruct-Rust-LoRA")
tokenizer = AutoTokenizer.from_pretrained(base_model)
Base model
Qwen/Qwen2.5-14B