helixql-nl2hql

This repository contains a PEFT LoRA adapter for Qwen/Qwen2.5-0.5B-Instruct.

Model Details

Parameter Value
Base model Qwen/Qwen2.5-0.5B-Instruct
Adapter repo Tranium/helixql-nl2hql
Training type qlora
Strategy chain SFT (3ep)
Dataset source local
Datasets train.jsonl, eval.jsonl
Batch Size 8
LoRA r 16
LoRA alpha 32
LoRA dropout 0.05
LoRA bias none
Target modules all-linear
DoRA False
rsLoRA False
Init gaussian

Training Details

Hyperparameter Value
epochs 3
learning_rate 0.0002
warmup_ratio 0.05
per_device_train_batch_size 8
gradient_accumulation_steps 4
effective_batch_size 32
optimizer paged_adamw_8bit
lr_scheduler cosine

Training Results

Run timeline

  • Started at: 2026-03-24T16:53:10.196819
  • Completed at: 2026-03-24T16:53:43.045308
Phase Strategy Status train_loss eval_loss global_step epoch runtime_s peak_mem_gb
0 sft completed 2.1116 — 12 3.00 30.5 3.75

Usage

Load as a PEFT adapter (recommended)

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model_id = "Qwen/Qwen2.5-0.5B-Instruct"
adapter_id = "Tranium/helixql-nl2hql"

tokenizer = AutoTokenizer.from_pretrained(adapter_id, trust_remote_code=False)
model = AutoModelForCausalLM.from_pretrained(base_model_id, device_map="auto", torch_dtype="auto", trust_remote_code=False)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()

prompt = "Hello!"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Merge adapter into base model (optional)

merged = model.merge_and_unload()
merged.save_pretrained("merged-model")
tokenizer.save_pretrained("merged-model")

Training Infrastructure

  • Platform: runpod
  • GPU: NVIDIA RTX A4000
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Tranium/helixql-nl2hql

Adapter
(450)
this model