Qwen2.5-7B-Instruct Stack & Merge (DBBench × ALFWorld)
This repository provides a merged full model created by stacking and merging two LoRA adapters (DBBench adapter + ALFWorld adapter) into the base model unsloth/Qwen2.5-7B-Instruct.
Unlike adapter-only repositories, this repo contains the merged model weights (e.g., model.safetensors),
so you can use it directly with transformers without loading separate LoRA adapters.
How This Model Was Built
- Load base model:
unsloth/Qwen2.5-7B-Instruct - Load two LoRA adapters:
- DBBench LoRA
- ALFWorld LoRA
- Create a weighted combined adapter via linear stacking:
[ \text{stacked} = \lambda \cdot \text{DB} + (1-\lambda) \cdot \text{ALF} ]
- Merge the stacked adapter into the base model (
merge_and_unload) and save the merged weights.
Stacking Weights
- DB weight (λ): 0.60
- ALF weight (1−λ): 0.40
Training Objective
This merged model is intended to improve multi-turn agent task performance across:
- ALFWorld: household embodied tasks (observation → action → observation)
- DBBench: database operation tasks (iterative SQL drafting and correction)
Loss during adapter training is applied to assistant turns in multi-turn trajectories, enabling the model to learn: environment interpretation, action selection, tool use patterns, and recovery from failures.
Training Configuration (Adapters)
Both adapters were fine-tuned from the same base model and then merged.
- Base model:
unsloth/Qwen2.5-7B-Instruct - Method: LoRA (Unsloth)
- Max sequence length: ALFWorld 3072, DBBench 768
- Epochs: 1
- Learning rate: 1e-05
- LoRA: r=64, alpha=128
- Target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Mountaingorillas/Qwen2.5-7B-Instruct-StackMerge-db0.60-alf0.40"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16, # A100/H100 recommended
device_map="auto",
)
# Generate
prompt = "You are a helpful agent.\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))
- Downloads last month
- -