Qwen2.5-7B-Instruct Stack & Merge (DBBench × ALFWorld)

This repository provides a merged full model created by stacking and merging two LoRA adapters (DBBench adapter + ALFWorld adapter) into the base model unsloth/Qwen2.5-7B-Instruct.

Unlike adapter-only repositories, this repo contains the merged model weights (e.g., model.safetensors), so you can use it directly with transformers without loading separate LoRA adapters.

How This Model Was Built

Load base model: unsloth/Qwen2.5-7B-Instruct
Load two LoRA adapters:
- DBBench LoRA
- ALFWorld LoRA
Create a weighted combined adapter via linear stacking:

[ \text{stacked} = \lambda \cdot \text{DB} + (1-\lambda) \cdot \text{ALF} ]

Merge the stacked adapter into the base model (merge_and_unload) and save the merged weights.

Stacking Weights

DB weight (λ): 0.60
ALF weight (1−λ): 0.40

Training Objective

This merged model is intended to improve multi-turn agent task performance across:

ALFWorld: household embodied tasks (observation → action → observation)
DBBench: database operation tasks (iterative SQL drafting and correction)

Loss during adapter training is applied to assistant turns in multi-turn trajectories, enabling the model to learn: environment interpretation, action selection, tool use patterns, and recovery from failures.

Training Configuration (Adapters)

Both adapters were fine-tuned from the same base model and then merged.

Base model: unsloth/Qwen2.5-7B-Instruct
Method: LoRA (Unsloth)
Max sequence length: ALFWorld 3072, DBBench 768
Epochs: 1
Learning rate: 1e-05
LoRA: r=64, alpha=128
Target modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Mountaingorillas/Qwen2.5-7B-Instruct-StackMerge-db0.60-alf0.40"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,  # A100/H100 recommended
    device_map="auto",
)

# Generate
prompt = "You are a helpful agent.\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Downloads last month: -

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for Mountaingorillas/Qwen2.5-7B-Instruct-StackMerge-db0.60-alf0.40

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

unsloth/Qwen2.5-7B-Instruct

Finetuned

(1680)

this model