Qwen2.5-7B-Instruct Stack & Merge (DBBench × ALFWorld)

This repository provides a merged full model created by stacking and merging two LoRA adapters (DBBench adapter + ALFWorld adapter) into the base model unsloth/Qwen2.5-7B-Instruct.

Unlike adapter-only repositories, this repo contains the merged model weights (e.g., model.safetensors), so you can use it directly with transformers without loading separate LoRA adapters.

How This Model Was Built

  1. Load base model: unsloth/Qwen2.5-7B-Instruct
  2. Load two LoRA adapters:
    • DBBench LoRA
    • ALFWorld LoRA
  3. Create a weighted combined adapter via linear stacking:

[ \text{stacked} = \lambda \cdot \text{DB} + (1-\lambda) \cdot \text{ALF} ]

  1. Merge the stacked adapter into the base model (merge_and_unload) and save the merged weights.

Stacking Weights

  • DB weight (λ): 0.60
  • ALF weight (1−λ): 0.40

Training Objective

This merged model is intended to improve multi-turn agent task performance across:

  • ALFWorld: household embodied tasks (observation → action → observation)
  • DBBench: database operation tasks (iterative SQL drafting and correction)

Loss during adapter training is applied to assistant turns in multi-turn trajectories, enabling the model to learn: environment interpretation, action selection, tool use patterns, and recovery from failures.

Training Configuration (Adapters)

Both adapters were fine-tuned from the same base model and then merged.

  • Base model: unsloth/Qwen2.5-7B-Instruct
  • Method: LoRA (Unsloth)
  • Max sequence length: ALFWorld 3072, DBBench 768
  • Epochs: 1
  • Learning rate: 1e-05
  • LoRA: r=64, alpha=128
  • Target modules: q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Mountaingorillas/Qwen2.5-7B-Instruct-StackMerge-db0.60-alf0.40"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,  # A100/H100 recommended
    device_map="auto",
)

# Generate
prompt = "You are a helpful agent.\n"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Downloads last month
-
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Mountaingorillas/Qwen2.5-7B-Instruct-StackMerge-db0.60-alf0.40

Base model

Qwen/Qwen2.5-7B
Finetuned
(1680)
this model