qwen25_7b_agentbench_lora_trained

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen2.5-7B-Instruct using LoRA + Unsloth + Single-Phase Training.

Note: This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).

Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.

Single-Phase Training Strategy

Phase Data Epochs LR Purpose

Training Configuration

Parameter Value
Base model Qwen/Qwen2.5-7B-Instruct
Method QLoRA (base + FP16 LoRA)
LoRA R / Alpha 64 / 128
RSLoRA Enabled
Max seq length 4096
Optimizer adamw_8bit
Gradient clip 1.0
LR scheduler Cosine with warmup

Training Results

Phase Final Train Loss Time
Total 1.2h

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen2.5-7B-Instruct"
adapter = "tomoniyukiwo/qwen25_7b_agentbench_lora_trained"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

With Unsloth (faster)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="tomoniyukiwo/qwen25_7b_agentbench_lora_trained",
    max_seq_length=4096,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Sources & Terms

Downloads last month
1
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tomoniyukiwo/qwen25_7b_agentbench_lora_trained

Base model

Qwen/Qwen2.5-7B
Adapter
(1894)
this model

Datasets used to train tomoniyukiwo/qwen25_7b_agentbench_lora_trained