qwen25_7b_agentbench_lora_trained

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen2.5-7B-Instruct using LoRA + Unsloth + Single-Phase Training.

Note: This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Objective

This adapter is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).

Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.

Single-Phase Training Strategy

Phase	Data	Epochs	LR	Purpose

Training Configuration

Parameter	Value
Base model	Qwen/Qwen2.5-7B-Instruct
Method	QLoRA (base + FP16 LoRA)
LoRA R / Alpha	64 / 128
RSLoRA	Enabled
Max seq length	4096
Optimizer	adamw_8bit
Gradient clip	1.0
LR scheduler	Cosine with warmup

Training Results

Phase	Final Train Loss	Time
Total		1.2h

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base = "Qwen/Qwen2.5-7B-Instruct"
adapter = "tomoniyukiwo/qwen25_7b_agentbench_lora_trained"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)

With Unsloth (faster)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="tomoniyukiwo/qwen25_7b_agentbench_lora_trained",
    max_seq_length=4096,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

Sources & Terms

Training data: u-10bei/dbbench_sft_dataset_react_v4
Dataset License: MIT License
Base model: Qwen/Qwen2.5-7B-Instruct
Compliance: Users must comply with the MIT license and the base model's original terms of use.

Downloads last month: 1

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for tomoniyukiwo/qwen25_7b_agentbench_lora_trained

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Adapter

(1894)

this model

tomoniyukiwo
/

qwen25_7b_agentbench_lora_trained