qwen25_7b_agentbench_lora_trained
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen2.5-7B-Instruct using LoRA + Unsloth + Single-Phase Training.
Note: This repository contains LoRA adapter weights only. The base model must be loaded separately.
Training Objective
This adapter is trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).
Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.
Single-Phase Training Strategy
| Phase | Data | Epochs | LR | Purpose |
|---|
Training Configuration
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen2.5-7B-Instruct |
| Method | QLoRA (base + FP16 LoRA) |
| LoRA R / Alpha | 64 / 128 |
| RSLoRA | Enabled |
| Max seq length | 4096 |
| Optimizer | adamw_8bit |
| Gradient clip | 1.0 |
| LR scheduler | Cosine with warmup |
Training Results
| Phase | Final Train Loss | Time |
|---|---|---|
| Total | 1.2h |
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen2.5-7B-Instruct"
adapter = "tomoniyukiwo/qwen25_7b_agentbench_lora_trained"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
With Unsloth (faster)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="tomoniyukiwo/qwen25_7b_agentbench_lora_trained",
max_seq_length=4096,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
Sources & Terms
- Training data: u-10bei/dbbench_sft_dataset_react_v4
- Dataset License: MIT License
- Base model: Qwen/Qwen2.5-7B-Instruct
- Compliance: Users must comply with the MIT license and the base model's original terms of use.
- Downloads last month
- 1