Qwen2.5-7B DB Bench Combined SFT (v1-v4)
This repository provides a merged full-weight model fine-tuned from Qwen2.5-7B-Instruct using LoRA + Unsloth, then merged to 16bit.
Training Objective
This model is trained to improve DB Bench (database operation) performance on the AgentBench evaluation benchmark. ALFWorld performance relies entirely on the base model's inherent capability (no ALFWorld training data used).
Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn SQL generation, action selection, and error recovery.
Training Data
- DB Bench v1 (u-10bei/dbbench_sft_dataset_react): ~750 samples
- DB Bench v2 (u-10bei/dbbench_sft_dataset_react_v2): ~750 samples
- DB Bench v3 (u-10bei/dbbench_sft_dataset_react_v3): ~750 samples
- DB Bench v4 (u-10bei/dbbench_sft_dataset_react_v4): ~750 samples
- Total: ~3,000 samples
- ALFWorld data intentionally excluded to preserve base model performance
Training Configuration
- Base model: Qwen/Qwen2.5-7B-Instruct
- Method: LoRA → merged to 16bit
- Max sequence length: 2048
- Epochs: 2
- Learning rate: 2e-6
- LoRA: r=64, alpha=128
- Batch size: 2, Gradient accumulation: 4 (effective batch 8)
- Optimizer: AdamW (cosine scheduler)
- Framework: Unsloth
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "koguma-ai/dbbench-combined-baseline0301"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
Sources & Terms
Training data: u-10bei/dbbench_sft_dataset_react (v1-v4)
Dataset License: Apache-2.0. Users must comply with the Apache-2.0 license and the base model's original terms of use.
Limitations
- Optimized for DB Bench tasks only
- ALFWorld performance relies on base model capability
- Weak categories: aggregation-MAX (16.7%), INSERT (33.3%)
- Downloads last month
- 51