Qwen3-4B Agent SFT v9 (All Datasets + Optimized)
LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth. This repository contains LoRA adapter weights only.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Method: LoRA (full precision) + Unsloth
- Datasets: ALFWorld v2-v5 (deduplicated, EEF) + DBBench v1-v9 (deduplicated, 2x upsampled)
- Max sequence length: 4096
- Epochs: 1
- Learning rate: 2e-6
- LoRA: r=64, alpha=128
- Scheduler: cosine with warmup 10%
Sources & Terms
Dataset License: MIT License.
- Downloads last month
- -
Model tree for Chattso-GPT/adv-sft-v9
Base model
Qwen/Qwen3-4B-Instruct-2507