Qwen3-4B Agent SFT v7 (All Datasets + Optimized)

LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth. This repository contains LoRA adapter weights only.

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Method: LoRA (full precision) + Unsloth
  • Datasets: ALFWorld v2-v5 (deduplicated, EEF) + DBBench v1-v7 (deduplicated, 2x upsampled)
  • Max sequence length: 4096
  • Epochs: 1
  • Learning rate: 2e-6
  • LoRA: r=64, alpha=128
  • Scheduler: cosine with warmup 10%

Sources & Terms

Dataset License: MIT License.

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chattso-GPT/adv-sft-v7

Adapter
(5228)
this model

Datasets used to train Chattso-GPT/adv-sft-v7