u-10bei/sft_alfworld_trajectory_dataset_v5
Viewer • Updated • 2.5k • 898
How to use UtsuSl0th/mixed-lora-1 with PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("unsloth/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "UtsuSl0th/mixed-lora-1")This repository provides a merged model fine-tuned from unsloth/Qwen2.5-7B-Instruct using LoRA + Unsloth.
Training data was built by mixing and preprocessing two trajectory datasets:
u-10bei/sft_alfworld_trajectory_dataset_v5): 2,327 samples after cleaningu-10bei/dbbench_sft_dataset_react_v4): 1,200 samples after cleaningCategory-level upsampling was applied to reinforce weak task types:
| Category | Multiplier |
|---|---|
| ALFWorld multi-object | ×3 |
| ALFWorld cool | ×2 |
| ALFWorld examine | ×1.5 |
| DBBench aggregation-MAX | ×3 |
| DBBench INSERT | ×2 |
| DBBench counting | ×2 |
Final dataset size: 5,169 samples
| Parameter | Value |
|---|---|
| Base model | unsloth/Qwen2.5-7B-Instruct |
| Method | LoRA + Unsloth (Colab Pro A100) |
| Max sequence length | 4096 |
| Epochs | 3 |
| Learning rate | 8e-6 |
| LoRA r / alpha | 64 / 128 |
| Effective batch size | 16 (bs=4 × grad_accum=4) |
Dataset License: MIT License. Users must comply with the MIT license and the base model's original terms of use.