οΌγθͺ²ι‘γγγγ―θͺεγ§θ¨ε ₯γγ¦δΈγγοΌ
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth.
This repository contains LoRA adapter weights only. The base model must be loaded separately.
Training Configuration
- Base model: Qwen/Qwen3-4B-Instruct-2507
- Max sequence length: 2048
- Epochs: 2
- Learning rate: 2e-06
- LoRA: r=64, alpha=128
Sources & Terms (IMPORTANT)
Training data (YAML): u-10bei/sft_alfworld_trajectory_dataset_v5 Training data (actually used by training code): /content/mix_alf_v5_dbweak30.jsonl
Note: HF YAML metadata does not allow local file paths in datasets:.
If you trained from a locally derived JSONL (e.g., /content/alf_try_balanced.jsonl), it should be described here (as above), not in YAML.
- Downloads last month
- -
Model tree for KOUJI039/structeval-qwen3-4b-sft-try54
Base model
Qwen/Qwen3-4B-Instruct-2507