οΌœγ€θͺ²ι‘Œγ€‘ここはθ‡ͺεˆ†γ§θ¨˜ε…₯γ—γ¦δΈ‹γ•γ„οΌž

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth.

This repository contains LoRA adapter weights only. The base model must be loaded separately.

Training Configuration

  • Base model: Qwen/Qwen3-4B-Instruct-2507
  • Max sequence length: 2048
  • Epochs: 2
  • Learning rate: 2e-06
  • LoRA: r=64, alpha=128

Sources & Terms (IMPORTANT)

Training data (YAML): u-10bei/sft_alfworld_trajectory_dataset_v5 Training data (actually used by training code): /content/mix_alf_v5_dbweak30.jsonl

Note: HF YAML metadata does not allow local file paths in datasets:. If you trained from a locally derived JSONL (e.g., /content/alf_try_balanced.jsonl), it should be described here (as above), not in YAML.

Downloads last month
-
Safetensors
Model size
4B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for KOUJI039/structeval-qwen3-4b-sft-try54

Adapter
(5234)
this model

Dataset used to train KOUJI039/structeval-qwen3-4b-sft-try54