u-10bei/sft_alfworld_trajectory_dataset_v5
Viewer • Updated • 2.5k • 742
How to use AF0815/agentbench with PEFT:
Task type is invalid.
How to use AF0815/agentbench with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AF0815/agentbench to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AF0815/agentbench to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AF0815/agentbench to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="AF0815/agentbench",
max_seq_length=2048,
)irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for AF0815/agentbench to start chatting# No setup required# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for AF0815/agentbench to start chattingpip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="AF0815/agentbench",
max_seq_length=2048,
)This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth for AgentBench-style multi-turn agent trajectories.
This repository contains LoRA adapter weights only. The base model must be loaded separately.
This adapter is trained to improve multi-turn agent task performance on:
Loss is applied to all assistant turns in the trajectory, enabling the model to learn:
u-10bei/dbbench_sft_dataset_react_v4u-10bei/sft_alfworld_trajectory_dataset_v5Enabled: False
DB category weights used during training-data preparation:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "AF0815/agentbench"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Training data:
Dataset license / terms:
Base model
Qwen/Qwen3-4B-Instruct-2507
Install Unsloth Studio (macOS, Linux, WSL)
# Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AF0815/agentbench to start chatting