mark-22/alfworld_cleaned_for_agentbench_v4
Viewer • Updated • 2.43k • 30
How to use mark-22/qwen3-4b-agent-trajectory-lora with PEFT:
Task type is invalid.
How to use mark-22/qwen3-4b-agent-trajectory-lora with Unsloth Studio:
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mark-22/qwen3-4b-agent-trajectory-lora to start chatting
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for mark-22/qwen3-4b-agent-trajectory-lora to start chatting
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for mark-22/qwen3-4b-agent-trajectory-lora to start chatting
pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
model_name="mark-22/qwen3-4b-agent-trajectory-lora",
max_seq_length=2048,
)This repository provides a Dual-Skill LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507. It is specifically optimized for two distinct agentic tasks: Household operations (ALFWorld) and Database interactions (DBBench).
| Parameter | Value |
|---|---|
| Base model | Qwen/Qwen3-4B-Instruct-2507 |
| Hardware | NVIDIA A100 SXM4 40GB |
| Precision | bfloat16 |
| Max context length | 3072 tokens |
| Epochs | 2 |
| Learning rate | 2e-06 |
| Batch size (effective) | 8 |
| LoRA Rank / Alpha | r=64 / a=128 |
| Target Modules | All Linear Layers (Q,K,V,O,Gate,Up,Down) |
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "your_id/your-repo-name"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.bfloat16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
mark-22/alfworld_cleaned_for_agentbench - Focused on household task completion and navigation.mark-22/dbbench_cleaned_for_agentbench - Focused on SQL generation and database manipulation (UPDATE/SELECT).This adapter is distributed under the Apache-2.0 license. Please ensure compliance with the base model's usage terms.
Base model
Qwen/Qwen3-4B-Instruct-2507