＜Qwen2.5-7B-Agent-Trajectory-LoRA AWQ 4bit＞

This repository provides an AWQ 4-bit quantized version of UtsuSl0th/trajectory-lora-repo, which is a LoRA adapter fine-tuned from unsloth/Qwen2.5-7B-Instruct using LoRA + Unsloth, and subsequently merged into a single standalone model.

Note: This is the quantized, ready-to-run version. The original LoRA adapter weights (non-quantized) are available at the base repository above.

What is AWQ?

AWQ (Activation-aware Weight Quantization) is a 4-bit quantization method that preserves model quality by identifying and protecting the most important weights. This quantization was performed using AutoAWQ with the following configuration:

Parameter	Value
Bits	4
Group size	128
Zero point	True
Version	GEMM

Training Objective (Original Model)

The source adapter was trained to improve multi-turn agent task performance on ALFWorld (household tasks) and DBBench (database operations).

Loss was applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, and recovery from errors.

Training Configuration (Original LoRA)

Parameter	Value
Base model	unsloth/Qwen2.5-7B-Instruct
Method	LoRA + Unsloth (Colab Pro A100)
Dataset	u-10bei/sft_alfworld_trajectory_dataset_v5
Max sequence length	4096
Epochs	2
Learning rate	2e-5
LoRA r	64
LoRA alpha	128
LoRA dropout	0
LoRA target modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Per-device batch size	4
Gradient accumulation	4 (effective batch size: 16)
Warmup ratio	0.1
Weight decay	0.05
Seed	3407

Usage

With vLLM (Recommended — fastest inference)

from vllm import LLM, SamplingParams

llm = LLM(
    model="UtsuSl0th/trajectory-lora-repo-AWQ",
    quantization="awq",
    dtype="auto",
)

sampling_params = SamplingParams(temperature=0.7, max_tokens=512)
outputs = llm.generate(["Your prompt here"], sampling_params)
print(outputs[0].outputs[0].text)

With AutoAWQ + Transformers

from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer

model_id = "UtsuSl0th/trajectory-lora-repo-AWQ"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoAWQForCausalLM.from_quantized(
    model_id,
    fuse_layers=True,
    trust_remote_code=False,
    safetensors=True,
)

inputs = tokenizer("Your prompt here", return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(output[0], skip_special_tokens=True))

With Transformers (standard pipeline)

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "UtsuSl0th/trajectory-lora-repo-AWQ"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
)

Sources & Terms (IMPORTANT)

Original adapter: UtsuSl0th/trajectory-lora-repo
Training data: u-10bei/sft_alfworld_trajectory_dataset_v5

Dataset License: MIT License. This dataset is used and distributed under the terms of the MIT License.
Compliance: Users must comply with the MIT license (including copyright notice) and the base model's original terms of use.

Downloads last month: 23

Safetensors

Model size

8B params

Tensor type

I32

BF16

F16

Model tree for UtsuSl0th/trajectory-lora-repo-AWQ

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct

Finetuned

unsloth/Qwen2.5-7B-Instruct

Adapter

UtsuSl0th/trajectory-lora-repo

Quantized

(1)

this model

UtsuSl0th
/

trajectory-lora-repo-AWQ