--- base_model: unsloth/Qwen2.5-7B-Instruct datasets: - u-10bei/sft_alfworld_trajectory_dataset_v5 - u-10bei/dbbench_sft_dataset_react_v4 language: - en license: apache-2.0 library_name: autoawq pipeline_tag: text-generation tags: - awq - 4bit - quantized - agent - tool-use - alfworld - dbbench --- # Qwen2.5-7B-Agent-Mixed-Trajectory-AWQ v3 This repository provides a **4-bit AWQ quantized** version of a merged model fine-tuned from **unsloth/Qwen2.5-7B-Instruct** using **LoRA + Unsloth**. The original LoRA adapter was trained on mixed agent trajectory data (ALFWorld + DBBench), then merged into the base model and quantized with AutoAWQ for faster inference. ## Quantization Details | Parameter | Value | |---|---| | Method | AWQ (Activation-aware Weight Quantization) | | Bits | 4-bit | | Group size | 128 | | Zero point | True | | Version | GEMM | | Library | autoawq 0.2.7.post3 | ## Dataset Construction (v3) Training data was built by mixing and preprocessing two trajectory datasets: - **ALFWorld** (`u-10bei/sft_alfworld_trajectory_dataset_v5`): 1,845 samples after cleaning and success-only filtering - **DBBench** (`u-10bei/dbbench_sft_dataset_react_v4`): 1,200 samples after cleaning Preprocessing steps: - Removal of htags template contamination - Removal of hallucinated object IDs (e.g. `bowl 99`) — ALFWorld only - **[v3 new]** ALFWorld failed trajectories excluded (success-only filtering): 2,327 → 1,845 samples Category-level upsampling was applied to reinforce weak task types: | Category | Multiplier | |---|---| | ALFWorld multi-object | ×3 | | ALFWorld cool | ×2 | | ALFWorld examine | ×1.5 | | DBBench aggregation-MAX | ×3 | | DBBench INSERT | ×2 | | DBBench counting | ×2 | Final dataset size: **4,687 samples** ## Training Configuration | Parameter | Value | |---|---| | Base model | unsloth/Qwen2.5-7B-Instruct | | Method | LoRA + Unsloth (Colab Pro L4) | | Max sequence length | 4096 | | Epochs | 3 | | Learning rate | 8e-6 | | LoRA r / alpha | 64 / 128 | | Effective batch size | 16 (bs=2 × grad_accum=8) | | load_in_4bit | True | ## Usage ```python from awq import AutoAWQForCausalLM from transformers import AutoTokenizer model_id = "UtsuSl0th/mixed-lora-3-awq" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoAWQForCausalLM.from_quantized( model_id, device_map="auto", fuse_layers=True, ) inputs = tokenizer("Your prompt here", return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=256) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Sources & Terms Dataset License: MIT License. Users must comply with the MIT license and the base model's original terms of use.