--- license: apache-2.0 tags: - qwen2 - llm-advanced-competition-2025 - react-agent - alfworld - dbbench --- # LLM-Advanced-Competition-2025 This repository provides a **full fine-tuned model** based on **Qwen/Qwen2.5-7B-Instruct** using **16-bit precision (BF16)**. ## Training Objective This model is trained to improve **ReAct-style agent performance** on ALFWorld (household tasks) and DBBench (database operations). Training data includes curated trajectories, distilled data from Qwen/Qwen3-32B, and augmented data targeting specific failure patterns. ## Training Data | Dataset | Count | | --- | --- | | u-10bei/sft_alfworld_trajectory_dataset_v5 | 2,502 | | u-10bei/dbbench_sft_dataset_react_v4 | 1,200 | | Distilled (Qwen/Qwen3-32B) | 1,200 | | ALFWorld augmented | 215 | | Recovery loop avoidance | 120 | | No-examine | 155 | | **Total** | **5,392** | ## Training Configuration * Base model: Qwen/Qwen2.5-7B-Instruct * Precision: 16-bit (BF16) * Epochs: 2 * GPU: A100 80GB ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model_id = "Sakai0920/LLM-Advanced-Competition-2025" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype=torch.bfloat16, device_map="auto", ) ``` ## Sources & Terms (IMPORTANT) Base model: [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) Distillation teacher: [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B) Compliance: Users must comply with the Apache 2.0 license and the base model's original terms of use.