LLM-Advanced-Competition-2025
This repository provides a full fine-tuned model based on Qwen/Qwen2.5-7B-Instruct using 16-bit precision (BF16).
Training Objective
This model is trained to improve ReAct-style agent performance on ALFWorld (household tasks) and DBBench (database operations).
Training data includes curated trajectories, distilled data from Qwen/Qwen3-32B, and augmented data targeting specific failure patterns.
Training Data
| Dataset | Count |
|---|---|
| u-10bei/sft_alfworld_trajectory_dataset_v5 | 2,502 |
| u-10bei/dbbench_sft_dataset_react_v4 | 1,200 |
| Distilled (Qwen/Qwen3-32B) | 1,200 |
| ALFWorld augmented | 215 |
| Recovery loop avoidance | 120 |
| No-examine | 155 |
| Total | 5,392 |
Training Configuration
- Base model: Qwen/Qwen2.5-7B-Instruct
- Precision: 16-bit (BF16)
- Epochs: 2
- GPU: A100 80GB
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "Sakai0920/LLM-Advanced-Competition-2025"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
Sources & Terms (IMPORTANT)
Base model: Qwen/Qwen2.5-7B-Instruct
Distillation teacher: Qwen/Qwen3-32B
Compliance: Users must comply with the Apache 2.0 license and the base model's original terms of use.
- Downloads last month
- 67
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support