---
license: apache-2.0
tags:
- qwen2
- llm-advanced-competition-2025
- react-agent
- alfworld
- dbbench
---

# LLM-Advanced-Competition-2025

This repository provides a **full fine-tuned model** based on
**Qwen/Qwen2.5-7B-Instruct** using **16-bit precision (BF16)**.

## Training Objective

This model is trained to improve **ReAct-style agent performance**
on ALFWorld (household tasks) and DBBench (database operations).

Training data includes curated trajectories, distilled data from Qwen/Qwen3-32B,
and augmented data targeting specific failure patterns.

## Training Data

| Dataset | Count |
| --- | --- |
| u-10bei/sft_alfworld_trajectory_dataset_v5 | 2,502 |
| u-10bei/dbbench_sft_dataset_react_v4 | 1,200 |
| Distilled (Qwen/Qwen3-32B) | 1,200 |
| ALFWorld augmented | 215 |
| Recovery loop avoidance | 120 |
| No-examine | 155 |
| **Total** | **5,392** |

## Training Configuration

* Base model: Qwen/Qwen2.5-7B-Instruct
* Precision: 16-bit (BF16)
* Epochs: 2
* GPU: A100 80GB

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_id = "Sakai0920/LLM-Advanced-Competition-2025"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
```

## Sources & Terms (IMPORTANT)

Base model: [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)

Distillation teacher: [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B)

Compliance: Users must comply with the Apache 2.0 license and the base model's original terms of use.