Sakai0920
/

LLM-Advanced-Competition-2025

llm-advanced-competition-2025

Model card Files Files and versions

LLM-Advanced-Competition-2025 / README.md

Sakai0920's picture

Update README.md

3c38ad8 verified 4 days ago

|

history blame contribute delete

1.62 kB

	---
	license: apache-2.0
	tags:
	- qwen2
	- llm-advanced-competition-2025
	- react-agent
	- alfworld
	- dbbench
	---

	# LLM-Advanced-Competition-2025

	This repository provides a full fine-tuned model based on
	Qwen/Qwen2.5-7B-Instruct using 16-bit precision (BF16).

	## Training Objective

	This model is trained to improve ReAct-style agent performance
	on ALFWorld (household tasks) and DBBench (database operations).

	Training data includes curated trajectories, distilled data from Qwen/Qwen3-32B,
	and augmented data targeting specific failure patterns.

	## Training Data

	\| Dataset \| Count \|
	\| --- \| --- \|
	\| u-10bei/sft_alfworld_trajectory_dataset_v5 \| 2,502 \|
	\| u-10bei/dbbench_sft_dataset_react_v4 \| 1,200 \|
	\| Distilled (Qwen/Qwen3-32B) \| 1,200 \|
	\| ALFWorld augmented \| 215 \|
	\| Recovery loop avoidance \| 120 \|
	\| No-examine \| 155 \|
	\| Total \| 5,392 \|

	## Training Configuration

	* Base model: Qwen/Qwen2.5-7B-Instruct
	* Precision: 16-bit (BF16)
	* Epochs: 2
	* GPU: A100 80GB

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_id = "Sakai0920/LLM-Advanced-Competition-2025"

	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	```

	## Sources & Terms (IMPORTANT)

	Base model: [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)

	Distillation teacher: [Qwen/Qwen3-32B](https://huggingface.co/Qwen/Qwen3-32B)

	Compliance: Users must comply with the Apache 2.0 license and the base model's original terms of use.