File size: 1,471 Bytes
188fcd7 7537c1a 0e33f3c 76a5aea 7537c1a 76a5aea 7537c1a 76a5aea 7537c1a 76a5aea |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 |
---
title: README
emoji: 🐠
colorFrom: yellow
colorTo: gray
sdk: gradio
pinned: false
---
# GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
[](https://arxiv.org/abs/2510.08872)
[](https://huggingface.co/GTAlign)
**GTAlign** applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.
## Models
We have released five model checkpoints, and we are preparing more thoroughly trained models.
| Model Name | Size | Dataset | Hugging Face Link |
|------------|------|--------|--------------------|
| `GTAlign/Qwen2.5-3B-Math-140step` | 3B | Math | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Math-140step) |
| `GTAlign/Qwen2.5-3B-Medium-110step` | 3B | Medium | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Medium-110step) |
| `GTAlign/Qwen2.5-3B-AbgQA-140step` | 3B | Ambig-QA | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-AbgQA-140step) |
| `GTAlign/Qwen2.5-3B-WildGuard-140step` | 3B | WildGuard | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-WildGuard-140step) |
| `GTAlign/Qwen2.5-3B-Full-160step` | 3B | Full | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Full-160step) |
|