Spaces:

GTAlign
/

README

No application file

zsqzz commited on Oct 7, 2025

Commit

7537c1a

verified ·

1 Parent(s): 188fcd7

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -7,4 +7,29 @@ sdk: gradio
 pinned: false
 ---
-Edit this `README.md` markdown file to author your organization card.

 pinned: false
 ---
+# GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
+[![ArXiv](https://img.shields.io/badge/arXiv-25210.66666-b31b1b?style=flat&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2505.19590)
+[![Hugging Face](https://img.shields.io/badge/HuggingFace-GTAlign-orange?logo=huggingface&logoColor=white)](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
+**GTAlign** applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.
+# Models
+We have released five model checkpoints trained on four datasets.
+[View Model Collections](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
+| Model Name | Size | Dataset | Hugging Face Link |
+|------------|------|--------|--------------------|
+| `` | 3B | Math | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH) |
+| `` | 3B   | Medium | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH) |
+| `` | 3B   | Ambig-QA | [View Model](https://huggingface.co/sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH) |
+| `` | 3B  | WildGuard | [View Model](https://huggingface.co/sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH) |
+| `` | 3B | Mixed | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH) |