Spaces:

GTAlign
/

README

No application file

zsqzz commited on Oct 10, 2025

Commit

76a5aea

verified ·

1 Parent(s): 7537c1a

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -9,27 +9,20 @@ pinned: false
 # GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
-[![ArXiv](https://img.shields.io/badge/arXiv-25210.66666-b31b1b?style=flat&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2505.19590)
-[![Hugging Face](https://img.shields.io/badge/HuggingFace-GTAlign-orange?logo=huggingface&logoColor=white)](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
 **GTAlign** applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.
-# Models
-We have released five model checkpoints trained on four datasets.
-[View Model Collections](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
 | Model Name | Size | Dataset | Hugging Face Link |
 |------------|------|--------|--------------------|
-| `` | 3B | Math | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH) |
-| `` | 3B   | Medium | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH) |
-| `` | 3B   | Ambig-QA | [View Model](https://huggingface.co/sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH) |
-| `` | 3B  | WildGuard | [View Model](https://huggingface.co/sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH) |
-| `` | 3B | Mixed | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH) |

 # GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
+[![ArXiv](https://img.shields.io/badge/arXiv-25210.66666-b31b1b?style=flat&logo=arxiv&logoColor=white)](none)
+[![Hugging Face](https://img.shields.io/badge/HuggingFace-GTAlign-orange?logo=huggingface&logoColor=white)](https://huggingface.co/GTAlign)
 **GTAlign** applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.
+## Models
+We have released five model checkpoints, and we are preparing more thoroughly trained models.
 | Model Name | Size | Dataset | Hugging Face Link |
 |------------|------|--------|--------------------|
+| `GTAlign/Qwen2.5-3B-Math-140step` | 3B | Math | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Math-140step) |
+| `GTAlign/Qwen2.5-3B-Medium-110step` | 3B | Medium | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Medium-110step) |
+| `GTAlign/Qwen2.5-3B-AbgQA-140step` | 3B | Ambig-QA | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-AbgQA-140step) |
+| `GTAlign/Qwen2.5-3B-WildGuard-140step` | 3B | WildGuard | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-WildGuard-140step) |
+| `GTAlign/Qwen2.5-3B-Full-160step` | 3B | Full | [Model](https://huggingface.co/GTAlign/Qwen2.5-3B-Full-160step) |