zsqzz commited on
Commit
7537c1a
·
verified ·
1 Parent(s): 188fcd7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -1
README.md CHANGED
@@ -7,4 +7,29 @@ sdk: gradio
7
  pinned: false
8
  ---
9
 
10
- Edit this `README.md` markdown file to author your organization card.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  pinned: false
8
  ---
9
 
10
+ # GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare
11
+
12
+ [![ArXiv](https://img.shields.io/badge/arXiv-25210.66666-b31b1b?style=flat&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2505.19590)
13
+ [![Hugging Face](https://img.shields.io/badge/HuggingFace-GTAlign-orange?logo=huggingface&logoColor=white)](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
14
+
15
+ **GTAlign** applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.
16
+
17
+
18
+ # Models
19
+
20
+
21
+ We have released five model checkpoints trained on four datasets.
22
+
23
+ [View Model Collections](https://huggingface.co/collections/sunblaze-ucb/intuitor-684f895c78ed2d3ef3a678b3)
24
+
25
+ | Model Name | Size | Dataset | Hugging Face Link |
26
+ |------------|------|--------|--------------------|
27
+ | `` | 3B | Math | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-Intuitor-MATH-1EPOCH) |
28
+ | `` | 3B | Medium | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-3B-Intuitor-MATH-1EPOCH) |
29
+ | `` | 3B | Ambig-QA | [View Model](https://huggingface.co/sunblaze-ucb/OLMo-2-7B-SFT-Intuitor-MATH-1EPOCH) |
30
+ | `` | 3B | WildGuard | [View Model](https://huggingface.co/sunblaze-ucb/Qwen3-14B-Intuitor-MATH-1EPOCH) |
31
+ | `` | 3B | Mixed | [View Model](https://huggingface.co/sunblaze-ucb/Qwen2.5-1.5B-GRPO-MATH-1EPOCH) |
32
+
33
+
34
+
35
+