Spaces:

GTAlign
/

README

No application file

App Files Files Community

README / README.md

zsqzz

Update README.md

0e33f3c verified 4 months ago

preview code

raw

history blame contribute delete

1.47 kB

A newer version of the Gradio SDK is available: 6.5.1

Upgrade

metadata

title: README
emoji: 🐠
colorFrom: yellow
colorTo: gray
sdk: gradio
pinned: false

GTAlign: Game-Theoretic Alignment of LLM Assistants for Mutual Welfare

GTAlign applies game-theoretic principles to fine-tune reasoning LLMs, encouraging them to make decisions that are not only accurate but also rational, cooperative, and transparent in dialogue settings.

Models

We have released five model checkpoints, and we are preparing more thoroughly trained models.

Model Name	Size	Dataset	Hugging Face Link
`GTAlign/Qwen2.5-3B-Math-140step`	3B	Math	Model
`GTAlign/Qwen2.5-3B-Medium-110step`	3B	Medium	Model
`GTAlign/Qwen2.5-3B-AbgQA-140step`	3B	Ambig-QA	Model
`GTAlign/Qwen2.5-3B-WildGuard-140step`	3B	WildGuard	Model
`GTAlign/Qwen2.5-3B-Full-160step`	3B	Full	Model