Abstract
GameTalk framework trains large language models to make strategic decisions through multi-turn dialogue by optimizing global objectives using reward signals across full conversations, outperforming untrained models in complex game scenarios.
Strategic decision-making in multi-agent settings is a key challenge for large language models (LLMs), particularly when coordination and negotiation must unfold over extended conversations. While recent work has explored the use of LLMs in isolated decision tasks, little attention has been given to optimizing long-term objectives through dialogue. We introduce GameTalk, a framework for training LLMs to make strategic decisions via multi-turn interactions. Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations. We achieve this by adapting fine-tuning methods like GRPO, DPO, and STaR to incorporate reward signals that depend on the entire interaction. We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling. Our results show that GameTalk significantly outperforms untrained models, especially under reward shaping, with DPO consistently yielding the strongest gains. These findings position conversational fine-tuning as a promising path for LLMs to reason, negotiate, and act in interactive environments.
Community
Strategic decision-making in multi-agent settings is a key challenge for large language models (LLMs), particularly when coordination and negotiation must unfold over extended conversations. While recent work has explored the use of LLMs in isolated decision tasks, little attention has been given to optimizing long-term objectives through dialogue. We introduce GameTalk, a framework for training LLMs to make strategic decisions via multi-turn interactions. Unlike prior work that focuses on single-turn objectives or static action prediction, we train LLMs to optimize a global objective across full conversations. We achieve this by adapting fine-tuning methods like GRPO, DPO, and STaR to incorporate reward signals that depend on the entire interaction. We evaluate this approach on a suite of increasingly complex games, designed to stress different aspects of reasoning, coordination, and opponent modeling. Our results show that GameTalk significantly outperforms untrained models, especially under reward shaping, with DPO consistently yielding the strongest gains. These findings position conversational fine-tuning as a promising path for LLMs to reason, negotiate, and act in interactive environments.
I'm a game dev and stuff like this is exactly what I need to be reading. Glad ppl are working on this. We're looking into distilling models for things like "engine-assisted coaching" for complicated strategy games, and getting those distills working on low-end / iGpus.
Added to my personal collection of papers that'll help us figure out how to get where we need to go :) "https://huggingface.co/collections/YellowjacketGames/papers-gameplay-optimization"
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- SpeakRL: Synergizing Reasoning, Speaking, and Acting in Language Models with Reinforcement Learning (2025)
- When Actions Teach You to Think: Reasoning-Action Synergy via Reinforcement Learning in Conversational Agents (2025)
- UserLM-R1: Modeling Human Reasoning in User Language Models with Multi-Reward Reinforcement Learning (2026)
- Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning (2026)
- LinguaGame: A Linguistically Grounded Game-Theoretic Paradigm for Multi-Agent Dialogue Generation (2026)
- MUSIC: MUlti-Step Instruction Contrast for Multi-Turn Reward Models (2025)
- ICPO: Illocution-Calibrated Policy Optimization for Multi-Turn Conversation (2026)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper