A QLoRA fine-tuned adapter for Qwen2.5-7B-Instruct, specialized in solving game theory problems with rigorous step-by-step mathematical reasoning.
๐ Model Description
GameTheory-Solver is a LoRA adapter trained on the GameTheory-Bench dataset โ the first comprehensive, computationally verified game theory dataset for LLM training. The adapter transforms Qwen2.5-7B-Instruct into a specialized solver that produces detailed, step-by-step solutions with mathematical proofs and clear final answers.
Key result: The fine-tuned model achieves 94% overall accuracy (up from 82% base) and 94.4% on hard problems (up from 66.7% base), representing a +12pp overall and +27.7pp hard-problem improvement.
๐ง Capabilities
Capability
Details
Nash Equilibrium Computation
Pure and mixed strategies for 2ร2, 3ร3, 3ร4, and 4ร4 games
Dominant Strategy Analysis
IESDS (Iterated Elimination of Strictly Dominated Strategies)
Zero-Sum Game Solving
Minimax theorem, saddle point detection, mixed strategies
Sequential Game Analysis
Backward induction, subgame perfect equilibrium (up to 3 stages)
Evaluated on a diverse benchmark spanning all 10 categories and 3 difficulty levels.
Overall Performance: Base vs. Solver
Metric
Base (Qwen2.5-7B)
Solver (Fine-tuned)
ฮ Improvement
Overall Accuracy
82%
94%
+12% โ
Hard Problems
66.7%
94.4%
+27.7% ๐
Per-Category Accuracy
Category
Base
Solver
ฮ
Trend
Normal Form 2ร2
100%
80%
โ20%
๐
Normal Form 3ร3
80%
60%
โ20%
๐
Normal Form 3ร4
100%
100%
โ
โก๏ธ
Normal Form 4ร4
100%
100%
โ
โก๏ธ
Zero-Sum
100%
100%
โ
โก๏ธ
Sequential Game
100%
100%
โ
โก๏ธ
Auction Theory
80%
100%
+20%
๐
Bayesian Game
0%
100%
+100%
๐
Cooperative Game
100%
100%
โ
โก๏ธ
Mechanism Design
60%
100%
+40%
๐
Highlight: The model achieves the most dramatic gains on previously unsolvable categories โ Bayesian Games (0% โ 100%) and Mechanism Design (60% โ 100%) โ while maintaining perfect scores across zero-sum, sequential, and cooperative games.
Small-matrix regression: Accuracy on 2ร2 and 3ร3 normal-form games decreased after fine-tuning (100% โ 80% and 80% โ 60% respectively). The base model already handled these well; the adapter slightly regresses on simpler subcategories while dramatically improving harder ones.
Mixed-strategy precision: Complex mixed-strategy Nash Equilibria involving irrational numbers may have floating-point precision issues.
Context length: Max sequence length of 2,048 tokens may truncate very large game matrices or extremely detailed solutions.
Synthetic training data: The model was trained on programmatically generated problems; real-world game theory scenarios with ambiguous framing may require additional prompting.