Instructions to use Alogotron/GameTheory-Solver with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Alogotron/GameTheory-Solver with PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "Alogotron/GameTheory-Solver")

Transformers

How to use Alogotron/GameTheory-Solver with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Alogotron/GameTheory-Solver")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("Alogotron/GameTheory-Solver", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use Alogotron/GameTheory-Solver with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Alogotron/GameTheory-Solver"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alogotron/GameTheory-Solver",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Alogotron/GameTheory-Solver

SGLang

How to use Alogotron/GameTheory-Solver with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Alogotron/GameTheory-Solver" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alogotron/GameTheory-Solver",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Alogotron/GameTheory-Solver" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alogotron/GameTheory-Solver",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Alogotron/GameTheory-Solver with Docker Model Runner:
```
docker model run hf.co/Alogotron/GameTheory-Solver
```

2reb commited on Feb 25

Commit

eb51fe2

verified ·

1 Parent(s): 1b82de3

Upload GameTheory-Solver QLoRA adapter with evaluation results

Browse files

Files changed (9) hide show

.gitattributes +1 -0
README.md +215 -0
adapter_config.json +46 -0
adapter_model.safetensors +3 -0
chat_template.jinja +54 -0
tokenizer.json +3 -0
tokenizer_config.json +29 -0
training_args.bin +3 -0
training_stats.json +16 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,215 @@

+---
+base_model: Qwen/Qwen2.5-7B-Instruct
+library_name: peft
+license: apache-2.0
+pipeline_tag: text-generation
+tags:
+- game-theory
+- math
+- reasoning
+- lora
+- qlora
+- sft
+- qwen2
+- transformers
+- trl
+- 4-bit
+- bitsandbytes
+datasets:
+- 2reb/GameTheory-Bench
+model-index:
+- name: GameTheory-Solver
+  results:
+  - task:
+      type: text-generation
+      name: Game Theory Problem Solving
+    dataset:
+      name: GameTheory-Bench
+      type: 2reb/GameTheory-Bench
+    metrics:
+    - name: Accuracy
+      type: accuracy
+      value: 80.0
+      verified: false
+---
+# GameTheory-Solver
+A QLoRA fine-tuned adapter for **Qwen/Qwen2.5-7B-Instruct** specialized in solving game theory problems with step-by-step mathematical reasoning.
+## Model Description
+GameTheory-Solver is a LoRA adapter trained on the [GameTheory-Bench](https://huggingface.co/datasets/2reb/GameTheory-Bench) dataset, which contains 2,913 diverse game theory problems spanning 10 categories. The model generates detailed, step-by-step solutions with mathematical proofs and clear final answers.
+### Capabilities
+- **Nash Equilibrium computation** (pure and mixed strategies) for 2x2, 3x3, 3x4, and 4x4 games
+- **Dominant strategy analysis** and Iterated Elimination of Strictly Dominated Strategies (IESDS)
+- **Zero-sum game solving** with minimax theorem and saddle point detection
+- **Sequential game analysis** via backward induction (up to 3 stages)
+- **Bayesian game equilibria** with incomplete information
+- **Cooperative game theory** including Shapley value computation
+- **Auction theory** (first-price, second-price, all-pay, revenue equivalence)
+- **Mechanism design** and incentive compatibility analysis
+## Training Details
+### Base Model
+- **Model**: [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct)
+- **Parameters**: 7.6B (base), 161M trainable (LoRA)
+### Dataset
+- **Dataset**: [2reb/GameTheory-Bench](https://huggingface.co/datasets/2reb/GameTheory-Bench)
+- **Train split**: 2,767 examples
+- **Eval split**: 146 examples (5% held out)
+### QLoRA Configuration
+| Parameter | Value |
+|---|---|
+| LoRA rank (r) | 64 |
+| LoRA alpha | 128 |
+| LoRA dropout | 0.05 |
+| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |
+| Quantization | 4-bit NF4 with double quantization |
+| Compute dtype | bfloat16 |
+| Trainable parameters | 161M (2.1% of total) |
+### Training Hyperparameters
+| Parameter | Value |
+|---|---|
+| Epochs | 3 |
+| Batch size (per device) | 2 |
+| Gradient accumulation steps | 8 |
+| Effective batch size | 16 |
+| Learning rate | 2e-4 |
+| LR scheduler | Cosine |
+| Warmup ratio | 0.05 |
+| Weight decay | 0.01 |
+| Max sequence length | 2048 |
+| Packing | Enabled |
+| Optimizer | paged_adamw_8bit |
+| Gradient checkpointing | Enabled |
+| Precision | bf16 |
+### Training Results
+| Metric | Value |
+|---|---|
+| Train loss | 0.1613 |
+| Eval loss | 0.0873 |
+| Token accuracy | 96.1% |
+| Total steps | 135 |
+| Training runtime | 1h 55m |
+## Evaluation Results
+Evaluated on 15 diverse problems sampled across all 10 categories and 3 difficulty levels.
+### Overall Performance
+| Metric | Value |
+|---|---|
+| **Overall Accuracy** | **12/15 (80.0%)** |
+| Avg generation time | 24.7s per problem |
+| Avg output tokens | 322 tokens |
+### Per-Category Accuracy
+| Category | Correct/Total | Accuracy |
+|---|---|---|
+| auction_theory | 2/2 | 100.0% |
+| bayesian_game | 0/1 | 0.0% |
+| cooperative_game | 0/1 | 0.0% |
+| mechanism_design | 2/2 | 100.0% |
+| normal_form_2x2 | 3/3 | 100.0% |
+| normal_form_3x3 | 1/1 | 100.0% |
+| normal_form_3x4 | 2/2 | 100.0% |
+| normal_form_4x4 | 1/1 | 100.0% |
+| sequential_game | 1/1 | 100.0% |
+| zero_sum | 0/1 | 0.0% |
+### Per-Difficulty Accuracy
+| Difficulty | Correct/Total | Accuracy |
+|---|---|---|
+| easy | 3/3 | 100.0% |
+| medium | 4/6 | 66.7% |
+| hard | 5/6 | 83.3% |
+### Sample Results
+| Category | Subcategory | Difficulty | Result |
+|---|---|---|---|
+| normal_form_2x2 | random_extra | easy | CORRECT |
+| normal_form_3x3 | 3x3_pure_ne | medium | CORRECT |
+| normal_form_3x4 | 3x4_pure_ne | hard | CORRECT |
+| normal_form_4x4 | 4x4_iesds | hard | CORRECT |
+| zero_sum | minimax | medium | INCORRECT |
+## Usage
+### Installation
+```bash
+pip install transformers peft bitsandbytes accelerate torch
+```
+### Loading the Model
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
+from peft import PeftModel
+# Load in 4-bit (same as training)
+bnb_config = BitsAndBytesConfig(
+    load_in_4bit=True,
+    bnb_4bit_quant_type="nf4",
+    bnb_4bit_compute_dtype=torch.bfloat16,
+    bnb_4bit_use_double_quant=True,
+)
+base_model = AutoModelForCausalLM.from_pretrained(
+    "Qwen/Qwen2.5-7B-Instruct",
+    quantization_config=bnb_config,
+    device_map="auto",
+)
+model = PeftModel.from_pretrained(base_model, "2reb/GameTheory-Solver")
+tokenizer = AutoTokenizer.from_pretrained("2reb/GameTheory-Solver")
+```
+### Solving a Game Theory Problem
+```python
+messages = [
+    {"role": "system", "content": "You are a game theory expert. Solve the given problem step-by-step, showing all mathematical reasoning. Provide the final answer clearly."},
+    {"role": "user", "content": "Consider the following game:\n\nPlayer 1 \\ Player 2 | Left | Right\n--- | --- | ---\nUp | (3,1) | (0,0)\nDown | (1,1) | (2,3)\n\nFind all Nash Equilibria."},
+]
+inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
+with torch.no_grad():
+    outputs = model.generate(inputs, max_new_tokens=512, do_sample=False)
+response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
+print(response)
+```
+## Limitations
+- Performance on **Bayesian games** and **cooperative games** (Shapley value) may be less reliable than on normal-form games
+- Complex mixed-strategy Nash Equilibria with irrational numbers may have precision issues
+- Maximum context of 2048 tokens may truncate very large game matrices
+- The model was trained on synthetically generated problems; real-world game theory scenarios may differ
+## License
+This adapter is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0).
+## Citation
+```bibtex
+@misc{gametheory-solver-2025,
+  title={GameTheory-Solver: QLoRA Fine-tuned Qwen2.5-7B for Game Theory},
+  author={2reb},
+  year={2025},
+  publisher={HuggingFace},
+  url={https://huggingface.co/2reb/GameTheory-Solver}
+}
+```

adapter_config.json ADDED Viewed

	@@ -0,0 +1,46 @@

+{
+  "alora_invocation_tokens": null,
+  "alpha_pattern": {},
+  "arrow_config": null,
+  "auto_mapping": null,
+  "base_model_name_or_path": "Qwen/Qwen2.5-7B-Instruct",
+  "bias": "none",
+  "corda_config": null,
+  "ensure_weight_tying": false,
+  "eva_config": null,
+  "exclude_modules": null,
+  "fan_in_fan_out": false,
+  "inference_mode": true,
+  "init_lora_weights": true,
+  "layer_replication": null,
+  "layers_pattern": null,
+  "layers_to_transform": null,
+  "loftq_config": {},
+  "lora_alpha": 128,
+  "lora_bias": false,
+  "lora_dropout": 0.05,
+  "megatron_config": null,
+  "megatron_core": "megatron.core",
+  "modules_to_save": null,
+  "peft_type": "LORA",
+  "peft_version": "0.18.1",
+  "qalora_group_size": 16,
+  "r": 64,
+  "rank_pattern": {},
+  "revision": null,
+  "target_modules": [
+    "o_proj",
+    "v_proj",
+    "down_proj",
+    "up_proj",
+    "gate_proj",
+    "q_proj",
+    "k_proj"
+  ],
+  "target_parameters": null,
+  "task_type": "CAUSAL_LM",
+  "trainable_token_indices": null,
+  "use_dora": false,
+  "use_qalora": false,
+  "use_rslora": false
+}

adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:681f6ee09f57be855a5cce57a1ddbcee711cf1befc5bc4ac15b695d0942cdab2
+size 645975704

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,54 @@

+{%- if tools %}
+    {{- '<|im_start|>system\n' }}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- messages[0]['content'] }}
+    {%- else %}
+        {{- 'You are Qwen, created by Alibaba Cloud. You are a helpful assistant.' }}
+    {%- endif %}
+    {{- "\n\n# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
+    {%- for tool in tools %}
+        {{- "\n" }}
+        {{- tool | tojson }}
+    {%- endfor %}
+    {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
+{%- else %}
+    {%- if messages[0]['role'] == 'system' %}
+        {{- '<|im_start|>system\n' + messages[0]['content'] + '<|im_end|>\n' }}
+    {%- else %}
+        {{- '<|im_start|>system\nYou are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>\n' }}
+    {%- endif %}
+{%- endif %}
+{%- for message in messages %}
+    {%- if (message.role == "user") or (message.role == "system" and not loop.first) or (message.role == "assistant" and not message.tool_calls) %}
+        {{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
+    {%- elif message.role == "assistant" %}
+        {{- '<|im_start|>' + message.role }}
+        {%- if message.content %}
+            {{- '\n' + message.content }}
+        {%- endif %}
+        {%- for tool_call in message.tool_calls %}
+            {%- if tool_call.function is defined %}
+                {%- set tool_call = tool_call.function %}
+            {%- endif %}
+            {{- '\n<tool_call>\n{"name": "' }}
+            {{- tool_call.name }}
+            {{- '", "arguments": ' }}
+            {{- tool_call.arguments | tojson }}
+            {{- '}\n</tool_call>' }}
+        {%- endfor %}
+        {{- '<|im_end|>\n' }}
+    {%- elif message.role == "tool" %}
+        {%- if (loop.index0 == 0) or (messages[loop.index0 - 1].role != "tool") %}
+            {{- '<|im_start|>user' }}
+        {%- endif %}
+        {{- '\n<tool_response>\n' }}
+        {{- message.content }}
+        {{- '\n</tool_response>' }}
+        {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
+            {{- '<|im_end|>\n' }}
+        {%- endif %}
+    {%- endif %}
+{%- endfor %}
+{%- if add_generation_prompt %}
+    {{- '<|im_start|>assistant\n' }}
+{%- endif %}

tokenizer.json ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3fd169731d2cbde95e10bf356d66d5997fd885dd8dbb6fb4684da3f23b2585d8
+size 11421892

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,29 @@

+{
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": null,
+  "clean_up_tokenization_spaces": false,
+  "eos_token": "<|im_end|>",
+  "errors": "replace",
+  "extra_special_tokens": [
+    "<|im_start|>",
+    "<|im_end|>",
+    "<|object_ref_start|>",
+    "<|object_ref_end|>",
+    "<|box_start|>",
+    "<|box_end|>",
+    "<|quad_start|>",
+    "<|quad_end|>",
+    "<|vision_start|>",
+    "<|vision_end|>",
+    "<|vision_pad|>",
+    "<|image_pad|>",
+    "<|video_pad|>"
+  ],
+  "is_local": false,
+  "model_max_length": 131072,
+  "pad_token": "<|endoftext|>",
+  "split_special_tokens": false,
+  "tokenizer_class": "Qwen2Tokenizer",
+  "unk_token": null
+}

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:da56877e9478f0b041766a7794d67df1b222b095968deca6b97c805b4609fc25
+size 5649

training_stats.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "base_model": "Qwen/Qwen2.5-7B-Instruct",
+  "dataset": "2reb/GameTheory-Bench",
+  "train_examples": 2767,
+  "eval_examples": 146,
+  "lora_r": 64,
+  "lora_alpha": 128,
+  "epochs": 3,
+  "batch_size": 2,
+  "grad_accum": 8,
+  "effective_batch": 16,
+  "lr": 0.0002,
+  "train_loss": 0.1613485331888552,
+  "eval_loss": 0.08727391809225082,
+  "runtime_seconds": 6895.8492
+}