Spaces:
Running
Running
Upload folder using huggingface_hub
Browse filesThis view is limited to 50 files because it contains too many changes.
See raw diff
- README.md +59 -219
- __init__.py +6 -12
- client.py +32 -62
- common/__init__.py +1 -0
- common/__pycache__/__init__.cpython-311.pyc +0 -0
- common/__pycache__/games.cpython-311.pyc +0 -0
- common/__pycache__/strategies.cpython-311.pyc +0 -0
- common/games.py +298 -0
- common/games_coop/__pycache__/cooperative.cpython-311.pyc +0 -0
- common/games_coop/__pycache__/dynamic.cpython-311.pyc +0 -0
- common/games_coop/__pycache__/infinite.cpython-311.pyc +0 -0
- common/games_coop/__pycache__/pd_variants.cpython-311.pyc +0 -0
- common/games_coop/__pycache__/stochastic.cpython-311.pyc +0 -0
- common/games_coop/cooperative.py +169 -0
- common/games_coop/dynamic.py +162 -0
- common/games_coop/infinite.py +72 -0
- common/games_coop/pd_variants.py +145 -0
- common/games_coop/stochastic.py +128 -0
- common/games_ext/__pycache__/auction.cpython-311.pyc +0 -0
- common/games_ext/__pycache__/generated.cpython-311.pyc +0 -0
- common/games_ext/__pycache__/matrix_games.cpython-311.pyc +0 -0
- common/games_ext/__pycache__/nplayer.cpython-311.pyc +0 -0
- common/games_ext/__pycache__/sequential.cpython-311.pyc +0 -0
- common/games_ext/auction.py +138 -0
- common/games_ext/generated.py +144 -0
- common/games_ext/matrix_games.py +152 -0
- common/games_ext/nplayer.py +143 -0
- common/games_ext/sequential.py +140 -0
- common/games_info/__pycache__/bayesian.cpython-311.pyc +0 -0
- common/games_info/__pycache__/communication.cpython-311.pyc +0 -0
- common/games_info/__pycache__/contracts.cpython-311.pyc +0 -0
- common/games_info/__pycache__/network.cpython-311.pyc +0 -0
- common/games_info/__pycache__/signaling.cpython-311.pyc +0 -0
- common/games_info/bayesian.py +125 -0
- common/games_info/communication.py +162 -0
- common/games_info/contracts.py +125 -0
- common/games_info/network.py +120 -0
- common/games_info/signaling.py +142 -0
- common/games_market/__pycache__/advanced.cpython-311.pyc +0 -0
- common/games_market/__pycache__/classic.cpython-311.pyc +0 -0
- common/games_market/__pycache__/contests.cpython-311.pyc +0 -0
- common/games_market/__pycache__/generated_v2.cpython-311.pyc +0 -0
- common/games_market/__pycache__/oligopoly.cpython-311.pyc +0 -0
- common/games_market/advanced.py +125 -0
- common/games_market/classic.py +164 -0
- common/games_market/contests.py +188 -0
- common/games_market/generated_v2.py +125 -0
- common/games_market/oligopoly.py +152 -0
- common/games_meta/__pycache__/coalition_config.cpython-311.pyc +0 -0
- common/games_meta/__pycache__/dynamic.cpython-311.pyc +0 -0
README.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
colorFrom: green
|
| 5 |
colorTo: yellow
|
| 6 |
sdk: docker
|
|
@@ -10,245 +10,85 @@ tags:
|
|
| 10 |
- openenv
|
| 11 |
---
|
| 12 |
|
| 13 |
-
#
|
| 14 |
|
| 15 |
-
A
|
| 16 |
|
| 17 |
-
##
|
| 18 |
-
|
| 19 |
-
The simplest way to use the Kantbench environment is through the `KantbenchEnv` class:
|
| 20 |
-
|
| 21 |
-
```python
|
| 22 |
-
from KantBench import KantbenchAction, KantbenchEnv
|
| 23 |
-
|
| 24 |
-
try:
|
| 25 |
-
# Create environment from Docker image
|
| 26 |
-
KantBenchenv = KantbenchEnv.from_docker_image("KantBench-env:latest")
|
| 27 |
-
|
| 28 |
-
# Reset
|
| 29 |
-
result = KantBenchenv.reset()
|
| 30 |
-
print(f"Reset: {result.observation.echoed_message}")
|
| 31 |
-
|
| 32 |
-
# Send multiple messages
|
| 33 |
-
messages = ["Hello, World!", "Testing echo", "Final message"]
|
| 34 |
-
|
| 35 |
-
for msg in messages:
|
| 36 |
-
result = KantBenchenv.step(KantbenchAction(message=msg))
|
| 37 |
-
print(f"Sent: '{msg}'")
|
| 38 |
-
print(f" → Echoed: '{result.observation.echoed_message}'")
|
| 39 |
-
print(f" → Length: {result.observation.message_length}")
|
| 40 |
-
print(f" → Reward: {result.reward}")
|
| 41 |
-
|
| 42 |
-
finally:
|
| 43 |
-
# Always clean up
|
| 44 |
-
KantBenchenv.close()
|
| 45 |
-
```
|
| 46 |
-
|
| 47 |
-
That's it! The `KantbenchEnv.from_docker_image()` method handles:
|
| 48 |
-
- Starting the Docker container
|
| 49 |
-
- Waiting for the server to be ready
|
| 50 |
-
- Connecting to the environment
|
| 51 |
-
- Container cleanup when you call `close()`
|
| 52 |
-
|
| 53 |
-
## Building the Docker Image
|
| 54 |
-
|
| 55 |
-
Before using the environment, you need to build the Docker image:
|
| 56 |
-
|
| 57 |
-
```bash
|
| 58 |
-
# From project root
|
| 59 |
-
docker build -t KantBench-env:latest -f server/Dockerfile .
|
| 60 |
-
```
|
| 61 |
-
|
| 62 |
-
## Deploying to Hugging Face Spaces
|
| 63 |
|
| 64 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
|
| 67 |
-
# From the environment directory (where openenv.yaml is located)
|
| 68 |
-
openenv push
|
| 69 |
-
|
| 70 |
-
# Or specify options
|
| 71 |
-
openenv push --namespace my-org --private
|
| 72 |
-
```
|
| 73 |
-
|
| 74 |
-
The `openenv push` command will:
|
| 75 |
-
1. Validate that the directory is an OpenEnv environment (checks for `openenv.yaml`)
|
| 76 |
-
2. Prepare a custom build for Hugging Face Docker space (enables web interface)
|
| 77 |
-
3. Upload to Hugging Face (ensuring you're logged in)
|
| 78 |
-
|
| 79 |
-
### Prerequisites
|
| 80 |
-
|
| 81 |
-
- Authenticate with Hugging Face: The command will prompt for login if not already authenticated
|
| 82 |
-
|
| 83 |
-
### Options
|
| 84 |
-
|
| 85 |
-
- `--directory`, `-d`: Directory containing the OpenEnv environment (defaults to current directory)
|
| 86 |
-
- `--repo-id`, `-r`: Repository ID in format 'username/repo-name' (defaults to 'username/env-name' from openenv.yaml)
|
| 87 |
-
- `--base-image`, `-b`: Base Docker image to use (overrides Dockerfile FROM)
|
| 88 |
-
- `--private`: Deploy the space as private (default: public)
|
| 89 |
-
|
| 90 |
-
### Examples
|
| 91 |
-
|
| 92 |
-
```bash
|
| 93 |
-
# Push to your personal namespace (defaults to username/env-name from openenv.yaml)
|
| 94 |
-
openenv push
|
| 95 |
-
|
| 96 |
-
# Push to a specific repository
|
| 97 |
-
openenv push --repo-id my-org/my-env
|
| 98 |
-
|
| 99 |
-
# Push with a custom base image
|
| 100 |
-
openenv push --base-image ghcr.io/meta-pytorch/openenv-base:latest
|
| 101 |
-
|
| 102 |
-
# Push as a private space
|
| 103 |
-
openenv push --private
|
| 104 |
-
|
| 105 |
-
# Combine options
|
| 106 |
-
openenv push --repo-id my-org/my-env --base-image custom-base:latest --private
|
| 107 |
-
```
|
| 108 |
|
| 109 |
-
|
| 110 |
-
`https://huggingface.co/spaces/<repo-id>`
|
| 111 |
|
| 112 |
-
|
| 113 |
-
- **Web Interface** at `/web` - Interactive UI for exploring the environment
|
| 114 |
-
- **API Documentation** at `/docs` - Full OpenAPI/Swagger interface
|
| 115 |
-
- **Health Check** at `/health` - Container health monitoring
|
| 116 |
-
- **WebSocket** at `/ws` - Persistent session endpoint for low-latency interactions
|
| 117 |
-
|
| 118 |
-
## Environment Details
|
| 119 |
-
|
| 120 |
-
### Action
|
| 121 |
-
**KantbenchAction**: Contains a single field
|
| 122 |
-
- `message` (str) - The message to echo back
|
| 123 |
-
|
| 124 |
-
### Observation
|
| 125 |
-
**KantbenchObservation**: Contains the echo response and metadata
|
| 126 |
-
- `echoed_message` (str) - The message echoed back
|
| 127 |
-
- `message_length` (int) - Length of the message
|
| 128 |
-
- `reward` (float) - Reward based on message length (length × 0.1)
|
| 129 |
-
- `done` (bool) - Always False for echo environment
|
| 130 |
-
- `metadata` (dict) - Additional info like step count
|
| 131 |
-
|
| 132 |
-
### Reward
|
| 133 |
-
The reward is calculated as: `message_length × 0.1`
|
| 134 |
-
- "Hi" → reward: 0.2
|
| 135 |
-
- "Hello, World!" → reward: 1.3
|
| 136 |
-
- Empty message → reward: 0.0
|
| 137 |
-
|
| 138 |
-
## Advanced Usage
|
| 139 |
-
|
| 140 |
-
### Connecting to an Existing Server
|
| 141 |
-
|
| 142 |
-
If you already have a Kantbench environment server running, you can connect directly:
|
| 143 |
|
| 144 |
```python
|
| 145 |
-
from KantBench import
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
```
|
| 154 |
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
### Using the Context Manager
|
| 158 |
-
|
| 159 |
-
The client supports context manager usage for automatic connection management:
|
| 160 |
|
| 161 |
```python
|
| 162 |
-
|
| 163 |
-
|
| 164 |
-
# Connect with context manager (auto-connects and closes)
|
| 165 |
-
with KantbenchEnv(base_url="http://localhost:8000") as env:
|
| 166 |
-
result = env.reset()
|
| 167 |
-
print(f"Reset: {result.observation.echoed_message}")
|
| 168 |
-
# Multiple steps with low latency
|
| 169 |
-
for msg in ["Hello", "World", "!"]:
|
| 170 |
-
result = env.step(KantbenchAction(message=msg))
|
| 171 |
-
print(f"Echoed: {result.observation.echoed_message}")
|
| 172 |
-
```
|
| 173 |
|
| 174 |
-
|
| 175 |
-
|
| 176 |
-
- **Persistent session**: Server maintains your environment state
|
| 177 |
-
- **Efficient for episodes**: Better for many sequential steps
|
| 178 |
-
|
| 179 |
-
### Concurrent WebSocket Sessions
|
| 180 |
-
|
| 181 |
-
The server supports multiple concurrent WebSocket connections. To enable this,
|
| 182 |
-
modify `server/app.py` to use factory mode:
|
| 183 |
-
|
| 184 |
-
```python
|
| 185 |
-
# In server/app.py - use factory mode for concurrent sessions
|
| 186 |
-
app = create_app(
|
| 187 |
-
KantbenchEnvironment, # Pass class, not instance
|
| 188 |
-
KantbenchAction,
|
| 189 |
-
KantbenchObservation,
|
| 190 |
-
max_concurrent_envs=4, # Allow 4 concurrent sessions
|
| 191 |
-
)
|
| 192 |
```
|
| 193 |
|
| 194 |
-
|
| 195 |
|
| 196 |
-
``
|
| 197 |
-
|
| 198 |
-
|
| 199 |
-
|
| 200 |
-
def run_episode(client_id: int):
|
| 201 |
-
with KantbenchEnv(base_url="http://localhost:8000") as env:
|
| 202 |
-
result = env.reset()
|
| 203 |
-
for i in range(10):
|
| 204 |
-
result = env.step(KantbenchAction(message=f"Client {client_id}, step {i}"))
|
| 205 |
-
return client_id, result.observation.message_length
|
| 206 |
-
|
| 207 |
-
# Run 4 episodes concurrently
|
| 208 |
-
with ThreadPoolExecutor(max_workers=4) as executor:
|
| 209 |
-
results = list(executor.map(run_episode, range(4)))
|
| 210 |
-
```
|
| 211 |
|
| 212 |
-
##
|
| 213 |
-
|
| 214 |
-
### Direct Environment Testing
|
| 215 |
|
| 216 |
-
|
| 217 |
|
| 218 |
-
|
| 219 |
-
|
| 220 |
-
python3 server/KantBench_environment.py
|
| 221 |
-
```
|
| 222 |
|
| 223 |
-
|
| 224 |
-
- Environment resets correctly
|
| 225 |
-
- Step executes actions properly
|
| 226 |
-
- State tracking works
|
| 227 |
-
- Rewards are calculated correctly
|
| 228 |
|
| 229 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 230 |
|
| 231 |
-
|
| 232 |
|
| 233 |
```bash
|
| 234 |
-
|
| 235 |
-
```
|
| 236 |
-
|
| 237 |
-
## Project Structure
|
| 238 |
-
|
| 239 |
-
```
|
| 240 |
-
KantBench/
|
| 241 |
-
├── .dockerignore # Docker build exclusions
|
| 242 |
-
├── __init__.py # Module exports
|
| 243 |
-
├── README.md # This file
|
| 244 |
-
├── openenv.yaml # OpenEnv manifest
|
| 245 |
-
├── pyproject.toml # Project metadata and dependencies
|
| 246 |
-
├── uv.lock # Locked dependencies (generated)
|
| 247 |
-
├── client.py # KantbenchEnv client
|
| 248 |
-
├── models.py # Action and Observation models
|
| 249 |
-
└── server/
|
| 250 |
-
├── __init__.py # Server module exports
|
| 251 |
-
├── KantBench_environment.py # Core environment logic
|
| 252 |
-
├── app.py # FastAPI application (HTTP + WebSocket endpoints)
|
| 253 |
-
└── Dockerfile # Container image definition
|
| 254 |
```
|
|
|
|
| 1 |
---
|
| 2 |
+
title: KantBench Environment Server
|
| 3 |
+
emoji: 🎮
|
| 4 |
colorFrom: green
|
| 5 |
colorTo: yellow
|
| 6 |
sdk: docker
|
|
|
|
| 10 |
- openenv
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# KantBench: 90+ Game Theory Environments for LLM Training
|
| 14 |
|
| 15 |
+
A comprehensive game theory environment for training and evaluating LLM strategic reasoning via OpenEnv. Supports GRPO/DPO training with the environment as a reward oracle.
|
| 16 |
|
| 17 |
+
## Games (90+)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 18 |
|
| 19 |
+
| Category | Examples | Count |
|
| 20 |
+
|---|---|---|
|
| 21 |
+
| **Classic Matrix** | Prisoner's Dilemma, Stag Hunt, Hawk-Dove, Battle of the Sexes | 20+ |
|
| 22 |
+
| **Economic/Market** | Cournot, Bertrand, Hotelling, Nash Demand, Double Auction | 23 |
|
| 23 |
+
| **Information & Signaling** | Beer-Quiche, Spence Signaling, Bayesian Persuasion, Moral Hazard | 21 |
|
| 24 |
+
| **Cooperative & Repeated** | Shapley Allocation, Stable Matching, Discounted PD, Stochastic PD | 23 |
|
| 25 |
+
| **Auctions & Contests** | First-Price, Vickrey, All-Pay, Colonel Blotto, Tullock Contest | 10+ |
|
| 26 |
+
| **Sequential** | Ultimatum, Trust, Centipede, Stackelberg, Dictator | 6 |
|
| 27 |
|
| 28 |
+
## Opponent Strategies (17)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
`random`, `always_cooperate`, `always_defect`, `tit_for_tat`, `tit_for_two_tats`, `grudger`, `pavlov`, `suspicious_tit_for_tat`, `generous_tit_for_tat`, `adaptive`, `mixed`, `ultimatum_fair`, `ultimatum_low`, `trust_fair`, `trust_generous`, `public_goods_fair`, `public_goods_free_rider`
|
|
|
|
| 31 |
|
| 32 |
+
## Quick Start
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 33 |
|
| 34 |
```python
|
| 35 |
+
from KantBench import KantBenchAction, KantBenchEnv
|
| 36 |
+
|
| 37 |
+
with KantBenchEnv(base_url="https://openenv-community-kantbench.hf.space") as env:
|
| 38 |
+
# Reset with a specific game and opponent strategy
|
| 39 |
+
result = env.reset(game="prisoners_dilemma", strategy="tit_for_tat")
|
| 40 |
+
print(f"Game: {result.observation.game_name}")
|
| 41 |
+
print(f"Moves: {result.observation.available_moves}")
|
| 42 |
+
|
| 43 |
+
# Play rounds until done
|
| 44 |
+
while not result.done:
|
| 45 |
+
result = env.step(KantBenchAction(move="cooperate"))
|
| 46 |
+
print(f"Round {result.observation.round_number}: "
|
| 47 |
+
f"you={result.observation.your_move}, "
|
| 48 |
+
f"opp={result.observation.opponent_move}, "
|
| 49 |
+
f"payoff={result.observation.your_payoff}")
|
| 50 |
+
|
| 51 |
+
print(f"Final score: {result.observation.cumulative_score}")
|
| 52 |
```
|
| 53 |
|
| 54 |
+
## Reset Parameters
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
```python
|
| 57 |
+
# Specific game and strategy
|
| 58 |
+
result = env.reset(game="stag_hunt", strategy="grudger")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 59 |
|
| 60 |
+
# Random game and strategy (default)
|
| 61 |
+
result = env.reset()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 62 |
```
|
| 63 |
|
| 64 |
+
## API Endpoints
|
| 65 |
|
| 66 |
+
- **Web Interface** at `/web` — Interactive UI for exploring the environment
|
| 67 |
+
- **API Docs** at `/docs` — Full OpenAPI/Swagger interface
|
| 68 |
+
- **Health Check** at `/health` — Container health monitoring
|
| 69 |
+
- **WebSocket** at `/ws` — Persistent session endpoint
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
+
## Environment Details
|
|
|
|
|
|
|
| 72 |
|
| 73 |
+
### Action
|
| 74 |
|
| 75 |
+
**KantBenchAction**: Single field
|
| 76 |
+
- `move` (str) — Your move (e.g. `"cooperate"`, `"defect"`, `"hawk"`, `"produce_5"`)
|
|
|
|
|
|
|
| 77 |
|
| 78 |
+
### Observation
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
|
| 80 |
+
**KantBenchObservation**: Full round result and episode state
|
| 81 |
+
- `game_name`, `game_description` — Current game info
|
| 82 |
+
- `available_moves` — Valid moves for this game
|
| 83 |
+
- `your_move`, `opponent_move` — Moves played this round
|
| 84 |
+
- `your_payoff`, `opponent_payoff` — Payoffs this round
|
| 85 |
+
- `cumulative_score` — Your total score
|
| 86 |
+
- `round_number`, `max_rounds` — Episode progress
|
| 87 |
+
- `opponent_strategy` — Opponent strategy name
|
| 88 |
+
- `history` — Full round-by-round history
|
| 89 |
|
| 90 |
+
## Deployment
|
| 91 |
|
| 92 |
```bash
|
| 93 |
+
python spaces/kant/deploy.py
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 94 |
```
|
__init__.py
CHANGED
|
@@ -1,16 +1,10 @@
|
|
| 1 |
-
|
| 2 |
-
# All rights reserved.
|
| 3 |
-
#
|
| 4 |
-
# This source code is licensed under the BSD-style license found in the
|
| 5 |
-
# LICENSE file in the root directory of this source tree.
|
| 6 |
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
from .client import KantbenchEnv
|
| 10 |
-
from .models import KantbenchAction, KantbenchObservation
|
| 11 |
|
| 12 |
__all__ = [
|
| 13 |
-
"
|
| 14 |
-
"
|
| 15 |
-
"
|
| 16 |
]
|
|
|
|
| 1 |
+
"""KantBench Environment — 90+ game theory games for LLM training."""
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
+
from .client import KantBenchEnv
|
| 4 |
+
from .models import KantBenchAction, KantBenchObservation
|
|
|
|
|
|
|
| 5 |
|
| 6 |
__all__ = [
|
| 7 |
+
"KantBenchAction",
|
| 8 |
+
"KantBenchObservation",
|
| 9 |
+
"KantBenchEnv",
|
| 10 |
]
|
client.py
CHANGED
|
@@ -1,10 +1,4 @@
|
|
| 1 |
-
|
| 2 |
-
# All rights reserved.
|
| 3 |
-
#
|
| 4 |
-
# This source code is licensed under the BSD-style license found in the
|
| 5 |
-
# LICENSE file in the root directory of this source tree.
|
| 6 |
-
|
| 7 |
-
"""Kantbench Environment Client."""
|
| 8 |
|
| 9 |
from typing import Dict
|
| 10 |
|
|
@@ -12,69 +6,54 @@ from openenv.core.client_types import StepResult
|
|
| 12 |
from openenv.core.env_server.types import State
|
| 13 |
from openenv.core import EnvClient
|
| 14 |
|
| 15 |
-
from .models import
|
| 16 |
|
| 17 |
|
| 18 |
-
class
|
| 19 |
-
EnvClient[
|
| 20 |
):
|
| 21 |
"""
|
| 22 |
-
Client for the
|
| 23 |
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
Each client instance has its own dedicated environment session on the server.
|
| 27 |
|
| 28 |
Example:
|
| 29 |
-
>>>
|
| 30 |
-
>>> with KantbenchEnv(base_url="http://localhost:8000") as client:
|
| 31 |
... result = client.reset()
|
| 32 |
-
... print(result.observation.
|
|
|
|
| 33 |
...
|
| 34 |
-
... result = client.step(
|
| 35 |
-
... print(result.observation.
|
| 36 |
|
| 37 |
-
Example with
|
| 38 |
-
>>>
|
| 39 |
-
>>> client = KantbenchEnv.from_docker_image("KantBench-env:latest")
|
| 40 |
-
>>> try:
|
| 41 |
... result = client.reset()
|
| 42 |
-
... result = client.step(
|
| 43 |
-
... finally:
|
| 44 |
-
... client.close()
|
| 45 |
"""
|
| 46 |
|
| 47 |
-
def _step_payload(self, action:
|
| 48 |
-
""
|
| 49 |
-
Convert KantbenchAction to JSON payload for step message.
|
| 50 |
-
|
| 51 |
-
Args:
|
| 52 |
-
action: KantbenchAction instance
|
| 53 |
-
|
| 54 |
-
Returns:
|
| 55 |
-
Dictionary representation suitable for JSON encoding
|
| 56 |
-
"""
|
| 57 |
-
return {
|
| 58 |
-
"message": action.message,
|
| 59 |
-
}
|
| 60 |
-
|
| 61 |
-
def _parse_result(self, payload: Dict) -> StepResult[KantbenchObservation]:
|
| 62 |
-
"""
|
| 63 |
-
Parse server response into StepResult[KantbenchObservation].
|
| 64 |
|
| 65 |
-
|
| 66 |
-
payload: JSON response data from server
|
| 67 |
-
|
| 68 |
-
Returns:
|
| 69 |
-
StepResult with KantbenchObservation
|
| 70 |
-
"""
|
| 71 |
obs_data = payload.get("observation", {})
|
| 72 |
-
observation =
|
| 73 |
-
|
| 74 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 75 |
done=payload.get("done", False),
|
| 76 |
reward=payload.get("reward"),
|
| 77 |
-
|
| 78 |
)
|
| 79 |
|
| 80 |
return StepResult(
|
|
@@ -84,15 +63,6 @@ class KantbenchEnv(
|
|
| 84 |
)
|
| 85 |
|
| 86 |
def _parse_state(self, payload: Dict) -> State:
|
| 87 |
-
"""
|
| 88 |
-
Parse server response into State object.
|
| 89 |
-
|
| 90 |
-
Args:
|
| 91 |
-
payload: JSON response from state request
|
| 92 |
-
|
| 93 |
-
Returns:
|
| 94 |
-
State object with episode_id and step_count
|
| 95 |
-
"""
|
| 96 |
return State(
|
| 97 |
episode_id=payload.get("episode_id"),
|
| 98 |
step_count=payload.get("step_count", 0),
|
|
|
|
| 1 |
+
"""KantBench Environment Client."""
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
from typing import Dict
|
| 4 |
|
|
|
|
| 6 |
from openenv.core.env_server.types import State
|
| 7 |
from openenv.core import EnvClient
|
| 8 |
|
| 9 |
+
from .models import KantBenchAction, KantBenchObservation
|
| 10 |
|
| 11 |
|
| 12 |
+
class KantBenchEnv(
|
| 13 |
+
EnvClient[KantBenchAction, KantBenchObservation]
|
| 14 |
):
|
| 15 |
"""
|
| 16 |
+
Client for the KantBench game theory environment.
|
| 17 |
|
| 18 |
+
Maintains a persistent WebSocket connection to the environment server.
|
| 19 |
+
Each client instance has its own dedicated environment session.
|
|
|
|
| 20 |
|
| 21 |
Example:
|
| 22 |
+
>>> with KantBenchEnv(base_url="http://localhost:8000") as client:
|
|
|
|
| 23 |
... result = client.reset()
|
| 24 |
+
... print(result.observation.game_name)
|
| 25 |
+
... print(result.observation.available_moves)
|
| 26 |
...
|
| 27 |
+
... result = client.step(KantBenchAction(move="cooperate"))
|
| 28 |
+
... print(result.observation.your_payoff)
|
| 29 |
|
| 30 |
+
Example with HF Space:
|
| 31 |
+
>>> with KantBenchEnv(base_url="https://openenv-community-kantbench.hf.space") as client:
|
|
|
|
|
|
|
| 32 |
... result = client.reset()
|
| 33 |
+
... result = client.step(KantBenchAction(move="cooperate"))
|
|
|
|
|
|
|
| 34 |
"""
|
| 35 |
|
| 36 |
+
def _step_payload(self, action: KantBenchAction) -> Dict:
|
| 37 |
+
return {"move": action.move}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 38 |
|
| 39 |
+
def _parse_result(self, payload: Dict) -> StepResult[KantBenchObservation]:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 40 |
obs_data = payload.get("observation", {})
|
| 41 |
+
observation = KantBenchObservation(
|
| 42 |
+
game_name=obs_data.get("game_name", ""),
|
| 43 |
+
game_description=obs_data.get("game_description", ""),
|
| 44 |
+
available_moves=obs_data.get("available_moves", []),
|
| 45 |
+
your_move=obs_data.get("your_move", ""),
|
| 46 |
+
opponent_move=obs_data.get("opponent_move", ""),
|
| 47 |
+
your_payoff=obs_data.get("your_payoff", 0.0),
|
| 48 |
+
opponent_payoff=obs_data.get("opponent_payoff", 0.0),
|
| 49 |
+
cumulative_score=obs_data.get("cumulative_score", 0.0),
|
| 50 |
+
round_number=obs_data.get("round_number", 0),
|
| 51 |
+
max_rounds=obs_data.get("max_rounds", 10),
|
| 52 |
+
opponent_strategy=obs_data.get("opponent_strategy", ""),
|
| 53 |
+
history=obs_data.get("history", []),
|
| 54 |
done=payload.get("done", False),
|
| 55 |
reward=payload.get("reward"),
|
| 56 |
+
message=obs_data.get("message", ""),
|
| 57 |
)
|
| 58 |
|
| 59 |
return StepResult(
|
|
|
|
| 63 |
)
|
| 64 |
|
| 65 |
def _parse_state(self, payload: Dict) -> State:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 66 |
return State(
|
| 67 |
episode_id=payload.get("episode_id"),
|
| 68 |
step_count=payload.get("step_count", 0),
|
common/__init__.py
ADDED
|
@@ -0,0 +1 @@
|
|
|
|
|
|
|
| 1 |
+
"""Shared game infrastructure: game definitions, strategies, and extensions."""
|
common/__pycache__/__init__.cpython-311.pyc
ADDED
|
Binary file (251 Bytes). View file
|
|
|
common/__pycache__/games.cpython-311.pyc
ADDED
|
Binary file (10.8 kB). View file
|
|
|
common/__pycache__/strategies.cpython-311.pyc
ADDED
|
Binary file (18.8 kB). View file
|
|
|
common/games.py
ADDED
|
@@ -0,0 +1,298 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Game configuration registry and payoff computation for KantBench."""
|
| 2 |
+
|
| 3 |
+
from __future__ import annotations
|
| 4 |
+
|
| 5 |
+
from dataclasses import dataclass
|
| 6 |
+
from typing import Callable
|
| 7 |
+
|
| 8 |
+
from constant_definitions.game_constants import (
|
| 9 |
+
DEFAULT_ZERO_FLOAT,
|
| 10 |
+
DEFAULT_ZERO_INT,
|
| 11 |
+
# Prisoner's Dilemma
|
| 12 |
+
PD_CC_PAYOFF,
|
| 13 |
+
PD_CD_PAYOFF,
|
| 14 |
+
PD_DC_PAYOFF,
|
| 15 |
+
PD_DD_PAYOFF,
|
| 16 |
+
# Stag Hunt
|
| 17 |
+
SH_SS_PAYOFF,
|
| 18 |
+
SH_SH_PAYOFF,
|
| 19 |
+
SH_HS_PAYOFF,
|
| 20 |
+
SH_HH_PAYOFF,
|
| 21 |
+
# Hawk-Dove
|
| 22 |
+
HD_HH_PAYOFF,
|
| 23 |
+
HD_HD_PAYOFF,
|
| 24 |
+
HD_DH_PAYOFF,
|
| 25 |
+
HD_DD_PAYOFF,
|
| 26 |
+
# Ultimatum
|
| 27 |
+
ULTIMATUM_POT,
|
| 28 |
+
# Trust
|
| 29 |
+
TRUST_MULTIPLIER,
|
| 30 |
+
TRUST_ENDOWMENT,
|
| 31 |
+
# Public Goods
|
| 32 |
+
PG_MULTIPLIER_NUMERATOR,
|
| 33 |
+
PG_MULTIPLIER_DENOMINATOR,
|
| 34 |
+
PG_ENDOWMENT,
|
| 35 |
+
PG_DEFAULT_NUM_PLAYERS,
|
| 36 |
+
# Round counts
|
| 37 |
+
DEFAULT_NUM_ROUNDS,
|
| 38 |
+
SINGLE_SHOT_ROUNDS,
|
| 39 |
+
)
|
| 40 |
+
|
| 41 |
+
# ---------------------------------------------------------------------------
|
| 42 |
+
# GameConfig dataclass
|
| 43 |
+
# ---------------------------------------------------------------------------
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
@dataclass(frozen=True)
|
| 47 |
+
class GameConfig:
|
| 48 |
+
"""Immutable specification for a single game type."""
|
| 49 |
+
|
| 50 |
+
name: str
|
| 51 |
+
description: str
|
| 52 |
+
actions: list[str]
|
| 53 |
+
game_type: str # "matrix" | "ultimatum" | "trust" | "public_goods"
|
| 54 |
+
default_rounds: int
|
| 55 |
+
payoff_fn: Callable[[str, str], tuple[float, float]]
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
# ---------------------------------------------------------------------------
|
| 59 |
+
# Matrix-game payoff helpers
|
| 60 |
+
# ---------------------------------------------------------------------------
|
| 61 |
+
|
| 62 |
+
_PD_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 63 |
+
("cooperate", "cooperate"): (float(PD_CC_PAYOFF), float(PD_CC_PAYOFF)),
|
| 64 |
+
("cooperate", "defect"): (float(PD_CD_PAYOFF), float(PD_DC_PAYOFF)),
|
| 65 |
+
("defect", "cooperate"): (float(PD_DC_PAYOFF), float(PD_CD_PAYOFF)),
|
| 66 |
+
("defect", "defect"): (float(PD_DD_PAYOFF), float(PD_DD_PAYOFF)),
|
| 67 |
+
}
|
| 68 |
+
|
| 69 |
+
_SH_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 70 |
+
("stag", "stag"): (float(SH_SS_PAYOFF), float(SH_SS_PAYOFF)),
|
| 71 |
+
("stag", "hare"): (float(SH_SH_PAYOFF), float(SH_HS_PAYOFF)),
|
| 72 |
+
("hare", "stag"): (float(SH_HS_PAYOFF), float(SH_SH_PAYOFF)),
|
| 73 |
+
("hare", "hare"): (float(SH_HH_PAYOFF), float(SH_HH_PAYOFF)),
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
+
_HD_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 77 |
+
("hawk", "hawk"): (float(HD_HH_PAYOFF), float(HD_HH_PAYOFF)),
|
| 78 |
+
("hawk", "dove"): (float(HD_HD_PAYOFF), float(HD_DH_PAYOFF)),
|
| 79 |
+
("dove", "hawk"): (float(HD_DH_PAYOFF), float(HD_HD_PAYOFF)),
|
| 80 |
+
("dove", "dove"): (float(HD_DD_PAYOFF), float(HD_DD_PAYOFF)),
|
| 81 |
+
}
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
def _matrix_payoff_fn(
|
| 85 |
+
matrix: dict[tuple[str, str], tuple[float, float]],
|
| 86 |
+
) -> Callable[[str, str], tuple[float, float]]:
|
| 87 |
+
"""Return a payoff function backed by a pre-built matrix dict."""
|
| 88 |
+
|
| 89 |
+
def _payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 90 |
+
return matrix[(player_action, opponent_action)]
|
| 91 |
+
|
| 92 |
+
return _payoff
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
# ---------------------------------------------------------------------------
|
| 96 |
+
# Computed payoff functions
|
| 97 |
+
# ---------------------------------------------------------------------------
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
def _parse_action_amount(action: str) -> int:
|
| 101 |
+
"""Extract the integer suffix from an action string like 'offer_5'."""
|
| 102 |
+
parts = action.rsplit("_", maxsplit=SINGLE_SHOT_ROUNDS)
|
| 103 |
+
return int(parts[SINGLE_SHOT_ROUNDS])
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
def _ultimatum_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 107 |
+
"""Compute Ultimatum Game payoffs.
|
| 108 |
+
|
| 109 |
+
The player chooses an offer amount; the opponent accepts or rejects.
|
| 110 |
+
"""
|
| 111 |
+
offer = _parse_action_amount(player_action)
|
| 112 |
+
|
| 113 |
+
if opponent_action == "reject":
|
| 114 |
+
return (DEFAULT_ZERO_FLOAT, DEFAULT_ZERO_FLOAT)
|
| 115 |
+
|
| 116 |
+
# accepted
|
| 117 |
+
player_payoff = float(ULTIMATUM_POT - offer)
|
| 118 |
+
opponent_payoff = float(offer)
|
| 119 |
+
return (player_payoff, opponent_payoff)
|
| 120 |
+
|
| 121 |
+
|
| 122 |
+
def _trust_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 123 |
+
"""Compute Trust Game payoffs.
|
| 124 |
+
|
| 125 |
+
The player invests X from their endowment. The opponent receives
|
| 126 |
+
X * multiplier and returns Y of that amount.
|
| 127 |
+
"""
|
| 128 |
+
investment = _parse_action_amount(player_action)
|
| 129 |
+
returned = _parse_action_amount(opponent_action)
|
| 130 |
+
|
| 131 |
+
player_payoff = float(TRUST_ENDOWMENT - investment + returned)
|
| 132 |
+
opponent_payoff = float(investment * TRUST_MULTIPLIER - returned)
|
| 133 |
+
return (player_payoff, opponent_payoff)
|
| 134 |
+
|
| 135 |
+
|
| 136 |
+
def _public_goods_payoff(
|
| 137 |
+
player_action: str, opponent_action: str,
|
| 138 |
+
) -> tuple[float, float]:
|
| 139 |
+
"""Compute Public Goods Game payoffs.
|
| 140 |
+
|
| 141 |
+
Each participant contributes from their endowment. The total pot is
|
| 142 |
+
multiplied by (numerator / denominator) then split equally among all
|
| 143 |
+
participants.
|
| 144 |
+
"""
|
| 145 |
+
player_contrib = _parse_action_amount(player_action)
|
| 146 |
+
opponent_contrib = _parse_action_amount(opponent_action)
|
| 147 |
+
|
| 148 |
+
total_contributions = player_contrib + opponent_contrib
|
| 149 |
+
multiplied_pot = (
|
| 150 |
+
total_contributions * PG_MULTIPLIER_NUMERATOR / PG_MULTIPLIER_DENOMINATOR
|
| 151 |
+
)
|
| 152 |
+
share = multiplied_pot / PG_DEFAULT_NUM_PLAYERS
|
| 153 |
+
|
| 154 |
+
player_payoff = float(PG_ENDOWMENT - player_contrib) + share
|
| 155 |
+
opponent_payoff = float(PG_ENDOWMENT - opponent_contrib) + share
|
| 156 |
+
return (player_payoff, opponent_payoff)
|
| 157 |
+
|
| 158 |
+
|
| 159 |
+
# ---------------------------------------------------------------------------
|
| 160 |
+
# Action lists for computed games
|
| 161 |
+
# ---------------------------------------------------------------------------
|
| 162 |
+
|
| 163 |
+
_ULTIMATUM_OFFERS: list[str] = [
|
| 164 |
+
f"offer_{i}" for i in range(ULTIMATUM_POT + SINGLE_SHOT_ROUNDS)
|
| 165 |
+
]
|
| 166 |
+
|
| 167 |
+
_TRUST_INVESTMENTS: list[str] = [
|
| 168 |
+
f"invest_{i}" for i in range(TRUST_ENDOWMENT + SINGLE_SHOT_ROUNDS)
|
| 169 |
+
]
|
| 170 |
+
|
| 171 |
+
_PG_CONTRIBUTIONS: list[str] = [
|
| 172 |
+
f"contribute_{i}" for i in range(PG_ENDOWMENT + SINGLE_SHOT_ROUNDS)
|
| 173 |
+
]
|
| 174 |
+
|
| 175 |
+
|
| 176 |
+
# ---------------------------------------------------------------------------
|
| 177 |
+
# Game registry
|
| 178 |
+
# ---------------------------------------------------------------------------
|
| 179 |
+
|
| 180 |
+
GAMES: dict[str, GameConfig] = {
|
| 181 |
+
"prisoners_dilemma": GameConfig(
|
| 182 |
+
name="Prisoner's Dilemma",
|
| 183 |
+
description=(
|
| 184 |
+
"Two players simultaneously choose to cooperate or defect. "
|
| 185 |
+
"Mutual cooperation yields a moderate reward, mutual defection "
|
| 186 |
+
"yields a low reward, and unilateral defection tempts with the "
|
| 187 |
+
"highest individual payoff at the other player's expense."
|
| 188 |
+
),
|
| 189 |
+
actions=["cooperate", "defect"],
|
| 190 |
+
game_type="matrix",
|
| 191 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 192 |
+
payoff_fn=_matrix_payoff_fn(_PD_MATRIX),
|
| 193 |
+
),
|
| 194 |
+
"stag_hunt": GameConfig(
|
| 195 |
+
name="Stag Hunt",
|
| 196 |
+
description=(
|
| 197 |
+
"Two players choose between hunting stag (risky but rewarding "
|
| 198 |
+
"if both participate) or hunting hare (safe but less rewarding). "
|
| 199 |
+
"Coordination on stag yields the highest joint payoff."
|
| 200 |
+
),
|
| 201 |
+
actions=["stag", "hare"],
|
| 202 |
+
game_type="matrix",
|
| 203 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 204 |
+
payoff_fn=_matrix_payoff_fn(_SH_MATRIX),
|
| 205 |
+
),
|
| 206 |
+
"hawk_dove": GameConfig(
|
| 207 |
+
name="Hawk-Dove",
|
| 208 |
+
description=(
|
| 209 |
+
"Two players choose between aggressive (hawk) and passive (dove) "
|
| 210 |
+
"strategies over a shared resource. Two hawks suffer mutual harm; "
|
| 211 |
+
"a hawk facing a dove claims the resource; two doves share it."
|
| 212 |
+
),
|
| 213 |
+
actions=["hawk", "dove"],
|
| 214 |
+
game_type="matrix",
|
| 215 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 216 |
+
payoff_fn=_matrix_payoff_fn(_HD_MATRIX),
|
| 217 |
+
),
|
| 218 |
+
"ultimatum": GameConfig(
|
| 219 |
+
name="Ultimatum Game",
|
| 220 |
+
description=(
|
| 221 |
+
"The proposer offers a split of a fixed pot. The responder "
|
| 222 |
+
"either accepts (both receive their shares) or rejects "
|
| 223 |
+
"(both receive nothing)."
|
| 224 |
+
),
|
| 225 |
+
actions=_ULTIMATUM_OFFERS,
|
| 226 |
+
game_type="ultimatum",
|
| 227 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 228 |
+
payoff_fn=_ultimatum_payoff,
|
| 229 |
+
),
|
| 230 |
+
"trust": GameConfig(
|
| 231 |
+
name="Trust Game",
|
| 232 |
+
description=(
|
| 233 |
+
"The investor sends part of an endowment; the amount is "
|
| 234 |
+
"multiplied and given to the trustee, who then decides how "
|
| 235 |
+
"much to return."
|
| 236 |
+
),
|
| 237 |
+
actions=_TRUST_INVESTMENTS,
|
| 238 |
+
game_type="trust",
|
| 239 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 240 |
+
payoff_fn=_trust_payoff,
|
| 241 |
+
),
|
| 242 |
+
"public_goods": GameConfig(
|
| 243 |
+
name="Public Goods Game",
|
| 244 |
+
description=(
|
| 245 |
+
"Each participant decides how much of their endowment to "
|
| 246 |
+
"contribute to a common pool. The pool is multiplied and "
|
| 247 |
+
"distributed equally, creating tension between individual "
|
| 248 |
+
"free-riding and collective benefit."
|
| 249 |
+
),
|
| 250 |
+
actions=_PG_CONTRIBUTIONS,
|
| 251 |
+
game_type="public_goods",
|
| 252 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 253 |
+
payoff_fn=_public_goods_payoff,
|
| 254 |
+
),
|
| 255 |
+
}
|
| 256 |
+
|
| 257 |
+
|
| 258 |
+
def get_game(name: str) -> GameConfig:
|
| 259 |
+
"""Retrieve a GameConfig by its registry key.
|
| 260 |
+
|
| 261 |
+
Args:
|
| 262 |
+
name: Key in the GAMES registry (e.g. ``"prisoners_dilemma"``).
|
| 263 |
+
|
| 264 |
+
Returns:
|
| 265 |
+
The corresponding :class:`GameConfig` instance.
|
| 266 |
+
|
| 267 |
+
Raises:
|
| 268 |
+
KeyError: If *name* is not present in the registry.
|
| 269 |
+
"""
|
| 270 |
+
return GAMES[name]
|
| 271 |
+
|
| 272 |
+
|
| 273 |
+
def _load_extensions() -> None:
|
| 274 |
+
"""Import extension modules that register additional games."""
|
| 275 |
+
import importlib
|
| 276 |
+
for mod in [
|
| 277 |
+
"common.games_ext.matrix_games", "common.games_ext.sequential",
|
| 278 |
+
"common.games_ext.auction", "common.games_ext.nplayer",
|
| 279 |
+
"common.games_ext.generated", "common.games_info.signaling",
|
| 280 |
+
"common.games_info.contracts", "common.games_info.communication",
|
| 281 |
+
"common.games_info.bayesian", "common.games_info.network",
|
| 282 |
+
"common.games_market.oligopoly", "common.games_market.contests",
|
| 283 |
+
"common.games_market.classic", "common.games_market.generated_v2",
|
| 284 |
+
"common.games_market.advanced", "common.games_coop.cooperative",
|
| 285 |
+
"common.games_coop.dynamic", "common.games_coop.pd_variants",
|
| 286 |
+
"common.games_coop.infinite", "common.games_coop.stochastic",
|
| 287 |
+
]:
|
| 288 |
+
try:
|
| 289 |
+
importlib.import_module(mod)
|
| 290 |
+
except ImportError:
|
| 291 |
+
pass
|
| 292 |
+
|
| 293 |
+
|
| 294 |
+
_load_extensions()
|
| 295 |
+
|
| 296 |
+
from common.games_meta.dynamic import ( # noqa: E402,F401
|
| 297 |
+
create_matrix_game, create_symmetric_game, create_custom_game,
|
| 298 |
+
)
|
common/games_coop/__pycache__/cooperative.cpython-311.pyc
ADDED
|
Binary file (9.15 kB). View file
|
|
|
common/games_coop/__pycache__/dynamic.cpython-311.pyc
ADDED
|
Binary file (7.66 kB). View file
|
|
|
common/games_coop/__pycache__/infinite.cpython-311.pyc
ADDED
|
Binary file (3.83 kB). View file
|
|
|
common/games_coop/__pycache__/pd_variants.cpython-311.pyc
ADDED
|
Binary file (5.9 kB). View file
|
|
|
common/games_coop/__pycache__/stochastic.cpython-311.pyc
ADDED
|
Binary file (6.17 kB). View file
|
|
|
common/games_coop/cooperative.py
ADDED
|
@@ -0,0 +1,169 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Cooperative game theory and social choice games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.cooperative_constants import (
|
| 7 |
+
SHAPLEY_GRAND_COALITION_VALUE, SHAPLEY_SINGLE_VALUE,
|
| 8 |
+
SHAPLEY_MAX_CLAIM,
|
| 9 |
+
CORE_POT,
|
| 10 |
+
WV_QUOTA, WV_PLAYER_WEIGHT, WV_OPPONENT_WEIGHT,
|
| 11 |
+
WV_PASS_BENEFIT, WV_FAIL_PAYOFF, WV_OPPOSITION_BONUS,
|
| 12 |
+
SM_TOP_MATCH_PAYOFF, SM_MID_MATCH_PAYOFF, SM_LOW_MATCH_PAYOFF,
|
| 13 |
+
MV_POSITION_RANGE, MV_DISTANCE_COST,
|
| 14 |
+
AV_PREFERRED_WIN, AV_ACCEPTABLE_WIN, AV_DISLIKED_WIN,
|
| 15 |
+
AV_NUM_CANDIDATES,
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
_ONE = int(bool(True))
|
| 19 |
+
_TWO = _ONE + _ONE
|
| 20 |
+
_ZERO_F = float()
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# -- Shapley Value Allocation --
|
| 24 |
+
def _shapley_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 25 |
+
"""Each proposes a claim. Compatible claims split; else disagreement."""
|
| 26 |
+
c_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 27 |
+
c_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 28 |
+
if c_p + c_o <= SHAPLEY_GRAND_COALITION_VALUE:
|
| 29 |
+
return (float(c_p), float(c_o))
|
| 30 |
+
return (float(SHAPLEY_SINGLE_VALUE), float(SHAPLEY_SINGLE_VALUE))
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
_SHAPLEY_ACTS = [f"claim_{i}" for i in range(SHAPLEY_MAX_CLAIM + _ONE)]
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
# -- Core / Divide-the-Dollar --
|
| 37 |
+
def _core_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 38 |
+
"""Each proposes how much they want. If feasible, they get it."""
|
| 39 |
+
d_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 40 |
+
d_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 41 |
+
if d_p + d_o <= CORE_POT:
|
| 42 |
+
return (float(d_p), float(d_o))
|
| 43 |
+
return (_ZERO_F, _ZERO_F)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
_CORE_ACTS = [f"claim_{i}" for i in range(CORE_POT + _ONE)]
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
# -- Weighted Voting --
|
| 50 |
+
def _weighted_voting_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 51 |
+
"""Players vote yes or no; proposal passes if weighted votes meet quota."""
|
| 52 |
+
p_yes = pa == "vote_yes"
|
| 53 |
+
o_yes = oa == "vote_yes"
|
| 54 |
+
total_weight = int()
|
| 55 |
+
if p_yes:
|
| 56 |
+
total_weight += WV_PLAYER_WEIGHT
|
| 57 |
+
if o_yes:
|
| 58 |
+
total_weight += WV_OPPONENT_WEIGHT
|
| 59 |
+
passes = total_weight >= WV_QUOTA
|
| 60 |
+
if passes:
|
| 61 |
+
return (float(WV_PASS_BENEFIT), float(WV_PASS_BENEFIT))
|
| 62 |
+
p_pay = float(WV_OPPOSITION_BONUS) if not p_yes else float(WV_FAIL_PAYOFF)
|
| 63 |
+
o_pay = float(WV_OPPOSITION_BONUS) if not o_yes else float(WV_FAIL_PAYOFF)
|
| 64 |
+
return (p_pay, o_pay)
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
# -- Stable Matching (preference revelation) --
|
| 68 |
+
_SM_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 69 |
+
("rank_abc", "rank_abc"): (float(SM_TOP_MATCH_PAYOFF), float(SM_TOP_MATCH_PAYOFF)),
|
| 70 |
+
("rank_abc", "rank_bac"): (float(SM_MID_MATCH_PAYOFF), float(SM_TOP_MATCH_PAYOFF)),
|
| 71 |
+
("rank_abc", "rank_cab"): (float(SM_LOW_MATCH_PAYOFF), float(SM_MID_MATCH_PAYOFF)),
|
| 72 |
+
("rank_bac", "rank_abc"): (float(SM_TOP_MATCH_PAYOFF), float(SM_MID_MATCH_PAYOFF)),
|
| 73 |
+
("rank_bac", "rank_bac"): (float(SM_MID_MATCH_PAYOFF), float(SM_MID_MATCH_PAYOFF)),
|
| 74 |
+
("rank_bac", "rank_cab"): (float(SM_LOW_MATCH_PAYOFF), float(SM_LOW_MATCH_PAYOFF)),
|
| 75 |
+
("rank_cab", "rank_abc"): (float(SM_MID_MATCH_PAYOFF), float(SM_LOW_MATCH_PAYOFF)),
|
| 76 |
+
("rank_cab", "rank_bac"): (float(SM_LOW_MATCH_PAYOFF), float(SM_LOW_MATCH_PAYOFF)),
|
| 77 |
+
("rank_cab", "rank_cab"): (float(SM_TOP_MATCH_PAYOFF), float(SM_TOP_MATCH_PAYOFF)),
|
| 78 |
+
}
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
# -- Median Voter --
|
| 82 |
+
def _median_voter_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 83 |
+
"""Each picks a policy position; outcome is the median."""
|
| 84 |
+
pos_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 85 |
+
pos_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 86 |
+
median = (pos_p + pos_o) // _TWO
|
| 87 |
+
p_pay = float(-MV_DISTANCE_COST * abs(pos_p - median))
|
| 88 |
+
o_pay = float(-MV_DISTANCE_COST * abs(pos_o - median))
|
| 89 |
+
return (p_pay, o_pay)
|
| 90 |
+
|
| 91 |
+
|
| 92 |
+
_MV_ACTS = [f"position_{i}" for i in range(MV_POSITION_RANGE + _ONE)]
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
# -- Approval Voting --
|
| 96 |
+
def _approval_voting_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 97 |
+
"""Each approves a candidate. Candidate with most approvals wins."""
|
| 98 |
+
if pa == oa:
|
| 99 |
+
return (float(AV_PREFERRED_WIN), float(AV_PREFERRED_WIN))
|
| 100 |
+
return (float(AV_DISLIKED_WIN), float(AV_DISLIKED_WIN))
|
| 101 |
+
|
| 102 |
+
|
| 103 |
+
_AV_ACTS = [f"approve_{chr(ord('a') + i)}" for i in range(AV_NUM_CANDIDATES)]
|
| 104 |
+
|
| 105 |
+
COOPERATIVE_GAMES: dict[str, GameConfig] = {
|
| 106 |
+
"shapley_allocation": GameConfig(
|
| 107 |
+
name="Shapley Value Allocation",
|
| 108 |
+
description=(
|
| 109 |
+
"Players claim shares of a coalition surplus. If claims are "
|
| 110 |
+
"compatible, each receives their claim; otherwise both receive "
|
| 111 |
+
"only their standalone value. Tests fair division reasoning."
|
| 112 |
+
),
|
| 113 |
+
actions=_SHAPLEY_ACTS, game_type="shapley",
|
| 114 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_shapley_payoff,
|
| 115 |
+
),
|
| 116 |
+
"core_divide_dollar": GameConfig(
|
| 117 |
+
name="Core / Divide-the-Dollar",
|
| 118 |
+
description=(
|
| 119 |
+
"Players simultaneously claim shares of a pot. If total "
|
| 120 |
+
"claims are feasible, each gets their share; otherwise "
|
| 121 |
+
"both get nothing. Tests coalition stability reasoning."
|
| 122 |
+
),
|
| 123 |
+
actions=_CORE_ACTS, game_type="core",
|
| 124 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_core_payoff,
|
| 125 |
+
),
|
| 126 |
+
"weighted_voting": GameConfig(
|
| 127 |
+
name="Weighted Voting Game",
|
| 128 |
+
description=(
|
| 129 |
+
"Players with different voting weights decide yes or no on "
|
| 130 |
+
"a proposal. The proposal passes if the weighted total meets "
|
| 131 |
+
"a quota. Tests understanding of pivotal power dynamics."
|
| 132 |
+
),
|
| 133 |
+
actions=["vote_yes", "vote_no"], game_type="matrix",
|
| 134 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_weighted_voting_payoff,
|
| 135 |
+
),
|
| 136 |
+
"stable_matching": GameConfig(
|
| 137 |
+
name="Stable Matching",
|
| 138 |
+
description=(
|
| 139 |
+
"Players report preference rankings over potential partners. "
|
| 140 |
+
"The matching outcome depends on reported preferences. Tests "
|
| 141 |
+
"whether agents report truthfully or strategically manipulate."
|
| 142 |
+
),
|
| 143 |
+
actions=["rank_abc", "rank_bac", "rank_cab"], game_type="matrix",
|
| 144 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 145 |
+
payoff_fn=_matrix_payoff_fn(_SM_MATRIX),
|
| 146 |
+
),
|
| 147 |
+
"median_voter": GameConfig(
|
| 148 |
+
name="Median Voter Game",
|
| 149 |
+
description=(
|
| 150 |
+
"Players choose policy positions on a line. The implemented "
|
| 151 |
+
"policy is the median. Each player's payoff decreases with "
|
| 152 |
+
"distance from the outcome. Tests strategic positioning."
|
| 153 |
+
),
|
| 154 |
+
actions=_MV_ACTS, game_type="median_voter",
|
| 155 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_median_voter_payoff,
|
| 156 |
+
),
|
| 157 |
+
"approval_voting": GameConfig(
|
| 158 |
+
name="Approval Voting",
|
| 159 |
+
description=(
|
| 160 |
+
"Players approve one candidate from a set. The candidate "
|
| 161 |
+
"with the most approvals wins. Tests strategic vs sincere "
|
| 162 |
+
"voting behavior and preference aggregation."
|
| 163 |
+
),
|
| 164 |
+
actions=_AV_ACTS, game_type="matrix",
|
| 165 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_approval_voting_payoff,
|
| 166 |
+
),
|
| 167 |
+
}
|
| 168 |
+
|
| 169 |
+
GAMES.update(COOPERATIVE_GAMES)
|
common/games_coop/dynamic.py
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Dynamic, behavioral, and repeated games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.dynamic_constants import (
|
| 7 |
+
BR_PATIENCE_REWARD, BR_EARLY_WITHDRAW, BR_BANK_FAIL_PAYOFF,
|
| 8 |
+
GSH_STAG_PAYOFF, GSH_HARE_PAYOFF, GSH_STAG_ALONE_PAYOFF,
|
| 9 |
+
BC_MAX_NUMBER, BC_TARGET_FRACTION_NUM, BC_TARGET_FRACTION_DEN,
|
| 10 |
+
BC_WIN_PAYOFF, BC_LOSE_PAYOFF, BC_TIE_PAYOFF,
|
| 11 |
+
HDB_RESOURCE_VALUE, HDB_FIGHT_COST, HDB_SHARE_DIVISOR,
|
| 12 |
+
)
|
| 13 |
+
from constant_definitions.game_constants import (
|
| 14 |
+
PD_CC_PAYOFF, PD_CD_PAYOFF, PD_DC_PAYOFF, PD_DD_PAYOFF,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
_ONE = int(bool(True))
|
| 18 |
+
_TWO = _ONE + _ONE
|
| 19 |
+
_ZERO_F = float()
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
# -- Bank Run (Diamond-Dybvig) --
|
| 23 |
+
_BR_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 24 |
+
("wait", "wait"): (float(BR_PATIENCE_REWARD), float(BR_PATIENCE_REWARD)),
|
| 25 |
+
("wait", "withdraw"): (float(BR_BANK_FAIL_PAYOFF), float(BR_EARLY_WITHDRAW)),
|
| 26 |
+
("withdraw", "wait"): (float(BR_EARLY_WITHDRAW), float(BR_BANK_FAIL_PAYOFF)),
|
| 27 |
+
("withdraw", "withdraw"): (float(BR_BANK_FAIL_PAYOFF), float(BR_BANK_FAIL_PAYOFF)),
|
| 28 |
+
}
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
# -- Global Stag Hunt (higher stakes variant) --
|
| 32 |
+
_GSH_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 33 |
+
("stag", "stag"): (float(GSH_STAG_PAYOFF), float(GSH_STAG_PAYOFF)),
|
| 34 |
+
("stag", "hare"): (float(GSH_STAG_ALONE_PAYOFF), float(GSH_HARE_PAYOFF)),
|
| 35 |
+
("hare", "stag"): (float(GSH_HARE_PAYOFF), float(GSH_STAG_ALONE_PAYOFF)),
|
| 36 |
+
("hare", "hare"): (float(GSH_HARE_PAYOFF), float(GSH_HARE_PAYOFF)),
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
# -- Beauty Contest (p-Guessing Game) --
|
| 41 |
+
def _beauty_contest_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 42 |
+
"""Each picks a number. Closest to p * average wins."""
|
| 43 |
+
n_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 44 |
+
n_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 45 |
+
avg = float(n_p + n_o) / _TWO
|
| 46 |
+
target = avg * BC_TARGET_FRACTION_NUM / BC_TARGET_FRACTION_DEN
|
| 47 |
+
dist_p = abs(float(n_p) - target)
|
| 48 |
+
dist_o = abs(float(n_o) - target)
|
| 49 |
+
if dist_p < dist_o:
|
| 50 |
+
return (float(BC_WIN_PAYOFF), float(BC_LOSE_PAYOFF))
|
| 51 |
+
if dist_o < dist_p:
|
| 52 |
+
return (float(BC_LOSE_PAYOFF), float(BC_WIN_PAYOFF))
|
| 53 |
+
return (float(BC_TIE_PAYOFF), float(BC_TIE_PAYOFF))
|
| 54 |
+
|
| 55 |
+
|
| 56 |
+
_BC_ACTS = [f"guess_{i}" for i in range(BC_MAX_NUMBER + _ONE)]
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
# -- Hawk-Dove-Bourgeois --
|
| 60 |
+
_V = float(HDB_RESOURCE_VALUE)
|
| 61 |
+
_C = float(HDB_FIGHT_COST)
|
| 62 |
+
_S = _V / float(HDB_SHARE_DIVISOR)
|
| 63 |
+
_HDB_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 64 |
+
("hawk", "hawk"): ((_V - _C) / _TWO, (_V - _C) / _TWO),
|
| 65 |
+
("hawk", "dove"): (_V, _ZERO_F),
|
| 66 |
+
("hawk", "bourgeois"): (_V / _TWO, (_V - _C) / (float(_TWO) * _TWO)),
|
| 67 |
+
("dove", "hawk"): (_ZERO_F, _V),
|
| 68 |
+
("dove", "dove"): (_S, _S),
|
| 69 |
+
("dove", "bourgeois"): (_S / _TWO, _S + _V / (float(_TWO) * _TWO)),
|
| 70 |
+
("bourgeois", "hawk"): ((_V - _C) / (float(_TWO) * _TWO), _V / _TWO),
|
| 71 |
+
("bourgeois", "dove"): (_S + _V / (float(_TWO) * _TWO), _S / _TWO),
|
| 72 |
+
("bourgeois", "bourgeois"): (_S, _S),
|
| 73 |
+
}
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
# -- Finitely Repeated PD (same payoffs, explicit short horizon) --
|
| 77 |
+
_FPD_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 78 |
+
("cooperate", "cooperate"): (float(PD_CC_PAYOFF), float(PD_CC_PAYOFF)),
|
| 79 |
+
("cooperate", "defect"): (float(PD_CD_PAYOFF), float(PD_DC_PAYOFF)),
|
| 80 |
+
("defect", "cooperate"): (float(PD_DC_PAYOFF), float(PD_CD_PAYOFF)),
|
| 81 |
+
("defect", "defect"): (float(PD_DD_PAYOFF), float(PD_DD_PAYOFF)),
|
| 82 |
+
}
|
| 83 |
+
|
| 84 |
+
_FIVE = _TWO + _TWO + _ONE
|
| 85 |
+
_MARKOV_ROUNDS = _FIVE + _FIVE + _FIVE
|
| 86 |
+
|
| 87 |
+
DYNAMIC_GAMES: dict[str, GameConfig] = {
|
| 88 |
+
"bank_run": GameConfig(
|
| 89 |
+
name="Bank Run (Diamond-Dybvig)",
|
| 90 |
+
description=(
|
| 91 |
+
"Depositors simultaneously decide whether to withdraw early. "
|
| 92 |
+
"If both wait, the bank survives and both earn a premium. If "
|
| 93 |
+
"both withdraw, the bank fails. Models coordination failure "
|
| 94 |
+
"in financial systems."
|
| 95 |
+
),
|
| 96 |
+
actions=["wait", "withdraw"], game_type="matrix",
|
| 97 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 98 |
+
payoff_fn=_matrix_payoff_fn(_BR_MATRIX),
|
| 99 |
+
),
|
| 100 |
+
"global_stag_hunt": GameConfig(
|
| 101 |
+
name="Global Stag Hunt",
|
| 102 |
+
description=(
|
| 103 |
+
"A higher-stakes Stag Hunt modeling coordination under "
|
| 104 |
+
"uncertainty. Both hunting stag yields a large payoff but "
|
| 105 |
+
"hunting stag alone yields nothing. Models bank runs, "
|
| 106 |
+
"currency attacks, and regime change dynamics."
|
| 107 |
+
),
|
| 108 |
+
actions=["stag", "hare"], game_type="matrix",
|
| 109 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 110 |
+
payoff_fn=_matrix_payoff_fn(_GSH_MATRIX),
|
| 111 |
+
),
|
| 112 |
+
"beauty_contest": GameConfig(
|
| 113 |
+
name="Keynesian Beauty Contest",
|
| 114 |
+
description=(
|
| 115 |
+
"Each player picks a number. The winner is closest to a "
|
| 116 |
+
"target fraction of the average. Tests depth of strategic "
|
| 117 |
+
"reasoning and level-k thinking. The unique Nash equilibrium "
|
| 118 |
+
"is zero, reached through iterated elimination."
|
| 119 |
+
),
|
| 120 |
+
actions=_BC_ACTS, game_type="beauty_contest",
|
| 121 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 122 |
+
payoff_fn=_beauty_contest_payoff,
|
| 123 |
+
),
|
| 124 |
+
"hawk_dove_bourgeois": GameConfig(
|
| 125 |
+
name="Hawk-Dove-Bourgeois",
|
| 126 |
+
description=(
|
| 127 |
+
"Extended Hawk-Dove with a Bourgeois strategy that plays "
|
| 128 |
+
"Hawk when incumbent and Dove when intruder. The Bourgeois "
|
| 129 |
+
"strategy is an evolutionarily stable strategy. Tests "
|
| 130 |
+
"reasoning about ownership conventions."
|
| 131 |
+
),
|
| 132 |
+
actions=["hawk", "dove", "bourgeois"], game_type="matrix",
|
| 133 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 134 |
+
payoff_fn=_matrix_payoff_fn(_HDB_MATRIX),
|
| 135 |
+
),
|
| 136 |
+
"finitely_repeated_pd": GameConfig(
|
| 137 |
+
name="Finitely Repeated Prisoner's Dilemma",
|
| 138 |
+
description=(
|
| 139 |
+
"A Prisoner's Dilemma played for a known finite number of "
|
| 140 |
+
"rounds. Backward induction predicts mutual defection in "
|
| 141 |
+
"every round, yet cooperation often emerges experimentally. "
|
| 142 |
+
"Tests backward induction versus cooperation heuristics."
|
| 143 |
+
),
|
| 144 |
+
actions=["cooperate", "defect"], game_type="matrix",
|
| 145 |
+
default_rounds=_FIVE,
|
| 146 |
+
payoff_fn=_matrix_payoff_fn(_FPD_MATRIX),
|
| 147 |
+
),
|
| 148 |
+
"markov_game": GameConfig(
|
| 149 |
+
name="Markov Decision Game",
|
| 150 |
+
description=(
|
| 151 |
+
"A repeated game where the payoff structure shifts based on "
|
| 152 |
+
"recent history. Players must adapt strategies to changing "
|
| 153 |
+
"incentives. Tests dynamic programming and Markov-perfect "
|
| 154 |
+
"equilibrium reasoning over multiple rounds."
|
| 155 |
+
),
|
| 156 |
+
actions=["cooperate", "defect"], game_type="matrix",
|
| 157 |
+
default_rounds=_MARKOV_ROUNDS,
|
| 158 |
+
payoff_fn=_matrix_payoff_fn(_FPD_MATRIX),
|
| 159 |
+
),
|
| 160 |
+
}
|
| 161 |
+
|
| 162 |
+
GAMES.update(DYNAMIC_GAMES)
|
common/games_coop/infinite.py
ADDED
|
@@ -0,0 +1,72 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Infinite-horizon and continuous games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS
|
| 6 |
+
from constant_definitions.var.infinite_constants import (
|
| 7 |
+
CPD_BENEFIT_NUMERATOR, CPD_COST_NUMERATOR, CPD_DENOMINATOR,
|
| 8 |
+
CPD_MAX_LEVEL,
|
| 9 |
+
DPD_TEMPTATION, DPD_REWARD, DPD_PUNISHMENT, DPD_SUCKER,
|
| 10 |
+
DPD_DEFAULT_ROUNDS,
|
| 11 |
+
)
|
| 12 |
+
|
| 13 |
+
_ONE = int(bool(True))
|
| 14 |
+
|
| 15 |
+
|
| 16 |
+
# -- Continuous PD (variable contribution levels) --
|
| 17 |
+
def _continuous_pd_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 18 |
+
"""Each player chooses a cooperation level. Higher = costlier but benefits opponent."""
|
| 19 |
+
lvl_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 20 |
+
lvl_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 21 |
+
p_pay = float(lvl_o * CPD_BENEFIT_NUMERATOR) / CPD_DENOMINATOR
|
| 22 |
+
p_pay -= float(lvl_p * CPD_COST_NUMERATOR) / CPD_DENOMINATOR
|
| 23 |
+
o_pay = float(lvl_p * CPD_BENEFIT_NUMERATOR) / CPD_DENOMINATOR
|
| 24 |
+
o_pay -= float(lvl_o * CPD_COST_NUMERATOR) / CPD_DENOMINATOR
|
| 25 |
+
return (p_pay, o_pay)
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
_CPD_ACTS = [f"level_{i}" for i in range(CPD_MAX_LEVEL + _ONE)]
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
# -- Discounted PD (high-stakes, long-horizon) --
|
| 32 |
+
_DPD_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 33 |
+
("cooperate", "cooperate"): (float(DPD_REWARD), float(DPD_REWARD)),
|
| 34 |
+
("cooperate", "defect"): (float(DPD_SUCKER), float(DPD_TEMPTATION)),
|
| 35 |
+
("defect", "cooperate"): (float(DPD_TEMPTATION), float(DPD_SUCKER)),
|
| 36 |
+
("defect", "defect"): (float(DPD_PUNISHMENT), float(DPD_PUNISHMENT)),
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
# -- Register --
|
| 41 |
+
INFINITE_GAMES: dict[str, GameConfig] = {
|
| 42 |
+
"continuous_pd": GameConfig(
|
| 43 |
+
name="Continuous Prisoner's Dilemma",
|
| 44 |
+
description=(
|
| 45 |
+
"A generalization of the Prisoner's Dilemma with variable "
|
| 46 |
+
"cooperation levels instead of binary choices. Each unit of "
|
| 47 |
+
"cooperation costs the player but benefits the opponent more. "
|
| 48 |
+
"Tests whether agents find intermediate cooperation strategies "
|
| 49 |
+
"in continuous action spaces."
|
| 50 |
+
),
|
| 51 |
+
actions=_CPD_ACTS,
|
| 52 |
+
game_type="continuous_pd",
|
| 53 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 54 |
+
payoff_fn=_continuous_pd_payoff,
|
| 55 |
+
),
|
| 56 |
+
"discounted_pd": GameConfig(
|
| 57 |
+
name="Discounted Prisoner's Dilemma",
|
| 58 |
+
description=(
|
| 59 |
+
"A high-stakes Prisoner's Dilemma with many rounds, modeling "
|
| 60 |
+
"an effectively infinite repeated interaction. The shadow of "
|
| 61 |
+
"the future makes cooperation sustainable under folk theorem "
|
| 62 |
+
"conditions. Tests long-horizon strategic reasoning with "
|
| 63 |
+
"higher temptation and reward differentials."
|
| 64 |
+
),
|
| 65 |
+
actions=["cooperate", "defect"],
|
| 66 |
+
game_type="matrix",
|
| 67 |
+
default_rounds=DPD_DEFAULT_ROUNDS,
|
| 68 |
+
payoff_fn=_matrix_payoff_fn(_DPD_MATRIX),
|
| 69 |
+
),
|
| 70 |
+
}
|
| 71 |
+
|
| 72 |
+
GAMES.update(INFINITE_GAMES)
|
common/games_coop/pd_variants.py
ADDED
|
@@ -0,0 +1,145 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Prisoner's Dilemma variants for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import (
|
| 6 |
+
PD_CC_PAYOFF, PD_CD_PAYOFF, PD_DC_PAYOFF, PD_DD_PAYOFF,
|
| 7 |
+
DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS,
|
| 8 |
+
)
|
| 9 |
+
from constant_definitions.var.pd_variant_constants import (
|
| 10 |
+
OPD_EXIT_PAYOFF,
|
| 11 |
+
APD_A_TEMPTATION, APD_A_REWARD, APD_A_PUNISHMENT, APD_A_SUCKER,
|
| 12 |
+
APD_B_TEMPTATION, APD_B_REWARD, APD_B_PUNISHMENT, APD_B_SUCKER,
|
| 13 |
+
DONATION_BENEFIT, DONATION_COST,
|
| 14 |
+
FOF_SHARE_PAYOFF, FOF_STEAL_WIN_PAYOFF,
|
| 15 |
+
PW_DISARM_DISARM, PW_DISARM_ARM, PW_ARM_DISARM, PW_ARM_ARM,
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
_ZERO_F = float()
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# -- Optional PD (cooperate / defect / exit) --
|
| 22 |
+
_OPD_EXIT_F = float(OPD_EXIT_PAYOFF)
|
| 23 |
+
_OPD_BASE: dict[tuple[str, str], tuple[float, float]] = {
|
| 24 |
+
("cooperate", "cooperate"): (float(PD_CC_PAYOFF), float(PD_CC_PAYOFF)),
|
| 25 |
+
("cooperate", "defect"): (float(PD_CD_PAYOFF), float(PD_DC_PAYOFF)),
|
| 26 |
+
("defect", "cooperate"): (float(PD_DC_PAYOFF), float(PD_CD_PAYOFF)),
|
| 27 |
+
("defect", "defect"): (float(PD_DD_PAYOFF), float(PD_DD_PAYOFF)),
|
| 28 |
+
}
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
def _optional_pd_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 32 |
+
if pa == "exit" or oa == "exit":
|
| 33 |
+
return (_OPD_EXIT_F, _OPD_EXIT_F)
|
| 34 |
+
return _OPD_BASE[(pa, oa)]
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
# -- Asymmetric PD (alibi game: different payoffs per player) --
|
| 38 |
+
_ASYM_PD: dict[tuple[str, str], tuple[float, float]] = {
|
| 39 |
+
("cooperate", "cooperate"): (float(APD_A_REWARD), float(APD_B_REWARD)),
|
| 40 |
+
("cooperate", "defect"): (float(APD_A_SUCKER), float(APD_B_TEMPTATION)),
|
| 41 |
+
("defect", "cooperate"): (float(APD_A_TEMPTATION), float(APD_B_SUCKER)),
|
| 42 |
+
("defect", "defect"): (float(APD_A_PUNISHMENT), float(APD_B_PUNISHMENT)),
|
| 43 |
+
}
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# -- Donation Game (pay cost c to give benefit b to opponent) --
|
| 47 |
+
_DG: dict[tuple[str, str], tuple[float, float]] = {
|
| 48 |
+
("donate", "donate"): (
|
| 49 |
+
float(DONATION_BENEFIT - DONATION_COST),
|
| 50 |
+
float(DONATION_BENEFIT - DONATION_COST),
|
| 51 |
+
),
|
| 52 |
+
("donate", "keep"): (float(-DONATION_COST), float(DONATION_BENEFIT)),
|
| 53 |
+
("keep", "donate"): (float(DONATION_BENEFIT), float(-DONATION_COST)),
|
| 54 |
+
("keep", "keep"): (_ZERO_F, _ZERO_F),
|
| 55 |
+
}
|
| 56 |
+
|
| 57 |
+
|
| 58 |
+
# -- Friend or Foe (game show: both defect yields zero) --
|
| 59 |
+
_FOF: dict[tuple[str, str], tuple[float, float]] = {
|
| 60 |
+
("friend", "friend"): (float(FOF_SHARE_PAYOFF), float(FOF_SHARE_PAYOFF)),
|
| 61 |
+
("friend", "foe"): (_ZERO_F, float(FOF_STEAL_WIN_PAYOFF)),
|
| 62 |
+
("foe", "friend"): (float(FOF_STEAL_WIN_PAYOFF), _ZERO_F),
|
| 63 |
+
("foe", "foe"): (_ZERO_F, _ZERO_F),
|
| 64 |
+
}
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
# -- Peace-War Game (arms race framing from international relations) --
|
| 68 |
+
_PW: dict[tuple[str, str], tuple[float, float]] = {
|
| 69 |
+
("disarm", "disarm"): (float(PW_DISARM_DISARM), float(PW_DISARM_DISARM)),
|
| 70 |
+
("disarm", "arm"): (float(PW_DISARM_ARM), float(PW_ARM_DISARM)),
|
| 71 |
+
("arm", "disarm"): (float(PW_ARM_DISARM), float(PW_DISARM_ARM)),
|
| 72 |
+
("arm", "arm"): (float(PW_ARM_ARM), float(PW_ARM_ARM)),
|
| 73 |
+
}
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
# -- Register --
|
| 77 |
+
PD_VARIANT_GAMES: dict[str, GameConfig] = {
|
| 78 |
+
"optional_pd": GameConfig(
|
| 79 |
+
name="Optional Prisoner's Dilemma",
|
| 80 |
+
description=(
|
| 81 |
+
"A Prisoner's Dilemma with a third action: exit. Exiting gives "
|
| 82 |
+
"a safe intermediate payoff regardless of the opponent's choice. "
|
| 83 |
+
"Tests whether outside options change cooperation dynamics and "
|
| 84 |
+
"models situations where players can walk away from interactions."
|
| 85 |
+
),
|
| 86 |
+
actions=["cooperate", "defect", "exit"],
|
| 87 |
+
game_type="matrix",
|
| 88 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 89 |
+
payoff_fn=_optional_pd_payoff,
|
| 90 |
+
),
|
| 91 |
+
"asymmetric_pd": GameConfig(
|
| 92 |
+
name="Asymmetric Prisoner's Dilemma",
|
| 93 |
+
description=(
|
| 94 |
+
"A Prisoner's Dilemma where players have unequal payoff "
|
| 95 |
+
"structures. The first player has an alibi advantage with a "
|
| 96 |
+
"higher punishment payoff. Tests strategic reasoning under "
|
| 97 |
+
"asymmetric incentive conditions."
|
| 98 |
+
),
|
| 99 |
+
actions=["cooperate", "defect"],
|
| 100 |
+
game_type="matrix",
|
| 101 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 102 |
+
payoff_fn=_matrix_payoff_fn(_ASYM_PD),
|
| 103 |
+
),
|
| 104 |
+
"donation_game": GameConfig(
|
| 105 |
+
name="Donation Game",
|
| 106 |
+
description=(
|
| 107 |
+
"A simplified cooperation model: each player independently "
|
| 108 |
+
"decides whether to donate. Donating costs the donor but "
|
| 109 |
+
"gives a larger benefit to the recipient. The dominant "
|
| 110 |
+
"strategy is to keep, but mutual donation is Pareto superior."
|
| 111 |
+
),
|
| 112 |
+
actions=["donate", "keep"],
|
| 113 |
+
game_type="matrix",
|
| 114 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 115 |
+
payoff_fn=_matrix_payoff_fn(_DG),
|
| 116 |
+
),
|
| 117 |
+
"friend_or_foe": GameConfig(
|
| 118 |
+
name="Friend or Foe",
|
| 119 |
+
description=(
|
| 120 |
+
"A game show variant of the Prisoner's Dilemma. If both choose "
|
| 121 |
+
"friend, winnings are shared. If one steals (foe), they take all. "
|
| 122 |
+
"If both choose foe, neither gets anything. Unlike standard PD, "
|
| 123 |
+
"mutual defection yields zero, creating a weak equilibrium."
|
| 124 |
+
),
|
| 125 |
+
actions=["friend", "foe"],
|
| 126 |
+
game_type="matrix",
|
| 127 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 128 |
+
payoff_fn=_matrix_payoff_fn(_FOF),
|
| 129 |
+
),
|
| 130 |
+
"peace_war": GameConfig(
|
| 131 |
+
name="Peace-War Game",
|
| 132 |
+
description=(
|
| 133 |
+
"An international relations framing of the Prisoner's Dilemma. "
|
| 134 |
+
"Players choose to arm or disarm. Mutual disarmament yields the "
|
| 135 |
+
"best joint outcome but unilateral arming dominates. Models "
|
| 136 |
+
"the security dilemma and arms race escalation dynamics."
|
| 137 |
+
),
|
| 138 |
+
actions=["disarm", "arm"],
|
| 139 |
+
game_type="matrix",
|
| 140 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 141 |
+
payoff_fn=_matrix_payoff_fn(_PW),
|
| 142 |
+
),
|
| 143 |
+
}
|
| 144 |
+
|
| 145 |
+
GAMES.update(PD_VARIANT_GAMES)
|
common/games_coop/stochastic.py
ADDED
|
@@ -0,0 +1,128 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Stochastic and evolutionary game variants for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.batch4.stochastic_constants import (
|
| 7 |
+
SPD_CC, SPD_CD, SPD_DC, SPD_DD,
|
| 8 |
+
RD_PAYOFF_DOMINANT, RD_RISK_DOMINANT, RD_MISCOORDINATION,
|
| 9 |
+
TPG_ENDOWMENT, TPG_THRESHOLD, TPG_SUCCESS_BONUS,
|
| 10 |
+
EPD_COOP_COOP, EPD_COOP_DEFECT, EPD_DEFECT_COOP, EPD_DEFECT_DEFECT,
|
| 11 |
+
EPD_TFT_DEFECT, EPD_DEFECT_TFT,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
_ONE = int(bool(True))
|
| 15 |
+
|
| 16 |
+
|
| 17 |
+
# -- Stochastic PD (expected payoffs under action noise) --
|
| 18 |
+
_SPD: dict[tuple[str, str], tuple[float, float]] = {
|
| 19 |
+
("cooperate", "cooperate"): (float(SPD_CC), float(SPD_CC)),
|
| 20 |
+
("cooperate", "defect"): (float(SPD_CD), float(SPD_DC)),
|
| 21 |
+
("defect", "cooperate"): (float(SPD_DC), float(SPD_CD)),
|
| 22 |
+
("defect", "defect"): (float(SPD_DD), float(SPD_DD)),
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
|
| 26 |
+
# -- Risk Dominance (payoff-dominant vs risk-dominant equilibria) --
|
| 27 |
+
_RD: dict[tuple[str, str], tuple[float, float]] = {
|
| 28 |
+
("risky", "risky"): (float(RD_PAYOFF_DOMINANT), float(RD_PAYOFF_DOMINANT)),
|
| 29 |
+
("risky", "safe"): (float(RD_MISCOORDINATION), float(RD_MISCOORDINATION)),
|
| 30 |
+
("safe", "risky"): (float(RD_MISCOORDINATION), float(RD_MISCOORDINATION)),
|
| 31 |
+
("safe", "safe"): (float(RD_RISK_DOMINANT), float(RD_RISK_DOMINANT)),
|
| 32 |
+
}
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
# -- Threshold Public Goods (step-function provision) --
|
| 36 |
+
_TPG_ENDOW_F = float(TPG_ENDOWMENT)
|
| 37 |
+
_TPG_THRESH = TPG_THRESHOLD
|
| 38 |
+
_TPG_BONUS = float(TPG_SUCCESS_BONUS)
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
def _tpg_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 42 |
+
p_c = int(pa.rsplit("_", _ONE)[_ONE])
|
| 43 |
+
o_c = int(oa.rsplit("_", _ONE)[_ONE])
|
| 44 |
+
total = p_c + o_c
|
| 45 |
+
if total >= _TPG_THRESH:
|
| 46 |
+
p_pay = _TPG_ENDOW_F - float(p_c) + _TPG_BONUS
|
| 47 |
+
o_pay = _TPG_ENDOW_F - float(o_c) + _TPG_BONUS
|
| 48 |
+
else:
|
| 49 |
+
p_pay = _TPG_ENDOW_F - float(p_c)
|
| 50 |
+
o_pay = _TPG_ENDOW_F - float(o_c)
|
| 51 |
+
return (p_pay, o_pay)
|
| 52 |
+
|
| 53 |
+
|
| 54 |
+
_TPG_ACTS = [f"contribute_{i}" for i in range(TPG_ENDOWMENT + _ONE)]
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
# -- Evolutionary PD (always_coop / always_defect / tit_for_tat) --
|
| 58 |
+
_EPD: dict[tuple[str, str], tuple[float, float]] = {
|
| 59 |
+
("always_coop", "always_coop"): (float(EPD_COOP_COOP), float(EPD_COOP_COOP)),
|
| 60 |
+
("always_coop", "always_defect"): (float(EPD_COOP_DEFECT), float(EPD_DEFECT_COOP)),
|
| 61 |
+
("always_coop", "tit_for_tat"): (float(EPD_COOP_COOP), float(EPD_COOP_COOP)),
|
| 62 |
+
("always_defect", "always_coop"): (float(EPD_DEFECT_COOP), float(EPD_COOP_DEFECT)),
|
| 63 |
+
("always_defect", "always_defect"): (float(EPD_DEFECT_DEFECT), float(EPD_DEFECT_DEFECT)),
|
| 64 |
+
("always_defect", "tit_for_tat"): (float(EPD_DEFECT_TFT), float(EPD_TFT_DEFECT)),
|
| 65 |
+
("tit_for_tat", "always_coop"): (float(EPD_COOP_COOP), float(EPD_COOP_COOP)),
|
| 66 |
+
("tit_for_tat", "always_defect"): (float(EPD_TFT_DEFECT), float(EPD_DEFECT_TFT)),
|
| 67 |
+
("tit_for_tat", "tit_for_tat"): (float(EPD_COOP_COOP), float(EPD_COOP_COOP)),
|
| 68 |
+
}
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
# -- Register --
|
| 72 |
+
STOCHASTIC_GAMES: dict[str, GameConfig] = {
|
| 73 |
+
"stochastic_pd": GameConfig(
|
| 74 |
+
name="Stochastic Prisoner's Dilemma",
|
| 75 |
+
description=(
|
| 76 |
+
"A Prisoner's Dilemma variant where action execution is noisy. "
|
| 77 |
+
"With some probability each player's intended action is flipped. "
|
| 78 |
+
"Expected payoffs differ from the standard PD, reflecting the "
|
| 79 |
+
"tremble probabilities. Tests robustness of strategies to noise."
|
| 80 |
+
),
|
| 81 |
+
actions=["cooperate", "defect"],
|
| 82 |
+
game_type="matrix",
|
| 83 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 84 |
+
payoff_fn=_matrix_payoff_fn(_SPD),
|
| 85 |
+
),
|
| 86 |
+
"risk_dominance": GameConfig(
|
| 87 |
+
name="Risk Dominance Game",
|
| 88 |
+
description=(
|
| 89 |
+
"A coordination game with two pure Nash equilibria: one "
|
| 90 |
+
"payoff-dominant (risky-risky yields higher mutual payoff) and "
|
| 91 |
+
"one risk-dominant (safe-safe is more robust to uncertainty). "
|
| 92 |
+
"Tests whether agents optimize for payoff or safety under "
|
| 93 |
+
"strategic uncertainty about the opponent's behavior."
|
| 94 |
+
),
|
| 95 |
+
actions=["risky", "safe"],
|
| 96 |
+
game_type="matrix",
|
| 97 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 98 |
+
payoff_fn=_matrix_payoff_fn(_RD),
|
| 99 |
+
),
|
| 100 |
+
"threshold_public_goods": GameConfig(
|
| 101 |
+
name="Threshold Public Goods Game",
|
| 102 |
+
description=(
|
| 103 |
+
"A public goods game with a provision threshold. Each player "
|
| 104 |
+
"contributes from an endowment. If total contributions meet the "
|
| 105 |
+
"threshold a bonus is provided to all. Otherwise contributions "
|
| 106 |
+
"are spent without the bonus. Tests coordination on provision."
|
| 107 |
+
),
|
| 108 |
+
actions=_TPG_ACTS,
|
| 109 |
+
game_type="threshold_public_goods",
|
| 110 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 111 |
+
payoff_fn=_tpg_payoff,
|
| 112 |
+
),
|
| 113 |
+
"evolutionary_pd": GameConfig(
|
| 114 |
+
name="Evolutionary Prisoner's Dilemma",
|
| 115 |
+
description=(
|
| 116 |
+
"A multi-strategy Prisoner's Dilemma representing long-run "
|
| 117 |
+
"evolutionary dynamics. Players choose from always cooperate "
|
| 118 |
+
"and always defect and tit-for-tat. Payoffs represent expected "
|
| 119 |
+
"long-run fitness across many interactions between strategies."
|
| 120 |
+
),
|
| 121 |
+
actions=["always_coop", "always_defect", "tit_for_tat"],
|
| 122 |
+
game_type="matrix",
|
| 123 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 124 |
+
payoff_fn=_matrix_payoff_fn(_EPD),
|
| 125 |
+
),
|
| 126 |
+
}
|
| 127 |
+
|
| 128 |
+
GAMES.update(STOCHASTIC_GAMES)
|
common/games_ext/__pycache__/auction.cpython-311.pyc
ADDED
|
Binary file (4.79 kB). View file
|
|
|
common/games_ext/__pycache__/generated.cpython-311.pyc
ADDED
|
Binary file (7.07 kB). View file
|
|
|
common/games_ext/__pycache__/matrix_games.cpython-311.pyc
ADDED
|
Binary file (6.27 kB). View file
|
|
|
common/games_ext/__pycache__/nplayer.cpython-311.pyc
ADDED
|
Binary file (5.54 kB). View file
|
|
|
common/games_ext/__pycache__/sequential.cpython-311.pyc
ADDED
|
Binary file (6.22 kB). View file
|
|
|
common/games_ext/auction.py
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Auction mechanism games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig
|
| 5 |
+
from constant_definitions.game_constants import SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.auction_nplayer_constants import (
|
| 7 |
+
AUCTION_ITEM_VALUE, AUCTION_MAX_BID, AUCTION_BID_INCREMENT,
|
| 8 |
+
)
|
| 9 |
+
|
| 10 |
+
_ONE = int(bool(True))
|
| 11 |
+
_ZERO = int()
|
| 12 |
+
_ZERO_F = float()
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def _parse_bid(action: str) -> int:
|
| 16 |
+
"""Extract bid amount from action string like 'bid_5'."""
|
| 17 |
+
return int(action.rsplit("_", _ONE)[_ONE])
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
# -- First-Price Sealed Bid Auction --
|
| 21 |
+
|
| 22 |
+
def _first_price_payoff(
|
| 23 |
+
player_action: str, opponent_action: str,
|
| 24 |
+
) -> tuple[float, float]:
|
| 25 |
+
"""Highest bidder wins and pays their own bid."""
|
| 26 |
+
p_bid = _parse_bid(player_action)
|
| 27 |
+
o_bid = _parse_bid(opponent_action)
|
| 28 |
+
|
| 29 |
+
if p_bid > o_bid:
|
| 30 |
+
p_pay = float(AUCTION_ITEM_VALUE - p_bid)
|
| 31 |
+
o_pay = _ZERO_F
|
| 32 |
+
elif o_bid > p_bid:
|
| 33 |
+
p_pay = _ZERO_F
|
| 34 |
+
o_pay = float(AUCTION_ITEM_VALUE - o_bid)
|
| 35 |
+
else:
|
| 36 |
+
half_surplus = float(AUCTION_ITEM_VALUE - p_bid) / (_ONE + _ONE)
|
| 37 |
+
p_pay = half_surplus
|
| 38 |
+
o_pay = half_surplus
|
| 39 |
+
return (p_pay, o_pay)
|
| 40 |
+
|
| 41 |
+
|
| 42 |
+
# -- Second-Price (Vickrey) Auction --
|
| 43 |
+
|
| 44 |
+
def _vickrey_payoff(
|
| 45 |
+
player_action: str, opponent_action: str,
|
| 46 |
+
) -> tuple[float, float]:
|
| 47 |
+
"""Highest bidder wins but pays the second-highest bid."""
|
| 48 |
+
p_bid = _parse_bid(player_action)
|
| 49 |
+
o_bid = _parse_bid(opponent_action)
|
| 50 |
+
|
| 51 |
+
if p_bid > o_bid:
|
| 52 |
+
p_pay = float(AUCTION_ITEM_VALUE - o_bid)
|
| 53 |
+
o_pay = _ZERO_F
|
| 54 |
+
elif o_bid > p_bid:
|
| 55 |
+
p_pay = _ZERO_F
|
| 56 |
+
o_pay = float(AUCTION_ITEM_VALUE - p_bid)
|
| 57 |
+
else:
|
| 58 |
+
half_surplus = float(AUCTION_ITEM_VALUE - p_bid) / (_ONE + _ONE)
|
| 59 |
+
p_pay = half_surplus
|
| 60 |
+
o_pay = half_surplus
|
| 61 |
+
return (p_pay, o_pay)
|
| 62 |
+
|
| 63 |
+
|
| 64 |
+
# -- All-Pay Auction --
|
| 65 |
+
|
| 66 |
+
def _allpay_payoff(
|
| 67 |
+
player_action: str, opponent_action: str,
|
| 68 |
+
) -> tuple[float, float]:
|
| 69 |
+
"""Both bidders pay their bids; only the winner gets the item."""
|
| 70 |
+
p_bid = _parse_bid(player_action)
|
| 71 |
+
o_bid = _parse_bid(opponent_action)
|
| 72 |
+
|
| 73 |
+
if p_bid > o_bid:
|
| 74 |
+
p_pay = float(AUCTION_ITEM_VALUE - p_bid)
|
| 75 |
+
o_pay = float(-o_bid)
|
| 76 |
+
elif o_bid > p_bid:
|
| 77 |
+
p_pay = float(-p_bid)
|
| 78 |
+
o_pay = float(AUCTION_ITEM_VALUE - o_bid)
|
| 79 |
+
else:
|
| 80 |
+
half_value = float(AUCTION_ITEM_VALUE) / (_ONE + _ONE)
|
| 81 |
+
p_pay = half_value - float(p_bid)
|
| 82 |
+
o_pay = half_value - float(o_bid)
|
| 83 |
+
return (p_pay, o_pay)
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
# -- Action lists --
|
| 87 |
+
|
| 88 |
+
_BID_ACTIONS = [
|
| 89 |
+
f"bid_{i}" for i in range(
|
| 90 |
+
_ZERO, AUCTION_MAX_BID + AUCTION_BID_INCREMENT, AUCTION_BID_INCREMENT,
|
| 91 |
+
)
|
| 92 |
+
]
|
| 93 |
+
|
| 94 |
+
|
| 95 |
+
# -- Register --
|
| 96 |
+
|
| 97 |
+
AUCTION_GAMES: dict[str, GameConfig] = {
|
| 98 |
+
"first_price_auction": GameConfig(
|
| 99 |
+
name="First-Price Sealed-Bid Auction",
|
| 100 |
+
description=(
|
| 101 |
+
"Two bidders simultaneously submit sealed bids for an item. "
|
| 102 |
+
"The highest bidder wins and pays their own bid. Strategic "
|
| 103 |
+
"bidding requires shading below true value to maximize surplus "
|
| 104 |
+
"while still winning."
|
| 105 |
+
),
|
| 106 |
+
actions=_BID_ACTIONS,
|
| 107 |
+
game_type="auction",
|
| 108 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 109 |
+
payoff_fn=_first_price_payoff,
|
| 110 |
+
),
|
| 111 |
+
"vickrey_auction": GameConfig(
|
| 112 |
+
name="Second-Price (Vickrey) Auction",
|
| 113 |
+
description=(
|
| 114 |
+
"Two bidders submit sealed bids. The highest bidder wins but "
|
| 115 |
+
"pays the second-highest bid. The dominant strategy is to bid "
|
| 116 |
+
"one's true valuation, making this a strategy-proof mechanism."
|
| 117 |
+
),
|
| 118 |
+
actions=_BID_ACTIONS,
|
| 119 |
+
game_type="auction",
|
| 120 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 121 |
+
payoff_fn=_vickrey_payoff,
|
| 122 |
+
),
|
| 123 |
+
"allpay_auction": GameConfig(
|
| 124 |
+
name="All-Pay Auction",
|
| 125 |
+
description=(
|
| 126 |
+
"Two bidders submit sealed bids. Both pay their bids regardless "
|
| 127 |
+
"of outcome, but only the highest bidder receives the item. "
|
| 128 |
+
"Models contests, lobbying, and rent-seeking where effort is "
|
| 129 |
+
"spent whether or not you win."
|
| 130 |
+
),
|
| 131 |
+
actions=_BID_ACTIONS,
|
| 132 |
+
game_type="auction",
|
| 133 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 134 |
+
payoff_fn=_allpay_payoff,
|
| 135 |
+
),
|
| 136 |
+
}
|
| 137 |
+
|
| 138 |
+
GAMES.update(AUCTION_GAMES)
|
common/games_ext/generated.py
ADDED
|
@@ -0,0 +1,144 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Procedurally generated games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
import random as _rand
|
| 5 |
+
from common.games import GAMES, GameConfig
|
| 6 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS
|
| 7 |
+
from constant_definitions.auction_nplayer_constants import (
|
| 8 |
+
GENERATED_DEFAULT_ACTIONS, GENERATED_PAYOFF_MIN, GENERATED_PAYOFF_MAX,
|
| 9 |
+
GENERATED_SEED_DEFAULT,
|
| 10 |
+
)
|
| 11 |
+
|
| 12 |
+
_ONE = int(bool(True))
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
def _action_label(index: int) -> str:
|
| 16 |
+
"""Generate action label: a, b, c, ... z, aa, ab, ..."""
|
| 17 |
+
alphabet_size = ord("z") - ord("a") + _ONE
|
| 18 |
+
if index < alphabet_size:
|
| 19 |
+
return chr(ord("a") + index)
|
| 20 |
+
first = index // alphabet_size - _ONE
|
| 21 |
+
second = index % alphabet_size
|
| 22 |
+
return chr(ord("a") + first) + chr(ord("a") + second)
|
| 23 |
+
|
| 24 |
+
|
| 25 |
+
def generate_random_symmetric(
|
| 26 |
+
num_actions: int = GENERATED_DEFAULT_ACTIONS,
|
| 27 |
+
payoff_min: int = GENERATED_PAYOFF_MIN,
|
| 28 |
+
payoff_max: int = GENERATED_PAYOFF_MAX,
|
| 29 |
+
seed: int = GENERATED_SEED_DEFAULT,
|
| 30 |
+
) -> GameConfig:
|
| 31 |
+
"""Generate a random symmetric NxN matrix game.
|
| 32 |
+
|
| 33 |
+
In a symmetric game, the payoff for the first player choosing (a, b)
|
| 34 |
+
equals the payoff for the second player facing (b, a).
|
| 35 |
+
"""
|
| 36 |
+
rng = _rand.Random(seed)
|
| 37 |
+
actions = [_action_label(i) for i in range(num_actions)]
|
| 38 |
+
|
| 39 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {}
|
| 40 |
+
for i, a in enumerate(actions):
|
| 41 |
+
for j, b in enumerate(actions):
|
| 42 |
+
if (a, b) not in matrix:
|
| 43 |
+
p_first = float(rng.randint(payoff_min, payoff_max))
|
| 44 |
+
p_second = float(rng.randint(payoff_min, payoff_max))
|
| 45 |
+
matrix[(a, b)] = (p_first, p_second)
|
| 46 |
+
matrix[(b, a)] = (p_second, p_first)
|
| 47 |
+
|
| 48 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 49 |
+
return matrix[(pa, oa)]
|
| 50 |
+
|
| 51 |
+
return GameConfig(
|
| 52 |
+
name=f"Random Symmetric {num_actions}x{num_actions} (seed={seed})",
|
| 53 |
+
description=(
|
| 54 |
+
f"A randomly generated {num_actions}x{num_actions} symmetric "
|
| 55 |
+
f"matrix game with payoffs in [{payoff_min}, {payoff_max}]. "
|
| 56 |
+
f"Tests generalization to novel strategic structures."
|
| 57 |
+
),
|
| 58 |
+
actions=actions,
|
| 59 |
+
game_type="matrix",
|
| 60 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 61 |
+
payoff_fn=_payoff,
|
| 62 |
+
)
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def generate_random_asymmetric(
|
| 66 |
+
num_actions: int = GENERATED_DEFAULT_ACTIONS,
|
| 67 |
+
payoff_min: int = GENERATED_PAYOFF_MIN,
|
| 68 |
+
payoff_max: int = GENERATED_PAYOFF_MAX,
|
| 69 |
+
seed: int = GENERATED_SEED_DEFAULT,
|
| 70 |
+
) -> GameConfig:
|
| 71 |
+
"""Generate a random asymmetric NxN matrix game.
|
| 72 |
+
|
| 73 |
+
Each cell has independently drawn payoffs for both players.
|
| 74 |
+
"""
|
| 75 |
+
rng = _rand.Random(seed)
|
| 76 |
+
actions = [_action_label(i) for i in range(num_actions)]
|
| 77 |
+
|
| 78 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {}
|
| 79 |
+
for a in actions:
|
| 80 |
+
for b in actions:
|
| 81 |
+
p_first = float(rng.randint(payoff_min, payoff_max))
|
| 82 |
+
p_second = float(rng.randint(payoff_min, payoff_max))
|
| 83 |
+
matrix[(a, b)] = (p_first, p_second)
|
| 84 |
+
|
| 85 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 86 |
+
return matrix[(pa, oa)]
|
| 87 |
+
|
| 88 |
+
return GameConfig(
|
| 89 |
+
name=f"Random Asymmetric {num_actions}x{num_actions} (seed={seed})",
|
| 90 |
+
description=(
|
| 91 |
+
f"A randomly generated {num_actions}x{num_actions} asymmetric "
|
| 92 |
+
f"matrix game with independent payoffs in [{payoff_min}, {payoff_max}]. "
|
| 93 |
+
f"Tests reasoning in novel non-symmetric strategic settings."
|
| 94 |
+
),
|
| 95 |
+
actions=actions,
|
| 96 |
+
game_type="matrix",
|
| 97 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 98 |
+
payoff_fn=_payoff,
|
| 99 |
+
)
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
def generate_parameterized_pd(
|
| 103 |
+
temptation: int,
|
| 104 |
+
reward: int,
|
| 105 |
+
punishment: int,
|
| 106 |
+
sucker: int,
|
| 107 |
+
seed: int = GENERATED_SEED_DEFAULT,
|
| 108 |
+
) -> GameConfig:
|
| 109 |
+
"""Create a Prisoner's Dilemma with custom T > R > P > S payoffs."""
|
| 110 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {
|
| 111 |
+
("cooperate", "cooperate"): (float(reward), float(reward)),
|
| 112 |
+
("cooperate", "defect"): (float(sucker), float(temptation)),
|
| 113 |
+
("defect", "cooperate"): (float(temptation), float(sucker)),
|
| 114 |
+
("defect", "defect"): (float(punishment), float(punishment)),
|
| 115 |
+
}
|
| 116 |
+
|
| 117 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 118 |
+
return matrix[(pa, oa)]
|
| 119 |
+
|
| 120 |
+
return GameConfig(
|
| 121 |
+
name=f"PD(T={temptation},R={reward},P={punishment},S={sucker})",
|
| 122 |
+
description=(
|
| 123 |
+
f"A parameterized Prisoner's Dilemma with T={temptation}, "
|
| 124 |
+
f"R={reward}, P={punishment}, S={sucker}. Tests sensitivity "
|
| 125 |
+
f"to varying incentive structures."
|
| 126 |
+
),
|
| 127 |
+
actions=["cooperate", "defect"],
|
| 128 |
+
game_type="matrix",
|
| 129 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 130 |
+
payoff_fn=_payoff,
|
| 131 |
+
)
|
| 132 |
+
|
| 133 |
+
|
| 134 |
+
# -- Register default generated instances --
|
| 135 |
+
|
| 136 |
+
_DEFAULT_SYMMETRIC = generate_random_symmetric()
|
| 137 |
+
_DEFAULT_ASYMMETRIC = generate_random_asymmetric(seed=GENERATED_SEED_DEFAULT + _ONE)
|
| 138 |
+
|
| 139 |
+
GENERATED_GAMES: dict[str, GameConfig] = {
|
| 140 |
+
"random_symmetric_3x3": _DEFAULT_SYMMETRIC,
|
| 141 |
+
"random_asymmetric_3x3": _DEFAULT_ASYMMETRIC,
|
| 142 |
+
}
|
| 143 |
+
|
| 144 |
+
GAMES.update(GENERATED_GAMES)
|
common/games_ext/matrix_games.py
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Extended matrix (normal-form) games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.zero_sum_constants import (
|
| 7 |
+
MP_MATCH_PAYOFF, MP_MISMATCH_PAYOFF,
|
| 8 |
+
RPS_WIN_PAYOFF, RPS_LOSE_PAYOFF, RPS_DRAW_PAYOFF,
|
| 9 |
+
)
|
| 10 |
+
from constant_definitions.coordination_constants import (
|
| 11 |
+
BOS_PREFERRED_PAYOFF, BOS_COMPROMISE_PAYOFF, BOS_MISMATCH_PAYOFF,
|
| 12 |
+
PC_MATCH_PAYOFF, PC_MISMATCH_PAYOFF,
|
| 13 |
+
DL_DC_PAYOFF, DL_DD_PAYOFF, DL_CC_PAYOFF, DL_CD_PAYOFF,
|
| 14 |
+
HM_CC_PAYOFF, HM_DC_PAYOFF, HM_CD_PAYOFF, HM_DD_PAYOFF,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
# -- Matching Pennies --
|
| 18 |
+
_MP_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 19 |
+
("heads", "heads"): (float(MP_MATCH_PAYOFF), float(MP_MISMATCH_PAYOFF)),
|
| 20 |
+
("heads", "tails"): (float(MP_MISMATCH_PAYOFF), float(MP_MATCH_PAYOFF)),
|
| 21 |
+
("tails", "heads"): (float(MP_MISMATCH_PAYOFF), float(MP_MATCH_PAYOFF)),
|
| 22 |
+
("tails", "tails"): (float(MP_MATCH_PAYOFF), float(MP_MISMATCH_PAYOFF)),
|
| 23 |
+
}
|
| 24 |
+
|
| 25 |
+
# -- Rock-Paper-Scissors --
|
| 26 |
+
_W, _L, _D = float(RPS_WIN_PAYOFF), float(RPS_LOSE_PAYOFF), float(RPS_DRAW_PAYOFF)
|
| 27 |
+
_RPS_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 28 |
+
("rock", "rock"): (_D, _D),
|
| 29 |
+
("rock", "scissors"): (_W, _L),
|
| 30 |
+
("rock", "paper"): (_L, _W),
|
| 31 |
+
("scissors", "rock"): (_L, _W),
|
| 32 |
+
("scissors", "scissors"): (_D, _D),
|
| 33 |
+
("scissors", "paper"): (_W, _L),
|
| 34 |
+
("paper", "rock"): (_W, _L),
|
| 35 |
+
("paper", "scissors"): (_L, _W),
|
| 36 |
+
("paper", "paper"): (_D, _D),
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
# -- Battle of the Sexes --
|
| 40 |
+
_BOS_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 41 |
+
("opera", "opera"): (float(BOS_PREFERRED_PAYOFF), float(BOS_COMPROMISE_PAYOFF)),
|
| 42 |
+
("opera", "football"): (float(BOS_MISMATCH_PAYOFF), float(BOS_MISMATCH_PAYOFF)),
|
| 43 |
+
("football", "opera"): (float(BOS_MISMATCH_PAYOFF), float(BOS_MISMATCH_PAYOFF)),
|
| 44 |
+
("football", "football"): (float(BOS_COMPROMISE_PAYOFF), float(BOS_PREFERRED_PAYOFF)),
|
| 45 |
+
}
|
| 46 |
+
|
| 47 |
+
# -- Pure Coordination --
|
| 48 |
+
_PC_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 49 |
+
("left", "left"): (float(PC_MATCH_PAYOFF), float(PC_MATCH_PAYOFF)),
|
| 50 |
+
("left", "right"): (float(PC_MISMATCH_PAYOFF), float(PC_MISMATCH_PAYOFF)),
|
| 51 |
+
("right", "left"): (float(PC_MISMATCH_PAYOFF), float(PC_MISMATCH_PAYOFF)),
|
| 52 |
+
("right", "right"): (float(PC_MATCH_PAYOFF), float(PC_MATCH_PAYOFF)),
|
| 53 |
+
}
|
| 54 |
+
|
| 55 |
+
# -- Deadlock --
|
| 56 |
+
_DL_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 57 |
+
("cooperate", "cooperate"): (float(DL_CC_PAYOFF), float(DL_CC_PAYOFF)),
|
| 58 |
+
("cooperate", "defect"): (float(DL_CD_PAYOFF), float(DL_DC_PAYOFF)),
|
| 59 |
+
("defect", "cooperate"): (float(DL_DC_PAYOFF), float(DL_CD_PAYOFF)),
|
| 60 |
+
("defect", "defect"): (float(DL_DD_PAYOFF), float(DL_DD_PAYOFF)),
|
| 61 |
+
}
|
| 62 |
+
|
| 63 |
+
# -- Harmony --
|
| 64 |
+
_HM_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 65 |
+
("cooperate", "cooperate"): (float(HM_CC_PAYOFF), float(HM_CC_PAYOFF)),
|
| 66 |
+
("cooperate", "defect"): (float(HM_CD_PAYOFF), float(HM_DC_PAYOFF)),
|
| 67 |
+
("defect", "cooperate"): (float(HM_DC_PAYOFF), float(HM_CD_PAYOFF)),
|
| 68 |
+
("defect", "defect"): (float(HM_DD_PAYOFF), float(HM_DD_PAYOFF)),
|
| 69 |
+
}
|
| 70 |
+
|
| 71 |
+
# -- Register all games --
|
| 72 |
+
|
| 73 |
+
EXTENDED_MATRIX_GAMES: dict[str, GameConfig] = {
|
| 74 |
+
"matching_pennies": GameConfig(
|
| 75 |
+
name="Matching Pennies",
|
| 76 |
+
description=(
|
| 77 |
+
"A pure zero-sum game. The matcher wins if both choose the same "
|
| 78 |
+
"side; the mismatcher wins if they differ. The only Nash "
|
| 79 |
+
"equilibrium is a mixed strategy of equal randomization."
|
| 80 |
+
),
|
| 81 |
+
actions=["heads", "tails"],
|
| 82 |
+
game_type="matrix",
|
| 83 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 84 |
+
payoff_fn=_matrix_payoff_fn(_MP_MATRIX),
|
| 85 |
+
),
|
| 86 |
+
"rock_paper_scissors": GameConfig(
|
| 87 |
+
name="Rock-Paper-Scissors",
|
| 88 |
+
description=(
|
| 89 |
+
"A three-action zero-sum game: rock beats scissors, scissors "
|
| 90 |
+
"beats paper, paper beats rock. The unique Nash equilibrium "
|
| 91 |
+
"is uniform randomization over all three actions."
|
| 92 |
+
),
|
| 93 |
+
actions=["rock", "paper", "scissors"],
|
| 94 |
+
game_type="matrix",
|
| 95 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 96 |
+
payoff_fn=_matrix_payoff_fn(_RPS_MATRIX),
|
| 97 |
+
),
|
| 98 |
+
"battle_of_the_sexes": GameConfig(
|
| 99 |
+
name="Battle of the Sexes",
|
| 100 |
+
description=(
|
| 101 |
+
"Two players want to coordinate but have different preferences. "
|
| 102 |
+
"The first player prefers opera, the second prefers football. "
|
| 103 |
+
"Both prefer any coordination over miscoordination. Two pure "
|
| 104 |
+
"Nash equilibria exist at (opera, opera) and (football, football)."
|
| 105 |
+
),
|
| 106 |
+
actions=["opera", "football"],
|
| 107 |
+
game_type="matrix",
|
| 108 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 109 |
+
payoff_fn=_matrix_payoff_fn(_BOS_MATRIX),
|
| 110 |
+
),
|
| 111 |
+
"pure_coordination": GameConfig(
|
| 112 |
+
name="Pure Coordination",
|
| 113 |
+
description=(
|
| 114 |
+
"Two players receive a positive payoff only when they choose "
|
| 115 |
+
"the same action. Both (left, left) and (right, right) are "
|
| 116 |
+
"Nash equilibria. Tests whether agents can converge on a focal "
|
| 117 |
+
"point without communication."
|
| 118 |
+
),
|
| 119 |
+
actions=["left", "right"],
|
| 120 |
+
game_type="matrix",
|
| 121 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 122 |
+
payoff_fn=_matrix_payoff_fn(_PC_MATRIX),
|
| 123 |
+
),
|
| 124 |
+
"deadlock": GameConfig(
|
| 125 |
+
name="Deadlock",
|
| 126 |
+
description=(
|
| 127 |
+
"Similar to the Prisoner's Dilemma but with different payoff "
|
| 128 |
+
"ordering: DC > DD > CC > CD. Both players prefer mutual "
|
| 129 |
+
"defection over mutual cooperation. The unique Nash equilibrium "
|
| 130 |
+
"is (defect, defect) and it is also Pareto optimal."
|
| 131 |
+
),
|
| 132 |
+
actions=["cooperate", "defect"],
|
| 133 |
+
game_type="matrix",
|
| 134 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 135 |
+
payoff_fn=_matrix_payoff_fn(_DL_MATRIX),
|
| 136 |
+
),
|
| 137 |
+
"harmony": GameConfig(
|
| 138 |
+
name="Harmony",
|
| 139 |
+
description=(
|
| 140 |
+
"The opposite of a social dilemma: cooperation is the dominant "
|
| 141 |
+
"strategy for both players. Payoff ordering CC > DC > CD > DD "
|
| 142 |
+
"means rational self-interest naturally leads to the socially "
|
| 143 |
+
"optimal outcome of mutual cooperation."
|
| 144 |
+
),
|
| 145 |
+
actions=["cooperate", "defect"],
|
| 146 |
+
game_type="matrix",
|
| 147 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 148 |
+
payoff_fn=_matrix_payoff_fn(_HM_MATRIX),
|
| 149 |
+
),
|
| 150 |
+
}
|
| 151 |
+
|
| 152 |
+
GAMES.update(EXTENDED_MATRIX_GAMES)
|
common/games_ext/nplayer.py
ADDED
|
@@ -0,0 +1,143 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""N-player social dilemma games for KantBench.
|
| 2 |
+
|
| 3 |
+
Modeled as one agent vs one opponent (representing aggregate of others).
|
| 4 |
+
"""
|
| 5 |
+
from __future__ import annotations
|
| 6 |
+
|
| 7 |
+
from common.games import GAMES, GameConfig
|
| 8 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 9 |
+
from constant_definitions.auction_nplayer_constants import (
|
| 10 |
+
COMMONS_RESOURCE_CAPACITY, COMMONS_MAX_EXTRACTION,
|
| 11 |
+
COMMONS_DEPLETION_PENALTY,
|
| 12 |
+
VOLUNTEER_BENEFIT, VOLUNTEER_COST, VOLUNTEER_NO_VOL,
|
| 13 |
+
EL_FAROL_ATTEND_REWARD, EL_FAROL_CROWD_PENALTY, EL_FAROL_STAY_HOME,
|
| 14 |
+
EL_FAROL_CAPACITY,
|
| 15 |
+
)
|
| 16 |
+
|
| 17 |
+
_ONE = int(bool(True))
|
| 18 |
+
_ZERO_F = float()
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# -- Tragedy of the Commons --
|
| 22 |
+
|
| 23 |
+
def _commons_payoff(
|
| 24 |
+
player_action: str, opponent_action: str,
|
| 25 |
+
) -> tuple[float, float]:
|
| 26 |
+
"""Resource extraction game.
|
| 27 |
+
|
| 28 |
+
Each player extracts from a shared resource. If total extraction
|
| 29 |
+
exceeds capacity, both suffer a depletion penalty.
|
| 30 |
+
"""
|
| 31 |
+
p_extract = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 32 |
+
o_extract = int(opponent_action.rsplit("_", _ONE)[_ONE])
|
| 33 |
+
total = p_extract + o_extract
|
| 34 |
+
|
| 35 |
+
if total > COMMONS_RESOURCE_CAPACITY:
|
| 36 |
+
return (float(COMMONS_DEPLETION_PENALTY), float(COMMONS_DEPLETION_PENALTY))
|
| 37 |
+
|
| 38 |
+
return (float(p_extract), float(o_extract))
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
_COMMONS_ACTIONS = [
|
| 42 |
+
f"extract_{i}" for i in range(COMMONS_MAX_EXTRACTION + _ONE)
|
| 43 |
+
]
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# -- Volunteer's Dilemma --
|
| 47 |
+
|
| 48 |
+
def _volunteer_payoff(
|
| 49 |
+
player_action: str, opponent_action: str,
|
| 50 |
+
) -> tuple[float, float]:
|
| 51 |
+
"""At least one must volunteer for everyone to benefit.
|
| 52 |
+
|
| 53 |
+
Volunteering costs the volunteer but benefits all.
|
| 54 |
+
If nobody volunteers, everyone gets nothing.
|
| 55 |
+
"""
|
| 56 |
+
p_vol = player_action == "volunteer"
|
| 57 |
+
o_vol = opponent_action == "volunteer"
|
| 58 |
+
|
| 59 |
+
if not p_vol and not o_vol:
|
| 60 |
+
return (float(VOLUNTEER_NO_VOL), float(VOLUNTEER_NO_VOL))
|
| 61 |
+
|
| 62 |
+
p_pay = float(VOLUNTEER_BENEFIT - VOLUNTEER_COST) if p_vol else float(VOLUNTEER_BENEFIT)
|
| 63 |
+
o_pay = float(VOLUNTEER_BENEFIT - VOLUNTEER_COST) if o_vol else float(VOLUNTEER_BENEFIT)
|
| 64 |
+
return (p_pay, o_pay)
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
# -- El Farol Bar Problem --
|
| 68 |
+
|
| 69 |
+
def _el_farol_payoff(
|
| 70 |
+
player_action: str, opponent_action: str,
|
| 71 |
+
) -> tuple[float, float]:
|
| 72 |
+
"""Bar attendance decision game.
|
| 73 |
+
|
| 74 |
+
Going to the bar is fun if few attend (under capacity), but
|
| 75 |
+
unpleasant if crowded. Staying home gives a moderate fixed payoff.
|
| 76 |
+
"""
|
| 77 |
+
p_goes = player_action == "attend"
|
| 78 |
+
o_goes = opponent_action == "attend"
|
| 79 |
+
|
| 80 |
+
attendees = int(p_goes) + int(o_goes)
|
| 81 |
+
crowded = attendees > _ONE
|
| 82 |
+
|
| 83 |
+
if not p_goes:
|
| 84 |
+
p_pay = float(EL_FAROL_STAY_HOME)
|
| 85 |
+
elif crowded:
|
| 86 |
+
p_pay = float(EL_FAROL_CROWD_PENALTY)
|
| 87 |
+
else:
|
| 88 |
+
p_pay = float(EL_FAROL_ATTEND_REWARD)
|
| 89 |
+
|
| 90 |
+
if not o_goes:
|
| 91 |
+
o_pay = float(EL_FAROL_STAY_HOME)
|
| 92 |
+
elif crowded:
|
| 93 |
+
o_pay = float(EL_FAROL_CROWD_PENALTY)
|
| 94 |
+
else:
|
| 95 |
+
o_pay = float(EL_FAROL_ATTEND_REWARD)
|
| 96 |
+
|
| 97 |
+
return (p_pay, o_pay)
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
# -- Register --
|
| 101 |
+
|
| 102 |
+
NPLAYER_GAMES: dict[str, GameConfig] = {
|
| 103 |
+
"tragedy_of_commons": GameConfig(
|
| 104 |
+
name="Tragedy of the Commons",
|
| 105 |
+
description=(
|
| 106 |
+
"Players extract resources from a shared pool. Individual "
|
| 107 |
+
"incentive is to extract more, but if total extraction exceeds "
|
| 108 |
+
"the sustainable capacity, the resource collapses and everyone "
|
| 109 |
+
"suffers. Models environmental and resource management dilemmas."
|
| 110 |
+
),
|
| 111 |
+
actions=_COMMONS_ACTIONS,
|
| 112 |
+
game_type="commons",
|
| 113 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 114 |
+
payoff_fn=_commons_payoff,
|
| 115 |
+
),
|
| 116 |
+
"volunteer_dilemma": GameConfig(
|
| 117 |
+
name="Volunteer's Dilemma",
|
| 118 |
+
description=(
|
| 119 |
+
"At least one player must volunteer (at personal cost) for "
|
| 120 |
+
"everyone to receive a benefit. If nobody volunteers, all get "
|
| 121 |
+
"nothing. Models bystander effects and public good provision."
|
| 122 |
+
),
|
| 123 |
+
actions=["volunteer", "abstain"],
|
| 124 |
+
game_type="matrix",
|
| 125 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 126 |
+
payoff_fn=_volunteer_payoff,
|
| 127 |
+
),
|
| 128 |
+
"el_farol": GameConfig(
|
| 129 |
+
name="El Farol Bar Problem",
|
| 130 |
+
description=(
|
| 131 |
+
"Each player decides whether to attend a bar. If attendance "
|
| 132 |
+
"is below capacity, going is better than staying home. If the "
|
| 133 |
+
"bar is crowded, staying home is better. Models minority games "
|
| 134 |
+
"and congestion dynamics."
|
| 135 |
+
),
|
| 136 |
+
actions=["attend", "stay_home"],
|
| 137 |
+
game_type="matrix",
|
| 138 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 139 |
+
payoff_fn=_el_farol_payoff,
|
| 140 |
+
),
|
| 141 |
+
}
|
| 142 |
+
|
| 143 |
+
GAMES.update(NPLAYER_GAMES)
|
common/games_ext/sequential.py
ADDED
|
@@ -0,0 +1,140 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Sequential (extensive-form) games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig
|
| 5 |
+
from constant_definitions.game_constants import SINGLE_SHOT_ROUNDS, DEFAULT_NUM_ROUNDS
|
| 6 |
+
from constant_definitions.sequential_constants import (
|
| 7 |
+
DICTATOR_ENDOWMENT,
|
| 8 |
+
CENTIPEDE_INITIAL_POT, CENTIPEDE_GROWTH_MULTIPLIER, CENTIPEDE_MAX_STAGES,
|
| 9 |
+
CENTIPEDE_LARGE_SHARE_NUMERATOR, CENTIPEDE_LARGE_SHARE_DENOMINATOR,
|
| 10 |
+
CENTIPEDE_SMALL_SHARE_NUMERATOR, CENTIPEDE_SMALL_SHARE_DENOMINATOR,
|
| 11 |
+
STACKELBERG_DEMAND_INTERCEPT, STACKELBERG_DEMAND_SLOPE,
|
| 12 |
+
STACKELBERG_MARGINAL_COST, STACKELBERG_MAX_QUANTITY,
|
| 13 |
+
)
|
| 14 |
+
|
| 15 |
+
_ONE = int(bool(True))
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
# -- Dictator Game --
|
| 19 |
+
|
| 20 |
+
def _dictator_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 21 |
+
"""Dictator allocates from endowment; recipient has no choice."""
|
| 22 |
+
amount = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 23 |
+
dictator_keeps = float(DICTATOR_ENDOWMENT - amount)
|
| 24 |
+
recipient_gets = float(amount)
|
| 25 |
+
return (dictator_keeps, recipient_gets)
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
_DICTATOR_ACTIONS = [
|
| 29 |
+
f"give_{i}" for i in range(DICTATOR_ENDOWMENT + _ONE)
|
| 30 |
+
]
|
| 31 |
+
|
| 32 |
+
|
| 33 |
+
# -- Centipede Game --
|
| 34 |
+
|
| 35 |
+
def _centipede_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 36 |
+
"""Alternating pass/take game with growing pot.
|
| 37 |
+
|
| 38 |
+
Actions encode the stage: 'take_N' means take at stage N,
|
| 39 |
+
'pass_all' means pass through all stages.
|
| 40 |
+
The opponent strategy similarly responds with take or pass.
|
| 41 |
+
"""
|
| 42 |
+
if player_action == "pass_all":
|
| 43 |
+
player_stage = CENTIPEDE_MAX_STAGES + _ONE
|
| 44 |
+
else:
|
| 45 |
+
player_stage = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 46 |
+
|
| 47 |
+
if opponent_action == "pass_all":
|
| 48 |
+
opp_stage = CENTIPEDE_MAX_STAGES + _ONE
|
| 49 |
+
else:
|
| 50 |
+
opp_stage = int(opponent_action.rsplit("_", _ONE)[_ONE])
|
| 51 |
+
|
| 52 |
+
take_stage = min(player_stage, opp_stage)
|
| 53 |
+
|
| 54 |
+
pot = CENTIPEDE_INITIAL_POT
|
| 55 |
+
for _ in range(take_stage):
|
| 56 |
+
pot = pot * CENTIPEDE_GROWTH_MULTIPLIER
|
| 57 |
+
|
| 58 |
+
large = pot * CENTIPEDE_LARGE_SHARE_NUMERATOR // CENTIPEDE_LARGE_SHARE_DENOMINATOR
|
| 59 |
+
small = pot * CENTIPEDE_SMALL_SHARE_NUMERATOR // CENTIPEDE_SMALL_SHARE_DENOMINATOR
|
| 60 |
+
|
| 61 |
+
if player_stage <= opp_stage:
|
| 62 |
+
return (float(large), float(small))
|
| 63 |
+
return (float(small), float(large))
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
_CENTIPEDE_ACTIONS = [
|
| 67 |
+
f"take_{i}" for i in range(CENTIPEDE_MAX_STAGES + _ONE)
|
| 68 |
+
] + ["pass_all"]
|
| 69 |
+
|
| 70 |
+
|
| 71 |
+
# -- Stackelberg Competition --
|
| 72 |
+
|
| 73 |
+
def _stackelberg_payoff(
|
| 74 |
+
player_action: str, opponent_action: str,
|
| 75 |
+
) -> tuple[float, float]:
|
| 76 |
+
"""Stackelberg duopoly: leader (player) and follower (opponent).
|
| 77 |
+
|
| 78 |
+
Profit = (demand_intercept - slope * (q_leader + q_follower) - cost) * q
|
| 79 |
+
"""
|
| 80 |
+
q_leader = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 81 |
+
q_follower = int(opponent_action.rsplit("_", _ONE)[_ONE])
|
| 82 |
+
|
| 83 |
+
total_q = q_leader + q_follower
|
| 84 |
+
price = STACKELBERG_DEMAND_INTERCEPT - STACKELBERG_DEMAND_SLOPE * total_q
|
| 85 |
+
|
| 86 |
+
leader_profit = float((price - STACKELBERG_MARGINAL_COST) * q_leader)
|
| 87 |
+
follower_profit = float((price - STACKELBERG_MARGINAL_COST) * q_follower)
|
| 88 |
+
return (leader_profit, follower_profit)
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
_STACKELBERG_ACTIONS = [
|
| 92 |
+
f"produce_{i}" for i in range(STACKELBERG_MAX_QUANTITY + _ONE)
|
| 93 |
+
]
|
| 94 |
+
|
| 95 |
+
|
| 96 |
+
# -- Register --
|
| 97 |
+
|
| 98 |
+
SEQUENTIAL_GAMES: dict[str, GameConfig] = {
|
| 99 |
+
"dictator": GameConfig(
|
| 100 |
+
name="Dictator Game",
|
| 101 |
+
description=(
|
| 102 |
+
"One player (the dictator) decides how to split an endowment "
|
| 103 |
+
"with a passive recipient who has no say. Tests fairness "
|
| 104 |
+
"preferences and altruistic behavior when there is no strategic "
|
| 105 |
+
"incentive to share."
|
| 106 |
+
),
|
| 107 |
+
actions=_DICTATOR_ACTIONS,
|
| 108 |
+
game_type="dictator",
|
| 109 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 110 |
+
payoff_fn=_dictator_payoff,
|
| 111 |
+
),
|
| 112 |
+
"centipede": GameConfig(
|
| 113 |
+
name="Centipede Game",
|
| 114 |
+
description=(
|
| 115 |
+
"Players alternate deciding to take or pass. Each pass doubles "
|
| 116 |
+
"the pot. The taker gets the larger share while the other gets "
|
| 117 |
+
"the smaller share. Backward induction predicts immediate taking, "
|
| 118 |
+
"but cooperation through passing yields higher joint payoffs."
|
| 119 |
+
),
|
| 120 |
+
actions=_CENTIPEDE_ACTIONS,
|
| 121 |
+
game_type="centipede",
|
| 122 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 123 |
+
payoff_fn=_centipede_payoff,
|
| 124 |
+
),
|
| 125 |
+
"stackelberg": GameConfig(
|
| 126 |
+
name="Stackelberg Competition",
|
| 127 |
+
description=(
|
| 128 |
+
"A quantity-setting duopoly where the leader commits to a "
|
| 129 |
+
"production quantity first, and the follower observes and "
|
| 130 |
+
"responds. The leader can exploit first-mover advantage. "
|
| 131 |
+
"Price is determined by total market quantity."
|
| 132 |
+
),
|
| 133 |
+
actions=_STACKELBERG_ACTIONS,
|
| 134 |
+
game_type="stackelberg",
|
| 135 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 136 |
+
payoff_fn=_stackelberg_payoff,
|
| 137 |
+
),
|
| 138 |
+
}
|
| 139 |
+
|
| 140 |
+
GAMES.update(SEQUENTIAL_GAMES)
|
common/games_info/__pycache__/bayesian.cpython-311.pyc
ADDED
|
Binary file (4.99 kB). View file
|
|
|
common/games_info/__pycache__/communication.cpython-311.pyc
ADDED
|
Binary file (6.51 kB). View file
|
|
|
common/games_info/__pycache__/contracts.cpython-311.pyc
ADDED
|
Binary file (5.33 kB). View file
|
|
|
common/games_info/__pycache__/network.cpython-311.pyc
ADDED
|
Binary file (5.34 kB). View file
|
|
|
common/games_info/__pycache__/signaling.cpython-311.pyc
ADDED
|
Binary file (6.78 kB). View file
|
|
|
common/games_info/bayesian.py
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Bayesian and incomplete information games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.batch4.bayesian_constants import (
|
| 7 |
+
GG_ATTACK_ATTACK, GG_ATTACK_WAIT, GG_WAIT_ATTACK, GG_WAIT_WAIT,
|
| 8 |
+
JV_CONVICT_CONVICT, JV_ACQUIT_ACQUIT, JV_SPLIT_VOTE,
|
| 9 |
+
IC_SIGNAL_SIGNAL, IC_SIGNAL_CROWD, IC_CROWD_SIGNAL, IC_CROWD_CROWD,
|
| 10 |
+
ASI_REVEAL_REVEAL, ASI_REVEAL_HIDE, ASI_HIDE_REVEAL, ASI_HIDE_HIDE,
|
| 11 |
+
)
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
# -- Global Game (regime change / bank run under private signals) --
|
| 15 |
+
_GG: dict[tuple[str, str], tuple[float, float]] = {
|
| 16 |
+
("attack", "attack"): (float(GG_ATTACK_ATTACK), float(GG_ATTACK_ATTACK)),
|
| 17 |
+
("attack", "wait"): (float(GG_ATTACK_WAIT), float(GG_WAIT_ATTACK)),
|
| 18 |
+
("wait", "attack"): (float(GG_WAIT_ATTACK), float(GG_ATTACK_WAIT)),
|
| 19 |
+
("wait", "wait"): (float(GG_WAIT_WAIT), float(GG_WAIT_WAIT)),
|
| 20 |
+
}
|
| 21 |
+
|
| 22 |
+
|
| 23 |
+
# -- Jury Voting (unanimity rule for conviction) --
|
| 24 |
+
_JV: dict[tuple[str, str], tuple[float, float]] = {
|
| 25 |
+
("guilty", "guilty"): (float(JV_CONVICT_CONVICT), float(JV_CONVICT_CONVICT)),
|
| 26 |
+
("guilty", "acquit"): (float(JV_SPLIT_VOTE), float(JV_SPLIT_VOTE)),
|
| 27 |
+
("acquit", "guilty"): (float(JV_SPLIT_VOTE), float(JV_SPLIT_VOTE)),
|
| 28 |
+
("acquit", "acquit"): (float(JV_ACQUIT_ACQUIT), float(JV_ACQUIT_ACQUIT)),
|
| 29 |
+
}
|
| 30 |
+
|
| 31 |
+
|
| 32 |
+
# -- Information Cascade (follow own signal vs follow crowd) --
|
| 33 |
+
_IC: dict[tuple[str, str], tuple[float, float]] = {
|
| 34 |
+
("follow_signal", "follow_signal"): (
|
| 35 |
+
float(IC_SIGNAL_SIGNAL), float(IC_SIGNAL_SIGNAL),
|
| 36 |
+
),
|
| 37 |
+
("follow_signal", "follow_crowd"): (
|
| 38 |
+
float(IC_SIGNAL_CROWD), float(IC_CROWD_SIGNAL),
|
| 39 |
+
),
|
| 40 |
+
("follow_crowd", "follow_signal"): (
|
| 41 |
+
float(IC_CROWD_SIGNAL), float(IC_SIGNAL_CROWD),
|
| 42 |
+
),
|
| 43 |
+
("follow_crowd", "follow_crowd"): (
|
| 44 |
+
float(IC_CROWD_CROWD), float(IC_CROWD_CROWD),
|
| 45 |
+
),
|
| 46 |
+
}
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
# -- Adverse Selection (reveal or hide private type) --
|
| 50 |
+
_ASI: dict[tuple[str, str], tuple[float, float]] = {
|
| 51 |
+
("reveal_type", "reveal_type"): (
|
| 52 |
+
float(ASI_REVEAL_REVEAL), float(ASI_REVEAL_REVEAL),
|
| 53 |
+
),
|
| 54 |
+
("reveal_type", "hide_type"): (
|
| 55 |
+
float(ASI_REVEAL_HIDE), float(ASI_HIDE_REVEAL),
|
| 56 |
+
),
|
| 57 |
+
("hide_type", "reveal_type"): (
|
| 58 |
+
float(ASI_HIDE_REVEAL), float(ASI_REVEAL_HIDE),
|
| 59 |
+
),
|
| 60 |
+
("hide_type", "hide_type"): (
|
| 61 |
+
float(ASI_HIDE_HIDE), float(ASI_HIDE_HIDE),
|
| 62 |
+
),
|
| 63 |
+
}
|
| 64 |
+
|
| 65 |
+
|
| 66 |
+
# -- Register --
|
| 67 |
+
BAYESIAN_GAMES: dict[str, GameConfig] = {
|
| 68 |
+
"global_game": GameConfig(
|
| 69 |
+
name="Global Game",
|
| 70 |
+
description=(
|
| 71 |
+
"A coordination game modeling regime change or bank runs under "
|
| 72 |
+
"incomplete information. Players receive private signals about "
|
| 73 |
+
"fundamentals and choose to attack or wait. Successful coordination "
|
| 74 |
+
"on attack yields high payoffs but unilateral attack is costly. "
|
| 75 |
+
"Tests strategic behavior under private information."
|
| 76 |
+
),
|
| 77 |
+
actions=["attack", "wait"],
|
| 78 |
+
game_type="matrix",
|
| 79 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 80 |
+
payoff_fn=_matrix_payoff_fn(_GG),
|
| 81 |
+
),
|
| 82 |
+
"jury_voting": GameConfig(
|
| 83 |
+
name="Jury Voting Game",
|
| 84 |
+
description=(
|
| 85 |
+
"Two jurors simultaneously vote guilty or acquit under a unanimity "
|
| 86 |
+
"rule. Conviction requires both voting guilty. Each juror has a "
|
| 87 |
+
"private signal about the defendant. Strategic voting may differ "
|
| 88 |
+
"from sincere voting. Tests information aggregation under voting."
|
| 89 |
+
),
|
| 90 |
+
actions=["guilty", "acquit"],
|
| 91 |
+
game_type="matrix",
|
| 92 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 93 |
+
payoff_fn=_matrix_payoff_fn(_JV),
|
| 94 |
+
),
|
| 95 |
+
"information_cascade": GameConfig(
|
| 96 |
+
name="Information Cascade Game",
|
| 97 |
+
description=(
|
| 98 |
+
"Players choose whether to follow their own private signal or "
|
| 99 |
+
"follow the crowd. Independent signal-following leads to better "
|
| 100 |
+
"information aggregation while crowd-following creates herding. "
|
| 101 |
+
"Asymmetric payoffs reflect the benefit of diverse information. "
|
| 102 |
+
"Tests independence of judgment under social influence."
|
| 103 |
+
),
|
| 104 |
+
actions=["follow_signal", "follow_crowd"],
|
| 105 |
+
game_type="matrix",
|
| 106 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 107 |
+
payoff_fn=_matrix_payoff_fn(_IC),
|
| 108 |
+
),
|
| 109 |
+
"adverse_selection_insurance": GameConfig(
|
| 110 |
+
name="Adverse Selection Insurance Game",
|
| 111 |
+
description=(
|
| 112 |
+
"An insurance market game with asymmetric information. Each player "
|
| 113 |
+
"can reveal their private risk type for efficient pricing or hide "
|
| 114 |
+
"it to exploit information asymmetry. Mutual revelation enables "
|
| 115 |
+
"fair pricing. Hiding while the other reveals creates adverse "
|
| 116 |
+
"selection profit. Tests screening and pooling dynamics."
|
| 117 |
+
),
|
| 118 |
+
actions=["reveal_type", "hide_type"],
|
| 119 |
+
game_type="matrix",
|
| 120 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 121 |
+
payoff_fn=_matrix_payoff_fn(_ASI),
|
| 122 |
+
),
|
| 123 |
+
}
|
| 124 |
+
|
| 125 |
+
GAMES.update(BAYESIAN_GAMES)
|
common/games_info/communication.py
ADDED
|
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Communication and mediation games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.var.communication_constants import (
|
| 7 |
+
CTPD_REWARD, CTPD_TEMPTATION, CTPD_PUNISHMENT, CTPD_SUCKER,
|
| 8 |
+
COMMIT_COST,
|
| 9 |
+
CE_FOLLOW_FOLLOW, CE_FOLLOW_DEVIATE,
|
| 10 |
+
CE_DEVIATE_FOLLOW, CE_DEVIATE_DEVIATE,
|
| 11 |
+
FP_MATCH_PAYOFF, FP_MISMATCH_PAYOFF,
|
| 12 |
+
MG_ACCEPT_ACCEPT, MG_ACCEPT_REJECT,
|
| 13 |
+
MG_REJECT_ACCEPT, MG_REJECT_REJECT,
|
| 14 |
+
)
|
| 15 |
+
|
| 16 |
+
_ONE = int(bool(True))
|
| 17 |
+
_ZERO_F = float()
|
| 18 |
+
|
| 19 |
+
# -- Cheap Talk PD (message + action, messages are non-binding) --
|
| 20 |
+
_CTPD_BASE: dict[tuple[str, str], tuple[float, float]] = {
|
| 21 |
+
("cooperate", "cooperate"): (float(CTPD_REWARD), float(CTPD_REWARD)),
|
| 22 |
+
("cooperate", "defect"): (float(CTPD_SUCKER), float(CTPD_TEMPTATION)),
|
| 23 |
+
("defect", "cooperate"): (float(CTPD_TEMPTATION), float(CTPD_SUCKER)),
|
| 24 |
+
("defect", "defect"): (float(CTPD_PUNISHMENT), float(CTPD_PUNISHMENT)),
|
| 25 |
+
}
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
def _cheap_talk_pd_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 29 |
+
"""Message is cheap talk; payoff depends only on actual action."""
|
| 30 |
+
actual_p = pa.rsplit("_", _ONE)[_ONE]
|
| 31 |
+
actual_o = oa.rsplit("_", _ONE)[_ONE]
|
| 32 |
+
return _CTPD_BASE[(actual_p, actual_o)]
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
_CTPD_ACTS = [
|
| 36 |
+
"msg_coop_cooperate", "msg_coop_defect",
|
| 37 |
+
"msg_def_cooperate", "msg_def_defect",
|
| 38 |
+
]
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
# -- Binding Commitment (costly commitment mechanism) --
|
| 42 |
+
_CC = float(CTPD_REWARD)
|
| 43 |
+
_CS = float(CTPD_SUCKER)
|
| 44 |
+
_CT = float(CTPD_TEMPTATION)
|
| 45 |
+
_CP = float(CTPD_PUNISHMENT)
|
| 46 |
+
_COST = float(COMMIT_COST)
|
| 47 |
+
|
| 48 |
+
_BIND_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 49 |
+
("commit_coop", "commit_coop"): (_CC - _COST, _CC - _COST),
|
| 50 |
+
("commit_coop", "free_coop"): (_CC - _COST, _CC),
|
| 51 |
+
("commit_coop", "free_defect"): (_CS - _COST, _CT),
|
| 52 |
+
("free_coop", "commit_coop"): (_CC, _CC - _COST),
|
| 53 |
+
("free_coop", "free_coop"): (_CC, _CC),
|
| 54 |
+
("free_coop", "free_defect"): (_CS, _CT),
|
| 55 |
+
("free_defect", "commit_coop"): (_CT, _CS - _COST),
|
| 56 |
+
("free_defect", "free_coop"): (_CT, _CS),
|
| 57 |
+
("free_defect", "free_defect"): (_CP, _CP),
|
| 58 |
+
}
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
# -- Correlated Equilibrium (follow external mediator or deviate) --
|
| 62 |
+
_CE: dict[tuple[str, str], tuple[float, float]] = {
|
| 63 |
+
("follow", "follow"): (float(CE_FOLLOW_FOLLOW), float(CE_FOLLOW_FOLLOW)),
|
| 64 |
+
("follow", "deviate"): (float(CE_FOLLOW_DEVIATE), float(CE_DEVIATE_FOLLOW)),
|
| 65 |
+
("deviate", "follow"): (float(CE_DEVIATE_FOLLOW), float(CE_FOLLOW_DEVIATE)),
|
| 66 |
+
("deviate", "deviate"): (float(CE_DEVIATE_DEVIATE), float(CE_DEVIATE_DEVIATE)),
|
| 67 |
+
}
|
| 68 |
+
|
| 69 |
+
|
| 70 |
+
# -- Focal Point (multi-option coordination without communication) --
|
| 71 |
+
_FP_MATCH = float(FP_MATCH_PAYOFF)
|
| 72 |
+
_FP_MISS = float(FP_MISMATCH_PAYOFF)
|
| 73 |
+
_FP_OPTIONS = ["choose_red", "choose_green", "choose_blue", "choose_yellow"]
|
| 74 |
+
|
| 75 |
+
|
| 76 |
+
def _focal_point_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 77 |
+
if pa == oa:
|
| 78 |
+
return (_FP_MATCH, _FP_MATCH)
|
| 79 |
+
return (_FP_MISS, _FP_MISS)
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
# -- Mediated Game (accept or reject third-party mediation) --
|
| 83 |
+
_MED: dict[tuple[str, str], tuple[float, float]] = {
|
| 84 |
+
("accept", "accept"): (float(MG_ACCEPT_ACCEPT), float(MG_ACCEPT_ACCEPT)),
|
| 85 |
+
("accept", "reject"): (float(MG_ACCEPT_REJECT), float(MG_REJECT_ACCEPT)),
|
| 86 |
+
("reject", "accept"): (float(MG_REJECT_ACCEPT), float(MG_ACCEPT_REJECT)),
|
| 87 |
+
("reject", "reject"): (float(MG_REJECT_REJECT), float(MG_REJECT_REJECT)),
|
| 88 |
+
}
|
| 89 |
+
|
| 90 |
+
|
| 91 |
+
# -- Register --
|
| 92 |
+
COMMUNICATION_GAMES: dict[str, GameConfig] = {
|
| 93 |
+
"cheap_talk_pd": GameConfig(
|
| 94 |
+
name="Cheap Talk Prisoner's Dilemma",
|
| 95 |
+
description=(
|
| 96 |
+
"A Prisoner's Dilemma where each player sends a non-binding "
|
| 97 |
+
"message before acting. Messages are cheap talk: costless and "
|
| 98 |
+
"unenforceable. Payoffs depend only on actual actions. Tests "
|
| 99 |
+
"whether non-binding communication improves cooperation."
|
| 100 |
+
),
|
| 101 |
+
actions=_CTPD_ACTS,
|
| 102 |
+
game_type="cheap_talk_pd",
|
| 103 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 104 |
+
payoff_fn=_cheap_talk_pd_payoff,
|
| 105 |
+
),
|
| 106 |
+
"binding_commitment": GameConfig(
|
| 107 |
+
name="Binding Commitment Game",
|
| 108 |
+
description=(
|
| 109 |
+
"A Prisoner's Dilemma where players can pay a cost to make a "
|
| 110 |
+
"binding commitment to cooperate. The commitment is credible "
|
| 111 |
+
"but costly. Tests whether costly signaling through commitment "
|
| 112 |
+
"mechanisms changes equilibrium behavior."
|
| 113 |
+
),
|
| 114 |
+
actions=["commit_coop", "free_coop", "free_defect"],
|
| 115 |
+
game_type="matrix",
|
| 116 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 117 |
+
payoff_fn=_matrix_payoff_fn(_BIND_MATRIX),
|
| 118 |
+
),
|
| 119 |
+
"correlated_equilibrium": GameConfig(
|
| 120 |
+
name="Correlated Equilibrium Game",
|
| 121 |
+
description=(
|
| 122 |
+
"An external mediator sends private recommendations to each "
|
| 123 |
+
"player. Following yields an efficient correlated outcome. "
|
| 124 |
+
"Deviating can be profitable if the other follows but mutual "
|
| 125 |
+
"deviation destroys coordination gains. Tests trust in "
|
| 126 |
+
"external coordination mechanisms."
|
| 127 |
+
),
|
| 128 |
+
actions=["follow", "deviate"],
|
| 129 |
+
game_type="matrix",
|
| 130 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 131 |
+
payoff_fn=_matrix_payoff_fn(_CE),
|
| 132 |
+
),
|
| 133 |
+
"focal_point": GameConfig(
|
| 134 |
+
name="Focal Point Game",
|
| 135 |
+
description=(
|
| 136 |
+
"Players must coordinate on the same choice from four options "
|
| 137 |
+
"without communication. Only matching yields a positive payoff. "
|
| 138 |
+
"Tests Schelling focal point reasoning and the ability to "
|
| 139 |
+
"identify salient coordination targets."
|
| 140 |
+
),
|
| 141 |
+
actions=_FP_OPTIONS,
|
| 142 |
+
game_type="focal_point",
|
| 143 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 144 |
+
payoff_fn=_focal_point_payoff,
|
| 145 |
+
),
|
| 146 |
+
"mediated_game": GameConfig(
|
| 147 |
+
name="Mediated Game",
|
| 148 |
+
description=(
|
| 149 |
+
"A dispute between two players where a mediator proposes a "
|
| 150 |
+
"fair resolution. Both accepting yields an efficient outcome. "
|
| 151 |
+
"Rejecting while the other accepts gives an advantage but "
|
| 152 |
+
"mutual rejection leads to costly breakdown. Tests willingness "
|
| 153 |
+
"to accept third-party dispute resolution."
|
| 154 |
+
),
|
| 155 |
+
actions=["accept", "reject"],
|
| 156 |
+
game_type="matrix",
|
| 157 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 158 |
+
payoff_fn=_matrix_payoff_fn(_MED),
|
| 159 |
+
),
|
| 160 |
+
}
|
| 161 |
+
|
| 162 |
+
GAMES.update(COMMUNICATION_GAMES)
|
common/games_info/contracts.py
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Principal-agent and contract theory games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig
|
| 5 |
+
from constant_definitions.game_constants import SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.dynamic_constants import (
|
| 7 |
+
MH_BASE_OUTPUT, MH_EFFORT_BOOST, MH_EFFORT_COST, MH_MAX_BONUS,
|
| 8 |
+
SCR_HIGH_TYPE_VALUE, SCR_LOW_TYPE_VALUE,
|
| 9 |
+
SCR_PREMIUM_PRICE, SCR_BASIC_PRICE,
|
| 10 |
+
GE_MAX_WAGE, GE_MAX_EFFORT,
|
| 11 |
+
GE_EFFORT_COST_PER_UNIT, GE_PRODUCTIVITY_PER_EFFORT,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
_ONE = int(bool(True))
|
| 15 |
+
_ZERO = int()
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
# -- Moral Hazard --
|
| 19 |
+
def _moral_hazard_payoff(
|
| 20 |
+
player_action: str, opponent_action: str,
|
| 21 |
+
) -> tuple[float, float]:
|
| 22 |
+
"""Principal sets bonus; agent chooses effort.
|
| 23 |
+
|
| 24 |
+
Principal: output - bonus if agent works.
|
| 25 |
+
Agent: bonus - effort_cost if working, base if shirking.
|
| 26 |
+
"""
|
| 27 |
+
bonus = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 28 |
+
works = opponent_action == "work"
|
| 29 |
+
output = MH_BASE_OUTPUT + MH_EFFORT_BOOST if works else MH_BASE_OUTPUT
|
| 30 |
+
principal_pay = float(output - bonus)
|
| 31 |
+
agent_pay = float(bonus - MH_EFFORT_COST) if works else float(bonus)
|
| 32 |
+
return (principal_pay, agent_pay)
|
| 33 |
+
|
| 34 |
+
|
| 35 |
+
_MH_BONUS_ACTIONS = [f"bonus_{i}" for i in range(MH_MAX_BONUS + _ONE)]
|
| 36 |
+
|
| 37 |
+
|
| 38 |
+
# -- Screening --
|
| 39 |
+
def _screening_payoff(
|
| 40 |
+
player_action: str, opponent_action: str,
|
| 41 |
+
) -> tuple[float, float]:
|
| 42 |
+
"""Principal offers contract menu; agent self-selects.
|
| 43 |
+
|
| 44 |
+
Agent picks premium or basic contract based on private type.
|
| 45 |
+
"""
|
| 46 |
+
if player_action == "offer_premium":
|
| 47 |
+
price = SCR_PREMIUM_PRICE
|
| 48 |
+
else:
|
| 49 |
+
price = SCR_BASIC_PRICE
|
| 50 |
+
|
| 51 |
+
if opponent_action == "choose_premium":
|
| 52 |
+
buyer_value = SCR_HIGH_TYPE_VALUE
|
| 53 |
+
seller_pay = float(SCR_PREMIUM_PRICE)
|
| 54 |
+
buyer_pay = float(buyer_value - SCR_PREMIUM_PRICE)
|
| 55 |
+
else:
|
| 56 |
+
buyer_value = SCR_LOW_TYPE_VALUE
|
| 57 |
+
seller_pay = float(SCR_BASIC_PRICE)
|
| 58 |
+
buyer_pay = float(buyer_value - SCR_BASIC_PRICE)
|
| 59 |
+
|
| 60 |
+
return (seller_pay, buyer_pay)
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
# -- Gift Exchange --
|
| 64 |
+
def _gift_exchange_payoff(
|
| 65 |
+
player_action: str, opponent_action: str,
|
| 66 |
+
) -> tuple[float, float]:
|
| 67 |
+
"""Employer offers wage; worker chooses effort.
|
| 68 |
+
|
| 69 |
+
Employer profit = productivity * effort - wage.
|
| 70 |
+
Worker payoff = wage - effort_cost * effort.
|
| 71 |
+
"""
|
| 72 |
+
wage = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 73 |
+
effort = int(opponent_action.rsplit("_", _ONE)[_ONE])
|
| 74 |
+
employer_pay = float(GE_PRODUCTIVITY_PER_EFFORT * effort - wage)
|
| 75 |
+
worker_pay = float(wage - GE_EFFORT_COST_PER_UNIT * effort)
|
| 76 |
+
return (employer_pay, worker_pay)
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
_GE_WAGE_ACTIONS = [f"wage_{i}" for i in range(GE_MAX_WAGE + _ONE)]
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
# -- Register --
|
| 83 |
+
CONTRACT_GAMES: dict[str, GameConfig] = {
|
| 84 |
+
"moral_hazard": GameConfig(
|
| 85 |
+
name="Moral Hazard (Principal-Agent)",
|
| 86 |
+
description=(
|
| 87 |
+
"A principal offers a bonus contract; an agent with "
|
| 88 |
+
"unobservable effort decides whether to work or shirk. "
|
| 89 |
+
"Tests optimal incentive design and the tradeoff between "
|
| 90 |
+
"motivation and rent extraction."
|
| 91 |
+
),
|
| 92 |
+
actions=_MH_BONUS_ACTIONS,
|
| 93 |
+
game_type="moral_hazard",
|
| 94 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 95 |
+
payoff_fn=_moral_hazard_payoff,
|
| 96 |
+
),
|
| 97 |
+
"screening": GameConfig(
|
| 98 |
+
name="Screening Game",
|
| 99 |
+
description=(
|
| 100 |
+
"An uninformed principal offers a menu of contracts; "
|
| 101 |
+
"agents of different types self-select. Tests understanding "
|
| 102 |
+
"of incentive compatibility and separating mechanisms "
|
| 103 |
+
"as in Rothschild-Stiglitz insurance models."
|
| 104 |
+
),
|
| 105 |
+
actions=["offer_premium", "offer_basic"],
|
| 106 |
+
game_type="matrix",
|
| 107 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 108 |
+
payoff_fn=_screening_payoff,
|
| 109 |
+
),
|
| 110 |
+
"gift_exchange": GameConfig(
|
| 111 |
+
name="Gift Exchange Game",
|
| 112 |
+
description=(
|
| 113 |
+
"An employer offers a wage; a worker chooses effort. "
|
| 114 |
+
"Nash prediction is minimal effort regardless of wage, "
|
| 115 |
+
"but reciprocity often leads to higher wages eliciting "
|
| 116 |
+
"higher effort. Tests fairness-driven behavior."
|
| 117 |
+
),
|
| 118 |
+
actions=_GE_WAGE_ACTIONS,
|
| 119 |
+
game_type="gift_exchange",
|
| 120 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 121 |
+
payoff_fn=_gift_exchange_payoff,
|
| 122 |
+
),
|
| 123 |
+
}
|
| 124 |
+
|
| 125 |
+
GAMES.update(CONTRACT_GAMES)
|
common/games_info/network.py
ADDED
|
@@ -0,0 +1,120 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Network and security interaction games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.batch4.network_constants import (
|
| 7 |
+
SG_DEFEND_SUCCESS, SG_ATTACK_FAIL, SG_DEFEND_FAIL, SG_ATTACK_SUCCESS,
|
| 8 |
+
LF_MUTUAL_CONNECT, LF_UNILATERAL_COST, LF_MUTUAL_ISOLATE,
|
| 9 |
+
TWP_CC, TWP_CD, TWP_DC, TWP_DD,
|
| 10 |
+
TWP_CP, TWP_PC, TWP_DP, TWP_PD, TWP_PP,
|
| 11 |
+
DG_EARLY_EARLY, DG_EARLY_LATE, DG_LATE_EARLY, DG_LATE_LATE,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
|
| 15 |
+
# -- Security Game (defender allocates, attacker targets) --
|
| 16 |
+
_SG: dict[tuple[str, str], tuple[float, float]] = {
|
| 17 |
+
("target_a", "target_a"): (float(SG_DEFEND_SUCCESS), float(SG_ATTACK_FAIL)),
|
| 18 |
+
("target_a", "target_b"): (float(SG_DEFEND_FAIL), float(SG_ATTACK_SUCCESS)),
|
| 19 |
+
("target_b", "target_a"): (float(SG_DEFEND_FAIL), float(SG_ATTACK_SUCCESS)),
|
| 20 |
+
("target_b", "target_b"): (float(SG_DEFEND_SUCCESS), float(SG_ATTACK_FAIL)),
|
| 21 |
+
}
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
# -- Link Formation (bilateral consent required) --
|
| 25 |
+
_LF_CON = float(LF_MUTUAL_CONNECT)
|
| 26 |
+
_LF_UNI = float(LF_UNILATERAL_COST)
|
| 27 |
+
_LF_ISO = float(LF_MUTUAL_ISOLATE)
|
| 28 |
+
|
| 29 |
+
_LF: dict[tuple[str, str], tuple[float, float]] = {
|
| 30 |
+
("connect", "connect"): (_LF_CON, _LF_CON),
|
| 31 |
+
("connect", "isolate"): (_LF_UNI, _LF_ISO),
|
| 32 |
+
("isolate", "connect"): (_LF_ISO, _LF_UNI),
|
| 33 |
+
("isolate", "isolate"): (_LF_ISO, _LF_ISO),
|
| 34 |
+
}
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
# -- Trust with Punishment (3x3: cooperate, defect, punish) --
|
| 38 |
+
_TWP: dict[tuple[str, str], tuple[float, float]] = {
|
| 39 |
+
("cooperate", "cooperate"): (float(TWP_CC), float(TWP_CC)),
|
| 40 |
+
("cooperate", "defect"): (float(TWP_CD), float(TWP_DC)),
|
| 41 |
+
("cooperate", "punish"): (float(TWP_CP), float(TWP_PC)),
|
| 42 |
+
("defect", "cooperate"): (float(TWP_DC), float(TWP_CD)),
|
| 43 |
+
("defect", "defect"): (float(TWP_DD), float(TWP_DD)),
|
| 44 |
+
("defect", "punish"): (float(TWP_DP), float(TWP_PD)),
|
| 45 |
+
("punish", "cooperate"): (float(TWP_PC), float(TWP_CP)),
|
| 46 |
+
("punish", "defect"): (float(TWP_PD), float(TWP_DP)),
|
| 47 |
+
("punish", "punish"): (float(TWP_PP), float(TWP_PP)),
|
| 48 |
+
}
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
# -- Dueling Game (fire timing) --
|
| 52 |
+
_DG: dict[tuple[str, str], tuple[float, float]] = {
|
| 53 |
+
("fire_early", "fire_early"): (float(DG_EARLY_EARLY), float(DG_EARLY_EARLY)),
|
| 54 |
+
("fire_early", "fire_late"): (float(DG_EARLY_LATE), float(DG_LATE_EARLY)),
|
| 55 |
+
("fire_late", "fire_early"): (float(DG_LATE_EARLY), float(DG_EARLY_LATE)),
|
| 56 |
+
("fire_late", "fire_late"): (float(DG_LATE_LATE), float(DG_LATE_LATE)),
|
| 57 |
+
}
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
# -- Register --
|
| 61 |
+
NETWORK_GAMES: dict[str, GameConfig] = {
|
| 62 |
+
"security_game": GameConfig(
|
| 63 |
+
name="Security Game",
|
| 64 |
+
description=(
|
| 65 |
+
"An attacker-defender game where the defender allocates protection "
|
| 66 |
+
"to one of two targets and the attacker simultaneously chooses "
|
| 67 |
+
"which target to attack. Matching the attacker's target means a "
|
| 68 |
+
"successful defense. Misallocation lets the attacker succeed. "
|
| 69 |
+
"Tests strategic resource allocation under adversarial uncertainty."
|
| 70 |
+
),
|
| 71 |
+
actions=["target_a", "target_b"],
|
| 72 |
+
game_type="matrix",
|
| 73 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 74 |
+
payoff_fn=_matrix_payoff_fn(_SG),
|
| 75 |
+
),
|
| 76 |
+
"link_formation": GameConfig(
|
| 77 |
+
name="Link Formation Game",
|
| 78 |
+
description=(
|
| 79 |
+
"A network formation game where two players simultaneously decide "
|
| 80 |
+
"whether to form a connection. A link forms only when both agree. "
|
| 81 |
+
"Mutual connection yields network benefits. Unilateral connection "
|
| 82 |
+
"attempt is costly. Mutual isolation yields nothing. Tests "
|
| 83 |
+
"bilateral consent in network formation."
|
| 84 |
+
),
|
| 85 |
+
actions=["connect", "isolate"],
|
| 86 |
+
game_type="matrix",
|
| 87 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 88 |
+
payoff_fn=_matrix_payoff_fn(_LF),
|
| 89 |
+
),
|
| 90 |
+
"trust_with_punishment": GameConfig(
|
| 91 |
+
name="Trust with Punishment Game",
|
| 92 |
+
description=(
|
| 93 |
+
"An extended trust game where players can cooperate or defect as "
|
| 94 |
+
"in the standard Prisoner's Dilemma plus a costly punishment "
|
| 95 |
+
"action. Punishing reduces the opponent's payoff but also costs "
|
| 96 |
+
"the punisher. Tests whether altruistic punishment enforces "
|
| 97 |
+
"cooperation even at personal cost."
|
| 98 |
+
),
|
| 99 |
+
actions=["cooperate", "defect", "punish"],
|
| 100 |
+
game_type="matrix",
|
| 101 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 102 |
+
payoff_fn=_matrix_payoff_fn(_TWP),
|
| 103 |
+
),
|
| 104 |
+
"dueling_game": GameConfig(
|
| 105 |
+
name="Dueling Game",
|
| 106 |
+
description=(
|
| 107 |
+
"A timing game where two players simultaneously choose when to "
|
| 108 |
+
"fire: early for a safe but moderate payoff or late for higher "
|
| 109 |
+
"accuracy. Firing early against a late opponent is advantageous. "
|
| 110 |
+
"Mutual late firing yields better outcomes than mutual early. "
|
| 111 |
+
"Tests patience versus preemption under uncertainty."
|
| 112 |
+
),
|
| 113 |
+
actions=["fire_early", "fire_late"],
|
| 114 |
+
game_type="matrix",
|
| 115 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 116 |
+
payoff_fn=_matrix_payoff_fn(_DG),
|
| 117 |
+
),
|
| 118 |
+
}
|
| 119 |
+
|
| 120 |
+
GAMES.update(NETWORK_GAMES)
|
common/games_info/signaling.py
ADDED
|
@@ -0,0 +1,142 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Signaling and incomplete information games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.signaling_constants import (
|
| 7 |
+
BQ_TOUGH_BEER_PAYOFF, BQ_TOUGH_QUICHE_PAYOFF,
|
| 8 |
+
BQ_WEAK_BEER_PAYOFF, BQ_WEAK_QUICHE_PAYOFF,
|
| 9 |
+
BQ_CHALLENGE_COST, BQ_NO_CHALLENGE_BONUS,
|
| 10 |
+
SPENCE_HIGH_WAGE, SPENCE_LOW_WAGE,
|
| 11 |
+
SPENCE_EDU_COST_HIGH, SPENCE_EDU_COST_LOW,
|
| 12 |
+
CT_ALIGNED_MATCH, CT_ALIGNED_MISMATCH, CT_BIAS,
|
| 13 |
+
LEMON_GOOD_QUALITY_VALUE, LEMON_BAD_QUALITY_VALUE,
|
| 14 |
+
LEMON_GOOD_SELLER_COST, LEMON_BAD_SELLER_COST, LEMON_MAX_PRICE,
|
| 15 |
+
BP_GOOD_STATE_VALUE, BP_BAD_STATE_PENALTY, BP_SAFE_PAYOFF,
|
| 16 |
+
)
|
| 17 |
+
|
| 18 |
+
_ONE = int(bool(True))
|
| 19 |
+
_TWO = _ONE + _ONE
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
# -- Beer-Quiche (simplified as simultaneous signal-response) --
|
| 23 |
+
_BQ_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 24 |
+
("beer", "challenge"): (float(BQ_TOUGH_BEER_PAYOFF + BQ_CHALLENGE_COST), float(_TWO)),
|
| 25 |
+
("beer", "back_down"): (float(BQ_TOUGH_BEER_PAYOFF + BQ_NO_CHALLENGE_BONUS), float(int())),
|
| 26 |
+
("quiche", "challenge"): (float(BQ_WEAK_QUICHE_PAYOFF + BQ_CHALLENGE_COST), float(-_ONE)),
|
| 27 |
+
("quiche", "back_down"): (float(BQ_WEAK_QUICHE_PAYOFF + BQ_NO_CHALLENGE_BONUS), float(int())),
|
| 28 |
+
}
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
# -- Spence Signaling (worker picks edu level, firm responds) --
|
| 32 |
+
def _spence_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 33 |
+
"""Worker chooses education; firm offers wage based on signal."""
|
| 34 |
+
educated = player_action == "educate"
|
| 35 |
+
high_wage = opponent_action == "high_wage"
|
| 36 |
+
wage = SPENCE_HIGH_WAGE if high_wage else SPENCE_LOW_WAGE
|
| 37 |
+
cost = SPENCE_EDU_COST_HIGH if educated else int()
|
| 38 |
+
worker_pay = float(wage - cost)
|
| 39 |
+
firm_pay = float(SPENCE_HIGH_WAGE - wage) if educated else float(SPENCE_LOW_WAGE - wage)
|
| 40 |
+
return (worker_pay, firm_pay)
|
| 41 |
+
|
| 42 |
+
|
| 43 |
+
# -- Cheap Talk --
|
| 44 |
+
_CT_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 45 |
+
("signal_left", "act_left"): (float(CT_ALIGNED_MATCH), float(CT_ALIGNED_MATCH)),
|
| 46 |
+
("signal_left", "act_right"): (float(CT_ALIGNED_MISMATCH), float(CT_ALIGNED_MISMATCH)),
|
| 47 |
+
("signal_right", "act_left"): (float(CT_ALIGNED_MISMATCH + CT_BIAS), float(CT_ALIGNED_MISMATCH)),
|
| 48 |
+
("signal_right", "act_right"): (float(CT_ALIGNED_MATCH + CT_BIAS), float(CT_ALIGNED_MATCH)),
|
| 49 |
+
}
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
# -- Lemon Market --
|
| 53 |
+
def _lemon_payoff(player_action: str, opponent_action: str) -> tuple[float, float]:
|
| 54 |
+
"""Seller sets price; buyer decides to buy or pass."""
|
| 55 |
+
price = int(player_action.rsplit("_", _ONE)[_ONE])
|
| 56 |
+
if opponent_action == "pass":
|
| 57 |
+
return (float(int()), float(int()))
|
| 58 |
+
avg_value = (LEMON_GOOD_QUALITY_VALUE + LEMON_BAD_QUALITY_VALUE) // _TWO
|
| 59 |
+
buyer_pay = float(avg_value - price)
|
| 60 |
+
avg_cost = (LEMON_GOOD_SELLER_COST + LEMON_BAD_SELLER_COST) // _TWO
|
| 61 |
+
seller_pay = float(price - avg_cost)
|
| 62 |
+
return (seller_pay, buyer_pay)
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
_LEMON_ACTIONS = [f"price_{i}" for i in range(LEMON_MAX_PRICE + _ONE)]
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
# -- Bayesian Persuasion --
|
| 69 |
+
_BP_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 70 |
+
("reveal", "act"): (float(BP_GOOD_STATE_VALUE), float(BP_GOOD_STATE_VALUE)),
|
| 71 |
+
("reveal", "safe"): (float(BP_SAFE_PAYOFF), float(BP_SAFE_PAYOFF)),
|
| 72 |
+
("conceal", "act"): (float(BP_BAD_STATE_PENALTY), float(BP_BAD_STATE_PENALTY)),
|
| 73 |
+
("conceal", "safe"): (float(BP_SAFE_PAYOFF), float(BP_SAFE_PAYOFF)),
|
| 74 |
+
}
|
| 75 |
+
|
| 76 |
+
|
| 77 |
+
# -- Register --
|
| 78 |
+
SIGNALING_GAMES: dict[str, GameConfig] = {
|
| 79 |
+
"beer_quiche": GameConfig(
|
| 80 |
+
name="Beer-Quiche Game",
|
| 81 |
+
description=(
|
| 82 |
+
"A signaling game: the sender chooses a meal (beer or quiche) "
|
| 83 |
+
"to signal their type; the receiver decides whether to challenge. "
|
| 84 |
+
"Tests reasoning about sequential equilibrium and belief refinement."
|
| 85 |
+
),
|
| 86 |
+
actions=["beer", "quiche"],
|
| 87 |
+
game_type="matrix",
|
| 88 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 89 |
+
payoff_fn=_matrix_payoff_fn(_BQ_MATRIX),
|
| 90 |
+
),
|
| 91 |
+
"spence_signaling": GameConfig(
|
| 92 |
+
name="Spence Job Market Signaling",
|
| 93 |
+
description=(
|
| 94 |
+
"A worker chooses whether to acquire education as a signal of "
|
| 95 |
+
"ability; a firm responds with a wage offer. Tests understanding "
|
| 96 |
+
"of separating versus pooling equilibria in labor markets."
|
| 97 |
+
),
|
| 98 |
+
actions=["educate", "no_educate"],
|
| 99 |
+
game_type="matrix",
|
| 100 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 101 |
+
payoff_fn=_spence_payoff,
|
| 102 |
+
),
|
| 103 |
+
"cheap_talk": GameConfig(
|
| 104 |
+
name="Cheap Talk",
|
| 105 |
+
description=(
|
| 106 |
+
"A sender observes a state and sends a costless message; "
|
| 107 |
+
"the receiver chooses an action. Interests are partially "
|
| 108 |
+
"aligned. Tests strategic communication and credibility."
|
| 109 |
+
),
|
| 110 |
+
actions=["signal_left", "signal_right"],
|
| 111 |
+
game_type="matrix",
|
| 112 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 113 |
+
payoff_fn=_matrix_payoff_fn(_CT_MATRIX),
|
| 114 |
+
),
|
| 115 |
+
"lemon_market": GameConfig(
|
| 116 |
+
name="Lemon Market",
|
| 117 |
+
description=(
|
| 118 |
+
"A seller with private quality information sets a price; "
|
| 119 |
+
"the buyer decides whether to purchase. Adverse selection "
|
| 120 |
+
"can cause market unraveling where only low-quality goods trade."
|
| 121 |
+
),
|
| 122 |
+
actions=_LEMON_ACTIONS,
|
| 123 |
+
game_type="lemon",
|
| 124 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 125 |
+
payoff_fn=_lemon_payoff,
|
| 126 |
+
),
|
| 127 |
+
"bayesian_persuasion": GameConfig(
|
| 128 |
+
name="Bayesian Persuasion",
|
| 129 |
+
description=(
|
| 130 |
+
"A sender designs an information structure (reveal or conceal "
|
| 131 |
+
"the state); a receiver takes an action based on the signal. "
|
| 132 |
+
"Tests strategic information disclosure and commitment to "
|
| 133 |
+
"information policies."
|
| 134 |
+
),
|
| 135 |
+
actions=["reveal", "conceal"],
|
| 136 |
+
game_type="matrix",
|
| 137 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 138 |
+
payoff_fn=_matrix_payoff_fn(_BP_MATRIX),
|
| 139 |
+
),
|
| 140 |
+
}
|
| 141 |
+
|
| 142 |
+
GAMES.update(SIGNALING_GAMES)
|
common/games_market/__pycache__/advanced.cpython-311.pyc
ADDED
|
Binary file (4.75 kB). View file
|
|
|
common/games_market/__pycache__/classic.cpython-311.pyc
ADDED
|
Binary file (7.5 kB). View file
|
|
|
common/games_market/__pycache__/contests.cpython-311.pyc
ADDED
|
Binary file (9.98 kB). View file
|
|
|
common/games_market/__pycache__/generated_v2.cpython-311.pyc
ADDED
|
Binary file (6.43 kB). View file
|
|
|
common/games_market/__pycache__/oligopoly.cpython-311.pyc
ADDED
|
Binary file (9.24 kB). View file
|
|
|
common/games_market/advanced.py
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Advanced market mechanism games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.batch4.advanced_constants import (
|
| 7 |
+
PRE_EARLY_EARLY, PRE_EARLY_LATE, PRE_LATE_EARLY, PRE_LATE_LATE,
|
| 8 |
+
PRE_OUT_PAYOFF,
|
| 9 |
+
WOG_LARGE_LARGE, WOG_LARGE_SMALL, WOG_LARGE_NONE,
|
| 10 |
+
WOG_SMALL_SMALL, WOG_SMALL_NONE, WOG_NO_GIFT,
|
| 11 |
+
PS_SAVE_PAYOFF, PS_SCORE_PAYOFF, PS_CENTER_BONUS,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
_ZERO_F = float()
|
| 15 |
+
_OUT_F = float(PRE_OUT_PAYOFF)
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
# -- Preemption Game (enter_early / enter_late / stay_out) --
|
| 19 |
+
_PRE: dict[tuple[str, str], tuple[float, float]] = {
|
| 20 |
+
("enter_early", "enter_early"): (
|
| 21 |
+
float(PRE_EARLY_EARLY), float(PRE_EARLY_EARLY),
|
| 22 |
+
),
|
| 23 |
+
("enter_early", "enter_late"): (
|
| 24 |
+
float(PRE_EARLY_LATE), float(PRE_LATE_EARLY),
|
| 25 |
+
),
|
| 26 |
+
("enter_early", "stay_out"): (float(PRE_EARLY_LATE), _OUT_F),
|
| 27 |
+
("enter_late", "enter_early"): (
|
| 28 |
+
float(PRE_LATE_EARLY), float(PRE_EARLY_LATE),
|
| 29 |
+
),
|
| 30 |
+
("enter_late", "enter_late"): (
|
| 31 |
+
float(PRE_LATE_LATE), float(PRE_LATE_LATE),
|
| 32 |
+
),
|
| 33 |
+
("enter_late", "stay_out"): (float(PRE_LATE_LATE), _OUT_F),
|
| 34 |
+
("stay_out", "enter_early"): (_OUT_F, float(PRE_EARLY_LATE)),
|
| 35 |
+
("stay_out", "enter_late"): (_OUT_F, float(PRE_LATE_LATE)),
|
| 36 |
+
("stay_out", "stay_out"): (_OUT_F, _OUT_F),
|
| 37 |
+
}
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
# -- War of Gifts (gift_large / gift_small / no_gift) --
|
| 41 |
+
_WOG_LL = float(WOG_LARGE_LARGE)
|
| 42 |
+
_WOG_LS = float(WOG_LARGE_SMALL)
|
| 43 |
+
_WOG_LN = float(WOG_LARGE_NONE)
|
| 44 |
+
_WOG_SS = float(WOG_SMALL_SMALL)
|
| 45 |
+
_WOG_SN = float(WOG_SMALL_NONE)
|
| 46 |
+
_WOG_NG = float(WOG_NO_GIFT)
|
| 47 |
+
_WOG_SL = _ZERO_F # small loses to large
|
| 48 |
+
|
| 49 |
+
_WOG: dict[tuple[str, str], tuple[float, float]] = {
|
| 50 |
+
("gift_large", "gift_large"): (_WOG_LL, _WOG_LL),
|
| 51 |
+
("gift_large", "gift_small"): (_WOG_LS, _WOG_SL),
|
| 52 |
+
("gift_large", "no_gift"): (_WOG_LN, _WOG_NG),
|
| 53 |
+
("gift_small", "gift_large"): (_WOG_SL, _WOG_LS),
|
| 54 |
+
("gift_small", "gift_small"): (_WOG_SS, _WOG_SS),
|
| 55 |
+
("gift_small", "no_gift"): (_WOG_SN, _WOG_NG),
|
| 56 |
+
("no_gift", "gift_large"): (_WOG_NG, _WOG_LN),
|
| 57 |
+
("no_gift", "gift_small"): (_WOG_NG, _WOG_SN),
|
| 58 |
+
("no_gift", "no_gift"): (_WOG_NG, _WOG_NG),
|
| 59 |
+
}
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
# -- Penalty Shootout (left / center / right, kicker vs keeper) --
|
| 63 |
+
_PS_SAVE = float(PS_SAVE_PAYOFF)
|
| 64 |
+
_PS_SCORE = float(PS_SCORE_PAYOFF)
|
| 65 |
+
_PS_CENTER = float(PS_CENTER_BONUS)
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
def _penalty_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 69 |
+
"""Kicker (player) vs keeper (opponent). Match means save."""
|
| 70 |
+
if pa == oa:
|
| 71 |
+
return (_PS_SAVE, -_PS_SAVE)
|
| 72 |
+
if pa == "center":
|
| 73 |
+
score = _PS_SCORE + _PS_CENTER
|
| 74 |
+
else:
|
| 75 |
+
score = _PS_SCORE
|
| 76 |
+
return (score, -score)
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
# -- Register --
|
| 80 |
+
ADVANCED_GAMES: dict[str, GameConfig] = {
|
| 81 |
+
"preemption_game": GameConfig(
|
| 82 |
+
name="Preemption Game",
|
| 83 |
+
description=(
|
| 84 |
+
"A timing game with first-mover advantage. Players choose to "
|
| 85 |
+
"enter a market early (risky if both enter) or late (safer but "
|
| 86 |
+
"second-mover disadvantage) or stay out entirely for a safe "
|
| 87 |
+
"payoff. Early entry against a late opponent captures the market. "
|
| 88 |
+
"Tests preemption incentives and entry deterrence."
|
| 89 |
+
),
|
| 90 |
+
actions=["enter_early", "enter_late", "stay_out"],
|
| 91 |
+
game_type="matrix",
|
| 92 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 93 |
+
payoff_fn=_matrix_payoff_fn(_PRE),
|
| 94 |
+
),
|
| 95 |
+
"war_of_gifts": GameConfig(
|
| 96 |
+
name="War of Gifts",
|
| 97 |
+
description=(
|
| 98 |
+
"A competitive generosity game. Players choose to give a large "
|
| 99 |
+
"gift or small gift or no gift. The largest giver wins prestige "
|
| 100 |
+
"but at material cost. Mutual large gifts cancel prestige gains. "
|
| 101 |
+
"No gift is safe but earns no prestige. Tests competitive "
|
| 102 |
+
"signaling through costly generosity."
|
| 103 |
+
),
|
| 104 |
+
actions=["gift_large", "gift_small", "no_gift"],
|
| 105 |
+
game_type="matrix",
|
| 106 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 107 |
+
payoff_fn=_matrix_payoff_fn(_WOG),
|
| 108 |
+
),
|
| 109 |
+
"penalty_shootout": GameConfig(
|
| 110 |
+
name="Penalty Shootout",
|
| 111 |
+
description=(
|
| 112 |
+
"A zero-sum mismatch game modeling penalty kicks. The kicker "
|
| 113 |
+
"chooses left or center or right; the goalkeeper dives. Matching "
|
| 114 |
+
"means a save. Mismatching means a goal. Center kicks score a "
|
| 115 |
+
"bonus when the goalkeeper guesses wrong. Tests mixed-strategy "
|
| 116 |
+
"reasoning in adversarial settings."
|
| 117 |
+
),
|
| 118 |
+
actions=["left", "center", "right"],
|
| 119 |
+
game_type="penalty_shootout",
|
| 120 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 121 |
+
payoff_fn=_penalty_payoff,
|
| 122 |
+
),
|
| 123 |
+
}
|
| 124 |
+
|
| 125 |
+
GAMES.update(ADVANCED_GAMES)
|
common/games_market/classic.py
ADDED
|
@@ -0,0 +1,164 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Classic dilemma and extended strategic games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.var.classic_constants import (
|
| 7 |
+
TD_MIN_CLAIM, TD_MAX_CLAIM, TD_BONUS,
|
| 8 |
+
DOLLAR_PRIZE, DOLLAR_MAX_BID,
|
| 9 |
+
UD_CHEAP_COST, UD_EXPENSIVE_COST, UD_CHEAP_VALUE, UD_EXPENSIVE_VALUE,
|
| 10 |
+
MINO_WIN_PAYOFF, MINO_TIE_PAYOFF,
|
| 11 |
+
RPSLS_WIN_PAYOFF, RPSLS_LOSE_PAYOFF, RPSLS_DRAW_PAYOFF,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
_ONE = int(bool(True))
|
| 15 |
+
_TWO = _ONE + _ONE
|
| 16 |
+
_ZERO_F = float()
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
# -- Traveler's Dilemma --
|
| 20 |
+
def _travelers_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 21 |
+
"""Lower claim gets bonus; higher claim gets penalty."""
|
| 22 |
+
claim_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 23 |
+
claim_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 24 |
+
if claim_p == claim_o:
|
| 25 |
+
return (float(claim_p), float(claim_o))
|
| 26 |
+
if claim_p < claim_o:
|
| 27 |
+
return (float(claim_p + TD_BONUS), float(claim_p - TD_BONUS))
|
| 28 |
+
return (float(claim_o - TD_BONUS), float(claim_o + TD_BONUS))
|
| 29 |
+
|
| 30 |
+
|
| 31 |
+
_TD_ACTS = [f"claim_{i}" for i in range(TD_MIN_CLAIM, TD_MAX_CLAIM + _ONE)]
|
| 32 |
+
|
| 33 |
+
|
| 34 |
+
# -- Dollar Auction (escalation: both pay, highest wins) --
|
| 35 |
+
def _dollar_auction_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 36 |
+
bid_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 37 |
+
bid_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 38 |
+
if bid_p > bid_o:
|
| 39 |
+
return (float(DOLLAR_PRIZE - bid_p), float(-bid_o))
|
| 40 |
+
if bid_o > bid_p:
|
| 41 |
+
return (float(-bid_p), float(DOLLAR_PRIZE - bid_o))
|
| 42 |
+
half = float(DOLLAR_PRIZE) / _TWO
|
| 43 |
+
return (half - float(bid_p), half - float(bid_o))
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
_DA_ACTS = [f"bid_{i}" for i in range(DOLLAR_MAX_BID + _ONE)]
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
# -- Unscrupulous Diner's Dilemma (shared bill) --
|
| 50 |
+
def _diner_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 51 |
+
"""Each orders cheap or expensive; bill is split equally."""
|
| 52 |
+
costs = {"order_cheap": UD_CHEAP_COST, "order_expensive": UD_EXPENSIVE_COST}
|
| 53 |
+
values = {"order_cheap": UD_CHEAP_VALUE, "order_expensive": UD_EXPENSIVE_VALUE}
|
| 54 |
+
total_bill = float(costs[pa] + costs[oa])
|
| 55 |
+
each_pays = total_bill / _TWO
|
| 56 |
+
p_val = float(values[pa]) - each_pays
|
| 57 |
+
o_val = float(values[oa]) - each_pays
|
| 58 |
+
return (p_val, o_val)
|
| 59 |
+
|
| 60 |
+
|
| 61 |
+
# -- Minority Game (anti-coordination: minority side wins) --
|
| 62 |
+
_MINO_ACTS = ["choose_a", "choose_b", "choose_c"]
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def _minority_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 66 |
+
"""With two players: matching = both lose; differing = both win."""
|
| 67 |
+
if pa == oa:
|
| 68 |
+
return (float(MINO_TIE_PAYOFF), float(MINO_TIE_PAYOFF))
|
| 69 |
+
return (float(MINO_WIN_PAYOFF), float(MINO_WIN_PAYOFF))
|
| 70 |
+
|
| 71 |
+
|
| 72 |
+
# -- Rock-Paper-Scissors-Lizard-Spock --
|
| 73 |
+
_RPSLS_W = float(RPSLS_WIN_PAYOFF)
|
| 74 |
+
_RPSLS_L = float(RPSLS_LOSE_PAYOFF)
|
| 75 |
+
_RPSLS_D = float(RPSLS_DRAW_PAYOFF)
|
| 76 |
+
|
| 77 |
+
_RPSLS_BEATS = {
|
| 78 |
+
"rock": ["scissors", "lizard"],
|
| 79 |
+
"paper": ["rock", "spock"],
|
| 80 |
+
"scissors": ["paper", "lizard"],
|
| 81 |
+
"lizard": ["paper", "spock"],
|
| 82 |
+
"spock": ["rock", "scissors"],
|
| 83 |
+
}
|
| 84 |
+
|
| 85 |
+
|
| 86 |
+
def _rpsls_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 87 |
+
if pa == oa:
|
| 88 |
+
return (_RPSLS_D, _RPSLS_D)
|
| 89 |
+
if oa in _RPSLS_BEATS[pa]:
|
| 90 |
+
return (_RPSLS_W, _RPSLS_L)
|
| 91 |
+
return (_RPSLS_L, _RPSLS_W)
|
| 92 |
+
|
| 93 |
+
|
| 94 |
+
# -- Register --
|
| 95 |
+
CLASSIC_GAMES: dict[str, GameConfig] = {
|
| 96 |
+
"travelers_dilemma": GameConfig(
|
| 97 |
+
name="Traveler's Dilemma",
|
| 98 |
+
description=(
|
| 99 |
+
"Two travelers submit claims. The lower claim sets the base "
|
| 100 |
+
"payout with a bonus for the lower claimant and a penalty for "
|
| 101 |
+
"the higher. Nash equilibrium is the minimum claim but "
|
| 102 |
+
"experimental subjects often claim high. Tests the rationality "
|
| 103 |
+
"paradox in iterative dominance reasoning."
|
| 104 |
+
),
|
| 105 |
+
actions=_TD_ACTS,
|
| 106 |
+
game_type="travelers_dilemma",
|
| 107 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 108 |
+
payoff_fn=_travelers_payoff,
|
| 109 |
+
),
|
| 110 |
+
"dollar_auction": GameConfig(
|
| 111 |
+
name="Dollar Auction",
|
| 112 |
+
description=(
|
| 113 |
+
"An escalation game: both players bid and both pay their bids "
|
| 114 |
+
"but only the highest bidder wins the prize. Ties split the "
|
| 115 |
+
"prize. Models sunk cost escalation and commitment traps. "
|
| 116 |
+
"Tests resistance to escalation bias."
|
| 117 |
+
),
|
| 118 |
+
actions=_DA_ACTS,
|
| 119 |
+
game_type="dollar_auction",
|
| 120 |
+
default_rounds=SINGLE_SHOT_ROUNDS,
|
| 121 |
+
payoff_fn=_dollar_auction_payoff,
|
| 122 |
+
),
|
| 123 |
+
"unscrupulous_diner": GameConfig(
|
| 124 |
+
name="Unscrupulous Diner's Dilemma",
|
| 125 |
+
description=(
|
| 126 |
+
"Diners at a restaurant independently order cheap or expensive "
|
| 127 |
+
"meals and split the bill equally. Each prefers expensive food "
|
| 128 |
+
"but shared costs create a free-rider problem. A multiplayer "
|
| 129 |
+
"generalization of the Prisoner's Dilemma in social settings."
|
| 130 |
+
),
|
| 131 |
+
actions=["order_cheap", "order_expensive"],
|
| 132 |
+
game_type="matrix",
|
| 133 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 134 |
+
payoff_fn=_diner_payoff,
|
| 135 |
+
),
|
| 136 |
+
"minority_game": GameConfig(
|
| 137 |
+
name="Minority Game",
|
| 138 |
+
description=(
|
| 139 |
+
"Players independently choose from three options. With two "
|
| 140 |
+
"players, matching choices yield a low tie payoff while "
|
| 141 |
+
"different choices yield a high payoff for both. Tests "
|
| 142 |
+
"anti-coordination and contrarian strategic reasoning."
|
| 143 |
+
),
|
| 144 |
+
actions=_MINO_ACTS,
|
| 145 |
+
game_type="minority",
|
| 146 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 147 |
+
payoff_fn=_minority_payoff,
|
| 148 |
+
),
|
| 149 |
+
"rpsls": GameConfig(
|
| 150 |
+
name="Rock-Paper-Scissors-Lizard-Spock",
|
| 151 |
+
description=(
|
| 152 |
+
"An extended zero-sum game with five actions. Each action "
|
| 153 |
+
"beats two others and loses to two others. The unique Nash "
|
| 154 |
+
"equilibrium is uniform randomization. Tests strategic "
|
| 155 |
+
"reasoning in larger zero-sum action spaces."
|
| 156 |
+
),
|
| 157 |
+
actions=["rock", "paper", "scissors", "lizard", "spock"],
|
| 158 |
+
game_type="matrix",
|
| 159 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 160 |
+
payoff_fn=_rpsls_payoff,
|
| 161 |
+
),
|
| 162 |
+
}
|
| 163 |
+
|
| 164 |
+
GAMES.update(CLASSIC_GAMES)
|
common/games_market/contests.py
ADDED
|
@@ -0,0 +1,188 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Contest, conflict, and fair division games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.conflict_constants import (
|
| 7 |
+
BLOTTO_BATTLEFIELDS, BLOTTO_TOTAL_TROOPS,
|
| 8 |
+
WOA_PRIZE, WOA_COST_PER_ROUND, WOA_MAX_PERSISTENCE,
|
| 9 |
+
TULLOCK_PRIZE, TULLOCK_MAX_EFFORT,
|
| 10 |
+
INSP_VIOLATION_GAIN, INSP_FINE, INSP_INSPECTION_COST,
|
| 11 |
+
INSP_COMPLIANCE_PAYOFF,
|
| 12 |
+
RUB_SURPLUS, RUB_DISCOUNT_NUM, RUB_DISCOUNT_DEN,
|
| 13 |
+
DAC_ENDOWMENT,
|
| 14 |
+
)
|
| 15 |
+
|
| 16 |
+
_ONE = int(bool(True))
|
| 17 |
+
_TWO = _ONE + _ONE
|
| 18 |
+
_ZERO_F = float()
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
# -- Colonel Blotto (three battlefields, encoded as alloc_X_Y_Z) --
|
| 22 |
+
def _blotto_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 23 |
+
"""Each player allocates troops across battlefields. Most wins per field."""
|
| 24 |
+
p_parts = pa.split("_")[_ONE:]
|
| 25 |
+
o_parts = oa.split("_")[_ONE:]
|
| 26 |
+
p_wins = int()
|
| 27 |
+
o_wins = int()
|
| 28 |
+
for pv, ov in zip(p_parts, o_parts):
|
| 29 |
+
pi, oi = int(pv), int(ov)
|
| 30 |
+
if pi > oi:
|
| 31 |
+
p_wins += _ONE
|
| 32 |
+
elif oi > pi:
|
| 33 |
+
o_wins += _ONE
|
| 34 |
+
return (float(p_wins), float(o_wins))
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
def _generate_blotto_actions() -> list[str]:
|
| 38 |
+
"""Generate all valid troop allocations across battlefields."""
|
| 39 |
+
actions = []
|
| 40 |
+
for a in range(BLOTTO_TOTAL_TROOPS + _ONE):
|
| 41 |
+
for b in range(BLOTTO_TOTAL_TROOPS - a + _ONE):
|
| 42 |
+
c = BLOTTO_TOTAL_TROOPS - a - b
|
| 43 |
+
actions.append(f"alloc_{a}_{b}_{c}")
|
| 44 |
+
return actions
|
| 45 |
+
|
| 46 |
+
|
| 47 |
+
_BLOTTO_ACTS = _generate_blotto_actions()
|
| 48 |
+
|
| 49 |
+
|
| 50 |
+
# -- War of Attrition --
|
| 51 |
+
def _woa_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 52 |
+
p_pers = int(pa.rsplit("_", _ONE)[_ONE])
|
| 53 |
+
o_pers = int(oa.rsplit("_", _ONE)[_ONE])
|
| 54 |
+
if p_pers > o_pers:
|
| 55 |
+
return (float(WOA_PRIZE - p_pers * WOA_COST_PER_ROUND),
|
| 56 |
+
float(-o_pers * WOA_COST_PER_ROUND))
|
| 57 |
+
if o_pers > p_pers:
|
| 58 |
+
return (float(-p_pers * WOA_COST_PER_ROUND),
|
| 59 |
+
float(WOA_PRIZE - o_pers * WOA_COST_PER_ROUND))
|
| 60 |
+
half = float(WOA_PRIZE) / _TWO
|
| 61 |
+
cost = float(p_pers * WOA_COST_PER_ROUND)
|
| 62 |
+
return (half - cost, half - cost)
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
_WOA_ACTS = [f"persist_{i}" for i in range(WOA_MAX_PERSISTENCE + _ONE)]
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
# -- Tullock Contest --
|
| 69 |
+
def _tullock_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 70 |
+
e_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 71 |
+
e_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 72 |
+
total = e_p + e_o
|
| 73 |
+
if total == int():
|
| 74 |
+
half = float(TULLOCK_PRIZE) / _TWO
|
| 75 |
+
return (half, half)
|
| 76 |
+
p_prob = float(e_p) / float(total)
|
| 77 |
+
return (float(p_prob * TULLOCK_PRIZE - e_p),
|
| 78 |
+
float((_ONE - p_prob) * TULLOCK_PRIZE - e_o))
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
_TULLOCK_ACTS = [f"effort_{i}" for i in range(TULLOCK_MAX_EFFORT + _ONE)]
|
| 82 |
+
|
| 83 |
+
|
| 84 |
+
# -- Inspection Game --
|
| 85 |
+
_INSP_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 86 |
+
("violate", "inspect"): (float(-INSP_FINE), float(INSP_FINE - INSP_INSPECTION_COST)),
|
| 87 |
+
("violate", "no_inspect"): (float(INSP_VIOLATION_GAIN), float(int())),
|
| 88 |
+
("comply", "inspect"): (float(INSP_COMPLIANCE_PAYOFF), float(-INSP_INSPECTION_COST)),
|
| 89 |
+
("comply", "no_inspect"): (float(INSP_COMPLIANCE_PAYOFF), float(int())),
|
| 90 |
+
}
|
| 91 |
+
|
| 92 |
+
|
| 93 |
+
# -- Rubinstein Bargaining (modeled as demand with discount) --
|
| 94 |
+
def _rubinstein_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 95 |
+
d_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 96 |
+
d_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 97 |
+
if d_p + d_o <= RUB_SURPLUS:
|
| 98 |
+
return (float(d_p), float(d_o))
|
| 99 |
+
disc_p = float(d_p * RUB_DISCOUNT_NUM) / float(RUB_DISCOUNT_DEN)
|
| 100 |
+
disc_o = float(d_o * RUB_DISCOUNT_NUM) / float(RUB_DISCOUNT_DEN)
|
| 101 |
+
if d_p + d_o <= RUB_SURPLUS + _TWO:
|
| 102 |
+
return (disc_p, disc_o)
|
| 103 |
+
return (_ZERO_F, _ZERO_F)
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
_RUB_ACTS = [f"demand_{i}" for i in range(RUB_SURPLUS + _ONE)]
|
| 107 |
+
|
| 108 |
+
|
| 109 |
+
# -- Divide-and-Choose --
|
| 110 |
+
def _dac_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 111 |
+
split = int(pa.rsplit("_", _ONE)[_ONE])
|
| 112 |
+
choice = oa
|
| 113 |
+
left_piece = split
|
| 114 |
+
right_piece = DAC_ENDOWMENT - split
|
| 115 |
+
if choice == "choose_left":
|
| 116 |
+
return (float(right_piece), float(left_piece))
|
| 117 |
+
return (float(left_piece), float(right_piece))
|
| 118 |
+
|
| 119 |
+
|
| 120 |
+
_DAC_SPLIT_ACTS = [f"split_{i}" for i in range(DAC_ENDOWMENT + _ONE)]
|
| 121 |
+
|
| 122 |
+
CONTEST_GAMES: dict[str, GameConfig] = {
|
| 123 |
+
"colonel_blotto": GameConfig(
|
| 124 |
+
name="Colonel Blotto",
|
| 125 |
+
description=(
|
| 126 |
+
"Two players allocate limited troops across multiple "
|
| 127 |
+
"battlefields. The player with more troops wins each field. "
|
| 128 |
+
"Tests multi-dimensional strategic resource allocation."
|
| 129 |
+
),
|
| 130 |
+
actions=_BLOTTO_ACTS, game_type="blotto",
|
| 131 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_blotto_payoff,
|
| 132 |
+
),
|
| 133 |
+
"war_of_attrition": GameConfig(
|
| 134 |
+
name="War of Attrition",
|
| 135 |
+
description=(
|
| 136 |
+
"Both players choose how long to persist. The survivor wins "
|
| 137 |
+
"a prize but both pay costs for duration. Tests endurance "
|
| 138 |
+
"strategy and rent dissipation reasoning."
|
| 139 |
+
),
|
| 140 |
+
actions=_WOA_ACTS, game_type="war_of_attrition",
|
| 141 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_woa_payoff,
|
| 142 |
+
),
|
| 143 |
+
"tullock_contest": GameConfig(
|
| 144 |
+
name="Tullock Contest",
|
| 145 |
+
description=(
|
| 146 |
+
"Players invest effort to win a prize. Win probability is "
|
| 147 |
+
"proportional to relative effort. Models lobbying, rent-seeking, "
|
| 148 |
+
"and competitive R&D spending."
|
| 149 |
+
),
|
| 150 |
+
actions=_TULLOCK_ACTS, game_type="tullock",
|
| 151 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_tullock_payoff,
|
| 152 |
+
),
|
| 153 |
+
"inspection_game": GameConfig(
|
| 154 |
+
name="Inspection Game",
|
| 155 |
+
description=(
|
| 156 |
+
"A potential violator chooses to comply or violate; an inspector "
|
| 157 |
+
"chooses whether to inspect. Mixed-strategy equilibrium models "
|
| 158 |
+
"compliance, auditing, and arms control verification."
|
| 159 |
+
),
|
| 160 |
+
actions=["violate", "comply"], game_type="matrix",
|
| 161 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 162 |
+
payoff_fn=_matrix_payoff_fn(_INSP_MATRIX),
|
| 163 |
+
),
|
| 164 |
+
"rubinstein_bargaining": GameConfig(
|
| 165 |
+
name="Rubinstein Bargaining",
|
| 166 |
+
description=(
|
| 167 |
+
"Players make simultaneous demands over a surplus. Compatible "
|
| 168 |
+
"demands yield immediate payoff; excessive demands are "
|
| 169 |
+
"discounted. Models alternating-offers bargaining with "
|
| 170 |
+
"time preference."
|
| 171 |
+
),
|
| 172 |
+
actions=_RUB_ACTS, game_type="rubinstein",
|
| 173 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_rubinstein_payoff,
|
| 174 |
+
),
|
| 175 |
+
"divide_and_choose": GameConfig(
|
| 176 |
+
name="Divide-and-Choose",
|
| 177 |
+
description=(
|
| 178 |
+
"The divider splits a resource into two portions; the "
|
| 179 |
+
"chooser takes their preferred portion. The optimal "
|
| 180 |
+
"strategy for the divider is an even split. Tests "
|
| 181 |
+
"envy-free fair division reasoning."
|
| 182 |
+
),
|
| 183 |
+
actions=_DAC_SPLIT_ACTS, game_type="divide_choose",
|
| 184 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_dac_payoff,
|
| 185 |
+
),
|
| 186 |
+
}
|
| 187 |
+
|
| 188 |
+
GAMES.update(CONTEST_GAMES)
|
common/games_market/generated_v2.py
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Extended procedurally generated games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
import random as _rand
|
| 5 |
+
|
| 6 |
+
from common.games import GAMES, GameConfig
|
| 7 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS
|
| 8 |
+
from constant_definitions.var.generated_ext_constants import (
|
| 9 |
+
RZS_SEED, RZS_MAX_PAYOFF, RZS_DEFAULT_ACTIONS,
|
| 10 |
+
RC_SEED, RC_MATCH_BONUS, RC_MISMATCH_MAX, RC_DEFAULT_ACTIONS,
|
| 11 |
+
PCHK_RESOURCE, PCHK_FIGHT_COST,
|
| 12 |
+
)
|
| 13 |
+
|
| 14 |
+
_ONE = int(bool(True))
|
| 15 |
+
_TWO = _ONE + _ONE
|
| 16 |
+
|
| 17 |
+
|
| 18 |
+
def _action_label(index: int) -> str:
|
| 19 |
+
return chr(ord("a") + index)
|
| 20 |
+
|
| 21 |
+
|
| 22 |
+
def generate_random_zero_sum(
|
| 23 |
+
num_actions: int = RZS_DEFAULT_ACTIONS,
|
| 24 |
+
max_payoff: int = RZS_MAX_PAYOFF,
|
| 25 |
+
seed: int = RZS_SEED,
|
| 26 |
+
) -> GameConfig:
|
| 27 |
+
"""Generate a random NxN zero-sum game."""
|
| 28 |
+
rng = _rand.Random(seed)
|
| 29 |
+
actions = [_action_label(i) for i in range(num_actions)]
|
| 30 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {}
|
| 31 |
+
for a in actions:
|
| 32 |
+
for b in actions:
|
| 33 |
+
val = float(rng.randint(-max_payoff, max_payoff))
|
| 34 |
+
matrix[(a, b)] = (val, -val)
|
| 35 |
+
|
| 36 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 37 |
+
return matrix[(pa, oa)]
|
| 38 |
+
|
| 39 |
+
return GameConfig(
|
| 40 |
+
name=f"Random Zero-Sum {num_actions}x{num_actions} (seed={seed})",
|
| 41 |
+
description=(
|
| 42 |
+
f"A randomly generated {num_actions}x{num_actions} zero-sum "
|
| 43 |
+
f"game. Every outcome sums to zero. Tests minimax reasoning "
|
| 44 |
+
f"in adversarial strategic settings."
|
| 45 |
+
),
|
| 46 |
+
actions=actions, game_type="matrix",
|
| 47 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_payoff,
|
| 48 |
+
)
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
def generate_random_coordination(
|
| 52 |
+
num_actions: int = RC_DEFAULT_ACTIONS,
|
| 53 |
+
match_bonus: int = RC_MATCH_BONUS,
|
| 54 |
+
mismatch_max: int = RC_MISMATCH_MAX,
|
| 55 |
+
seed: int = RC_SEED,
|
| 56 |
+
) -> GameConfig:
|
| 57 |
+
"""Generate a random NxN coordination game with diagonal bonus."""
|
| 58 |
+
rng = _rand.Random(seed)
|
| 59 |
+
actions = [_action_label(i) for i in range(num_actions)]
|
| 60 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {}
|
| 61 |
+
for a in actions:
|
| 62 |
+
for b in actions:
|
| 63 |
+
if a == b:
|
| 64 |
+
val = float(match_bonus + rng.randint(int(), mismatch_max))
|
| 65 |
+
matrix[(a, b)] = (val, val)
|
| 66 |
+
else:
|
| 67 |
+
val = float(rng.randint(int(), mismatch_max))
|
| 68 |
+
matrix[(a, b)] = (val, val)
|
| 69 |
+
|
| 70 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 71 |
+
return matrix[(pa, oa)]
|
| 72 |
+
|
| 73 |
+
return GameConfig(
|
| 74 |
+
name=f"Random Coordination {num_actions}x{num_actions} (seed={seed})",
|
| 75 |
+
description=(
|
| 76 |
+
f"A randomly generated {num_actions}x{num_actions} coordination "
|
| 77 |
+
f"game. Matching actions receive a bonus payoff. Tests focal "
|
| 78 |
+
f"point identification in novel coordination structures."
|
| 79 |
+
),
|
| 80 |
+
actions=actions, game_type="matrix",
|
| 81 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_payoff,
|
| 82 |
+
)
|
| 83 |
+
|
| 84 |
+
|
| 85 |
+
def generate_parameterized_chicken(
|
| 86 |
+
resource: int = PCHK_RESOURCE,
|
| 87 |
+
fight_cost: int = PCHK_FIGHT_COST,
|
| 88 |
+
) -> GameConfig:
|
| 89 |
+
"""Create a Hawk-Dove / Chicken game with custom parameters."""
|
| 90 |
+
half_v = float(resource) / _TWO
|
| 91 |
+
fight_pay = (float(resource) - float(fight_cost)) / _TWO
|
| 92 |
+
matrix: dict[tuple[str, str], tuple[float, float]] = {
|
| 93 |
+
("hawk", "hawk"): (fight_pay, fight_pay),
|
| 94 |
+
("hawk", "dove"): (float(resource), float(int())),
|
| 95 |
+
("dove", "hawk"): (float(int()), float(resource)),
|
| 96 |
+
("dove", "dove"): (half_v, half_v),
|
| 97 |
+
}
|
| 98 |
+
|
| 99 |
+
def _payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 100 |
+
return matrix[(pa, oa)]
|
| 101 |
+
|
| 102 |
+
return GameConfig(
|
| 103 |
+
name=f"Chicken(V={resource},C={fight_cost})",
|
| 104 |
+
description=(
|
| 105 |
+
f"A parameterized Chicken / Hawk-Dove game with resource value "
|
| 106 |
+
f"{resource} and fight cost {fight_cost}. Tests anti-coordination "
|
| 107 |
+
f"behavior under varied incentive parameters."
|
| 108 |
+
),
|
| 109 |
+
actions=["hawk", "dove"], game_type="matrix",
|
| 110 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_payoff,
|
| 111 |
+
)
|
| 112 |
+
|
| 113 |
+
|
| 114 |
+
# -- Register default instances --
|
| 115 |
+
_ZS = generate_random_zero_sum()
|
| 116 |
+
_CO = generate_random_coordination()
|
| 117 |
+
_CH = generate_parameterized_chicken()
|
| 118 |
+
|
| 119 |
+
GENERATED_V2: dict[str, GameConfig] = {
|
| 120 |
+
"random_zero_sum_3x3": _ZS,
|
| 121 |
+
"random_coordination_3x3": _CO,
|
| 122 |
+
"parameterized_chicken": _CH,
|
| 123 |
+
}
|
| 124 |
+
|
| 125 |
+
GAMES.update(GENERATED_V2)
|
common/games_market/oligopoly.py
ADDED
|
@@ -0,0 +1,152 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""Market competition and bargaining games for KantBench."""
|
| 2 |
+
from __future__ import annotations
|
| 3 |
+
|
| 4 |
+
from common.games import GAMES, GameConfig, _matrix_payoff_fn
|
| 5 |
+
from constant_definitions.game_constants import DEFAULT_NUM_ROUNDS, SINGLE_SHOT_ROUNDS
|
| 6 |
+
from constant_definitions.ext.market_constants import (
|
| 7 |
+
COURNOT_DEMAND_INTERCEPT, COURNOT_DEMAND_SLOPE, COURNOT_MARGINAL_COST,
|
| 8 |
+
COURNOT_MAX_QUANTITY,
|
| 9 |
+
BERTRAND_MAX_PRICE, BERTRAND_MARGINAL_COST, BERTRAND_MARKET_SIZE,
|
| 10 |
+
HOTELLING_LINE_LENGTH, HOTELLING_TRANSPORT_COST, HOTELLING_MARKET_VALUE,
|
| 11 |
+
ED_MONOPOLY_PROFIT, ED_DUOPOLY_PROFIT, ED_FIGHT_COST,
|
| 12 |
+
ED_ENTRANT_FIGHT_LOSS, ED_STAY_OUT_PAYOFF,
|
| 13 |
+
ND_SURPLUS, DA_BUYER_VALUE, DA_SELLER_COST, DA_MAX_PRICE,
|
| 14 |
+
)
|
| 15 |
+
|
| 16 |
+
_ONE = int(bool(True))
|
| 17 |
+
_TWO = _ONE + _ONE
|
| 18 |
+
_ZERO_F = float()
|
| 19 |
+
|
| 20 |
+
|
| 21 |
+
def _cournot_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 22 |
+
q_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 23 |
+
q_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 24 |
+
total = q_p + q_o
|
| 25 |
+
price = COURNOT_DEMAND_INTERCEPT - COURNOT_DEMAND_SLOPE * total
|
| 26 |
+
return (float((price - COURNOT_MARGINAL_COST) * q_p),
|
| 27 |
+
float((price - COURNOT_MARGINAL_COST) * q_o))
|
| 28 |
+
|
| 29 |
+
|
| 30 |
+
def _bertrand_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 31 |
+
p_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 32 |
+
p_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 33 |
+
if p_p < p_o:
|
| 34 |
+
demand = max(BERTRAND_MARKET_SIZE - p_p, int())
|
| 35 |
+
return (float((p_p - BERTRAND_MARGINAL_COST) * demand), _ZERO_F)
|
| 36 |
+
if p_o < p_p:
|
| 37 |
+
demand = max(BERTRAND_MARKET_SIZE - p_o, int())
|
| 38 |
+
return (_ZERO_F, float((p_o - BERTRAND_MARGINAL_COST) * demand))
|
| 39 |
+
demand = max(BERTRAND_MARKET_SIZE - p_p, int())
|
| 40 |
+
half_profit = float((p_p - BERTRAND_MARGINAL_COST) * demand) / _TWO
|
| 41 |
+
return (half_profit, half_profit)
|
| 42 |
+
|
| 43 |
+
|
| 44 |
+
def _hotelling_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 45 |
+
loc_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 46 |
+
loc_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 47 |
+
if loc_p == loc_o:
|
| 48 |
+
share = float(HOTELLING_MARKET_VALUE) / _TWO
|
| 49 |
+
return (share, share)
|
| 50 |
+
mid = (loc_p + loc_o) / _TWO
|
| 51 |
+
p_share = mid if loc_p < loc_o else float(HOTELLING_LINE_LENGTH) - mid
|
| 52 |
+
o_share = float(HOTELLING_LINE_LENGTH) - p_share
|
| 53 |
+
return (float(p_share * HOTELLING_TRANSPORT_COST),
|
| 54 |
+
float(o_share * HOTELLING_TRANSPORT_COST))
|
| 55 |
+
|
| 56 |
+
|
| 57 |
+
_ED_MATRIX: dict[tuple[str, str], tuple[float, float]] = {
|
| 58 |
+
("enter", "accommodate"): (float(ED_DUOPOLY_PROFIT), float(ED_DUOPOLY_PROFIT)),
|
| 59 |
+
("enter", "fight"): (float(ED_ENTRANT_FIGHT_LOSS), float(ED_FIGHT_COST)),
|
| 60 |
+
("stay_out", "accommodate"): (float(ED_STAY_OUT_PAYOFF), float(ED_MONOPOLY_PROFIT)),
|
| 61 |
+
("stay_out", "fight"): (float(ED_STAY_OUT_PAYOFF), float(ED_MONOPOLY_PROFIT)),
|
| 62 |
+
}
|
| 63 |
+
|
| 64 |
+
|
| 65 |
+
def _nash_demand_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 66 |
+
d_p = int(pa.rsplit("_", _ONE)[_ONE])
|
| 67 |
+
d_o = int(oa.rsplit("_", _ONE)[_ONE])
|
| 68 |
+
if d_p + d_o <= ND_SURPLUS:
|
| 69 |
+
return (float(d_p), float(d_o))
|
| 70 |
+
return (_ZERO_F, _ZERO_F)
|
| 71 |
+
|
| 72 |
+
|
| 73 |
+
def _double_auction_payoff(pa: str, oa: str) -> tuple[float, float]:
|
| 74 |
+
bid = int(pa.rsplit("_", _ONE)[_ONE])
|
| 75 |
+
ask = int(oa.rsplit("_", _ONE)[_ONE])
|
| 76 |
+
if bid >= ask:
|
| 77 |
+
price = (bid + ask) // _TWO
|
| 78 |
+
return (float(DA_BUYER_VALUE - price), float(price - DA_SELLER_COST))
|
| 79 |
+
return (_ZERO_F, _ZERO_F)
|
| 80 |
+
|
| 81 |
+
|
| 82 |
+
_COURNOT_ACTS = [f"produce_{i}" for i in range(COURNOT_MAX_QUANTITY + _ONE)]
|
| 83 |
+
_BERTRAND_ACTS = [f"price_{i}" for i in range(BERTRAND_MAX_PRICE + _ONE)]
|
| 84 |
+
_HOTELLING_ACTS = [f"locate_{i}" for i in range(HOTELLING_LINE_LENGTH + _ONE)]
|
| 85 |
+
_ND_ACTS = [f"demand_{i}" for i in range(ND_SURPLUS + _ONE)]
|
| 86 |
+
_DA_ACTS = [f"bid_{i}" for i in range(DA_MAX_PRICE + _ONE)]
|
| 87 |
+
|
| 88 |
+
OLIGOPOLY_GAMES: dict[str, GameConfig] = {
|
| 89 |
+
"cournot": GameConfig(
|
| 90 |
+
name="Cournot Duopoly",
|
| 91 |
+
description=(
|
| 92 |
+
"Two firms simultaneously choose production quantities. "
|
| 93 |
+
"Market price decreases with total output. Tests Nash "
|
| 94 |
+
"equilibrium reasoning in quantity competition."
|
| 95 |
+
),
|
| 96 |
+
actions=_COURNOT_ACTS, game_type="cournot",
|
| 97 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_cournot_payoff,
|
| 98 |
+
),
|
| 99 |
+
"bertrand": GameConfig(
|
| 100 |
+
name="Bertrand Competition",
|
| 101 |
+
description=(
|
| 102 |
+
"Two firms simultaneously set prices. The lower-price firm "
|
| 103 |
+
"captures the market. The Bertrand paradox predicts pricing "
|
| 104 |
+
"at marginal cost even with only two competitors."
|
| 105 |
+
),
|
| 106 |
+
actions=_BERTRAND_ACTS, game_type="bertrand",
|
| 107 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_bertrand_payoff,
|
| 108 |
+
),
|
| 109 |
+
"hotelling": GameConfig(
|
| 110 |
+
name="Hotelling Location Game",
|
| 111 |
+
description=(
|
| 112 |
+
"Two firms choose locations on a line. Consumers visit the "
|
| 113 |
+
"nearest firm. Tests the principle of minimum differentiation "
|
| 114 |
+
"and spatial competition dynamics."
|
| 115 |
+
),
|
| 116 |
+
actions=_HOTELLING_ACTS, game_type="hotelling",
|
| 117 |
+
default_rounds=DEFAULT_NUM_ROUNDS, payoff_fn=_hotelling_payoff,
|
| 118 |
+
),
|
| 119 |
+
"entry_deterrence": GameConfig(
|
| 120 |
+
name="Entry Deterrence",
|
| 121 |
+
description=(
|
| 122 |
+
"A potential entrant decides whether to enter a market; "
|
| 123 |
+
"the incumbent decides whether to fight or accommodate. "
|
| 124 |
+
"Tests credible commitment and limit pricing reasoning."
|
| 125 |
+
),
|
| 126 |
+
actions=["enter", "stay_out"], game_type="matrix",
|
| 127 |
+
default_rounds=DEFAULT_NUM_ROUNDS,
|
| 128 |
+
payoff_fn=_matrix_payoff_fn(_ED_MATRIX),
|
| 129 |
+
),
|
| 130 |
+
"nash_demand": GameConfig(
|
| 131 |
+
name="Nash Demand Game",
|
| 132 |
+
description=(
|
| 133 |
+
"Two players simultaneously demand shares of a surplus. "
|
| 134 |
+
"If demands are compatible (sum within surplus), both "
|
| 135 |
+
"receive their demand; otherwise both get nothing."
|
| 136 |
+
),
|
| 137 |
+
actions=_ND_ACTS, game_type="nash_demand",
|
| 138 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_nash_demand_payoff,
|
| 139 |
+
),
|
| 140 |
+
"double_auction": GameConfig(
|
| 141 |
+
name="Double Auction",
|
| 142 |
+
description=(
|
| 143 |
+
"A buyer submits a bid and a seller submits an ask. Trade "
|
| 144 |
+
"occurs at the midpoint if bid exceeds ask. Tests price "
|
| 145 |
+
"discovery and competitive market behavior."
|
| 146 |
+
),
|
| 147 |
+
actions=_DA_ACTS, game_type="double_auction",
|
| 148 |
+
default_rounds=SINGLE_SHOT_ROUNDS, payoff_fn=_double_auction_payoff,
|
| 149 |
+
),
|
| 150 |
+
}
|
| 151 |
+
|
| 152 |
+
GAMES.update(OLIGOPOLY_GAMES)
|
common/games_meta/__pycache__/coalition_config.cpython-311.pyc
ADDED
|
Binary file (15.5 kB). View file
|
|
|
common/games_meta/__pycache__/dynamic.cpython-311.pyc
ADDED
|
Binary file (7.43 kB). View file
|
|
|