Spaces:
Sleeping
Sleeping
Commit
Β·
230696b
1
Parent(s):
0f5f162
feat: add multi-evidence extension with configurable evidence types
Browse files- Add Basic and Extended evidence type configuration
- Implement multi-evidence generation (e.g., ["lower", "half"])
- Update Bayesian inference for joint probability calculations
- Add UI dropdown for evidence type selection
- Maintain domain separation and architectural constraints
- Update all tests (83 passing) for new multi-evidence format
Basic Evidence: higher/lower/same
Extended Evidence: adds half/double for richer inference
π€ Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
- CLAUDE.md +40 -17
- domains/belief/belief_domain.py +55 -15
- domains/coordination/game_coordination.py +17 -5
- domains/environment/environment_domain.py +35 -9
- tests/test_architectural_constraints.py +18 -18
- tests/test_belief_domain.py +24 -24
- tests/test_environment_domain.py +114 -20
- tests/test_game_coordination.py +7 -5
- ui/gradio_interface.py +39 -10
CLAUDE.md
CHANGED
|
@@ -6,28 +6,39 @@ A Bayesian Game implementation featuring a Belief-based Agent using domain-drive
|
|
| 6 |
## Game Rules
|
| 7 |
- Judge and Player 1 can see the target die value
|
| 8 |
- Player 2 must deduce the target value using only comparison results
|
| 9 |
-
- Player 1 rolls dice and reports
|
| 10 |
-
- **CRITICAL**: Player 2 receives ONLY the
|
| 11 |
- Game runs for 10 rounds
|
| 12 |
- Judge ensures truth-telling
|
| 13 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 14 |
## Development Practices
|
| 15 |
- Use conventional commits when committing code to git
|
| 16 |
- Always use uv and the local venv
|
|
|
|
| 17 |
|
| 18 |
## Architecture
|
| 19 |
Domain-Driven Design with 3 modules:
|
| 20 |
|
| 21 |
1. **Environment Domain** (`domains/environment/environment_domain.py`)
|
| 22 |
-
- EnvironmentEvidence dataclass (contains dice_roll AND
|
| 23 |
-
- Environment class for target/evidence generation
|
| 24 |
- **ACCESS**: Full knowledge of dice rolls and target values
|
| 25 |
|
| 26 |
2. **Belief Domain** (`domains/belief/belief_domain.py`)
|
| 27 |
-
- BeliefUpdate dataclass (contains ONLY
|
| 28 |
-
- BayesianBeliefState class for inference
|
| 29 |
- **ACCESS**: NO knowledge of dice roll values or true target
|
| 30 |
-
- **CONSTRAINT**: Must calculate P(
|
| 31 |
|
| 32 |
3. **Game Coordination** (`domains/coordination/game_coordination.py`)
|
| 33 |
- GameState dataclass (tracks full game state)
|
|
@@ -63,11 +74,17 @@ bayesian_game/
|
|
| 63 |
## Key Design Decisions & Architectural Constraints
|
| 64 |
|
| 65 |
### Information Flow Rules
|
| 66 |
-
1. **Environment β Coordination**: EnvironmentEvidence (dice_roll +
|
| 67 |
-
2. **Coordination β Belief**: BeliefUpdate (
|
| 68 |
3. **NEVER**: Direct Environment β Belief communication
|
| 69 |
4. **NEVER**: Belief domain access to dice roll values
|
| 70 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 71 |
### Domain Separation Principles
|
| 72 |
- **Environment Domain**: No probability knowledge, pure evidence generation
|
| 73 |
- **Belief Domain**: Pure Bayesian inference, no knowledge of actual dice values
|
|
@@ -75,31 +92,37 @@ bayesian_game/
|
|
| 75 |
- **UI Layer**: Separate from core game logic, can display full information
|
| 76 |
|
| 77 |
### Critical Implementation Rules
|
| 78 |
-
- BeliefUpdate dataclass MUST contain only
|
| 79 |
-
- BayesianBeliefState MUST calculate P(
|
| 80 |
- Game coordination MUST filter dice_roll from EnvironmentEvidence before passing to belief domain
|
| 81 |
- Tests MUST verify that belief domain never receives dice roll values
|
|
|
|
| 82 |
|
| 83 |
## Maintaining Architectural Integrity
|
| 84 |
|
| 85 |
### Code Review Checklist
|
| 86 |
When modifying the codebase, ensure:
|
| 87 |
-
- [ ] BeliefUpdate contains ONLY
|
| 88 |
- [ ] No dice_roll parameter passed to belief domain methods
|
| 89 |
- [ ] Game coordination filters EnvironmentEvidence properly
|
| 90 |
- [ ] Tests verify belief domain isolation
|
| 91 |
-
- [ ] Belief calculations use probabilistic formulas
|
|
|
|
|
|
|
| 92 |
|
| 93 |
### Anti-Patterns to Avoid
|
| 94 |
-
β `BeliefUpdate(dice_roll=X,
|
| 95 |
β Direct Environment-Belief communication
|
| 96 |
β Belief domain knowing actual dice roll or target values
|
| 97 |
-
β Hard-coded probability values instead of calculated P(
|
|
|
|
| 98 |
|
| 99 |
### Correct Patterns
|
| 100 |
-
β
`BeliefUpdate(
|
| 101 |
β
Environment β Coordination β Belief information flow
|
| 102 |
-
β
Probabilistic calculations: P(
|
|
|
|
|
|
|
| 103 |
β
Clean domain boundaries with no cross-dependencies
|
| 104 |
|
| 105 |
## Dependencies
|
|
|
|
| 6 |
## Game Rules
|
| 7 |
- Judge and Player 1 can see the target die value
|
| 8 |
- Player 2 must deduce the target value using only comparison results
|
| 9 |
+
- Player 1 rolls dice and reports evidence based on selected evidence type
|
| 10 |
+
- **CRITICAL**: Player 2 receives ONLY the evidence results, NOT the dice roll value
|
| 11 |
- Game runs for 10 rounds
|
| 12 |
- Judge ensures truth-telling
|
| 13 |
|
| 14 |
+
### Evidence Types
|
| 15 |
+
**Basic Evidence**: `["higher", "lower", "same"]`
|
| 16 |
+
- Standard comparison between dice roll and target
|
| 17 |
+
|
| 18 |
+
**Extended Evidence**: `["higher", "lower", "same", "half", "double"]`
|
| 19 |
+
- Multiple evidence types can apply to single roll
|
| 20 |
+
- "half": dice_roll = target/2 (exact integer matches only)
|
| 21 |
+
- "double": dice_roll = target*2 (exact integer matches only)
|
| 22 |
+
- Example: target=4, dice_roll=2 β evidence=`["lower", "half"]`
|
| 23 |
+
|
| 24 |
## Development Practices
|
| 25 |
- Use conventional commits when committing code to git
|
| 26 |
- Always use uv and the local venv
|
| 27 |
+
- Always use the make file for devops-style tasks
|
| 28 |
|
| 29 |
## Architecture
|
| 30 |
Domain-Driven Design with 3 modules:
|
| 31 |
|
| 32 |
1. **Environment Domain** (`domains/environment/environment_domain.py`)
|
| 33 |
+
- EnvironmentEvidence dataclass (contains dice_roll AND comparison_results)
|
| 34 |
+
- Environment class for target/evidence generation with configurable evidence types
|
| 35 |
- **ACCESS**: Full knowledge of dice rolls and target values
|
| 36 |
|
| 37 |
2. **Belief Domain** (`domains/belief/belief_domain.py`)
|
| 38 |
+
- BeliefUpdate dataclass (contains ONLY comparison_results as List[str])
|
| 39 |
+
- BayesianBeliefState class for inference with multi-evidence support
|
| 40 |
- **ACCESS**: NO knowledge of dice roll values or true target
|
| 41 |
+
- **CONSTRAINT**: Must calculate P(comparison_results | target) probabilistically for multiple evidence types
|
| 42 |
|
| 43 |
3. **Game Coordination** (`domains/coordination/game_coordination.py`)
|
| 44 |
- GameState dataclass (tracks full game state)
|
|
|
|
| 74 |
## Key Design Decisions & Architectural Constraints
|
| 75 |
|
| 76 |
### Information Flow Rules
|
| 77 |
+
1. **Environment β Coordination**: EnvironmentEvidence (dice_roll + comparison_results)
|
| 78 |
+
2. **Coordination β Belief**: BeliefUpdate (comparison_results ONLY as List[str])
|
| 79 |
3. **NEVER**: Direct Environment β Belief communication
|
| 80 |
4. **NEVER**: Belief domain access to dice roll values
|
| 81 |
|
| 82 |
+
### Multi-Evidence Processing
|
| 83 |
+
- Environment generates all applicable evidence types for each roll
|
| 84 |
+
- Coordination filters dice_roll information before passing to belief domain
|
| 85 |
+
- Belief domain calculates joint probabilities: P(comparison_results | target)
|
| 86 |
+
- UI displays evidence configuration options (Basic vs Extended)
|
| 87 |
+
|
| 88 |
### Domain Separation Principles
|
| 89 |
- **Environment Domain**: No probability knowledge, pure evidence generation
|
| 90 |
- **Belief Domain**: Pure Bayesian inference, no knowledge of actual dice values
|
|
|
|
| 92 |
- **UI Layer**: Separate from core game logic, can display full information
|
| 93 |
|
| 94 |
### Critical Implementation Rules
|
| 95 |
+
- BeliefUpdate dataclass MUST contain only comparison_results as List[str]
|
| 96 |
+
- BayesianBeliefState MUST calculate P(comparison_results | target) probabilistically for multi-evidence
|
| 97 |
- Game coordination MUST filter dice_roll from EnvironmentEvidence before passing to belief domain
|
| 98 |
- Tests MUST verify that belief domain never receives dice roll values
|
| 99 |
+
- Evidence type configuration MUST be passed through coordination layer, not directly to belief domain
|
| 100 |
|
| 101 |
## Maintaining Architectural Integrity
|
| 102 |
|
| 103 |
### Code Review Checklist
|
| 104 |
When modifying the codebase, ensure:
|
| 105 |
+
- [ ] BeliefUpdate contains ONLY comparison_results field (List[str])
|
| 106 |
- [ ] No dice_roll parameter passed to belief domain methods
|
| 107 |
- [ ] Game coordination filters EnvironmentEvidence properly
|
| 108 |
- [ ] Tests verify belief domain isolation
|
| 109 |
+
- [ ] Belief calculations use probabilistic formulas for multi-evidence: P(comparison_results | target)
|
| 110 |
+
- [ ] Evidence type configuration flows through coordination layer
|
| 111 |
+
- [ ] UI evidence type selection properly configures game behavior
|
| 112 |
|
| 113 |
### Anti-Patterns to Avoid
|
| 114 |
+
β `BeliefUpdate(dice_roll=X, comparison_results=Y)` - belief shouldn't know dice value
|
| 115 |
β Direct Environment-Belief communication
|
| 116 |
β Belief domain knowing actual dice roll or target values
|
| 117 |
+
β Hard-coded probability values instead of calculated P(comparison_results | target)
|
| 118 |
+
β Passing evidence type configuration directly to belief domain
|
| 119 |
|
| 120 |
### Correct Patterns
|
| 121 |
+
β
`BeliefUpdate(comparison_results=["lower", "half"])` - only evidence results
|
| 122 |
β
Environment β Coordination β Belief information flow
|
| 123 |
+
β
Probabilistic calculations for multi-evidence: P(comparison_results | target)
|
| 124 |
+
β
Evidence type configuration handled in coordination layer
|
| 125 |
+
β
Joint probability calculations: P(["lower", "half"] | target) = P(dice_roll=target/2 AND dice_roll<target)
|
| 126 |
β
Clean domain boundaries with no cross-dependencies
|
| 127 |
|
| 128 |
## Dependencies
|
domains/belief/belief_domain.py
CHANGED
|
@@ -1,5 +1,4 @@
|
|
| 1 |
from dataclasses import dataclass
|
| 2 |
-
from typing import Literal
|
| 3 |
|
| 4 |
import numpy as np
|
| 5 |
|
|
@@ -8,7 +7,7 @@ import numpy as np
|
|
| 8 |
class BeliefUpdate:
|
| 9 |
"""Update information for Bayesian belief state."""
|
| 10 |
|
| 11 |
-
|
| 12 |
|
| 13 |
|
| 14 |
class BayesianBeliefState:
|
|
@@ -65,7 +64,7 @@ class BayesianBeliefState:
|
|
| 65 |
"""
|
| 66 |
self.evidence_history.append(evidence)
|
| 67 |
|
| 68 |
-
|
| 69 |
|
| 70 |
# Calculate likelihood for each possible target value
|
| 71 |
likelihoods = np.zeros(self.dice_sides)
|
|
@@ -73,18 +72,11 @@ class BayesianBeliefState:
|
|
| 73 |
for target_idx in range(self.dice_sides):
|
| 74 |
target_value = target_idx + 1
|
| 75 |
|
| 76 |
-
# Calculate P(
|
| 77 |
-
# This is the probability that
|
| 78 |
-
|
| 79 |
-
|
| 80 |
-
|
| 81 |
-
elif comparison_result == "lower":
|
| 82 |
-
# P(roll < target) = (target - 1) / dice_sides
|
| 83 |
-
likelihood = (target_value - 1) / self.dice_sides
|
| 84 |
-
else: # comparison_result == "same"
|
| 85 |
-
# P(roll = target) = 1 / dice_sides
|
| 86 |
-
likelihood = 1 / self.dice_sides
|
| 87 |
-
|
| 88 |
likelihoods[target_idx] = likelihood
|
| 89 |
|
| 90 |
# Apply Bayes' rule: posterior β prior * likelihood
|
|
@@ -99,6 +91,54 @@ class BayesianBeliefState:
|
|
| 99 |
# reset to uniform distribution
|
| 100 |
self.beliefs = np.ones(self.dice_sides) / self.dice_sides
|
| 101 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 102 |
def reset_beliefs(self) -> None:
|
| 103 |
"""Reset beliefs to uniform prior and clear evidence history."""
|
| 104 |
self.beliefs = np.ones(self.dice_sides) / self.dice_sides
|
|
|
|
| 1 |
from dataclasses import dataclass
|
|
|
|
| 2 |
|
| 3 |
import numpy as np
|
| 4 |
|
|
|
|
| 7 |
class BeliefUpdate:
|
| 8 |
"""Update information for Bayesian belief state."""
|
| 9 |
|
| 10 |
+
comparison_results: list[str]
|
| 11 |
|
| 12 |
|
| 13 |
class BayesianBeliefState:
|
|
|
|
| 64 |
"""
|
| 65 |
self.evidence_history.append(evidence)
|
| 66 |
|
| 67 |
+
comparison_results = evidence.comparison_results
|
| 68 |
|
| 69 |
# Calculate likelihood for each possible target value
|
| 70 |
likelihoods = np.zeros(self.dice_sides)
|
|
|
|
| 72 |
for target_idx in range(self.dice_sides):
|
| 73 |
target_value = target_idx + 1
|
| 74 |
|
| 75 |
+
# Calculate P(comparison_results | target_value)
|
| 76 |
+
# This is the joint probability that a dice roll would produce ALL these evidence types
|
| 77 |
+
likelihood = self._calculate_joint_likelihood(
|
| 78 |
+
comparison_results, target_value
|
| 79 |
+
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
likelihoods[target_idx] = likelihood
|
| 81 |
|
| 82 |
# Apply Bayes' rule: posterior β prior * likelihood
|
|
|
|
| 91 |
# reset to uniform distribution
|
| 92 |
self.beliefs = np.ones(self.dice_sides) / self.dice_sides
|
| 93 |
|
| 94 |
+
def _calculate_joint_likelihood(
|
| 95 |
+
self, comparison_results: list[str], target_value: int
|
| 96 |
+
) -> float:
|
| 97 |
+
"""Calculate P(comparison_results | target_value) for multiple evidence types.
|
| 98 |
+
|
| 99 |
+
Args:
|
| 100 |
+
comparison_results: List of evidence results (e.g., ["lower", "half"])
|
| 101 |
+
target_value: Target value to calculate likelihood for
|
| 102 |
+
|
| 103 |
+
Returns:
|
| 104 |
+
Joint probability of observing all evidence types given the target
|
| 105 |
+
"""
|
| 106 |
+
# For multiple evidence types from a single roll, we need to find
|
| 107 |
+
# the probability that a single dice roll satisfies ALL conditions
|
| 108 |
+
|
| 109 |
+
# Count dice rolls that satisfy all evidence conditions
|
| 110 |
+
satisfying_rolls = 0
|
| 111 |
+
|
| 112 |
+
for dice_roll in range(1, self.dice_sides + 1):
|
| 113 |
+
satisfies_all = True
|
| 114 |
+
|
| 115 |
+
for evidence in comparison_results:
|
| 116 |
+
if (
|
| 117 |
+
(evidence == "higher" and not (dice_roll > target_value))
|
| 118 |
+
or (evidence == "lower" and not (dice_roll < target_value))
|
| 119 |
+
or (evidence == "same" and dice_roll != target_value)
|
| 120 |
+
or (
|
| 121 |
+
evidence == "half"
|
| 122 |
+
and not (
|
| 123 |
+
target_value % 2 == 0 and dice_roll == target_value // 2
|
| 124 |
+
)
|
| 125 |
+
)
|
| 126 |
+
or (
|
| 127 |
+
evidence == "double"
|
| 128 |
+
and not (
|
| 129 |
+
dice_roll == target_value * 2
|
| 130 |
+
and dice_roll <= self.dice_sides
|
| 131 |
+
)
|
| 132 |
+
)
|
| 133 |
+
):
|
| 134 |
+
satisfies_all = False
|
| 135 |
+
break
|
| 136 |
+
|
| 137 |
+
if satisfies_all:
|
| 138 |
+
satisfying_rolls += 1
|
| 139 |
+
|
| 140 |
+
return satisfying_rolls / self.dice_sides
|
| 141 |
+
|
| 142 |
def reset_beliefs(self) -> None:
|
| 143 |
"""Reset beliefs to uniform prior and clear evidence history."""
|
| 144 |
self.beliefs = np.ones(self.dice_sides) / self.dice_sides
|
domains/coordination/game_coordination.py
CHANGED
|
@@ -3,7 +3,11 @@ from enum import Enum
|
|
| 3 |
from typing import Any
|
| 4 |
|
| 5 |
from ..belief.belief_domain import BayesianBeliefState, BeliefUpdate
|
| 6 |
-
from ..environment.environment_domain import
|
|
|
|
|
|
|
|
|
|
|
|
|
| 7 |
|
| 8 |
|
| 9 |
class GamePhase(Enum):
|
|
@@ -42,20 +46,28 @@ class BayesianGame:
|
|
| 42 |
"""
|
| 43 |
|
| 44 |
def __init__(
|
| 45 |
-
self,
|
|
|
|
|
|
|
|
|
|
|
|
|
| 46 |
):
|
| 47 |
"""Initialize the Bayesian Game.
|
| 48 |
|
| 49 |
Args:
|
| 50 |
dice_sides: Number of sides on the dice
|
| 51 |
max_rounds: Maximum number of rounds to play
|
|
|
|
| 52 |
seed: Random seed for reproducible results
|
| 53 |
"""
|
| 54 |
self.dice_sides = dice_sides
|
| 55 |
self.max_rounds = max_rounds
|
|
|
|
| 56 |
|
| 57 |
# Initialize domains
|
| 58 |
-
self.environment = Environment(
|
|
|
|
|
|
|
| 59 |
self.belief_state = BayesianBeliefState(dice_sides=dice_sides)
|
| 60 |
|
| 61 |
# Initialize game state
|
|
@@ -113,8 +125,8 @@ class BayesianGame:
|
|
| 113 |
# Generate evidence from environment
|
| 114 |
evidence = self.environment.roll_dice_and_compare()
|
| 115 |
|
| 116 |
-
# Update belief state (only pass comparison
|
| 117 |
-
belief_update = BeliefUpdate(
|
| 118 |
self.belief_state.update_beliefs(belief_update)
|
| 119 |
|
| 120 |
# Update game state
|
|
|
|
| 3 |
from typing import Any
|
| 4 |
|
| 5 |
from ..belief.belief_domain import BayesianBeliefState, BeliefUpdate
|
| 6 |
+
from ..environment.environment_domain import (
|
| 7 |
+
Environment,
|
| 8 |
+
EnvironmentEvidence,
|
| 9 |
+
EvidenceType,
|
| 10 |
+
)
|
| 11 |
|
| 12 |
|
| 13 |
class GamePhase(Enum):
|
|
|
|
| 46 |
"""
|
| 47 |
|
| 48 |
def __init__(
|
| 49 |
+
self,
|
| 50 |
+
dice_sides: int = 6,
|
| 51 |
+
max_rounds: int = 10,
|
| 52 |
+
evidence_type: EvidenceType = EvidenceType.BASIC,
|
| 53 |
+
seed: int | None = None,
|
| 54 |
):
|
| 55 |
"""Initialize the Bayesian Game.
|
| 56 |
|
| 57 |
Args:
|
| 58 |
dice_sides: Number of sides on the dice
|
| 59 |
max_rounds: Maximum number of rounds to play
|
| 60 |
+
evidence_type: Type of evidence to generate (basic or extended)
|
| 61 |
seed: Random seed for reproducible results
|
| 62 |
"""
|
| 63 |
self.dice_sides = dice_sides
|
| 64 |
self.max_rounds = max_rounds
|
| 65 |
+
self.evidence_type = evidence_type
|
| 66 |
|
| 67 |
# Initialize domains
|
| 68 |
+
self.environment = Environment(
|
| 69 |
+
dice_sides=dice_sides, evidence_type=evidence_type, seed=seed
|
| 70 |
+
)
|
| 71 |
self.belief_state = BayesianBeliefState(dice_sides=dice_sides)
|
| 72 |
|
| 73 |
# Initialize game state
|
|
|
|
| 125 |
# Generate evidence from environment
|
| 126 |
evidence = self.environment.roll_dice_and_compare()
|
| 127 |
|
| 128 |
+
# Update belief state (only pass comparison results, not dice roll)
|
| 129 |
+
belief_update = BeliefUpdate(comparison_results=evidence.comparison_results)
|
| 130 |
self.belief_state.update_beliefs(belief_update)
|
| 131 |
|
| 132 |
# Update game state
|
domains/environment/environment_domain.py
CHANGED
|
@@ -1,14 +1,21 @@
|
|
| 1 |
import random
|
| 2 |
from dataclasses import dataclass
|
| 3 |
-
from
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
|
| 6 |
@dataclass
|
| 7 |
class EnvironmentEvidence:
|
| 8 |
-
"""Evidence generated by the environment - dice roll and comparison
|
| 9 |
|
| 10 |
dice_roll: int
|
| 11 |
-
|
| 12 |
|
| 13 |
|
| 14 |
class Environment:
|
|
@@ -17,14 +24,21 @@ class Environment:
|
|
| 17 |
Has no knowledge of probabilities - purely generates observable evidence.
|
| 18 |
"""
|
| 19 |
|
| 20 |
-
def __init__(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
"""Initialize environment with dice configuration.
|
| 22 |
|
| 23 |
Args:
|
| 24 |
dice_sides: Number of sides on the dice (default 6)
|
|
|
|
| 25 |
seed: Random seed for reproducible results
|
| 26 |
"""
|
| 27 |
self.dice_sides = dice_sides
|
|
|
|
| 28 |
self._random_state = (
|
| 29 |
random.Random(seed) if seed is not None else random.Random()
|
| 30 |
)
|
|
@@ -67,7 +81,7 @@ class Environment:
|
|
| 67 |
"""Roll dice and compare to target, generating evidence.
|
| 68 |
|
| 69 |
Returns:
|
| 70 |
-
EnvironmentEvidence with dice roll and comparison
|
| 71 |
|
| 72 |
Raises:
|
| 73 |
ValueError: If target value hasn't been set
|
|
@@ -76,14 +90,26 @@ class Environment:
|
|
| 76 |
raise ValueError("Target value not set")
|
| 77 |
|
| 78 |
dice_roll = self._random_state.randint(1, self.dice_sides)
|
|
|
|
| 79 |
|
|
|
|
| 80 |
if dice_roll > self._target_value:
|
| 81 |
-
|
| 82 |
elif dice_roll < self._target_value:
|
| 83 |
-
|
| 84 |
else:
|
| 85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 86 |
|
| 87 |
return EnvironmentEvidence(
|
| 88 |
-
dice_roll=dice_roll,
|
| 89 |
)
|
|
|
|
| 1 |
import random
|
| 2 |
from dataclasses import dataclass
|
| 3 |
+
from enum import Enum
|
| 4 |
+
|
| 5 |
+
|
| 6 |
+
class EvidenceType(Enum):
|
| 7 |
+
"""Types of evidence that can be generated."""
|
| 8 |
+
|
| 9 |
+
BASIC = "basic"
|
| 10 |
+
EXTENDED = "extended"
|
| 11 |
|
| 12 |
|
| 13 |
@dataclass
|
| 14 |
class EnvironmentEvidence:
|
| 15 |
+
"""Evidence generated by the environment - dice roll and comparison results."""
|
| 16 |
|
| 17 |
dice_roll: int
|
| 18 |
+
comparison_results: list[str]
|
| 19 |
|
| 20 |
|
| 21 |
class Environment:
|
|
|
|
| 24 |
Has no knowledge of probabilities - purely generates observable evidence.
|
| 25 |
"""
|
| 26 |
|
| 27 |
+
def __init__(
|
| 28 |
+
self,
|
| 29 |
+
dice_sides: int = 6,
|
| 30 |
+
evidence_type: EvidenceType = EvidenceType.BASIC,
|
| 31 |
+
seed: int | None = None,
|
| 32 |
+
):
|
| 33 |
"""Initialize environment with dice configuration.
|
| 34 |
|
| 35 |
Args:
|
| 36 |
dice_sides: Number of sides on the dice (default 6)
|
| 37 |
+
evidence_type: Type of evidence to generate (basic or extended)
|
| 38 |
seed: Random seed for reproducible results
|
| 39 |
"""
|
| 40 |
self.dice_sides = dice_sides
|
| 41 |
+
self.evidence_type = evidence_type
|
| 42 |
self._random_state = (
|
| 43 |
random.Random(seed) if seed is not None else random.Random()
|
| 44 |
)
|
|
|
|
| 81 |
"""Roll dice and compare to target, generating evidence.
|
| 82 |
|
| 83 |
Returns:
|
| 84 |
+
EnvironmentEvidence with dice roll and comparison results
|
| 85 |
|
| 86 |
Raises:
|
| 87 |
ValueError: If target value hasn't been set
|
|
|
|
| 90 |
raise ValueError("Target value not set")
|
| 91 |
|
| 92 |
dice_roll = self._random_state.randint(1, self.dice_sides)
|
| 93 |
+
comparison_results = []
|
| 94 |
|
| 95 |
+
# Basic evidence: higher/lower/same
|
| 96 |
if dice_roll > self._target_value:
|
| 97 |
+
comparison_results.append("higher")
|
| 98 |
elif dice_roll < self._target_value:
|
| 99 |
+
comparison_results.append("lower")
|
| 100 |
else:
|
| 101 |
+
comparison_results.append("same")
|
| 102 |
+
|
| 103 |
+
# Extended evidence: half/double (only for extended type)
|
| 104 |
+
if self.evidence_type == EvidenceType.EXTENDED:
|
| 105 |
+
# Check for "half" - dice_roll = target/2 (exact integer only)
|
| 106 |
+
if self._target_value % 2 == 0 and dice_roll == self._target_value // 2:
|
| 107 |
+
comparison_results.append("half")
|
| 108 |
+
|
| 109 |
+
# Check for "double" - dice_roll = target*2 (within dice range)
|
| 110 |
+
if dice_roll == self._target_value * 2 and dice_roll <= self.dice_sides:
|
| 111 |
+
comparison_results.append("double")
|
| 112 |
|
| 113 |
return EnvironmentEvidence(
|
| 114 |
+
dice_roll=dice_roll, comparison_results=comparison_results
|
| 115 |
)
|
tests/test_architectural_constraints.py
CHANGED
|
@@ -20,24 +20,24 @@ class TestArchitecturalConstraints:
|
|
| 20 |
"""Test architectural constraints and domain separation."""
|
| 21 |
|
| 22 |
def test_belief_update_dataclass_structure(self):
|
| 23 |
-
"""Test that BeliefUpdate contains only
|
| 24 |
# Get all fields of BeliefUpdate
|
| 25 |
fields = BeliefUpdate.__dataclass_fields__
|
| 26 |
|
| 27 |
-
# Should only contain
|
| 28 |
assert len(fields) == 1, (
|
| 29 |
f"BeliefUpdate should have exactly 1 field, got {len(fields)}: "
|
| 30 |
f"{list(fields.keys())}"
|
| 31 |
)
|
| 32 |
-
assert "
|
| 33 |
-
"BeliefUpdate must contain
|
| 34 |
)
|
| 35 |
assert "dice_roll" not in fields, (
|
| 36 |
"BeliefUpdate MUST NOT contain dice_roll field"
|
| 37 |
)
|
| 38 |
|
| 39 |
def test_environment_evidence_dataclass_structure(self):
|
| 40 |
-
"""Test that EnvironmentEvidence contains both dice_roll and
|
| 41 |
# Get all fields of EnvironmentEvidence
|
| 42 |
fields = EnvironmentEvidence.__dataclass_fields__
|
| 43 |
|
|
@@ -47,8 +47,8 @@ class TestArchitecturalConstraints:
|
|
| 47 |
f"{list(fields.keys())}"
|
| 48 |
)
|
| 49 |
assert "dice_roll" in fields, "EnvironmentEvidence must contain dice_roll field"
|
| 50 |
-
assert "
|
| 51 |
-
"EnvironmentEvidence must contain
|
| 52 |
)
|
| 53 |
|
| 54 |
def test_belief_state_methods_no_dice_roll_parameters(self):
|
|
@@ -69,14 +69,14 @@ class TestArchitecturalConstraints:
|
|
| 69 |
|
| 70 |
def test_belief_update_creation_without_dice_roll(self):
|
| 71 |
"""Test that BeliefUpdate can be created without dice_roll."""
|
| 72 |
-
# This should work (only
|
| 73 |
-
update = BeliefUpdate(
|
| 74 |
-
assert update.
|
| 75 |
|
| 76 |
# This should fail if dice_roll field exists
|
| 77 |
try:
|
| 78 |
# This should raise TypeError if dice_roll is not a field
|
| 79 |
-
BeliefUpdate(dice_roll=3,
|
| 80 |
pytest.fail("BeliefUpdate should not accept dice_roll parameter")
|
| 81 |
except TypeError:
|
| 82 |
pass # Expected - dice_roll should not be a valid parameter
|
|
@@ -100,8 +100,8 @@ class TestArchitecturalConstraints:
|
|
| 100 |
|
| 101 |
# Verify that evidence history in belief domain contains only comparison results
|
| 102 |
for evidence in game.belief_state.evidence_history:
|
| 103 |
-
assert hasattr(evidence, "
|
| 104 |
-
"Belief evidence must have
|
| 105 |
)
|
| 106 |
assert not hasattr(evidence, "dice_roll"), (
|
| 107 |
"Belief evidence MUST NOT have dice_roll"
|
|
@@ -127,7 +127,7 @@ class TestArchitecturalConstraints:
|
|
| 127 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 128 |
|
| 129 |
# Apply "higher" evidence
|
| 130 |
-
update = BeliefUpdate(
|
| 131 |
belief_state.update_beliefs(update)
|
| 132 |
|
| 133 |
# Verify that probabilities follow expected pattern for "higher"
|
|
@@ -153,14 +153,14 @@ class TestArchitecturalConstraints:
|
|
| 153 |
assert hasattr(state.evidence_history[0], "dice_roll"), (
|
| 154 |
"Game state should maintain full evidence for display"
|
| 155 |
)
|
| 156 |
-
assert hasattr(state.evidence_history[0], "
|
| 157 |
"Game state should maintain comparison results"
|
| 158 |
)
|
| 159 |
|
| 160 |
# But belief state should only have comparison results
|
| 161 |
belief_evidence = game.belief_state.evidence_history[0]
|
| 162 |
-
assert hasattr(belief_evidence, "
|
| 163 |
-
"Belief evidence must have
|
| 164 |
)
|
| 165 |
assert not hasattr(belief_evidence, "dice_roll"), (
|
| 166 |
"Belief evidence MUST NOT have dice_roll"
|
|
@@ -173,7 +173,7 @@ class TestArchitecturalConstraints:
|
|
| 173 |
belief_state = BayesianBeliefState(dice_sides=dice_sides)
|
| 174 |
|
| 175 |
# Apply "higher" evidence
|
| 176 |
-
update = BeliefUpdate(
|
| 177 |
belief_state.update_beliefs(update)
|
| 178 |
|
| 179 |
# Target 1 should have highest probability: P(roll > 1) = (dice_sides - 1) / dice_sides
|
|
|
|
| 20 |
"""Test architectural constraints and domain separation."""
|
| 21 |
|
| 22 |
def test_belief_update_dataclass_structure(self):
|
| 23 |
+
"""Test that BeliefUpdate contains only comparison_results field."""
|
| 24 |
# Get all fields of BeliefUpdate
|
| 25 |
fields = BeliefUpdate.__dataclass_fields__
|
| 26 |
|
| 27 |
+
# Should only contain comparison_results
|
| 28 |
assert len(fields) == 1, (
|
| 29 |
f"BeliefUpdate should have exactly 1 field, got {len(fields)}: "
|
| 30 |
f"{list(fields.keys())}"
|
| 31 |
)
|
| 32 |
+
assert "comparison_results" in fields, (
|
| 33 |
+
"BeliefUpdate must contain comparison_results field"
|
| 34 |
)
|
| 35 |
assert "dice_roll" not in fields, (
|
| 36 |
"BeliefUpdate MUST NOT contain dice_roll field"
|
| 37 |
)
|
| 38 |
|
| 39 |
def test_environment_evidence_dataclass_structure(self):
|
| 40 |
+
"""Test that EnvironmentEvidence contains both dice_roll and comparison_results."""
|
| 41 |
# Get all fields of EnvironmentEvidence
|
| 42 |
fields = EnvironmentEvidence.__dataclass_fields__
|
| 43 |
|
|
|
|
| 47 |
f"{list(fields.keys())}"
|
| 48 |
)
|
| 49 |
assert "dice_roll" in fields, "EnvironmentEvidence must contain dice_roll field"
|
| 50 |
+
assert "comparison_results" in fields, (
|
| 51 |
+
"EnvironmentEvidence must contain comparison_results field"
|
| 52 |
)
|
| 53 |
|
| 54 |
def test_belief_state_methods_no_dice_roll_parameters(self):
|
|
|
|
| 69 |
|
| 70 |
def test_belief_update_creation_without_dice_roll(self):
|
| 71 |
"""Test that BeliefUpdate can be created without dice_roll."""
|
| 72 |
+
# This should work (only comparison_results)
|
| 73 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 74 |
+
assert update.comparison_results == ["higher"]
|
| 75 |
|
| 76 |
# This should fail if dice_roll field exists
|
| 77 |
try:
|
| 78 |
# This should raise TypeError if dice_roll is not a field
|
| 79 |
+
BeliefUpdate(dice_roll=3, comparison_results=["higher"])
|
| 80 |
pytest.fail("BeliefUpdate should not accept dice_roll parameter")
|
| 81 |
except TypeError:
|
| 82 |
pass # Expected - dice_roll should not be a valid parameter
|
|
|
|
| 100 |
|
| 101 |
# Verify that evidence history in belief domain contains only comparison results
|
| 102 |
for evidence in game.belief_state.evidence_history:
|
| 103 |
+
assert hasattr(evidence, "comparison_results"), (
|
| 104 |
+
"Belief evidence must have comparison_results"
|
| 105 |
)
|
| 106 |
assert not hasattr(evidence, "dice_roll"), (
|
| 107 |
"Belief evidence MUST NOT have dice_roll"
|
|
|
|
| 127 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 128 |
|
| 129 |
# Apply "higher" evidence
|
| 130 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 131 |
belief_state.update_beliefs(update)
|
| 132 |
|
| 133 |
# Verify that probabilities follow expected pattern for "higher"
|
|
|
|
| 153 |
assert hasattr(state.evidence_history[0], "dice_roll"), (
|
| 154 |
"Game state should maintain full evidence for display"
|
| 155 |
)
|
| 156 |
+
assert hasattr(state.evidence_history[0], "comparison_results"), (
|
| 157 |
"Game state should maintain comparison results"
|
| 158 |
)
|
| 159 |
|
| 160 |
# But belief state should only have comparison results
|
| 161 |
belief_evidence = game.belief_state.evidence_history[0]
|
| 162 |
+
assert hasattr(belief_evidence, "comparison_results"), (
|
| 163 |
+
"Belief evidence must have comparison_results"
|
| 164 |
)
|
| 165 |
assert not hasattr(belief_evidence, "dice_roll"), (
|
| 166 |
"Belief evidence MUST NOT have dice_roll"
|
|
|
|
| 173 |
belief_state = BayesianBeliefState(dice_sides=dice_sides)
|
| 174 |
|
| 175 |
# Apply "higher" evidence
|
| 176 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 177 |
belief_state.update_beliefs(update)
|
| 178 |
|
| 179 |
# Target 1 should have highest probability: P(roll > 1) = (dice_sides - 1) / dice_sides
|
tests/test_belief_domain.py
CHANGED
|
@@ -9,15 +9,15 @@ class TestBeliefUpdate:
|
|
| 9 |
|
| 10 |
def test_belief_update_creation(self):
|
| 11 |
"""Test creating belief update with valid data."""
|
| 12 |
-
update = BeliefUpdate(
|
| 13 |
-
assert update.
|
| 14 |
|
| 15 |
def test_belief_update_all_results(self):
|
| 16 |
"""Test belief update with all comparison results."""
|
| 17 |
valid_results = ["higher", "lower", "same"]
|
| 18 |
for result in valid_results:
|
| 19 |
-
update = BeliefUpdate(
|
| 20 |
-
assert update.
|
| 21 |
|
| 22 |
|
| 23 |
class TestBayesianBeliefState:
|
|
@@ -63,7 +63,7 @@ class TestBayesianBeliefState:
|
|
| 63 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 64 |
|
| 65 |
# Update with evidence that favors lower target values
|
| 66 |
-
update = BeliefUpdate(
|
| 67 |
belief_state.update_beliefs(update)
|
| 68 |
|
| 69 |
# Lower targets are more likely to result in "higher" comparison
|
|
@@ -93,7 +93,7 @@ class TestBayesianBeliefState:
|
|
| 93 |
|
| 94 |
# Evidence: comparison result is "higher" (dice roll > target)
|
| 95 |
# This is more likely for lower target values
|
| 96 |
-
update = BeliefUpdate(
|
| 97 |
belief_state.update_beliefs(update)
|
| 98 |
|
| 99 |
# Lower targets should have higher probability than higher targets
|
|
@@ -111,7 +111,7 @@ class TestBayesianBeliefState:
|
|
| 111 |
|
| 112 |
# Evidence: comparison result is "lower" (dice roll < target)
|
| 113 |
# This is more likely for higher target values
|
| 114 |
-
update = BeliefUpdate(
|
| 115 |
belief_state.update_beliefs(update)
|
| 116 |
|
| 117 |
# Higher targets should have higher probability than lower targets
|
|
@@ -129,7 +129,7 @@ class TestBayesianBeliefState:
|
|
| 129 |
|
| 130 |
# Evidence: comparison result is "same" (dice roll = target)
|
| 131 |
# This has equal probability for all targets: P(roll = target) = 1/6
|
| 132 |
-
update = BeliefUpdate(
|
| 133 |
belief_state.update_beliefs(update)
|
| 134 |
|
| 135 |
# All targets should have equal probability since P(roll = target) = 1/6 for all
|
|
@@ -142,11 +142,11 @@ class TestBayesianBeliefState:
|
|
| 142 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 143 |
|
| 144 |
# First update: "higher" (favors lower targets)
|
| 145 |
-
update1 = BeliefUpdate(
|
| 146 |
belief_state.update_beliefs(update1)
|
| 147 |
|
| 148 |
# Second update: "lower" (favors higher targets)
|
| 149 |
-
update2 = BeliefUpdate(
|
| 150 |
belief_state.update_beliefs(update2)
|
| 151 |
|
| 152 |
# The combination should favor middle targets
|
|
@@ -167,9 +167,9 @@ class TestBayesianBeliefState:
|
|
| 167 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 168 |
|
| 169 |
updates = [
|
| 170 |
-
BeliefUpdate(
|
| 171 |
-
BeliefUpdate(
|
| 172 |
-
BeliefUpdate(
|
| 173 |
]
|
| 174 |
|
| 175 |
for update in updates:
|
|
@@ -183,7 +183,7 @@ class TestBayesianBeliefState:
|
|
| 183 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 184 |
|
| 185 |
# Update beliefs
|
| 186 |
-
update = BeliefUpdate(
|
| 187 |
belief_state.update_beliefs(update)
|
| 188 |
|
| 189 |
# Verify beliefs changed from uniform
|
|
@@ -215,7 +215,7 @@ class TestBayesianBeliefState:
|
|
| 215 |
# Create a near-certain belief by applying many "higher" updates
|
| 216 |
# This will eventually make target 1 much more likely than others
|
| 217 |
for _ in range(10):
|
| 218 |
-
update = BeliefUpdate(
|
| 219 |
belief_state.update_beliefs(update)
|
| 220 |
|
| 221 |
entropy = belief_state.get_entropy()
|
|
@@ -227,7 +227,7 @@ class TestBayesianBeliefState:
|
|
| 227 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 228 |
|
| 229 |
# Reduce uncertainty but don't eliminate it
|
| 230 |
-
update = BeliefUpdate(
|
| 231 |
belief_state.update_beliefs(update)
|
| 232 |
|
| 233 |
entropy = belief_state.get_entropy()
|
|
@@ -245,8 +245,8 @@ class TestBayesianBeliefState:
|
|
| 245 |
|
| 246 |
# Add some evidence
|
| 247 |
updates = [
|
| 248 |
-
BeliefUpdate(
|
| 249 |
-
BeliefUpdate(
|
| 250 |
]
|
| 251 |
|
| 252 |
for i, update in enumerate(updates, 1):
|
|
@@ -258,10 +258,10 @@ class TestBayesianBeliefState:
|
|
| 258 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 259 |
|
| 260 |
updates = [
|
| 261 |
-
BeliefUpdate(
|
| 262 |
-
BeliefUpdate(
|
| 263 |
-
BeliefUpdate(
|
| 264 |
-
BeliefUpdate(
|
| 265 |
]
|
| 266 |
|
| 267 |
# Check initial sum
|
|
@@ -278,7 +278,7 @@ class TestBayesianBeliefState:
|
|
| 278 |
|
| 279 |
# Apply a few "higher" results to favor lower targets
|
| 280 |
for _ in range(3):
|
| 281 |
-
update1 = BeliefUpdate(
|
| 282 |
belief_state.update_beliefs(update1)
|
| 283 |
|
| 284 |
# Target 1 should be favored, target 6 should have zero probability
|
|
@@ -289,7 +289,7 @@ class TestBayesianBeliefState:
|
|
| 289 |
assert abs(prob_6 - 0.0) < 1e-10 # Target 6 should have zero probability
|
| 290 |
|
| 291 |
# Apply more evidence and verify probabilities still sum to 1
|
| 292 |
-
update2 = BeliefUpdate(
|
| 293 |
belief_state.update_beliefs(update2)
|
| 294 |
|
| 295 |
total_prob = sum(belief_state.get_belief_for_target(i) for i in range(1, 7))
|
|
|
|
| 9 |
|
| 10 |
def test_belief_update_creation(self):
|
| 11 |
"""Test creating belief update with valid data."""
|
| 12 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 13 |
+
assert update.comparison_results == ["higher"]
|
| 14 |
|
| 15 |
def test_belief_update_all_results(self):
|
| 16 |
"""Test belief update with all comparison results."""
|
| 17 |
valid_results = ["higher", "lower", "same"]
|
| 18 |
for result in valid_results:
|
| 19 |
+
update = BeliefUpdate(comparison_results=[result])
|
| 20 |
+
assert update.comparison_results == [result]
|
| 21 |
|
| 22 |
|
| 23 |
class TestBayesianBeliefState:
|
|
|
|
| 63 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 64 |
|
| 65 |
# Update with evidence that favors lower target values
|
| 66 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 67 |
belief_state.update_beliefs(update)
|
| 68 |
|
| 69 |
# Lower targets are more likely to result in "higher" comparison
|
|
|
|
| 93 |
|
| 94 |
# Evidence: comparison result is "higher" (dice roll > target)
|
| 95 |
# This is more likely for lower target values
|
| 96 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 97 |
belief_state.update_beliefs(update)
|
| 98 |
|
| 99 |
# Lower targets should have higher probability than higher targets
|
|
|
|
| 111 |
|
| 112 |
# Evidence: comparison result is "lower" (dice roll < target)
|
| 113 |
# This is more likely for higher target values
|
| 114 |
+
update = BeliefUpdate(comparison_results=["lower"])
|
| 115 |
belief_state.update_beliefs(update)
|
| 116 |
|
| 117 |
# Higher targets should have higher probability than lower targets
|
|
|
|
| 129 |
|
| 130 |
# Evidence: comparison result is "same" (dice roll = target)
|
| 131 |
# This has equal probability for all targets: P(roll = target) = 1/6
|
| 132 |
+
update = BeliefUpdate(comparison_results=["same"])
|
| 133 |
belief_state.update_beliefs(update)
|
| 134 |
|
| 135 |
# All targets should have equal probability since P(roll = target) = 1/6 for all
|
|
|
|
| 142 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 143 |
|
| 144 |
# First update: "higher" (favors lower targets)
|
| 145 |
+
update1 = BeliefUpdate(comparison_results=["higher"])
|
| 146 |
belief_state.update_beliefs(update1)
|
| 147 |
|
| 148 |
# Second update: "lower" (favors higher targets)
|
| 149 |
+
update2 = BeliefUpdate(comparison_results=["lower"])
|
| 150 |
belief_state.update_beliefs(update2)
|
| 151 |
|
| 152 |
# The combination should favor middle targets
|
|
|
|
| 167 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 168 |
|
| 169 |
updates = [
|
| 170 |
+
BeliefUpdate(comparison_results=["higher"]),
|
| 171 |
+
BeliefUpdate(comparison_results=["lower"]),
|
| 172 |
+
BeliefUpdate(comparison_results=["same"]),
|
| 173 |
]
|
| 174 |
|
| 175 |
for update in updates:
|
|
|
|
| 183 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 184 |
|
| 185 |
# Update beliefs
|
| 186 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 187 |
belief_state.update_beliefs(update)
|
| 188 |
|
| 189 |
# Verify beliefs changed from uniform
|
|
|
|
| 215 |
# Create a near-certain belief by applying many "higher" updates
|
| 216 |
# This will eventually make target 1 much more likely than others
|
| 217 |
for _ in range(10):
|
| 218 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 219 |
belief_state.update_beliefs(update)
|
| 220 |
|
| 221 |
entropy = belief_state.get_entropy()
|
|
|
|
| 227 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 228 |
|
| 229 |
# Reduce uncertainty but don't eliminate it
|
| 230 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 231 |
belief_state.update_beliefs(update)
|
| 232 |
|
| 233 |
entropy = belief_state.get_entropy()
|
|
|
|
| 245 |
|
| 246 |
# Add some evidence
|
| 247 |
updates = [
|
| 248 |
+
BeliefUpdate(comparison_results=["higher"]),
|
| 249 |
+
BeliefUpdate(comparison_results=["lower"]),
|
| 250 |
]
|
| 251 |
|
| 252 |
for i, update in enumerate(updates, 1):
|
|
|
|
| 258 |
belief_state = BayesianBeliefState(dice_sides=6)
|
| 259 |
|
| 260 |
updates = [
|
| 261 |
+
BeliefUpdate(comparison_results=["higher"]),
|
| 262 |
+
BeliefUpdate(comparison_results=["lower"]),
|
| 263 |
+
BeliefUpdate(comparison_results=["same"]),
|
| 264 |
+
BeliefUpdate(comparison_results=["higher"]),
|
| 265 |
]
|
| 266 |
|
| 267 |
# Check initial sum
|
|
|
|
| 278 |
|
| 279 |
# Apply a few "higher" results to favor lower targets
|
| 280 |
for _ in range(3):
|
| 281 |
+
update1 = BeliefUpdate(comparison_results=["higher"])
|
| 282 |
belief_state.update_beliefs(update1)
|
| 283 |
|
| 284 |
# Target 1 should be favored, target 6 should have zero probability
|
|
|
|
| 289 |
assert abs(prob_6 - 0.0) < 1e-10 # Target 6 should have zero probability
|
| 290 |
|
| 291 |
# Apply more evidence and verify probabilities still sum to 1
|
| 292 |
+
update2 = BeliefUpdate(comparison_results=["lower"])
|
| 293 |
belief_state.update_beliefs(update2)
|
| 294 |
|
| 295 |
total_prob = sum(belief_state.get_belief_for_target(i) for i in range(1, 7))
|
tests/test_environment_domain.py
CHANGED
|
@@ -1,6 +1,10 @@
|
|
| 1 |
import pytest
|
| 2 |
|
| 3 |
-
from domains.environment.environment_domain import
|
|
|
|
|
|
|
|
|
|
|
|
|
| 4 |
|
| 5 |
|
| 6 |
class TestEnvironmentEvidence:
|
|
@@ -8,16 +12,26 @@ class TestEnvironmentEvidence:
|
|
| 8 |
|
| 9 |
def test_evidence_creation(self):
|
| 10 |
"""Test creating evidence with valid data."""
|
| 11 |
-
evidence = EnvironmentEvidence(dice_roll=3,
|
| 12 |
assert evidence.dice_roll == 3
|
| 13 |
-
assert evidence.
|
| 14 |
|
| 15 |
def test_evidence_comparison_results(self):
|
| 16 |
"""Test all valid comparison results."""
|
| 17 |
valid_results = ["higher", "lower", "same"]
|
| 18 |
for result in valid_results:
|
| 19 |
-
evidence = EnvironmentEvidence(dice_roll=1,
|
| 20 |
-
assert evidence.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
|
| 23 |
class TestEnvironment:
|
|
@@ -28,11 +42,13 @@ class TestEnvironment:
|
|
| 28 |
# Default initialization
|
| 29 |
env = Environment()
|
| 30 |
assert env.dice_sides == 6
|
|
|
|
| 31 |
assert env._target_value is None
|
| 32 |
|
| 33 |
# Custom initialization
|
| 34 |
-
env = Environment(dice_sides=8, seed=42)
|
| 35 |
assert env.dice_sides == 8
|
|
|
|
| 36 |
assert env._target_value is None
|
| 37 |
|
| 38 |
def test_set_target_value_valid(self):
|
|
@@ -103,11 +119,11 @@ class TestEnvironment:
|
|
| 103 |
|
| 104 |
assert 1 <= evidence.dice_roll <= 6
|
| 105 |
if evidence.dice_roll > 1:
|
| 106 |
-
assert
|
| 107 |
elif evidence.dice_roll < 1:
|
| 108 |
-
assert
|
| 109 |
else:
|
| 110 |
-
assert
|
| 111 |
|
| 112 |
def test_roll_dice_and_compare_lower(self):
|
| 113 |
"""Test dice roll comparison when result is lower."""
|
|
@@ -120,11 +136,11 @@ class TestEnvironment:
|
|
| 120 |
|
| 121 |
assert 1 <= evidence.dice_roll <= 6
|
| 122 |
if evidence.dice_roll > 6:
|
| 123 |
-
assert
|
| 124 |
elif evidence.dice_roll < 6:
|
| 125 |
-
assert
|
| 126 |
else:
|
| 127 |
-
assert
|
| 128 |
|
| 129 |
def test_roll_dice_and_compare_same(self):
|
| 130 |
"""Test dice roll comparison when result is same."""
|
|
@@ -140,13 +156,13 @@ class TestEnvironment:
|
|
| 140 |
evidence = env.roll_dice_and_compare()
|
| 141 |
|
| 142 |
if evidence.dice_roll == target:
|
| 143 |
-
assert
|
| 144 |
found_same = True
|
| 145 |
break
|
| 146 |
elif evidence.dice_roll > target:
|
| 147 |
-
assert
|
| 148 |
else:
|
| 149 |
-
assert
|
| 150 |
|
| 151 |
# With 100 attempts, we should find at least one match for 6-sided die
|
| 152 |
assert found_same, f"Failed to roll target value {target} in 100 attempts"
|
|
@@ -161,15 +177,17 @@ class TestEnvironment:
|
|
| 161 |
# Roll many times to see all outcomes
|
| 162 |
for _ in range(100):
|
| 163 |
evidence = env.roll_dice_and_compare()
|
| 164 |
-
outcomes_seen
|
|
|
|
|
|
|
| 165 |
|
| 166 |
# Verify consistency
|
| 167 |
if evidence.dice_roll > 3:
|
| 168 |
-
assert
|
| 169 |
elif evidence.dice_roll < 3:
|
| 170 |
-
assert
|
| 171 |
else:
|
| 172 |
-
assert
|
| 173 |
|
| 174 |
# Should see all three outcomes with enough rolls
|
| 175 |
assert "higher" in outcomes_seen
|
|
@@ -184,4 +202,80 @@ class TestEnvironment:
|
|
| 184 |
|
| 185 |
evidence = env.roll_dice_and_compare()
|
| 186 |
assert 1 <= evidence.dice_roll <= sides
|
| 187 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
import pytest
|
| 2 |
|
| 3 |
+
from domains.environment.environment_domain import (
|
| 4 |
+
Environment,
|
| 5 |
+
EnvironmentEvidence,
|
| 6 |
+
EvidenceType,
|
| 7 |
+
)
|
| 8 |
|
| 9 |
|
| 10 |
class TestEnvironmentEvidence:
|
|
|
|
| 12 |
|
| 13 |
def test_evidence_creation(self):
|
| 14 |
"""Test creating evidence with valid data."""
|
| 15 |
+
evidence = EnvironmentEvidence(dice_roll=3, comparison_results=["higher"])
|
| 16 |
assert evidence.dice_roll == 3
|
| 17 |
+
assert evidence.comparison_results == ["higher"]
|
| 18 |
|
| 19 |
def test_evidence_comparison_results(self):
|
| 20 |
"""Test all valid comparison results."""
|
| 21 |
valid_results = ["higher", "lower", "same"]
|
| 22 |
for result in valid_results:
|
| 23 |
+
evidence = EnvironmentEvidence(dice_roll=1, comparison_results=[result])
|
| 24 |
+
assert evidence.comparison_results == [result]
|
| 25 |
+
|
| 26 |
+
def test_evidence_multiple_comparison_results(self):
|
| 27 |
+
"""Test evidence with multiple comparison results."""
|
| 28 |
+
evidence = EnvironmentEvidence(
|
| 29 |
+
dice_roll=3, comparison_results=["higher", "double"]
|
| 30 |
+
)
|
| 31 |
+
assert evidence.dice_roll == 3
|
| 32 |
+
assert evidence.comparison_results == ["higher", "double"]
|
| 33 |
+
assert "higher" in evidence.comparison_results
|
| 34 |
+
assert "double" in evidence.comparison_results
|
| 35 |
|
| 36 |
|
| 37 |
class TestEnvironment:
|
|
|
|
| 42 |
# Default initialization
|
| 43 |
env = Environment()
|
| 44 |
assert env.dice_sides == 6
|
| 45 |
+
assert env.evidence_type == EvidenceType.BASIC
|
| 46 |
assert env._target_value is None
|
| 47 |
|
| 48 |
# Custom initialization
|
| 49 |
+
env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
|
| 50 |
assert env.dice_sides == 8
|
| 51 |
+
assert env.evidence_type == EvidenceType.EXTENDED
|
| 52 |
assert env._target_value is None
|
| 53 |
|
| 54 |
def test_set_target_value_valid(self):
|
|
|
|
| 119 |
|
| 120 |
assert 1 <= evidence.dice_roll <= 6
|
| 121 |
if evidence.dice_roll > 1:
|
| 122 |
+
assert "higher" in evidence.comparison_results
|
| 123 |
elif evidence.dice_roll < 1:
|
| 124 |
+
assert "lower" in evidence.comparison_results
|
| 125 |
else:
|
| 126 |
+
assert "same" in evidence.comparison_results
|
| 127 |
|
| 128 |
def test_roll_dice_and_compare_lower(self):
|
| 129 |
"""Test dice roll comparison when result is lower."""
|
|
|
|
| 136 |
|
| 137 |
assert 1 <= evidence.dice_roll <= 6
|
| 138 |
if evidence.dice_roll > 6:
|
| 139 |
+
assert "higher" in evidence.comparison_results
|
| 140 |
elif evidence.dice_roll < 6:
|
| 141 |
+
assert "lower" in evidence.comparison_results
|
| 142 |
else:
|
| 143 |
+
assert "same" in evidence.comparison_results
|
| 144 |
|
| 145 |
def test_roll_dice_and_compare_same(self):
|
| 146 |
"""Test dice roll comparison when result is same."""
|
|
|
|
| 156 |
evidence = env.roll_dice_and_compare()
|
| 157 |
|
| 158 |
if evidence.dice_roll == target:
|
| 159 |
+
assert "same" in evidence.comparison_results
|
| 160 |
found_same = True
|
| 161 |
break
|
| 162 |
elif evidence.dice_roll > target:
|
| 163 |
+
assert "higher" in evidence.comparison_results
|
| 164 |
else:
|
| 165 |
+
assert "lower" in evidence.comparison_results
|
| 166 |
|
| 167 |
# With 100 attempts, we should find at least one match for 6-sided die
|
| 168 |
assert found_same, f"Failed to roll target value {target} in 100 attempts"
|
|
|
|
| 177 |
# Roll many times to see all outcomes
|
| 178 |
for _ in range(100):
|
| 179 |
evidence = env.roll_dice_and_compare()
|
| 180 |
+
# Add all comparison results to outcomes_seen
|
| 181 |
+
for result in evidence.comparison_results:
|
| 182 |
+
outcomes_seen.add(result)
|
| 183 |
|
| 184 |
# Verify consistency
|
| 185 |
if evidence.dice_roll > 3:
|
| 186 |
+
assert "higher" in evidence.comparison_results
|
| 187 |
elif evidence.dice_roll < 3:
|
| 188 |
+
assert "lower" in evidence.comparison_results
|
| 189 |
else:
|
| 190 |
+
assert "same" in evidence.comparison_results
|
| 191 |
|
| 192 |
# Should see all three outcomes with enough rolls
|
| 193 |
assert "higher" in outcomes_seen
|
|
|
|
| 202 |
|
| 203 |
evidence = env.roll_dice_and_compare()
|
| 204 |
assert 1 <= evidence.dice_roll <= sides
|
| 205 |
+
# At least one basic comparison result should be present
|
| 206 |
+
basic_results = {"higher", "lower", "same"}
|
| 207 |
+
assert any(
|
| 208 |
+
result in basic_results for result in evidence.comparison_results
|
| 209 |
+
)
|
| 210 |
+
|
| 211 |
+
def test_basic_evidence_type(self):
|
| 212 |
+
"""Test basic evidence type produces only basic comparison results."""
|
| 213 |
+
env = Environment(dice_sides=6, evidence_type=EvidenceType.BASIC, seed=42)
|
| 214 |
+
env.set_target_value(4)
|
| 215 |
+
|
| 216 |
+
for _ in range(50):
|
| 217 |
+
evidence = env.roll_dice_and_compare()
|
| 218 |
+
# Should only contain basic results
|
| 219 |
+
for result in evidence.comparison_results:
|
| 220 |
+
assert result in ["higher", "lower", "same"]
|
| 221 |
+
# Should contain exactly one basic result
|
| 222 |
+
assert len(evidence.comparison_results) == 1
|
| 223 |
+
|
| 224 |
+
def test_extended_evidence_type(self):
|
| 225 |
+
"""Test extended evidence type can produce additional comparison results."""
|
| 226 |
+
env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
|
| 227 |
+
env.set_target_value(4) # Target = 4, so half = 2, double = 8
|
| 228 |
+
|
| 229 |
+
extended_results_seen = set()
|
| 230 |
+
for _ in range(100):
|
| 231 |
+
evidence = env.roll_dice_and_compare()
|
| 232 |
+
|
| 233 |
+
# Should always contain at least one basic result
|
| 234 |
+
basic_results = {"higher", "lower", "same"}
|
| 235 |
+
assert any(
|
| 236 |
+
result in basic_results for result in evidence.comparison_results
|
| 237 |
+
)
|
| 238 |
+
|
| 239 |
+
# Collect all results
|
| 240 |
+
for result in evidence.comparison_results:
|
| 241 |
+
extended_results_seen.add(result)
|
| 242 |
+
assert result in ["higher", "lower", "same", "half", "double"]
|
| 243 |
+
|
| 244 |
+
# Basic results should definitely be seen
|
| 245 |
+
assert (
|
| 246 |
+
"higher" in extended_results_seen
|
| 247 |
+
or "lower" in extended_results_seen
|
| 248 |
+
or "same" in extended_results_seen
|
| 249 |
+
)
|
| 250 |
+
|
| 251 |
+
def test_extended_evidence_half_condition(self):
|
| 252 |
+
"""Test that 'half' evidence is generated correctly."""
|
| 253 |
+
env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
|
| 254 |
+
env.set_target_value(4) # Target = 4, so half = 2
|
| 255 |
+
|
| 256 |
+
# Force a dice roll of 2 by testing specific conditions
|
| 257 |
+
for _ in range(200): # More attempts to find the half condition
|
| 258 |
+
evidence = env.roll_dice_and_compare()
|
| 259 |
+
if evidence.dice_roll == 2: # Should be 'half' of target 4
|
| 260 |
+
assert "half" in evidence.comparison_results
|
| 261 |
+
assert "lower" in evidence.comparison_results # 2 < 4
|
| 262 |
+
break
|
| 263 |
+
|
| 264 |
+
# If we didn't find it randomly, we know the logic is correct from the condition above
|
| 265 |
+
# This test mainly verifies the logic structure
|
| 266 |
+
|
| 267 |
+
def test_extended_evidence_double_condition(self):
|
| 268 |
+
"""Test that 'double' evidence is generated correctly."""
|
| 269 |
+
env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
|
| 270 |
+
env.set_target_value(3) # Target = 3, so double = 6
|
| 271 |
+
|
| 272 |
+
# Force a dice roll of 6 by testing specific conditions
|
| 273 |
+
for _ in range(200): # More attempts to find the double condition
|
| 274 |
+
evidence = env.roll_dice_and_compare()
|
| 275 |
+
if evidence.dice_roll == 6: # Should be 'double' of target 3
|
| 276 |
+
assert "double" in evidence.comparison_results
|
| 277 |
+
assert "higher" in evidence.comparison_results # 6 > 3
|
| 278 |
+
break
|
| 279 |
+
|
| 280 |
+
# If we didn't find it randomly, we know the logic is correct from the condition above
|
| 281 |
+
# This test mainly verifies the logic structure
|
tests/test_game_coordination.py
CHANGED
|
@@ -20,7 +20,7 @@ class TestGameState:
|
|
| 20 |
|
| 21 |
def test_game_state_with_optional_params(self):
|
| 22 |
"""Test creating game state with optional parameters."""
|
| 23 |
-
evidence = [EnvironmentEvidence(dice_roll=3,
|
| 24 |
beliefs = [0.2, 0.3, 0.5]
|
| 25 |
|
| 26 |
state = GameState(
|
|
@@ -146,7 +146,9 @@ class TestBayesianGame:
|
|
| 146 |
# Evidence should be valid
|
| 147 |
evidence = updated_state.evidence_history[0]
|
| 148 |
assert 1 <= evidence.dice_roll <= 6
|
| 149 |
-
|
|
|
|
|
|
|
| 150 |
|
| 151 |
def test_play_multiple_rounds(self):
|
| 152 |
"""Test playing multiple rounds."""
|
|
@@ -295,7 +297,7 @@ class TestBayesianGame:
|
|
| 295 |
# Evidence should influence beliefs correctly
|
| 296 |
for state in states:
|
| 297 |
for evidence in state.evidence_history:
|
| 298 |
-
if
|
| 299 |
# Target must be less than dice roll
|
| 300 |
for _target in range(evidence.dice_roll, 7):
|
| 301 |
# These targets should have reduced probability
|
|
@@ -309,7 +311,7 @@ class TestBayesianGame:
|
|
| 309 |
# Apply evidence that changes beliefs
|
| 310 |
from domains.belief.belief_domain import BeliefUpdate
|
| 311 |
|
| 312 |
-
update = BeliefUpdate(
|
| 313 |
game.belief_state.update_beliefs(update)
|
| 314 |
|
| 315 |
# Update game state to reflect the belief change
|
|
@@ -346,7 +348,7 @@ class TestBayesianGame:
|
|
| 346 |
state1.evidence_history, state2.evidence_history, strict=False
|
| 347 |
):
|
| 348 |
assert ev1.dice_roll == ev2.dice_roll
|
| 349 |
-
assert ev1.
|
| 350 |
|
| 351 |
# Beliefs should be identical
|
| 352 |
assert state1.current_beliefs == state2.current_beliefs
|
|
|
|
| 20 |
|
| 21 |
def test_game_state_with_optional_params(self):
|
| 22 |
"""Test creating game state with optional parameters."""
|
| 23 |
+
evidence = [EnvironmentEvidence(dice_roll=3, comparison_results=["higher"])]
|
| 24 |
beliefs = [0.2, 0.3, 0.5]
|
| 25 |
|
| 26 |
state = GameState(
|
|
|
|
| 146 |
# Evidence should be valid
|
| 147 |
evidence = updated_state.evidence_history[0]
|
| 148 |
assert 1 <= evidence.dice_roll <= 6
|
| 149 |
+
# At least one basic comparison result should be present
|
| 150 |
+
basic_results = {"higher", "lower", "same"}
|
| 151 |
+
assert any(result in basic_results for result in evidence.comparison_results)
|
| 152 |
|
| 153 |
def test_play_multiple_rounds(self):
|
| 154 |
"""Test playing multiple rounds."""
|
|
|
|
| 297 |
# Evidence should influence beliefs correctly
|
| 298 |
for state in states:
|
| 299 |
for evidence in state.evidence_history:
|
| 300 |
+
if "higher" in evidence.comparison_results:
|
| 301 |
# Target must be less than dice roll
|
| 302 |
for _target in range(evidence.dice_roll, 7):
|
| 303 |
# These targets should have reduced probability
|
|
|
|
| 311 |
# Apply evidence that changes beliefs
|
| 312 |
from domains.belief.belief_domain import BeliefUpdate
|
| 313 |
|
| 314 |
+
update = BeliefUpdate(comparison_results=["higher"])
|
| 315 |
game.belief_state.update_beliefs(update)
|
| 316 |
|
| 317 |
# Update game state to reflect the belief change
|
|
|
|
| 348 |
state1.evidence_history, state2.evidence_history, strict=False
|
| 349 |
):
|
| 350 |
assert ev1.dice_roll == ev2.dice_roll
|
| 351 |
+
assert ev1.comparison_results == ev2.comparison_results
|
| 352 |
|
| 353 |
# Beliefs should be identical
|
| 354 |
assert state1.current_beliefs == state2.current_beliefs
|
ui/gradio_interface.py
CHANGED
|
@@ -2,6 +2,7 @@ import gradio as gr
|
|
| 2 |
import matplotlib.pyplot as plt
|
| 3 |
|
| 4 |
from domains.coordination.game_coordination import BayesianGame, GamePhase
|
|
|
|
| 5 |
|
| 6 |
|
| 7 |
class GradioInterface:
|
|
@@ -13,18 +14,29 @@ class GradioInterface:
|
|
| 13 |
self.reset_game()
|
| 14 |
|
| 15 |
def reset_game(
|
| 16 |
-
self,
|
|
|
|
|
|
|
|
|
|
| 17 |
) -> tuple[str, plt.Figure, str]:
|
| 18 |
"""Reset the game with new parameters.
|
| 19 |
|
| 20 |
Args:
|
| 21 |
dice_sides: Number of sides on the dice
|
| 22 |
max_rounds: Maximum number of rounds
|
|
|
|
| 23 |
|
| 24 |
Returns:
|
| 25 |
Tuple of (status, belief_chart, game_log)
|
| 26 |
"""
|
| 27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 28 |
return self._get_interface_state()
|
| 29 |
|
| 30 |
def start_new_game(self, target_value: str = "") -> tuple[str, plt.Figure, str]:
|
|
@@ -224,12 +236,20 @@ class GradioInterface:
|
|
| 224 |
log_lines = ["**Evidence History:**\n"]
|
| 225 |
|
| 226 |
for i, evidence in enumerate(self.game.game_state.evidence_history, 1):
|
| 227 |
-
|
| 228 |
-
|
| 229 |
-
|
| 230 |
-
|
| 231 |
-
|
| 232 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 233 |
|
| 234 |
# Add completion message if game is finished
|
| 235 |
if self.game.game_state.phase == GamePhase.FINISHED:
|
|
@@ -282,7 +302,9 @@ def create_interface() -> gr.Interface:
|
|
| 282 |
**Game Rules:**
|
| 283 |
- Judge and Player 1 can see the target die value
|
| 284 |
- Player 2 must deduce the target value using Bayesian inference
|
| 285 |
-
- Each round: Player 1 rolls dice and reports
|
|
|
|
|
|
|
| 286 |
- Game runs for a specified number of rounds
|
| 287 |
"""
|
| 288 |
)
|
|
@@ -299,6 +321,13 @@ def create_interface() -> gr.Interface:
|
|
| 299 |
value=10, label="Max Rounds", minimum=1, maximum=50, precision=0
|
| 300 |
)
|
| 301 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 302 |
reset_btn = gr.Button("π Reset Game", variant="secondary")
|
| 303 |
|
| 304 |
target_input = gr.Textbox(
|
|
@@ -317,7 +346,7 @@ def create_interface() -> gr.Interface:
|
|
| 317 |
# Event handlers
|
| 318 |
reset_btn.click(
|
| 319 |
interface.reset_game,
|
| 320 |
-
inputs=[dice_sides, max_rounds],
|
| 321 |
outputs=[status_output, belief_plot, game_log],
|
| 322 |
)
|
| 323 |
|
|
|
|
| 2 |
import matplotlib.pyplot as plt
|
| 3 |
|
| 4 |
from domains.coordination.game_coordination import BayesianGame, GamePhase
|
| 5 |
+
from domains.environment.environment_domain import EvidenceType
|
| 6 |
|
| 7 |
|
| 8 |
class GradioInterface:
|
|
|
|
| 14 |
self.reset_game()
|
| 15 |
|
| 16 |
def reset_game(
|
| 17 |
+
self,
|
| 18 |
+
dice_sides: int = 6,
|
| 19 |
+
max_rounds: int = 10,
|
| 20 |
+
evidence_type_str: str = "Basic",
|
| 21 |
) -> tuple[str, plt.Figure, str]:
|
| 22 |
"""Reset the game with new parameters.
|
| 23 |
|
| 24 |
Args:
|
| 25 |
dice_sides: Number of sides on the dice
|
| 26 |
max_rounds: Maximum number of rounds
|
| 27 |
+
evidence_type_str: Evidence type ("Basic" or "Extended")
|
| 28 |
|
| 29 |
Returns:
|
| 30 |
Tuple of (status, belief_chart, game_log)
|
| 31 |
"""
|
| 32 |
+
evidence_type = (
|
| 33 |
+
EvidenceType.EXTENDED
|
| 34 |
+
if evidence_type_str == "Extended"
|
| 35 |
+
else EvidenceType.BASIC
|
| 36 |
+
)
|
| 37 |
+
self.game = BayesianGame(
|
| 38 |
+
dice_sides=dice_sides, max_rounds=max_rounds, evidence_type=evidence_type
|
| 39 |
+
)
|
| 40 |
return self._get_interface_state()
|
| 41 |
|
| 42 |
def start_new_game(self, target_value: str = "") -> tuple[str, plt.Figure, str]:
|
|
|
|
| 236 |
log_lines = ["**Evidence History:**\n"]
|
| 237 |
|
| 238 |
for i, evidence in enumerate(self.game.game_state.evidence_history, 1):
|
| 239 |
+
# Handle multiple evidence types
|
| 240 |
+
evidence_display = []
|
| 241 |
+
for result in evidence.comparison_results:
|
| 242 |
+
emoji = {
|
| 243 |
+
"higher": "β¬οΈ",
|
| 244 |
+
"lower": "β¬οΈ",
|
| 245 |
+
"same": "π―",
|
| 246 |
+
"half": "Β½",
|
| 247 |
+
"double": "x2",
|
| 248 |
+
}.get(result, "β")
|
| 249 |
+
evidence_display.append(f"{result} {emoji}")
|
| 250 |
+
|
| 251 |
+
evidence_str = ", ".join(evidence_display)
|
| 252 |
+
log_lines.append(f"Round {i}: Rolled {evidence.dice_roll} β {evidence_str}")
|
| 253 |
|
| 254 |
# Add completion message if game is finished
|
| 255 |
if self.game.game_state.phase == GamePhase.FINISHED:
|
|
|
|
| 302 |
**Game Rules:**
|
| 303 |
- Judge and Player 1 can see the target die value
|
| 304 |
- Player 2 must deduce the target value using Bayesian inference
|
| 305 |
+
- Each round: Player 1 rolls dice and reports evidence based on selected type
|
| 306 |
+
- **Basic Evidence**: higher/lower/same compared to target
|
| 307 |
+
- **Extended Evidence**: higher/lower/same/half/double (multiple types can apply)
|
| 308 |
- Game runs for a specified number of rounds
|
| 309 |
"""
|
| 310 |
)
|
|
|
|
| 321 |
value=10, label="Max Rounds", minimum=1, maximum=50, precision=0
|
| 322 |
)
|
| 323 |
|
| 324 |
+
evidence_type_dropdown = gr.Dropdown(
|
| 325 |
+
choices=["Basic", "Extended"],
|
| 326 |
+
value="Basic",
|
| 327 |
+
label="Evidence Type",
|
| 328 |
+
info="Basic: higher/lower/same only. Extended: adds half/double evidence.",
|
| 329 |
+
)
|
| 330 |
+
|
| 331 |
reset_btn = gr.Button("π Reset Game", variant="secondary")
|
| 332 |
|
| 333 |
target_input = gr.Textbox(
|
|
|
|
| 346 |
# Event handlers
|
| 347 |
reset_btn.click(
|
| 348 |
interface.reset_game,
|
| 349 |
+
inputs=[dice_sides, max_rounds, evidence_type_dropdown],
|
| 350 |
outputs=[status_output, belief_plot, game_log],
|
| 351 |
)
|
| 352 |
|