Spaces:

thompsonson
/

bayesian_game

Sleeping

thompsonson Claude commited on Jun 16, 2025

Commit

571f3d0

0 Parent(s):

feat: implement complete Bayesian Game with domain-driven architecture

- Add Environment Domain with EnvironmentEvidence dataclass and Environment class for pure evidence generation
- Add Belief Domain with BeliefUpdate dataclass and BayesianBeliefState class for Bayesian inference
- Add Game Coordination with GameState dataclass and BayesianGame orchestration class
- Add Gradio web interface with real-time belief visualization and game controls
- Implement proper information filtering: belief agent receives only comparison results, not dice values
- Add comprehensive test suite (78 tests) including architectural constraint verification
- Add memory leak prevention and graceful game completion in UI
- Support configurable dice sides and round counts with reproducible seeded experiments

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (20) hide show

.gitignore +191 -0
CLAUDE.md +107 -0
README.md +242 -0
app.py +24 -0
domains/__init__.py +1 -0
domains/belief/__init__.py +1 -0
domains/belief/belief_domain.py +123 -0
domains/coordination/__init__.py +1 -0
domains/coordination/game_coordination.py +193 -0
domains/environment/__init__.py +1 -0
domains/environment/environment_domain.py +87 -0
requirements.txt +4 -0
tests/__init__.py +1 -0
tests/test_architectural_constraints.py +159 -0
tests/test_belief_domain.py +295 -0
tests/test_environment_domain.py +187 -0
tests/test_game_coordination.py +351 -0
tests/test_ui_interface.py +243 -0
ui/__init__.py +1 -0
ui/gradio_interface.py +370 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,191 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+.pybuilder/
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+# pytype static type analyzer
+.pytype/
+# Cython debug symbols
+cython_debug/
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be added to the global gitignore or merged into this project gitignore.  For a PyCharm
+#  project, it is not recommended to check the .gitignore file into the git repo
+#  but consider adding it to the global gitignore or setting up git config properly
+#  for your development environment.
+.idea/
+# VS Code
+.vscode/
+# macOS
+.DS_Store
+.DS_Store?
+._*
+.Spotlight-V100
+.Trashes
+ehthumbs.db
+Thumbs.db
+# Windows
+Thumbs.db
+ehthumbs.db
+Desktop.ini
+# Gradio temporary files
+gradio_cached_examples/
+flagged/
+# Matplotlib cache
+.matplotlib/
+# Temporary files
+*.tmp
+*.temp
+temp/
+tmp/

CLAUDE.md ADDED Viewed

	@@ -0,0 +1,107 @@

+# Bayesian Game Project
+## Project Overview
+A Bayesian Game implementation featuring a Belief-based Agent using domain-driven design.
+## Game Rules
+- Judge and Player 1 can see the target die value
+- Player 2 must deduce the target value using only comparison results
+- Player 1 rolls dice and reports "higher"/"lower"/"same" compared to target
+- **CRITICAL**: Player 2 receives ONLY the comparison result, NOT the dice roll value
+- Game runs for 10 rounds
+- Judge ensures truth-telling
+## Development Practices
+- Use conventional commits when committing code to git
+## Architecture
+Domain-Driven Design with 3 modules:
+1. **Environment Domain** (`domains/environment/environment_domain.py`)
+   - EnvironmentEvidence dataclass (contains dice_roll AND comparison_result)
+   - Environment class for target/evidence generation
+   - **ACCESS**: Full knowledge of dice rolls and target values
+2. **Belief Domain** (`domains/belief/belief_domain.py`)
+   - BeliefUpdate dataclass (contains ONLY comparison_result)
+   - BayesianBeliefState class for inference
+   - **ACCESS**: NO knowledge of dice roll values or true target
+   - **CONSTRAINT**: Must calculate P(comparison_result | target) probabilistically
+3. **Game Coordination** (`domains/coordination/game_coordination.py`)
+   - GameState dataclass (tracks full game state)
+   - BayesianGame orchestration class
+   - **RESPONSIBILITY**: Filters EnvironmentEvidence to create BeliefUpdate
+## Development Commands
+- Test: `python -m pytest tests/`
+- Run: `python app.py`
+## Folder Structure
+```
+bayesian_game/
+├── domains/
+│   ├── environment/environment_domain.py
+│   ├── belief/belief_domain.py
+│   └── coordination/game_coordination.py
+├── ui/gradio_interface.py
+├── tests/
+├── app.py              # Hugging Face entry point
+├── requirements.txt
+└── CLAUDE.md
+```
+## Implementation Status
+- ✅ Architecture implemented with proper domain separation
+- ✅ Domain-driven design with information filtering enforced
+- ✅ Gradio UI with graceful completion and comprehensive final results
+- ✅ Comprehensive test suite (78 tests) ensuring architectural constraints
+- ✅ Proper Bayesian inference without dice roll knowledge
+- ✅ Memory leak prevention in matplotlib figure generation
+## Key Design Decisions & Architectural Constraints
+### Information Flow Rules
+1. **Environment → Coordination**: EnvironmentEvidence (dice_roll + comparison_result)
+2. **Coordination → Belief**: BeliefUpdate (comparison_result ONLY)
+3. **NEVER**: Direct Environment → Belief communication
+4. **NEVER**: Belief domain access to dice roll values
+### Domain Separation Principles
+- **Environment Domain**: No probability knowledge, pure evidence generation
+- **Belief Domain**: Pure Bayesian inference, no knowledge of actual dice values
+- **Coordination Layer**: Thin orchestration, responsible for information filtering
+- **UI Layer**: Separate from core game logic, can display full information
+### Critical Implementation Rules
+- BeliefUpdate dataclass MUST contain only comparison_result
+- BayesianBeliefState MUST calculate P(comparison_result | target) probabilistically
+- Game coordination MUST filter dice_roll from EnvironmentEvidence before passing to belief domain
+- Tests MUST verify that belief domain never receives dice roll values
+## Maintaining Architectural Integrity
+### Code Review Checklist
+When modifying the codebase, ensure:
+- [ ] BeliefUpdate contains ONLY comparison_result field
+- [ ] No dice_roll parameter passed to belief domain methods
+- [ ] Game coordination filters EnvironmentEvidence properly
+- [ ] Tests verify belief domain isolation
+- [ ] Belief calculations use probabilistic formulas, not direct dice values
+### Anti-Patterns to Avoid
+❌ `BeliefUpdate(dice_roll=X, comparison_result=Y)` - belief shouldn't know dice value
+❌ Direct Environment-Belief communication
+❌ Belief domain knowing actual dice roll or target values
+❌ Hard-coded probability values instead of calculated P(comparison_result | target)
+### Correct Patterns
+✅ `BeliefUpdate(comparison_result="higher")` - only comparison result
+✅ Environment → Coordination → Belief information flow
+✅ Probabilistic calculations: P(roll > target) = (dice_sides - target) / dice_sides
+✅ Clean domain boundaries with no cross-dependencies
+## Dependencies
+- gradio (for UI)
+- numpy (for Bayesian calculations)
+- pytest (for testing)

README.md ADDED Viewed

	@@ -0,0 +1,242 @@

+# 🎲 Bayesian Game
+A Bayesian Game implementation featuring a Belief-based Agent using domain-driven design. This interactive game demonstrates Bayesian inference in action as Player 2 attempts to deduce a hidden target die value based on evidence from dice rolls.
+## 🎯 Game Overview
+**The Setup:**
+- Judge and Player 1 can see the target die value (1-6)
+- Player 2 must deduce the target value using Bayesian inference
+- Each round: Player 1 rolls dice and reports "higher"/"lower"/"same" compared to target
+- **Player 2 only receives the comparison result, NOT the actual dice roll value**
+- Game runs for 10 rounds (configurable)
+- Judge ensures truth-telling
+**The Challenge:**
+Player 2 starts with uniform beliefs about the target value and updates their beliefs after each piece of evidence using Bayes' rule. The key insight is that Player 2 must calculate the probability that ANY dice roll would produce the observed comparison result for each possible target value.
+## 🏗️ Architecture
+Built using **Domain-Driven Design** with clean separation of concerns:
+### 1. Environment Domain (`domains/environment/`)
+- **Pure evidence generation** - no probability knowledge
+- `EnvironmentEvidence`: Dataclass for dice roll results
+- `Environment`: Generates target values and dice roll comparisons
+### 2. Belief Domain (`domains/belief/`)
+- **Pure Bayesian inference** - receives only comparison results, no dice roll values
+- `BeliefUpdate`: Dataclass containing only comparison results
+- `BayesianBeliefState`: Calculates likelihood P(comparison_result | target) for each possible target
+### 3. Game Coordination (`domains/coordination/`)
+- **Thin orchestration layer** - coordinates between domains
+- `GameState`: Tracks current game state
+- `BayesianGame`: Main game orchestration class
+### 4. UI Layer (`ui/`)
+- Interactive Gradio web interface
+- Real-time belief visualization
+- Game controls and statistics display
+## 🚀 Quick Start
+### Prerequisites
+- Python 3.10+
+- `uv` package manager (recommended) or `pip`
+### Installation
+1. **Clone and navigate to the project:**
+```bash
+git clone <repository-url>
+cd bayesian_game
+```
+2. **Set up virtual environment:**
+```bash
+# Using uv (recommended)
+uv venv
+source .venv/bin/activate  # On Windows: .venv\Scripts\activate
+# Or using pip
+python -m venv venv
+source venv/bin/activate   # On Windows: venv\Scripts\activate
+```
+3. **Install dependencies:**
+```bash
+# Using uv
+uv pip install -r requirements.txt
+# Or using pip
+pip install -r requirements.txt
+```
+### Running the Game
+**Launch the interactive web interface:**
+```bash
+python app.py
+```
+The game will be available at `http://localhost:7860`
+**Run from command line (for development):**
+```python
+from domains.coordination.game_coordination import BayesianGame
+# Create and start a game
+game = BayesianGame(seed=42)
+game.start_new_game(target_value=3)
+# Play rounds
+for round_num in range(5):
+    state = game.play_round()
+    evidence = state.evidence_history[-1]
+    print(f"Round {round_num + 1}: Rolled {evidence.dice_roll} → {evidence.comparison_result}")
+    print(f"Most likely target: {state.most_likely_target}")
+    print(f"Belief entropy: {state.belief_entropy:.2f}")
+```
+## 🧪 Testing
+Run the comprehensive test suite:
+```bash
+# Run all tests
+python -m pytest tests/ -v
+# Run specific domain tests
+python -m pytest tests/test_environment_domain.py -v
+python -m pytest tests/test_belief_domain.py -v
+python -m pytest tests/test_game_coordination.py -v
+# Run with coverage
+python -m pytest tests/ --cov=domains --cov-report=html
+```
+**Test Coverage:**
+- 56 comprehensive tests
+- All core functionality covered
+- Edge cases and error handling tested
+- Reproducibility and randomness testing
+## 🎮 Game Interface
+The Gradio interface provides:
+- **Game Controls**: Start new games, play rounds, reset settings
+- **Real-time Visualization**: Belief probability distribution chart
+- **Game Statistics**: Entropy, accuracy, round information
+- **Evidence History**: Complete log of dice rolls and comparisons
+- **Customization**: Adjustable dice sides and round count
+### Interface Features
+- 📊 **Belief Distribution Chart**: Visual representation of Player 2's beliefs
+- 🎯 **Target Highlighting**: True target and most likely guess highlighted
+- 📝 **Evidence Log**: Complete history of all dice rolls and results
+- ⚙️ **Game Settings**: Customize dice sides (2-20) and max rounds (1-50)
+- 🔄 **Reset & Replay**: Easy game reset and replay functionality
+## 📁 Project Structure
+```
+bayesian_game/
+├── domains/                    # Core domain logic
+│   ├── environment/           # Evidence generation
+│   │   └── environment_domain.py
+│   ├── belief/               # Bayesian inference
+│   │   └── belief_domain.py
+│   └── coordination/         # Game orchestration
+│       └── game_coordination.py
+├── ui/                       # User interface
+│   └── gradio_interface.py
+├── tests/                    # Comprehensive test suite
+│   ├── test_environment_domain.py
+│   ├── test_belief_domain.py
+│   └── test_game_coordination.py
+├── app.py                    # Main entry point
+├── requirements.txt          # Dependencies
+├── CLAUDE.md                 # Project specifications
+└── README.md                 # This file
+```
+## 🔬 Key Features
+### Bayesian Inference Engine
+- **Proper Bayesian Updates**: Uses Bayes' rule for belief updates
+- **Entropy Calculation**: Measures uncertainty in beliefs
+- **Evidence Integration**: Combines multiple pieces of evidence
+- **Impossible Evidence Handling**: Gracefully handles contradictory evidence
+### Reproducible Experiments
+- **Seeded Randomness**: Reproducible results for testing
+- **Deterministic Behavior**: Same seed produces same game sequence
+- **Statistical Analysis**: Track accuracy and convergence
+### Clean Architecture
+- **Domain Separation**: Pure domains with no cross-dependencies
+- **Testable Components**: Each domain independently testable
+- **Extensible Design**: Easy to add new features or modify rules
+## 🎓 Educational Value
+This implementation demonstrates:
+- **Bayesian Inference**: Real-world application of Bayes' rule
+- **Uncertainty Quantification**: How beliefs evolve with evidence
+- **Information Theory**: Entropy as a measure of uncertainty
+- **Domain-Driven Design**: Clean software architecture patterns
+- **Test-Driven Development**: Comprehensive testing strategies
+## 🛠️ Development
+### Key Dependencies
+- `gradio`: Web interface framework
+- `numpy`: Numerical computations for Bayesian inference
+- `matplotlib`: Belief distribution visualization
+- `pytest`: Testing framework
+### Design Principles
+1. **Pure Functions**: Domains contain pure, testable functions
+2. **Immutable Data**: Evidence and belief updates are immutable
+3. **Clear Interfaces**: Well-defined boundaries between domains
+4. **Comprehensive Testing**: Every component thoroughly tested
+### Contributing
+1. Follow the existing domain-driven architecture
+2. Add tests for any new functionality
+3. Maintain clean separation between domains
+4. Update documentation for new features
+## 📊 Example Game Flow
+```
+Round 1: Evidence "higher" (dice roll > target)
+├─ P(roll>1)=5/6, P(roll>2)=4/6, ..., P(roll>6)=0/6
+├─ Lower targets become more likely
+└─ Entropy: 2.15 bits
+Round 2: Evidence "lower" (dice roll < target)
+├─ P(roll<1)=0/6, P(roll<2)=1/6, ..., P(roll<6)=5/6
+├─ Higher targets become more likely
+└─ Entropy: 1.97 bits
+Round 3: Evidence "same" (dice roll = target)
+├─ P(roll=target) = 1/6 for all targets
+├─ Beliefs remain proportional to previous round
+└─ Entropy: 1.97 bits (unchanged)
+```
+## 🚀 Deployment
+Ready for deployment on:
+- **Hugging Face Spaces**: Direct deployment support
+- **Local Server**: Built-in Gradio server
+- **Cloud Platforms**: Standard Python web app deployment
+---
+**Built with ❤️ using Domain-Driven Design and Bayesian Inference**

app.py ADDED Viewed

	@@ -0,0 +1,24 @@

+"""
+Bayesian Game - Hugging Face entry point
+A Bayesian Game implementation featuring a Belief-based Agent using domain-driven design.
+"""
+from ui.gradio_interface import create_interface
+def main():
+    """Main entry point for the Bayesian Game application."""
+    demo = create_interface()
+    # Launch with Hugging Face compatible settings
+    demo.launch(
+        server_name="0.0.0.0",
+        server_port=7860,
+        share=False,  # Set to True for public sharing if needed
+        show_error=True
+    )
+if __name__ == "__main__":
+    main()

domains/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Domains package initialization

domains/belief/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Belief domain package initialization

domains/belief/belief_domain.py ADDED Viewed

	@@ -0,0 +1,123 @@

+from dataclasses import dataclass
+from typing import List, Literal
+import numpy as np
+@dataclass
+class BeliefUpdate:
+    """Update information for Bayesian belief state."""
+    comparison_result: Literal["higher", "lower", "same"]
+class BayesianBeliefState:
+    """Bayesian belief state for inferring target die value.
+    Handles pure Bayesian inference without knowledge of actual values.
+    """
+    def __init__(self, dice_sides: int = 6):
+        """Initialize belief state with uniform prior.
+        Args:
+            dice_sides: Number of sides on the dice
+        """
+        self.dice_sides = dice_sides
+        # Uniform prior over all possible target values
+        self.beliefs = np.ones(dice_sides) / dice_sides
+        self.evidence_history: List[BeliefUpdate] = []
+    def get_current_beliefs(self) -> np.ndarray:
+        """Get current belief distribution over target values.
+        Returns:
+            Array of probabilities for each possible target value (1 to dice_sides)
+        """
+        return self.beliefs.copy()
+    def get_most_likely_target(self) -> int:
+        """Get the most likely target value based on current beliefs.
+        Returns:
+            Most likely target value (1-indexed)
+        """
+        return np.argmax(self.beliefs) + 1
+    def get_belief_for_target(self, target: int) -> float:
+        """Get belief probability for a specific target value.
+        Args:
+            target: Target value (1 to dice_sides)
+        Returns:
+            Probability that target is the true value
+        """
+        if not (1 <= target <= self.dice_sides):
+            raise ValueError(f"Target must be between 1 and {self.dice_sides}")
+        return self.beliefs[target - 1]
+    def update_beliefs(self, evidence: BeliefUpdate) -> None:
+        """Update beliefs based on new evidence using Bayes' rule.
+        Args:
+            evidence: New evidence to incorporate
+        """
+        self.evidence_history.append(evidence)
+        comparison_result = evidence.comparison_result
+        # Calculate likelihood for each possible target value
+        likelihoods = np.zeros(self.dice_sides)
+        for target_idx in range(self.dice_sides):
+            target_value = target_idx + 1
+            # Calculate P(comparison_result | target_value)
+            # This is the probability that ANY dice roll would produce this comparison result
+            if comparison_result == "higher":
+                # P(roll > target) = (dice_sides - target) / dice_sides
+                likelihood = (self.dice_sides - target_value) / self.dice_sides
+            elif comparison_result == "lower":
+                # P(roll < target) = (target - 1) / dice_sides
+                likelihood = (target_value - 1) / self.dice_sides
+            else:  # comparison_result == "same"
+                # P(roll = target) = 1 / dice_sides
+                likelihood = 1 / self.dice_sides
+            likelihoods[target_idx] = likelihood
+        # Apply Bayes' rule: posterior ∝ prior × likelihood
+        self.beliefs = self.beliefs * likelihoods
+        # Normalize to ensure probabilities sum to 1
+        total_belief = np.sum(self.beliefs)
+        if total_belief > 0:
+            self.beliefs = self.beliefs / total_belief
+        else:
+            # If all likelihoods are 0 (shouldn't happen with valid evidence),
+            # reset to uniform distribution
+            self.beliefs = np.ones(self.dice_sides) / self.dice_sides
+    def reset_beliefs(self) -> None:
+        """Reset beliefs to uniform prior and clear evidence history."""
+        self.beliefs = np.ones(self.dice_sides) / self.dice_sides
+        self.evidence_history = []
+    def get_entropy(self) -> float:
+        """Calculate entropy of current belief distribution.
+        Returns:
+            Entropy in bits (higher = more uncertain)
+        """
+        # Avoid log(0) by filtering out zero probabilities
+        non_zero_beliefs = self.beliefs[self.beliefs > 0]
+        if len(non_zero_beliefs) == 0:
+            return 0.0
+        return -np.sum(non_zero_beliefs * np.log2(non_zero_beliefs))
+    def get_evidence_count(self) -> int:
+        """Get number of evidence updates received.
+        Returns:
+            Number of evidence updates
+        """
+        return len(self.evidence_history)

domains/coordination/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Coordination domain package initialization

domains/coordination/game_coordination.py ADDED Viewed

	@@ -0,0 +1,193 @@

+from dataclasses import dataclass
+from typing import List, Dict, Any
+from enum import Enum
+from ..environment.environment_domain import Environment, EnvironmentEvidence
+from ..belief.belief_domain import BayesianBeliefState, BeliefUpdate
+class GamePhase(Enum):
+    """Phases of the Bayesian Game."""
+    SETUP = "setup"
+    PLAYING = "playing"
+    FINISHED = "finished"
+@dataclass
+class GameState:
+    """Current state of the Bayesian Game."""
+    round_number: int
+    max_rounds: int
+    phase: GamePhase
+    target_value: int = None
+    evidence_history: List[EnvironmentEvidence] = None
+    current_beliefs: List[float] = None
+    most_likely_target: int = None
+    belief_entropy: float = None
+    def __post_init__(self):
+        if self.evidence_history is None:
+            self.evidence_history = []
+        if self.current_beliefs is None:
+            self.current_beliefs = []
+class BayesianGame:
+    """Main orchestration class for the Bayesian Game.
+    Coordinates between Environment and Belief domains while maintaining
+    clean separation of concerns.
+    """
+    def __init__(self, dice_sides: int = 6, max_rounds: int = 10, seed: int = None):
+        """Initialize the Bayesian Game.
+        Args:
+            dice_sides: Number of sides on the dice
+            max_rounds: Maximum number of rounds to play
+            seed: Random seed for reproducible results
+        """
+        self.dice_sides = dice_sides
+        self.max_rounds = max_rounds
+        # Initialize domains
+        self.environment = Environment(dice_sides=dice_sides, seed=seed)
+        self.belief_state = BayesianBeliefState(dice_sides=dice_sides)
+        # Initialize game state
+        self.game_state = GameState(
+            round_number=0,
+            max_rounds=max_rounds,
+            phase=GamePhase.SETUP
+        )
+    def start_new_game(self, target_value: int = None) -> GameState:
+        """Start a new game with optional specific target value.
+        Args:
+            target_value: Specific target value, or None for random
+        Returns:
+            Initial game state
+        """
+        # Reset domains
+        self.belief_state.reset_beliefs()
+        # Set target value
+        if target_value is not None:
+            self.environment.set_target_value(target_value)
+        else:
+            self.environment.generate_random_target()
+        # Reset game state
+        self.game_state = GameState(
+            round_number=0,
+            max_rounds=self.max_rounds,
+            phase=GamePhase.PLAYING,
+            target_value=self.environment.get_target_value(),
+            evidence_history=[],
+            current_beliefs=self.belief_state.get_current_beliefs().tolist(),
+            most_likely_target=self.belief_state.get_most_likely_target(),
+            belief_entropy=self.belief_state.get_entropy()
+        )
+        return self.game_state
+    def play_round(self) -> GameState:
+        """Play one round of the game.
+        Returns:
+            Updated game state after the round
+        Raises:
+            ValueError: If game is not in playing phase
+        """
+        if self.game_state.phase != GamePhase.PLAYING:
+            raise ValueError("Game is not in playing phase")
+        if self.game_state.round_number >= self.max_rounds:
+            raise ValueError("Game has already finished")
+        # Generate evidence from environment
+        evidence = self.environment.roll_dice_and_compare()
+        # Update belief state (only pass comparison result, not dice roll)
+        belief_update = BeliefUpdate(
+            comparison_result=evidence.comparison_result
+        )
+        self.belief_state.update_beliefs(belief_update)
+        # Update game state
+        self.game_state.round_number += 1
+        self.game_state.evidence_history.append(evidence)
+        self.game_state.current_beliefs = self.belief_state.get_current_beliefs().tolist()
+        self.game_state.most_likely_target = self.belief_state.get_most_likely_target()
+        self.game_state.belief_entropy = self.belief_state.get_entropy()
+        # Check if game is finished
+        if self.game_state.round_number >= self.max_rounds:
+            self.game_state.phase = GamePhase.FINISHED
+        return self.game_state
+    def get_current_state(self) -> GameState:
+        """Get current game state.
+        Returns:
+            Current game state
+        """
+        return self.game_state
+    def is_game_finished(self) -> bool:
+        """Check if game is finished.
+        Returns:
+            True if game is finished
+        """
+        return self.game_state.phase == GamePhase.FINISHED
+    def get_final_guess_accuracy(self) -> float:
+        """Get accuracy of final guess (belief for true target).
+        Returns:
+            Probability assigned to true target value
+        Raises:
+            ValueError: If target value is not set
+        """
+        if self.game_state.target_value is None:
+            raise ValueError("Target value not set")
+        return self.belief_state.get_belief_for_target(self.game_state.target_value)
+    def was_final_guess_correct(self) -> bool:
+        """Check if the most likely target matches the true target.
+        Returns:
+            True if most likely target equals true target
+        Raises:
+            ValueError: If target value is not set
+        """
+        if self.game_state.target_value is None:
+            raise ValueError("Target value not set")
+        return bool(self.game_state.most_likely_target == self.game_state.target_value)
+    def get_game_summary(self) -> Dict[str, Any]:
+        """Get summary of completed game.
+        Returns:
+            Dictionary with game summary statistics
+        """
+        return {
+            "rounds_played": self.game_state.round_number,
+            "max_rounds": self.max_rounds,
+            "true_target": self.game_state.target_value,
+            "final_guess": self.game_state.most_likely_target,
+            "guess_correct": self.was_final_guess_correct(),
+            "final_accuracy": self.get_final_guess_accuracy(),
+            "final_entropy": self.game_state.belief_entropy,
+            "evidence_count": len(self.game_state.evidence_history),
+            "final_beliefs": dict(enumerate(self.game_state.current_beliefs, 1))
+        }

domains/environment/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Environment domain package initialization

domains/environment/environment_domain.py ADDED Viewed

	@@ -0,0 +1,87 @@

+from dataclasses import dataclass
+from typing import Literal
+import random
+@dataclass
+class EnvironmentEvidence:
+    """Evidence generated by the environment - dice roll and comparison result."""
+    dice_roll: int
+    comparison_result: Literal["higher", "lower", "same"]
+class Environment:
+    """Environment domain that generates target values and evidence.
+    Has no knowledge of probabilities - purely generates observable evidence.
+    """
+    def __init__(self, dice_sides: int = 6, seed: int = None):
+        """Initialize environment with dice configuration.
+        Args:
+            dice_sides: Number of sides on the dice (default 6)
+            seed: Random seed for reproducible results
+        """
+        self.dice_sides = dice_sides
+        self._random_state = random.Random(seed) if seed is not None else random.Random()
+        self._target_value = None
+    def set_target_value(self, target: int) -> None:
+        """Set the target die value that Player 2 must guess.
+        Args:
+            target: Target value (1 to dice_sides)
+        """
+        if not (1 <= target <= self.dice_sides):
+            raise ValueError(f"Target must be between 1 and {self.dice_sides}")
+        self._target_value = target
+    def get_target_value(self) -> int:
+        """Get the current target value.
+        Returns:
+            Current target value
+        Raises:
+            ValueError: If target value hasn't been set
+        """
+        if self._target_value is None:
+            raise ValueError("Target value not set")
+        return self._target_value
+    def generate_random_target(self) -> int:
+        """Generate and set a random target value.
+        Returns:
+            The generated target value
+        """
+        target = self._random_state.randint(1, self.dice_sides)
+        self.set_target_value(target)
+        return target
+    def roll_dice_and_compare(self) -> EnvironmentEvidence:
+        """Roll dice and compare to target, generating evidence.
+        Returns:
+            EnvironmentEvidence with dice roll and comparison result
+        Raises:
+            ValueError: If target value hasn't been set
+        """
+        if self._target_value is None:
+            raise ValueError("Target value not set")
+        dice_roll = self._random_state.randint(1, self.dice_sides)
+        if dice_roll > self._target_value:
+            comparison_result = "higher"
+        elif dice_roll < self._target_value:
+            comparison_result = "lower"
+        else:
+            comparison_result = "same"
+        return EnvironmentEvidence(
+            dice_roll=dice_roll,
+            comparison_result=comparison_result
+        )

requirements.txt ADDED Viewed

	@@ -0,0 +1,4 @@

+gradio>=4.0.0
+numpy>=1.21.0
+matplotlib>=3.5.0
+pytest>=7.0.0

tests/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Test package initialization

tests/test_architectural_constraints.py ADDED Viewed

	@@ -0,0 +1,159 @@

+"""
+Architectural constraint tests to ensure proper domain separation.
+These tests verify that the key architectural principles are maintained:
+1. Belief domain receives only comparison results, not dice roll values
+2. Information flows correctly through the coordination layer
+3. Domain boundaries are properly enforced
+"""
+import pytest
+import inspect
+from domains.belief.belief_domain import BeliefUpdate, BayesianBeliefState
+from domains.environment.environment_domain import EnvironmentEvidence
+from domains.coordination.game_coordination import BayesianGame
+class TestArchitecturalConstraints:
+    """Test architectural constraints and domain separation."""
+    def test_belief_update_dataclass_structure(self):
+        """Test that BeliefUpdate contains only comparison_result field."""
+        # Get all fields of BeliefUpdate
+        fields = BeliefUpdate.__dataclass_fields__
+        # Should only contain comparison_result
+        assert len(fields) == 1, f"BeliefUpdate should have exactly 1 field, got {len(fields)}: {list(fields.keys())}"
+        assert "comparison_result" in fields, "BeliefUpdate must contain comparison_result field"
+        assert "dice_roll" not in fields, "BeliefUpdate MUST NOT contain dice_roll field"
+    def test_environment_evidence_dataclass_structure(self):
+        """Test that EnvironmentEvidence contains both dice_roll and comparison_result."""
+        # Get all fields of EnvironmentEvidence
+        fields = EnvironmentEvidence.__dataclass_fields__
+        # Should contain both fields
+        assert len(fields) == 2, f"EnvironmentEvidence should have exactly 2 fields, got {len(fields)}: {list(fields.keys())}"
+        assert "dice_roll" in fields, "EnvironmentEvidence must contain dice_roll field"
+        assert "comparison_result" in fields, "EnvironmentEvidence must contain comparison_result field"
+    def test_belief_state_methods_no_dice_roll_parameters(self):
+        """Test that BayesianBeliefState methods don't accept dice_roll parameters."""
+        # Get all methods of BayesianBeliefState
+        methods = inspect.getmembers(BayesianBeliefState, predicate=inspect.isfunction)
+        for method_name, method in methods:
+            if method_name.startswith('_'):
+                continue  # Skip private methods
+            signature = inspect.signature(method)
+            param_names = list(signature.parameters.keys())
+            assert "dice_roll" not in param_names, f"Method {method_name} MUST NOT have dice_roll parameter"
+    def test_belief_update_creation_without_dice_roll(self):
+        """Test that BeliefUpdate can be created without dice_roll."""
+        # This should work (only comparison_result)
+        update = BeliefUpdate(comparison_result="higher")
+        assert update.comparison_result == "higher"
+        # This should fail if dice_roll field exists
+        try:
+            # This should raise TypeError if dice_roll is not a field
+            BeliefUpdate(dice_roll=3, comparison_result="higher")
+            pytest.fail("BeliefUpdate should not accept dice_roll parameter")
+        except TypeError:
+            pass  # Expected - dice_roll should not be a valid parameter
+    def test_information_filtering_in_coordination(self):
+        """Test that game coordination properly filters information to belief domain."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=3)
+        # Get initial belief state
+        initial_beliefs = game.belief_state.get_current_beliefs()
+        # Play a round (this should trigger proper information filtering)
+        game.play_round()
+        # Verify that belief state received update (beliefs changed)
+        updated_beliefs = game.belief_state.get_current_beliefs()
+        assert not all(a == b for a, b in zip(initial_beliefs, updated_beliefs)), \
+            "Beliefs should change after receiving evidence"
+        # Verify that evidence history in belief domain contains only comparison results
+        for evidence in game.belief_state.evidence_history:
+            assert hasattr(evidence, "comparison_result"), "Belief evidence must have comparison_result"
+            assert not hasattr(evidence, "dice_roll"), "Belief evidence MUST NOT have dice_roll"
+    def test_domain_import_isolation(self):
+        """Test that belief domain doesn't import environment domain."""
+        import domains.belief.belief_domain as belief_module
+        # Get all imports in the belief domain module
+        belief_source = inspect.getsource(belief_module)
+        # Should not import environment domain
+        assert "from domains.environment" not in belief_source, \
+            "Belief domain MUST NOT import environment domain"
+        assert "import domains.environment" not in belief_source, \
+            "Belief domain MUST NOT import environment domain"
+    def test_proper_bayesian_calculation_structure(self):
+        """Test that belief updates use probabilistic calculations."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Apply "higher" evidence
+        update = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update)
+        # Verify that probabilities follow expected pattern for "higher"
+        # Target 1: P(roll > 1) = 5/6, should be highest
+        # Target 6: P(roll > 6) = 0/6, should be zero
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        assert prob_1 > prob_6, "Higher evidence should favor lower targets"
+        assert abs(prob_6 - 0.0) < 1e-10, "Target 6 should have zero probability after 'higher' evidence"
+    def test_coordination_layer_responsibility(self):
+        """Test that coordination layer properly orchestrates without leaking information."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=4)
+        # Play a round to generate evidence
+        state = game.play_round()
+        # Game state should have full information (for display)
+        assert hasattr(state.evidence_history[0], "dice_roll"), \
+            "Game state should maintain full evidence for display"
+        assert hasattr(state.evidence_history[0], "comparison_result"), \
+            "Game state should maintain comparison results"
+        # But belief state should only have comparison results
+        belief_evidence = game.belief_state.evidence_history[0]
+        assert hasattr(belief_evidence, "comparison_result"), \
+            "Belief evidence must have comparison_result"
+        assert not hasattr(belief_evidence, "dice_roll"), \
+            "Belief evidence MUST NOT have dice_roll"
+    def test_no_hard_coded_probabilities(self):
+        """Test that belief calculations are dynamic, not hard-coded."""
+        # Test with different dice sides to ensure calculations are dynamic
+        for dice_sides in [4, 6, 8, 10]:
+            belief_state = BayesianBeliefState(dice_sides=dice_sides)
+            # Apply "higher" evidence
+            update = BeliefUpdate(comparison_result="higher")
+            belief_state.update_beliefs(update)
+            # Target 1 should have highest probability: P(roll > 1) = (dice_sides - 1) / dice_sides
+            # Last target should have zero probability: P(roll > dice_sides) = 0
+            prob_1 = belief_state.get_belief_for_target(1)
+            prob_last = belief_state.get_belief_for_target(dice_sides)
+            expected_prob_1_unnormalized = (dice_sides - 1) / dice_sides
+            assert prob_1 > prob_last, f"Target 1 should be more likely than target {dice_sides}"
+            assert abs(prob_last - 0.0) < 1e-10, f"Target {dice_sides} should have zero probability"
+            assert prob_1 > 0, "Target 1 should have non-zero probability"

tests/test_belief_domain.py ADDED Viewed

	@@ -0,0 +1,295 @@

+import pytest
+import numpy as np
+from domains.belief.belief_domain import BayesianBeliefState, BeliefUpdate
+class TestBeliefUpdate:
+    """Test the BeliefUpdate dataclass."""
+    def test_belief_update_creation(self):
+        """Test creating belief update with valid data."""
+        update = BeliefUpdate(comparison_result="higher")
+        assert update.comparison_result == "higher"
+    def test_belief_update_all_results(self):
+        """Test belief update with all comparison results."""
+        valid_results = ["higher", "lower", "same"]
+        for result in valid_results:
+            update = BeliefUpdate(comparison_result=result)
+            assert update.comparison_result == result
+class TestBayesianBeliefState:
+    """Test the BayesianBeliefState class."""
+    def test_initialization_default(self):
+        """Test initialization with default parameters."""
+        belief_state = BayesianBeliefState()
+        assert belief_state.dice_sides == 6
+        assert len(belief_state.beliefs) == 6
+        assert np.allclose(belief_state.beliefs, 1/6)  # Uniform prior
+        assert len(belief_state.evidence_history) == 0
+    def test_initialization_custom(self):
+        """Test initialization with custom dice sides."""
+        belief_state = BayesianBeliefState(dice_sides=8)
+        assert belief_state.dice_sides == 8
+        assert len(belief_state.beliefs) == 8
+        assert np.allclose(belief_state.beliefs, 1/8)  # Uniform prior
+    def test_get_current_beliefs(self):
+        """Test getting current beliefs returns copy."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        beliefs = belief_state.get_current_beliefs()
+        # Should be a copy, not reference
+        beliefs[0] = 0.5
+        assert not np.array_equal(beliefs, belief_state.beliefs)
+        assert np.allclose(belief_state.beliefs, 1/6)
+    def test_get_most_likely_target_uniform(self):
+        """Test getting most likely target with uniform distribution."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # With uniform distribution, should return first target (index 0 + 1)
+        most_likely = belief_state.get_most_likely_target()
+        assert most_likely == 1
+    def test_get_most_likely_target_after_update(self):
+        """Test getting most likely target after belief update."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Update with evidence that favors lower target values
+        update = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update)
+        # Lower targets are more likely to result in "higher" comparison
+        most_likely = belief_state.get_most_likely_target()
+        assert most_likely in range(1, 7)  # Should be valid
+    def test_get_belief_for_target_valid(self):
+        """Test getting belief for valid target values."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        for target in range(1, 7):
+            belief = belief_state.get_belief_for_target(target)
+            assert abs(belief - 1/6) < 1e-10  # Should be uniform initially
+    def test_get_belief_for_target_invalid(self):
+        """Test getting belief for invalid target values raises error."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        invalid_targets = [0, 7, -1, 10]
+        for target in invalid_targets:
+            with pytest.raises(ValueError, match="Target must be between 1 and 6"):
+                belief_state.get_belief_for_target(target)
+    def test_update_beliefs_higher(self):
+        """Test belief update with 'higher' evidence."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Evidence: comparison result is "higher" (dice roll > target)
+        # This is more likely for lower target values
+        update = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update)
+        # Lower targets should have higher probability than higher targets
+        # Target 1: P(roll > 1) = 5/6
+        # Target 6: P(roll > 6) = 0/6
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        assert prob_1 > prob_6  # Target 1 should be more likely than target 6
+        assert abs(prob_6 - 0.0) < 1e-10  # Target 6 should have zero probability
+    def test_update_beliefs_lower(self):
+        """Test belief update with 'lower' evidence."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Evidence: comparison result is "lower" (dice roll < target)
+        # This is more likely for higher target values
+        update = BeliefUpdate(comparison_result="lower")
+        belief_state.update_beliefs(update)
+        # Higher targets should have higher probability than lower targets
+        # Target 1: P(roll < 1) = 0/6
+        # Target 6: P(roll < 6) = 5/6
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        assert prob_6 > prob_1  # Target 6 should be more likely than target 1
+        assert abs(prob_1 - 0.0) < 1e-10  # Target 1 should have zero probability
+    def test_update_beliefs_same(self):
+        """Test belief update with 'same' evidence."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Evidence: comparison result is "same" (dice roll = target)
+        # This has equal probability for all targets: P(roll = target) = 1/6
+        update = BeliefUpdate(comparison_result="same")
+        belief_state.update_beliefs(update)
+        # All targets should have equal probability since P(roll = target) = 1/6 for all
+        for target in range(1, 7):
+            prob = belief_state.get_belief_for_target(target)
+            assert abs(prob - 1/6) < 1e-10  # Should remain uniform
+    def test_update_beliefs_multiple(self):
+        """Test multiple belief updates."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # First update: "higher" (favors lower targets)
+        update1 = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update1)
+        # Second update: "lower" (favors higher targets)
+        update2 = BeliefUpdate(comparison_result="lower")
+        belief_state.update_beliefs(update2)
+        # The combination should favor middle targets
+        # Target 1: P(roll>1) * P(roll<1) = 5/6 * 0 = 0
+        # Target 6: P(roll>6) * P(roll<6) = 0 * 5/6 = 0
+        # Middle targets should have non-zero probability
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        prob_3 = belief_state.get_belief_for_target(3)
+        assert abs(prob_1 - 0.0) < 1e-10  # Target 1 should be eliminated
+        assert abs(prob_6 - 0.0) < 1e-10  # Target 6 should be eliminated
+        assert prob_3 > 0  # Middle targets should have some probability
+    def test_update_beliefs_evidence_history(self):
+        """Test that evidence history is maintained."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        updates = [
+            BeliefUpdate(comparison_result="higher"),
+            BeliefUpdate(comparison_result="lower"),
+            BeliefUpdate(comparison_result="same")
+        ]
+        for update in updates:
+            belief_state.update_beliefs(update)
+        assert len(belief_state.evidence_history) == 3
+        assert belief_state.evidence_history == updates
+    def test_reset_beliefs(self):
+        """Test resetting beliefs to uniform prior."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Update beliefs
+        update = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update)
+        # Verify beliefs changed from uniform
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        assert prob_1 != prob_6  # Should no longer be uniform
+        assert len(belief_state.evidence_history) == 1
+        # Reset beliefs
+        belief_state.reset_beliefs()
+        # Should be back to uniform
+        for target in range(1, 7):
+            assert abs(belief_state.get_belief_for_target(target) - 1/6) < 1e-10
+        assert len(belief_state.evidence_history) == 0
+    def test_get_entropy_uniform(self):
+        """Test entropy calculation for uniform distribution."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        entropy = belief_state.get_entropy()
+        expected_entropy = np.log2(6)  # Maximum entropy for 6 outcomes
+        assert abs(entropy - expected_entropy) < 1e-10
+    def test_get_entropy_certain(self):
+        """Test entropy calculation for certain distribution."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Create a near-certain belief by applying many "higher" updates
+        # This will eventually make target 1 much more likely than others
+        for _ in range(10):
+            update = BeliefUpdate(comparison_result="higher")
+            belief_state.update_beliefs(update)
+        entropy = belief_state.get_entropy()
+        max_entropy = np.log2(6)
+        assert entropy < max_entropy  # Should be much less than maximum entropy
+    def test_get_entropy_partial(self):
+        """Test entropy calculation for partial certainty."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Reduce uncertainty but don't eliminate it
+        update = BeliefUpdate(comparison_result="higher")
+        belief_state.update_beliefs(update)
+        entropy = belief_state.get_entropy()
+        max_entropy = np.log2(6)
+        min_entropy = 0
+        # Should be between min and max
+        assert min_entropy < entropy < max_entropy
+    def test_get_evidence_count(self):
+        """Test getting evidence count."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        assert belief_state.get_evidence_count() == 0
+        # Add some evidence
+        updates = [
+            BeliefUpdate(comparison_result="higher"),
+            BeliefUpdate(comparison_result="lower")
+        ]
+        for i, update in enumerate(updates, 1):
+            belief_state.update_beliefs(update)
+            assert belief_state.get_evidence_count() == i
+    def test_beliefs_sum_to_one(self):
+        """Test that beliefs always sum to 1 after updates."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        updates = [
+            BeliefUpdate(comparison_result="higher"),
+            BeliefUpdate(comparison_result="lower"),
+            BeliefUpdate(comparison_result="same"),
+            BeliefUpdate(comparison_result="higher")
+        ]
+        # Check initial sum
+        assert abs(np.sum(belief_state.beliefs) - 1.0) < 1e-10
+        # Check sum after each update
+        for update in updates:
+            belief_state.update_beliefs(update)
+            assert abs(np.sum(belief_state.beliefs) - 1.0) < 1e-10
+    def test_impossible_evidence_handling(self):
+        """Test handling of evidence combinations that create zero likelihoods."""
+        belief_state = BayesianBeliefState(dice_sides=6)
+        # Apply a few "higher" results to favor lower targets
+        for _ in range(3):
+            update1 = BeliefUpdate(comparison_result="higher")
+            belief_state.update_beliefs(update1)
+        # Target 1 should be favored, target 6 should have zero probability
+        prob_1 = belief_state.get_belief_for_target(1)
+        prob_6 = belief_state.get_belief_for_target(6)
+        assert prob_1 > 0  # Target 1 should have some probability
+        assert abs(prob_6 - 0.0) < 1e-10  # Target 6 should have zero probability
+        # Apply more evidence and verify probabilities still sum to 1
+        update2 = BeliefUpdate(comparison_result="lower")
+        belief_state.update_beliefs(update2)
+        total_prob = sum(belief_state.get_belief_for_target(i) for i in range(1, 7))
+        assert abs(total_prob - 1.0) < 1e-10  # Should still sum to 1

tests/test_environment_domain.py ADDED Viewed

	@@ -0,0 +1,187 @@

+import pytest
+import random
+from domains.environment.environment_domain import Environment, EnvironmentEvidence
+class TestEnvironmentEvidence:
+    """Test the EnvironmentEvidence dataclass."""
+    def test_evidence_creation(self):
+        """Test creating evidence with valid data."""
+        evidence = EnvironmentEvidence(dice_roll=3, comparison_result="higher")
+        assert evidence.dice_roll == 3
+        assert evidence.comparison_result == "higher"
+    def test_evidence_comparison_results(self):
+        """Test all valid comparison results."""
+        valid_results = ["higher", "lower", "same"]
+        for result in valid_results:
+            evidence = EnvironmentEvidence(dice_roll=1, comparison_result=result)
+            assert evidence.comparison_result == result
+class TestEnvironment:
+    """Test the Environment class."""
+    def test_environment_initialization(self):
+        """Test environment initialization with default and custom parameters."""
+        # Default initialization
+        env = Environment()
+        assert env.dice_sides == 6
+        assert env._target_value is None
+        # Custom initialization
+        env = Environment(dice_sides=8, seed=42)
+        assert env.dice_sides == 8
+        assert env._target_value is None
+    def test_set_target_value_valid(self):
+        """Test setting valid target values."""
+        env = Environment(dice_sides=6)
+        for target in range(1, 7):
+            env.set_target_value(target)
+            assert env.get_target_value() == target
+    def test_set_target_value_invalid(self):
+        """Test setting invalid target values raises ValueError."""
+        env = Environment(dice_sides=6)
+        invalid_targets = [0, 7, -1, 10]
+        for target in invalid_targets:
+            with pytest.raises(ValueError, match="Target must be between 1 and 6"):
+                env.set_target_value(target)
+    def test_get_target_value_not_set(self):
+        """Test getting target value when not set raises ValueError."""
+        env = Environment()
+        with pytest.raises(ValueError, match="Target value not set"):
+            env.get_target_value()
+    def test_generate_random_target(self):
+        """Test random target generation."""
+        env = Environment(dice_sides=6, seed=42)
+        # Generate multiple targets to test randomness
+        targets = [env.generate_random_target() for _ in range(10)]
+        # All targets should be valid
+        for target in targets:
+            assert 1 <= target <= 6
+        # Should be able to get the target after generation
+        assert env.get_target_value() == targets[-1]
+    def test_generate_random_target_reproducible(self):
+        """Test that random target generation is reproducible with seed."""
+        env1 = Environment(dice_sides=6, seed=42)
+        env2 = Environment(dice_sides=6, seed=42)
+        target1 = env1.generate_random_target()
+        target2 = env2.generate_random_target()
+        assert target1 == target2
+    def test_roll_dice_and_compare_target_not_set(self):
+        """Test rolling dice without target set raises ValueError."""
+        env = Environment()
+        with pytest.raises(ValueError, match="Target value not set"):
+            env.roll_dice_and_compare()
+    def test_roll_dice_and_compare_higher(self):
+        """Test dice roll comparison when result is higher."""
+        env = Environment(dice_sides=6, seed=42)
+        env.set_target_value(1)  # Target = 1, any roll > 1 should be "higher"
+        # Run multiple times to test different rolls
+        results = []
+        for _ in range(20):
+            evidence = env.roll_dice_and_compare()
+            results.append(evidence)
+            assert 1 <= evidence.dice_roll <= 6
+            if evidence.dice_roll > 1:
+                assert evidence.comparison_result == "higher"
+            elif evidence.dice_roll < 1:
+                assert evidence.comparison_result == "lower"
+            else:
+                assert evidence.comparison_result == "same"
+    def test_roll_dice_and_compare_lower(self):
+        """Test dice roll comparison when result is lower."""
+        env = Environment(dice_sides=6, seed=42)
+        env.set_target_value(6)  # Target = 6, any roll < 6 should be "lower"
+        # Run multiple times to test different rolls
+        for _ in range(20):
+            evidence = env.roll_dice_and_compare()
+            assert 1 <= evidence.dice_roll <= 6
+            if evidence.dice_roll > 6:
+                assert evidence.comparison_result == "higher"
+            elif evidence.dice_roll < 6:
+                assert evidence.comparison_result == "lower"
+            else:
+                assert evidence.comparison_result == "same"
+    def test_roll_dice_and_compare_same(self):
+        """Test dice roll comparison when result is same."""
+        env = Environment(dice_sides=6, seed=42)
+        # Test each possible target value
+        for target in range(1, 7):
+            env.set_target_value(target)
+            # Roll until we get a match (may take several tries)
+            found_same = False
+            for _ in range(100):  # Avoid infinite loop
+                evidence = env.roll_dice_and_compare()
+                if evidence.dice_roll == target:
+                    assert evidence.comparison_result == "same"
+                    found_same = True
+                    break
+                elif evidence.dice_roll > target:
+                    assert evidence.comparison_result == "higher"
+                else:
+                    assert evidence.comparison_result == "lower"
+            # With 100 attempts, we should find at least one match for 6-sided die
+            assert found_same, f"Failed to roll target value {target} in 100 attempts"
+    def test_roll_dice_and_compare_all_outcomes(self):
+        """Test that all comparison outcomes can occur."""
+        env = Environment(dice_sides=6, seed=42)
+        env.set_target_value(3)  # Middle value to allow all outcomes
+        outcomes_seen = set()
+        # Roll many times to see all outcomes
+        for _ in range(100):
+            evidence = env.roll_dice_and_compare()
+            outcomes_seen.add(evidence.comparison_result)
+            # Verify consistency
+            if evidence.dice_roll > 3:
+                assert evidence.comparison_result == "higher"
+            elif evidence.dice_roll < 3:
+                assert evidence.comparison_result == "lower"
+            else:
+                assert evidence.comparison_result == "same"
+        # Should see all three outcomes with enough rolls
+        assert "higher" in outcomes_seen
+        assert "lower" in outcomes_seen
+        assert "same" in outcomes_seen
+    def test_dice_sides_parameter(self):
+        """Test environment with different dice sides."""
+        for sides in [4, 8, 10, 20]:
+            env = Environment(dice_sides=sides, seed=42)
+            env.set_target_value(sides // 2)  # Middle value
+            evidence = env.roll_dice_and_compare()
+            assert 1 <= evidence.dice_roll <= sides
+            assert evidence.comparison_result in ["higher", "lower", "same"]

tests/test_game_coordination.py ADDED Viewed

	@@ -0,0 +1,351 @@

+import pytest
+from domains.coordination.game_coordination import BayesianGame, GameState, GamePhase
+from domains.environment.environment_domain import EnvironmentEvidence
+class TestGameState:
+    """Test the GameState dataclass."""
+    def test_game_state_creation(self):
+        """Test creating game state with required parameters."""
+        state = GameState(
+            round_number=5,
+            max_rounds=10,
+            phase=GamePhase.PLAYING
+        )
+        assert state.round_number == 5
+        assert state.max_rounds == 10
+        assert state.phase == GamePhase.PLAYING
+        assert state.target_value is None
+        assert state.evidence_history == []
+        assert state.current_beliefs == []
+    def test_game_state_with_optional_params(self):
+        """Test creating game state with optional parameters."""
+        evidence = [EnvironmentEvidence(dice_roll=3, comparison_result="higher")]
+        beliefs = [0.2, 0.3, 0.5]
+        state = GameState(
+            round_number=2,
+            max_rounds=5,
+            phase=GamePhase.PLAYING,
+            target_value=4,
+            evidence_history=evidence,
+            current_beliefs=beliefs,
+            most_likely_target=3,
+            belief_entropy=1.5
+        )
+        assert state.target_value == 4
+        assert state.evidence_history == evidence
+        assert state.current_beliefs == beliefs
+        assert state.most_likely_target == 3
+        assert state.belief_entropy == 1.5
+class TestBayesianGame:
+    """Test the BayesianGame class."""
+    def test_initialization_default(self):
+        """Test game initialization with default parameters."""
+        game = BayesianGame()
+        assert game.dice_sides == 6
+        assert game.max_rounds == 10
+        assert game.environment.dice_sides == 6
+        assert game.belief_state.dice_sides == 6
+        assert game.game_state.phase == GamePhase.SETUP
+        assert game.game_state.round_number == 0
+        assert game.game_state.max_rounds == 10
+    def test_initialization_custom(self):
+        """Test game initialization with custom parameters."""
+        game = BayesianGame(dice_sides=8, max_rounds=15, seed=42)
+        assert game.dice_sides == 8
+        assert game.max_rounds == 15
+        assert game.environment.dice_sides == 8
+        assert game.belief_state.dice_sides == 8
+        assert game.game_state.max_rounds == 15
+    def test_start_new_game_random_target(self):
+        """Test starting new game with random target."""
+        game = BayesianGame(seed=42)
+        state = game.start_new_game()
+        assert state.phase == GamePhase.PLAYING
+        assert state.round_number == 0
+        assert 1 <= state.target_value <= 6
+        assert len(state.evidence_history) == 0
+        assert len(state.current_beliefs) == 6
+        assert state.most_likely_target in range(1, 7)
+        assert state.belief_entropy > 0
+    def test_start_new_game_specific_target(self):
+        """Test starting new game with specific target."""
+        game = BayesianGame()
+        state = game.start_new_game(target_value=4)
+        assert state.phase == GamePhase.PLAYING
+        assert state.target_value == 4
+        assert game.environment.get_target_value() == 4
+    def test_start_new_game_resets_state(self):
+        """Test that starting new game resets previous state."""
+        game = BayesianGame(seed=42)
+        # Start first game and play some rounds
+        game.start_new_game(target_value=3)
+        game.play_round()
+        game.play_round()
+        # Start new game
+        state = game.start_new_game(target_value=5)
+        assert state.target_value == 5
+        assert state.round_number == 0
+        assert len(state.evidence_history) == 0
+        assert len(game.belief_state.evidence_history) == 0
+    def test_play_round_not_playing(self):
+        """Test playing round when not in playing phase."""
+        game = BayesianGame()
+        # Game starts in setup phase
+        with pytest.raises(ValueError, match="Game is not in playing phase"):
+            game.play_round()
+    def test_play_round_game_finished(self):
+        """Test playing round when game is already finished."""
+        game = BayesianGame(max_rounds=1, seed=42)
+        # Start game and play one round (should finish)
+        game.start_new_game(target_value=3)
+        game.play_round()
+        # Try to play another round
+        with pytest.raises(ValueError, match="Game is not in playing phase"):
+            game.play_round()
+    def test_play_round_updates_state(self):
+        """Test that playing round updates game state correctly."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=3)
+        initial_round_number = game.get_current_state().round_number
+        # Play one round
+        updated_state = game.play_round()
+        assert updated_state.round_number == initial_round_number + 1
+        assert len(updated_state.evidence_history) == 1
+        assert len(updated_state.current_beliefs) == 6
+        assert updated_state.most_likely_target in range(1, 7)
+        assert updated_state.belief_entropy >= 0
+        # Evidence should be valid
+        evidence = updated_state.evidence_history[0]
+        assert 1 <= evidence.dice_roll <= 6
+        assert evidence.comparison_result in ["higher", "lower", "same"]
+    def test_play_multiple_rounds(self):
+        """Test playing multiple rounds."""
+        game = BayesianGame(max_rounds=5, seed=42)
+        game.start_new_game(target_value=4)
+        for expected_round in range(1, 6):
+            state = game.play_round()
+            assert state.round_number == expected_round
+            assert len(state.evidence_history) == expected_round
+            if expected_round < 5:
+                assert state.phase == GamePhase.PLAYING
+            else:
+                assert state.phase == GamePhase.FINISHED
+    def test_get_current_state(self):
+        """Test getting current game state."""
+        game = BayesianGame()
+        # Initial state
+        state = game.get_current_state()
+        assert state.phase == GamePhase.SETUP
+        # After starting game
+        game.start_new_game(target_value=2)
+        state = game.get_current_state()
+        assert state.phase == GamePhase.PLAYING
+        assert state.target_value == 2
+    def test_is_game_finished(self):
+        """Test checking if game is finished."""
+        game = BayesianGame(max_rounds=2, seed=42)
+        # Initially not finished
+        assert not game.is_game_finished()
+        # Start game - still not finished
+        game.start_new_game(target_value=3)
+        assert not game.is_game_finished()
+        # Play one round - still not finished
+        game.play_round()
+        assert not game.is_game_finished()
+        # Play final round - now finished
+        game.play_round()
+        assert game.is_game_finished()
+    def test_get_final_guess_accuracy_no_target(self):
+        """Test getting final guess accuracy without target set."""
+        game = BayesianGame()
+        with pytest.raises(ValueError, match="Target value not set"):
+            game.get_final_guess_accuracy()
+    def test_get_final_guess_accuracy(self):
+        """Test getting final guess accuracy."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=3)
+        # Play some rounds
+        game.play_round()
+        game.play_round()
+        accuracy = game.get_final_guess_accuracy()
+        # Should be probability assigned to target value 3
+        assert 0 <= accuracy <= 1
+        expected_accuracy = game.belief_state.get_belief_for_target(3)
+        assert accuracy == expected_accuracy
+    def test_was_final_guess_correct_no_target(self):
+        """Test checking final guess correctness without target set."""
+        game = BayesianGame()
+        with pytest.raises(ValueError, match="Target value not set"):
+            game.was_final_guess_correct()
+    def test_was_final_guess_correct(self):
+        """Test checking if final guess was correct."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=3)
+        # Play rounds until we get definitive evidence
+        for _ in range(10):  # Play enough rounds to get clear evidence
+            if game.is_game_finished():
+                break
+            game.play_round()
+        is_correct = game.was_final_guess_correct()
+        most_likely = game.game_state.most_likely_target
+        assert isinstance(is_correct, bool)
+        assert is_correct == (most_likely == 3)
+    def test_get_game_summary(self):
+        """Test getting game summary."""
+        game = BayesianGame(max_rounds=3, seed=42)
+        game.start_new_game(target_value=4)
+        # Play all rounds
+        while not game.is_game_finished():
+            game.play_round()
+        summary = game.get_game_summary()
+        # Check all required fields
+        assert summary["rounds_played"] == 3
+        assert summary["max_rounds"] == 3
+        assert summary["true_target"] == 4
+        assert summary["final_guess"] in range(1, 7)
+        assert isinstance(summary["guess_correct"], bool)
+        assert 0 <= summary["final_accuracy"] <= 1
+        assert summary["final_entropy"] >= 0
+        assert summary["evidence_count"] == 3
+        assert len(summary["final_beliefs"]) == 6
+        # Check that final beliefs are properly indexed (1-6)
+        for i in range(1, 7):
+            assert i in summary["final_beliefs"]
+    def test_belief_updates_with_evidence(self):
+        """Test that belief updates properly reflect evidence."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=1)  # Low target for predictable evidence
+        initial_beliefs = game.belief_state.get_current_beliefs()
+        # Play several rounds
+        states = []
+        for _ in range(5):
+            if game.is_game_finished():
+                break
+            state = game.play_round()
+            states.append(state)
+        # Beliefs should change as evidence accumulates
+        final_beliefs = game.belief_state.get_current_beliefs()
+        # Should not be uniform anymore (unless very unlikely)
+        assert not all(abs(b - 1/6) < 1e-10 for b in final_beliefs)
+        # Evidence should influence beliefs correctly
+        for state in states:
+            for evidence in state.evidence_history:
+                if evidence.comparison_result == "higher":
+                    # Target must be less than dice roll
+                    for target in range(evidence.dice_roll, 7):
+                        # These targets should have reduced probability
+                        pass  # Detailed verification would require complex logic
+    def test_game_with_evidence_updates(self):
+        """Test game behavior with evidence updates."""
+        game = BayesianGame(seed=42)
+        game.start_new_game(target_value=3)
+        # Apply evidence that changes beliefs
+        from domains.belief.belief_domain import BeliefUpdate
+        update = BeliefUpdate(comparison_result="higher")
+        game.belief_state.update_beliefs(update)
+        # Update game state to reflect the belief change
+        game.game_state.most_likely_target = game.belief_state.get_most_likely_target()
+        # Beliefs should have changed from uniform
+        prob_1 = game.belief_state.get_belief_for_target(1)
+        prob_6 = game.belief_state.get_belief_for_target(6)
+        assert prob_1 > prob_6  # Lower targets should be more likely after "higher"
+        assert game.belief_state.get_most_likely_target() in range(1, 7)
+        assert 0 <= game.get_final_guess_accuracy() <= 1
+    def test_reproducibility_with_seed(self):
+        """Test that games are reproducible with same seed."""
+        # Run two games with same seed
+        game1 = BayesianGame(seed=42)
+        game1.start_new_game(target_value=3)
+        game2 = BayesianGame(seed=42)
+        game2.start_new_game(target_value=3)
+        # Play same number of rounds
+        for _ in range(5):
+            if game1.is_game_finished() or game2.is_game_finished():
+                break
+            state1 = game1.play_round()
+            state2 = game2.play_round()
+            # Evidence should be identical
+            assert len(state1.evidence_history) == len(state2.evidence_history)
+            for ev1, ev2 in zip(state1.evidence_history, state2.evidence_history):
+                assert ev1.dice_roll == ev2.dice_roll
+                assert ev1.comparison_result == ev2.comparison_result
+            # Beliefs should be identical
+            assert state1.current_beliefs == state2.current_beliefs

tests/test_ui_interface.py ADDED Viewed

	@@ -0,0 +1,243 @@

+"""
+Tests for the Gradio UI interface to ensure proper error handling and memory management.
+"""
+import pytest
+import matplotlib.pyplot as plt
+from ui.gradio_interface import GradioInterface
+class TestGradioInterface:
+    """Test the Gradio interface functionality."""
+    def test_interface_initialization(self):
+        """Test that interface initializes correctly."""
+        interface = GradioInterface()
+        assert interface.game is not None
+        assert interface.game.dice_sides == 6
+        assert interface.game.max_rounds == 10
+    def test_reset_game_returns_proper_types(self):
+        """Test that reset_game returns proper types."""
+        interface = GradioInterface()
+        result = interface.reset_game(dice_sides=8, max_rounds=15)
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+    def test_start_new_game_valid_target(self):
+        """Test starting a new game with valid target."""
+        interface = GradioInterface()
+        result = interface.start_new_game("3")
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+        assert "Playing" in status
+    def test_start_new_game_invalid_target(self):
+        """Test starting a new game with invalid target returns proper types."""
+        interface = GradioInterface()
+        result = interface.start_new_game("10")  # Invalid for 6-sided die
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+        assert "❌" in status
+        assert "between 1 and 6" in status
+    def test_play_round_without_game_started(self):
+        """Test playing round without starting game returns proper types."""
+        interface = GradioInterface()
+        result = interface.play_round()
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+        assert "❌" in status
+        assert "not in playing phase" in status
+    def test_play_round_normal_flow(self):
+        """Test normal round playing flow."""
+        interface = GradioInterface()
+        # Start a game first
+        interface.start_new_game("3")
+        # Play a round
+        result = interface.play_round()
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+        assert "Playing" in status
+    def test_exceeding_max_rounds(self):
+        """Test that exceeding max rounds shows graceful completion."""
+        interface = GradioInterface()
+        # Start a game with 2 rounds
+        interface.reset_game(dice_sides=6, max_rounds=2)
+        interface.start_new_game("3")
+        # Play 2 rounds (should finish the game)
+        interface.play_round()
+        interface.play_round()
+        # Try to play another round (should be prevented)
+        result = interface.play_round()
+        assert len(result) == 4
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(status, str)
+        assert isinstance(round_info, str)
+        assert isinstance(belief_chart, plt.Figure)
+        assert isinstance(game_log, str)
+        # When game is finished, we should get a graceful completion message
+        assert ("🏁" in status and "completed" in status)
+    def test_create_empty_chart(self):
+        """Test that empty chart creation works properly."""
+        interface = GradioInterface()
+        chart = interface._create_empty_chart()
+        assert isinstance(chart, plt.Figure)
+        # Clean up
+        plt.close(chart)
+    def test_matplotlib_memory_management(self):
+        """Test that matplotlib figures are properly managed."""
+        interface = GradioInterface()
+        # Get initial figure count
+        initial_figures = len(plt.get_fignums())
+        # Create multiple charts
+        for _ in range(5):
+            interface._create_belief_chart()
+        # Should not accumulate figures due to plt.close('all')
+        final_figures = len(plt.get_fignums())
+        # Should have at most 1 figure open (the most recent one)
+        assert final_figures <= initial_figures + 1
+    def test_error_handling_preserves_types(self):
+        """Test that error handling always returns consistent types."""
+        interface = GradioInterface()
+        # Test various error conditions
+        error_results = [
+            interface.start_new_game("invalid_number"),
+            interface.start_new_game("0"),
+            interface.start_new_game("100"),
+            interface.play_round(),  # No game started
+        ]
+        for result in error_results:
+            assert len(result) == 4
+            status, round_info, belief_chart, game_log = result
+            assert isinstance(status, str)
+            assert isinstance(round_info, str)
+            assert isinstance(belief_chart, plt.Figure)
+            assert isinstance(game_log, str)
+            assert "❌" in status
+            # Clean up the figure
+            plt.close(belief_chart)
+    def test_game_log_creation(self):
+        """Test that game log is created properly."""
+        interface = GradioInterface()
+        interface.start_new_game("3")
+        # Play a few rounds
+        for _ in range(3):
+            interface.play_round()
+        result = interface._get_interface_state()
+        status, round_info, belief_chart, game_log = result
+        assert isinstance(game_log, str)
+        assert "Evidence History" in game_log
+        assert "Round" in game_log
+        # Clean up
+        plt.close(belief_chart)
+    def test_graceful_game_completion(self):
+        """Test that game completion shows comprehensive final results."""
+        interface = GradioInterface()
+        # Start and complete a game
+        interface.reset_game(dice_sides=6, max_rounds=3)
+        interface.start_new_game("4")
+        # Play all rounds
+        for _ in range(3):
+            interface.play_round()
+        # Get final state
+        result = interface._get_interface_state()
+        status, round_info, belief_chart, game_log = result
+        # Should show comprehensive final results
+        assert "Final Game Results" in round_info
+        assert "Learning Performance" in round_info
+        assert "Information gained" in round_info
+        assert "Game Completed" in game_log
+        assert ("Congratulations" in game_log or "Learning opportunity" in game_log)
+        assert "confidence in true target" in game_log
+        # Chart should have final state title
+        assert isinstance(belief_chart, plt.Figure)
+        # Clean up
+        plt.close(belief_chart)
+    def test_completion_state_preservation(self):
+        """Test that completion state preserves all information."""
+        interface = GradioInterface()
+        # Complete a game
+        interface.reset_game(dice_sides=6, max_rounds=2)
+        interface.start_new_game("3")
+        interface.play_round()
+        interface.play_round()
+        # Try to play after completion - should preserve final state
+        result = interface.play_round()
+        status, round_info, belief_chart, game_log = result
+        # Should still have all the final game information
+        assert "🏁" in status
+        assert "completed" in status
+        assert len(round_info) > 100  # Should have detailed final results
+        assert len(game_log) > 50     # Should have complete evidence history
+        assert isinstance(belief_chart, plt.Figure)
+        # Clean up
+        plt.close(belief_chart)

ui/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # UI package initialization

ui/gradio_interface.py ADDED Viewed

	@@ -0,0 +1,370 @@

+import gradio as gr
+import numpy as np
+import matplotlib.pyplot as plt
+from typing import Tuple, Dict, Any, Union
+from domains.coordination.game_coordination import BayesianGame, GamePhase
+class GradioInterface:
+    """Gradio interface for the Bayesian Game."""
+    def __init__(self):
+        """Initialize the Gradio interface."""
+        self.game = None
+        self.reset_game()
+    def reset_game(
+        self, dice_sides: int = 6, max_rounds: int = 10
+    ) -> Tuple[str, str, plt.Figure, str]:
+        """Reset the game with new parameters.
+        Args:
+            dice_sides: Number of sides on the dice
+            max_rounds: Maximum number of rounds
+        Returns:
+            Tuple of (status, round_info, belief_chart, game_log)
+        """
+        self.game = BayesianGame(dice_sides=dice_sides, max_rounds=max_rounds)
+        return self._get_interface_state()
+    def start_new_game(self, target_value: str = "") -> Tuple[str, str, plt.Figure, str]:
+        """Start a new game.
+        Args:
+            target_value: Optional specific target value
+        Returns:
+            Tuple of (status, round_info, belief_chart, game_log)
+        """
+        try:
+            target = int(target_value) if target_value.strip() else None
+            if target is not None and not (1 <= target <= self.game.dice_sides):
+                return (
+                    f"❌ Target value must be between 1 and {self.game.dice_sides}",
+                    "",
+                    self._create_empty_chart(),
+                    "",
+                )
+            self.game.start_new_game(target_value=target)
+            return self._get_interface_state()
+        except ValueError as e:
+            return f"❌ Error: {str(e)}", "", self._create_empty_chart(), ""
+    def play_round(self) -> Tuple[str, str, plt.Figure, str]:
+        """Play one round of the game.
+        Returns:
+            Tuple of (status, round_info, belief_chart, game_log)
+        """
+        try:
+            # Check if game is already finished - but still show the final state
+            if self.game.is_game_finished():
+                # Get the current final state but with a message about being finished
+                status, round_info, belief_chart, game_log = self._get_interface_state()
+                return (
+                    "🏁 Game completed! All rounds finished. Start a new game to play again.",
+                    round_info,
+                    belief_chart,
+                    game_log,
+                )
+            if self.game.game_state.phase != GamePhase.PLAYING:
+                return (
+                    "❌ Game not in playing phase. Start a new game first.",
+                    "",
+                    self._create_empty_chart(),
+                    "",
+                )
+            self.game.play_round()
+            return self._get_interface_state()
+        except ValueError as e:
+            return f"❌ Error: {str(e)}", "", self._create_empty_chart(), ""
+    def _get_interface_state(self) -> Tuple[str, str, plt.Figure, str]:
+        """Get current interface state.
+        Returns:
+            Tuple of (status, round_info, belief_chart, game_log)
+        """
+        state = self.game.get_current_state()
+        # Status message
+        if state.phase == GamePhase.SETUP:
+            status = "🎯 Ready to start new game"
+        elif state.phase == GamePhase.PLAYING:
+            status = f"🎲 Playing - Round {state.round_number}/{state.max_rounds}"
+        else:  # FINISHED
+            correct = "✅" if self.game.was_final_guess_correct() else "❌"
+            accuracy = self.game.get_final_guess_accuracy()
+            status = f"{correct} Game finished! Final guess: {state.most_likely_target} (True: {state.target_value}) - Accuracy: {accuracy:.2f}"
+        # Round information
+        if state.target_value is not None:
+            if state.phase == GamePhase.FINISHED:
+                # Show comprehensive final results
+                summary = self.game.get_game_summary()
+                final_correct = "✅ Correct!" if summary["guess_correct"] else "❌ Incorrect"
+                round_info = f"""
+**🏁 Final Game Results:**
+- True Target: {state.target_value}
+- Final Guess: {state.most_likely_target} {final_correct}
+- Final Accuracy: {summary["final_accuracy"]:.3f} (probability assigned to true target)
+- Final Entropy: {state.belief_entropy:.2f} bits
+- Rounds Played: {state.round_number}/{state.max_rounds}
+- Evidence Collected: {summary["evidence_count"]} pieces
+**📊 Learning Performance:**
+- Started with uniform beliefs (entropy: {np.log2(len(state.current_beliefs)):.2f} bits)
+- Ended with entropy: {state.belief_entropy:.2f} bits
+- Information gained: {np.log2(len(state.current_beliefs)) - state.belief_entropy:.2f} bits
+"""
+            else:
+                # Show current game state
+                round_info = f"""
+**Game Settings:**
+- Target Value: {state.target_value} (hidden from Player 2)
+- Most Likely Target: {state.most_likely_target}
+- Belief Entropy: {state.belief_entropy:.2f} bits
+- Round: {state.round_number}/{state.max_rounds}
+"""
+        else:
+            round_info = "Start a new game to see round information."
+        # Belief visualization
+        belief_chart = self._create_belief_chart()
+        # Game log
+        game_log = self._create_game_log()
+        return status, round_info, belief_chart, game_log
+    def _create_belief_chart(self) -> plt.Figure:
+        """Create belief distribution chart.
+        Returns:
+            Matplotlib figure showing belief distribution
+        """
+        # Close any existing figures to prevent memory leaks
+        plt.close('all')
+        fig, ax = plt.subplots(figsize=(10, 6))
+        if self.game.game_state.current_beliefs:
+            targets = list(range(1, len(self.game.game_state.current_beliefs) + 1))
+            beliefs = self.game.game_state.current_beliefs
+            bars = ax.bar(
+                targets, beliefs, alpha=0.7, color="skyblue", edgecolor="navy"
+            )
+            # Highlight the most likely target
+            if self.game.game_state.most_likely_target:
+                most_likely_idx = self.game.game_state.most_likely_target - 1
+                bars[most_likely_idx].set_color("orange")
+                bars[most_likely_idx].set_alpha(1.0)
+            # Highlight true target if known
+            if self.game.game_state.target_value:
+                true_target_idx = self.game.game_state.target_value - 1
+                bars[true_target_idx].set_edgecolor("red")
+                bars[true_target_idx].set_linewidth(3)
+            ax.set_xlabel("Target Value")
+            ax.set_ylabel("Belief Probability")
+            # Enhanced title based on game state
+            if self.game.game_state.phase == GamePhase.FINISHED:
+                correct_indicator = "✅" if self.game.was_final_guess_correct() else "❌"
+                ax.set_title(f"Final Belief Distribution {correct_indicator}")
+            else:
+                ax.set_title("Player 2's Belief Distribution")
+            ax.set_xticks(targets)
+            ax.set_ylim(0, 1)
+            ax.grid(True, alpha=0.3)
+            # Add legend
+            legend_elements = []
+            if self.game.game_state.most_likely_target:
+                legend_elements.append(
+                    plt.Rectangle(
+                        (0, 0), 1, 1, fc="orange", alpha=1.0, label="Most Likely"
+                    )
+                )
+            if self.game.game_state.target_value:
+                legend_elements.append(
+                    plt.Rectangle(
+                        (0, 0), 1, 1, fc="skyblue", ec="red", lw=3, label="True Target"
+                    )
+                )
+            if legend_elements:
+                ax.legend(handles=legend_elements)
+        else:
+            ax.text(
+                0.5,
+                0.5,
+                "Start a game to see beliefs",
+                transform=ax.transAxes,
+                ha="center",
+                va="center",
+                fontsize=14,
+            )
+            ax.set_xlim(0, 1)
+            ax.set_ylim(0, 1)
+        plt.tight_layout()
+        return fig
+    def _create_empty_chart(self) -> plt.Figure:
+        """Create an empty chart for error states.
+        Returns:
+            Matplotlib figure with error message
+        """
+        # Close any existing figures to prevent memory leaks
+        plt.close('all')
+        fig, ax = plt.subplots(figsize=(10, 6))
+        ax.text(
+            0.5,
+            0.5,
+            "Error: Unable to display chart",
+            transform=ax.transAxes,
+            ha="center",
+            va="center",
+            fontsize=14,
+            color="red"
+        )
+        ax.set_xlim(0, 1)
+        ax.set_ylim(0, 1)
+        ax.set_title("Chart Error")
+        plt.tight_layout()
+        return fig
+    def _create_game_log(self) -> str:
+        """Create game log showing evidence history.
+        Returns:
+            Formatted string with game log
+        """
+        if not self.game.game_state.evidence_history:
+            return "No evidence yet. Start playing rounds to see the log."
+        log_lines = ["**Evidence History:**\n"]
+        for i, evidence in enumerate(self.game.game_state.evidence_history, 1):
+            emoji = {"higher": "⬆️", "lower": "⬇️", "same": "🎯"}[
+                evidence.comparison_result
+            ]
+            log_lines.append(
+                f"Round {i}: Rolled {evidence.dice_roll} → {evidence.comparison_result} {emoji}"
+            )
+        # Add completion message if game is finished
+        if self.game.game_state.phase == GamePhase.FINISHED:
+            log_lines.append("")
+            log_lines.append("**🏁 Game Completed!**")
+            if self.game.was_final_guess_correct():
+                log_lines.append("🎉 **Congratulations!** Player 2 correctly identified the target!")
+            else:
+                log_lines.append("📈 **Learning opportunity!** Player 2's beliefs converged but missed the target.")
+            # Add some Bayesian insights
+            final_accuracy = self.game.get_final_guess_accuracy()
+            if final_accuracy > 0.5:
+                log_lines.append(f"🎯 Strong evidence: {final_accuracy:.1%} confidence in true target")
+            elif final_accuracy > 0.3:
+                log_lines.append(f"🤔 Moderate evidence: {final_accuracy:.1%} confidence in true target")
+            else:
+                log_lines.append(f"🌫️ Conflicting evidence: Only {final_accuracy:.1%} confidence in true target")
+        return "\n".join(log_lines)
+def create_interface() -> gr.Interface:
+    """Create and return the Gradio interface.
+    Returns:
+        Configured Gradio interface
+    """
+    interface = GradioInterface()
+    with gr.Blocks(title="Bayesian Game", theme=gr.themes.Soft()) as demo:
+        gr.Markdown("# 🎲 Bayesian Game")
+        gr.Markdown(
+            """
+        **Game Rules:**
+        - Judge and Player 1 can see the target die value
+        - Player 2 must deduce the target value using Bayesian inference
+        - Each round: Player 1 rolls dice and reports "higher"/"lower"/"same" compared to target
+        - Game runs for a specified number of rounds
+        """
+        )
+        with gr.Row():
+            with gr.Column(scale=1):
+                gr.Markdown("### Game Controls")
+                with gr.Row():
+                    dice_sides = gr.Number(
+                        value=6, label="Dice Sides", minimum=2, maximum=20, precision=0
+                    )
+                    max_rounds = gr.Number(
+                        value=10, label="Max Rounds", minimum=1, maximum=50, precision=0
+                    )
+                reset_btn = gr.Button("🔄 Reset Game", variant="secondary")
+                target_input = gr.Textbox(
+                    label="Target Value (optional)",
+                    placeholder="Leave empty for random target",
+                    max_lines=1,
+                )
+                start_btn = gr.Button("🎯 Start New Game", variant="primary")
+                play_btn = gr.Button("🎲 Play Round", variant="secondary")
+            with gr.Column(scale=2):
+                status_output = gr.Textbox(label="Game Status", interactive=False)
+                round_info = gr.Markdown("Start a new game to begin.")
+        with gr.Row():
+            with gr.Column():
+                belief_plot = gr.Plot(label="Belief Distribution")
+            with gr.Column():
+                game_log = gr.Markdown("Game log will appear here.")
+        # Event handlers
+        reset_btn.click(
+            interface.reset_game,
+            inputs=[dice_sides, max_rounds],
+            outputs=[status_output, round_info, belief_plot, game_log],
+        )
+        start_btn.click(
+            interface.start_new_game,
+            inputs=[target_input],
+            outputs=[status_output, round_info, belief_plot, game_log],
+        )
+        play_btn.click(
+            interface.play_round,
+            outputs=[status_output, round_info, belief_plot, game_log],
+        )
+        # Initialize interface
+        demo.load(
+            interface._get_interface_state,
+            outputs=[status_output, round_info, belief_plot, game_log],
+        )
+    return demo
+if __name__ == "__main__":
+    demo = create_interface()
+    demo.launch()