Spaces:

thompsonson
/

bayesian_game

Sleeping

thompsonson Claude commited on Jun 17, 2025

Commit

230696b

1 Parent(s): 0f5f162

feat: add multi-evidence extension with configurable evidence types

- Add Basic and Extended evidence type configuration
- Implement multi-evidence generation (e.g., ["lower", "half"])
- Update Bayesian inference for joint probability calculations
- Add UI dropdown for evidence type selection
- Maintain domain separation and architectural constraints
- Update all tests (83 passing) for new multi-evidence format

Basic Evidence: higher/lower/same
Extended Evidence: adds half/double for richer inference

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

Files changed (9) hide show

CLAUDE.md +40 -17
domains/belief/belief_domain.py +55 -15
domains/coordination/game_coordination.py +17 -5
domains/environment/environment_domain.py +35 -9
tests/test_architectural_constraints.py +18 -18
tests/test_belief_domain.py +24 -24
tests/test_environment_domain.py +114 -20
tests/test_game_coordination.py +7 -5
ui/gradio_interface.py +39 -10

CLAUDE.md CHANGED Viewed

@@ -6,28 +6,39 @@ A Bayesian Game implementation featuring a Belief-based Agent using domain-drive
 ## Game Rules
 - Judge and Player 1 can see the target die value
 - Player 2 must deduce the target value using only comparison results
-- Player 1 rolls dice and reports "higher"/"lower"/"same" compared to target
-- **CRITICAL**: Player 2 receives ONLY the comparison result, NOT the dice roll value
 - Game runs for 10 rounds
 - Judge ensures truth-telling
 ## Development Practices
 - Use conventional commits when committing code to git
 - Always use uv and the local venv
 ## Architecture
 Domain-Driven Design with 3 modules:
 1. **Environment Domain** (`domains/environment/environment_domain.py`)
-   - EnvironmentEvidence dataclass (contains dice_roll AND comparison_result)
-   - Environment class for target/evidence generation
    - **ACCESS**: Full knowledge of dice rolls and target values
 2. **Belief Domain** (`domains/belief/belief_domain.py`)
-   - BeliefUpdate dataclass (contains ONLY comparison_result)
-   - BayesianBeliefState class for inference
    - **ACCESS**: NO knowledge of dice roll values or true target
-   - **CONSTRAINT**: Must calculate P(comparison_result | target) probabilistically
 3. **Game Coordination** (`domains/coordination/game_coordination.py`)
    - GameState dataclass (tracks full game state)
@@ -63,11 +74,17 @@ bayesian_game/
 ## Key Design Decisions & Architectural Constraints
 ### Information Flow Rules
-1. **Environment → Coordination**: EnvironmentEvidence (dice_roll + comparison_result)
-2. **Coordination → Belief**: BeliefUpdate (comparison_result ONLY)
 3. **NEVER**: Direct Environment → Belief communication
 4. **NEVER**: Belief domain access to dice roll values
 ### Domain Separation Principles
 - **Environment Domain**: No probability knowledge, pure evidence generation
 - **Belief Domain**: Pure Bayesian inference, no knowledge of actual dice values
@@ -75,31 +92,37 @@ bayesian_game/
 - **UI Layer**: Separate from core game logic, can display full information
 ### Critical Implementation Rules
-- BeliefUpdate dataclass MUST contain only comparison_result
-- BayesianBeliefState MUST calculate P(comparison_result | target) probabilistically
 - Game coordination MUST filter dice_roll from EnvironmentEvidence before passing to belief domain
 - Tests MUST verify that belief domain never receives dice roll values
 ## Maintaining Architectural Integrity
 ### Code Review Checklist
 When modifying the codebase, ensure:
-- [ ] BeliefUpdate contains ONLY comparison_result field
 - [ ] No dice_roll parameter passed to belief domain methods
 - [ ] Game coordination filters EnvironmentEvidence properly
 - [ ] Tests verify belief domain isolation
-- [ ] Belief calculations use probabilistic formulas, not direct dice values
 ### Anti-Patterns to Avoid
-❌ `BeliefUpdate(dice_roll=X, comparison_result=Y)` - belief shouldn't know dice value
 ❌ Direct Environment-Belief communication
 ❌ Belief domain knowing actual dice roll or target values
-❌ Hard-coded probability values instead of calculated P(comparison_result | target)
 ### Correct Patterns
-✅ `BeliefUpdate(comparison_result="higher")` - only comparison result
 ✅ Environment → Coordination → Belief information flow
-✅ Probabilistic calculations: P(roll > target) = (dice_sides - target) / dice_sides
 ✅ Clean domain boundaries with no cross-dependencies
 ## Dependencies

 ## Game Rules
 - Judge and Player 1 can see the target die value
 - Player 2 must deduce the target value using only comparison results
+- Player 1 rolls dice and reports evidence based on selected evidence type
+- **CRITICAL**: Player 2 receives ONLY the evidence results, NOT the dice roll value
 - Game runs for 10 rounds
 - Judge ensures truth-telling
+### Evidence Types
+**Basic Evidence**: `["higher", "lower", "same"]`
+- Standard comparison between dice roll and target
+**Extended Evidence**: `["higher", "lower", "same", "half", "double"]`
+- Multiple evidence types can apply to single roll
+- "half": dice_roll = target/2 (exact integer matches only)
+- "double": dice_roll = target*2 (exact integer matches only)
+- Example: target=4, dice_roll=2 → evidence=`["lower", "half"]`
 ## Development Practices
 - Use conventional commits when committing code to git
 - Always use uv and the local venv
+- Always use the make file for devops-style tasks
 ## Architecture
 Domain-Driven Design with 3 modules:
 1. **Environment Domain** (`domains/environment/environment_domain.py`)
+   - EnvironmentEvidence dataclass (contains dice_roll AND comparison_results)
+   - Environment class for target/evidence generation with configurable evidence types
    - **ACCESS**: Full knowledge of dice rolls and target values
 2. **Belief Domain** (`domains/belief/belief_domain.py`)
+   - BeliefUpdate dataclass (contains ONLY comparison_results as List[str])
+   - BayesianBeliefState class for inference with multi-evidence support
    - **ACCESS**: NO knowledge of dice roll values or true target
+   - **CONSTRAINT**: Must calculate P(comparison_results | target) probabilistically for multiple evidence types
 3. **Game Coordination** (`domains/coordination/game_coordination.py`)
    - GameState dataclass (tracks full game state)
 ## Key Design Decisions & Architectural Constraints
 ### Information Flow Rules
+1. **Environment → Coordination**: EnvironmentEvidence (dice_roll + comparison_results)
+2. **Coordination → Belief**: BeliefUpdate (comparison_results ONLY as List[str])
 3. **NEVER**: Direct Environment → Belief communication
 4. **NEVER**: Belief domain access to dice roll values
+### Multi-Evidence Processing
+- Environment generates all applicable evidence types for each roll
+- Coordination filters dice_roll information before passing to belief domain
+- Belief domain calculates joint probabilities: P(comparison_results | target)
+- UI displays evidence configuration options (Basic vs Extended)
 ### Domain Separation Principles
 - **Environment Domain**: No probability knowledge, pure evidence generation
 - **Belief Domain**: Pure Bayesian inference, no knowledge of actual dice values
 - **UI Layer**: Separate from core game logic, can display full information
 ### Critical Implementation Rules
+- BeliefUpdate dataclass MUST contain only comparison_results as List[str]
+- BayesianBeliefState MUST calculate P(comparison_results | target) probabilistically for multi-evidence
 - Game coordination MUST filter dice_roll from EnvironmentEvidence before passing to belief domain
 - Tests MUST verify that belief domain never receives dice roll values
+- Evidence type configuration MUST be passed through coordination layer, not directly to belief domain
 ## Maintaining Architectural Integrity
 ### Code Review Checklist
 When modifying the codebase, ensure:
+- [ ] BeliefUpdate contains ONLY comparison_results field (List[str])
 - [ ] No dice_roll parameter passed to belief domain methods
 - [ ] Game coordination filters EnvironmentEvidence properly
 - [ ] Tests verify belief domain isolation
+- [ ] Belief calculations use probabilistic formulas for multi-evidence: P(comparison_results | target)
+- [ ] Evidence type configuration flows through coordination layer
+- [ ] UI evidence type selection properly configures game behavior
 ### Anti-Patterns to Avoid
+❌ `BeliefUpdate(dice_roll=X, comparison_results=Y)` - belief shouldn't know dice value
 ❌ Direct Environment-Belief communication
 ❌ Belief domain knowing actual dice roll or target values
+❌ Hard-coded probability values instead of calculated P(comparison_results | target)
+❌ Passing evidence type configuration directly to belief domain
 ### Correct Patterns
+✅ `BeliefUpdate(comparison_results=["lower", "half"])` - only evidence results
 ✅ Environment → Coordination → Belief information flow
+✅ Probabilistic calculations for multi-evidence: P(comparison_results | target)
+✅ Evidence type configuration handled in coordination layer
+✅ Joint probability calculations: P(["lower", "half"] | target) = P(dice_roll=target/2 AND dice_roll<target)
 ✅ Clean domain boundaries with no cross-dependencies
 ## Dependencies

domains/belief/belief_domain.py CHANGED Viewed

@@ -1,5 +1,4 @@
 from dataclasses import dataclass
-from typing import Literal
 import numpy as np
@@ -8,7 +7,7 @@ import numpy as np
 class BeliefUpdate:
     """Update information for Bayesian belief state."""
-    comparison_result: Literal["higher", "lower", "same"]
 class BayesianBeliefState:
@@ -65,7 +64,7 @@ class BayesianBeliefState:
         """
         self.evidence_history.append(evidence)
-        comparison_result = evidence.comparison_result
         # Calculate likelihood for each possible target value
         likelihoods = np.zeros(self.dice_sides)
@@ -73,18 +72,11 @@ class BayesianBeliefState:
         for target_idx in range(self.dice_sides):
             target_value = target_idx + 1
-            # Calculate P(comparison_result | target_value)
-            # This is the probability that ANY dice roll would produce this comparison result
-            if comparison_result == "higher":
-                # P(roll > target) = (dice_sides - target) / dice_sides
-                likelihood = (self.dice_sides - target_value) / self.dice_sides
-            elif comparison_result == "lower":
-                # P(roll < target) = (target - 1) / dice_sides
-                likelihood = (target_value - 1) / self.dice_sides
-            else:  # comparison_result == "same"
-                # P(roll = target) = 1 / dice_sides
-                likelihood = 1 / self.dice_sides
             likelihoods[target_idx] = likelihood
         # Apply Bayes' rule: posterior ∝ prior * likelihood
@@ -99,6 +91,54 @@ class BayesianBeliefState:
             # reset to uniform distribution
             self.beliefs = np.ones(self.dice_sides) / self.dice_sides
     def reset_beliefs(self) -> None:
         """Reset beliefs to uniform prior and clear evidence history."""
         self.beliefs = np.ones(self.dice_sides) / self.dice_sides

 from dataclasses import dataclass
 import numpy as np
 class BeliefUpdate:
     """Update information for Bayesian belief state."""
+    comparison_results: list[str]
 class BayesianBeliefState:
         """
         self.evidence_history.append(evidence)
+        comparison_results = evidence.comparison_results
         # Calculate likelihood for each possible target value
         likelihoods = np.zeros(self.dice_sides)
         for target_idx in range(self.dice_sides):
             target_value = target_idx + 1
+            # Calculate P(comparison_results | target_value)
+            # This is the joint probability that a dice roll would produce ALL these evidence types
+            likelihood = self._calculate_joint_likelihood(
+                comparison_results, target_value
+            )
             likelihoods[target_idx] = likelihood
         # Apply Bayes' rule: posterior ∝ prior * likelihood
             # reset to uniform distribution
             self.beliefs = np.ones(self.dice_sides) / self.dice_sides
+    def _calculate_joint_likelihood(
+        self, comparison_results: list[str], target_value: int
+    ) -> float:
+        """Calculate P(comparison_results | target_value) for multiple evidence types.
+        Args:
+            comparison_results: List of evidence results (e.g., ["lower", "half"])
+            target_value: Target value to calculate likelihood for
+        Returns:
+            Joint probability of observing all evidence types given the target
+        """
+        # For multiple evidence types from a single roll, we need to find
+        # the probability that a single dice roll satisfies ALL conditions
+        # Count dice rolls that satisfy all evidence conditions
+        satisfying_rolls = 0
+        for dice_roll in range(1, self.dice_sides + 1):
+            satisfies_all = True
+            for evidence in comparison_results:
+                if (
+                    (evidence == "higher" and not (dice_roll > target_value))
+                    or (evidence == "lower" and not (dice_roll < target_value))
+                    or (evidence == "same" and dice_roll != target_value)
+                    or (
+                        evidence == "half"
+                        and not (
+                            target_value % 2 == 0 and dice_roll == target_value // 2
+                        )
+                    )
+                    or (
+                        evidence == "double"
+                        and not (
+                            dice_roll == target_value * 2
+                            and dice_roll <= self.dice_sides
+                        )
+                    )
+                ):
+                    satisfies_all = False
+                    break
+            if satisfies_all:
+                satisfying_rolls += 1
+        return satisfying_rolls / self.dice_sides
     def reset_beliefs(self) -> None:
         """Reset beliefs to uniform prior and clear evidence history."""
         self.beliefs = np.ones(self.dice_sides) / self.dice_sides

domains/coordination/game_coordination.py CHANGED Viewed

@@ -3,7 +3,11 @@ from enum import Enum
 from typing import Any
 from ..belief.belief_domain import BayesianBeliefState, BeliefUpdate
-from ..environment.environment_domain import Environment, EnvironmentEvidence
 class GamePhase(Enum):
@@ -42,20 +46,28 @@ class BayesianGame:
     """
     def __init__(
-        self, dice_sides: int = 6, max_rounds: int = 10, seed: int | None = None
     ):
         """Initialize the Bayesian Game.
         Args:
             dice_sides: Number of sides on the dice
             max_rounds: Maximum number of rounds to play
             seed: Random seed for reproducible results
         """
         self.dice_sides = dice_sides
         self.max_rounds = max_rounds
         # Initialize domains
-        self.environment = Environment(dice_sides=dice_sides, seed=seed)
         self.belief_state = BayesianBeliefState(dice_sides=dice_sides)
         # Initialize game state
@@ -113,8 +125,8 @@ class BayesianGame:
         # Generate evidence from environment
         evidence = self.environment.roll_dice_and_compare()
-        # Update belief state (only pass comparison result, not dice roll)
-        belief_update = BeliefUpdate(comparison_result=evidence.comparison_result)
         self.belief_state.update_beliefs(belief_update)
         # Update game state

 from typing import Any
 from ..belief.belief_domain import BayesianBeliefState, BeliefUpdate
+from ..environment.environment_domain import (
+    Environment,
+    EnvironmentEvidence,
+    EvidenceType,
+)
 class GamePhase(Enum):
     """
     def __init__(
+        self,
+        dice_sides: int = 6,
+        max_rounds: int = 10,
+        evidence_type: EvidenceType = EvidenceType.BASIC,
+        seed: int | None = None,
     ):
         """Initialize the Bayesian Game.
         Args:
             dice_sides: Number of sides on the dice
             max_rounds: Maximum number of rounds to play
+            evidence_type: Type of evidence to generate (basic or extended)
             seed: Random seed for reproducible results
         """
         self.dice_sides = dice_sides
         self.max_rounds = max_rounds
+        self.evidence_type = evidence_type
         # Initialize domains
+        self.environment = Environment(
+            dice_sides=dice_sides, evidence_type=evidence_type, seed=seed
+        )
         self.belief_state = BayesianBeliefState(dice_sides=dice_sides)
         # Initialize game state
         # Generate evidence from environment
         evidence = self.environment.roll_dice_and_compare()
+        # Update belief state (only pass comparison results, not dice roll)
+        belief_update = BeliefUpdate(comparison_results=evidence.comparison_results)
         self.belief_state.update_beliefs(belief_update)
         # Update game state

domains/environment/environment_domain.py CHANGED Viewed

@@ -1,14 +1,21 @@
 import random
 from dataclasses import dataclass
-from typing import Literal
 @dataclass
 class EnvironmentEvidence:
-    """Evidence generated by the environment - dice roll and comparison result."""
     dice_roll: int
-    comparison_result: Literal["higher", "lower", "same"]
 class Environment:
@@ -17,14 +24,21 @@ class Environment:
     Has no knowledge of probabilities - purely generates observable evidence.
     """
-    def __init__(self, dice_sides: int = 6, seed: int | None = None):
         """Initialize environment with dice configuration.
         Args:
             dice_sides: Number of sides on the dice (default 6)
             seed: Random seed for reproducible results
         """
         self.dice_sides = dice_sides
         self._random_state = (
             random.Random(seed) if seed is not None else random.Random()
         )
@@ -67,7 +81,7 @@ class Environment:
         """Roll dice and compare to target, generating evidence.
         Returns:
-            EnvironmentEvidence with dice roll and comparison result
         Raises:
             ValueError: If target value hasn't been set
@@ -76,14 +90,26 @@ class Environment:
             raise ValueError("Target value not set")
         dice_roll = self._random_state.randint(1, self.dice_sides)
         if dice_roll > self._target_value:
-            comparison_result = "higher"
         elif dice_roll < self._target_value:
-            comparison_result = "lower"
         else:
-            comparison_result = "same"
         return EnvironmentEvidence(
-            dice_roll=dice_roll, comparison_result=comparison_result
         )

 import random
 from dataclasses import dataclass
+from enum import Enum
+class EvidenceType(Enum):
+    """Types of evidence that can be generated."""
+    BASIC = "basic"
+    EXTENDED = "extended"
 @dataclass
 class EnvironmentEvidence:
+    """Evidence generated by the environment - dice roll and comparison results."""
     dice_roll: int
+    comparison_results: list[str]
 class Environment:
     Has no knowledge of probabilities - purely generates observable evidence.
     """
+    def __init__(
+        self,
+        dice_sides: int = 6,
+        evidence_type: EvidenceType = EvidenceType.BASIC,
+        seed: int | None = None,
+    ):
         """Initialize environment with dice configuration.
         Args:
             dice_sides: Number of sides on the dice (default 6)
+            evidence_type: Type of evidence to generate (basic or extended)
             seed: Random seed for reproducible results
         """
         self.dice_sides = dice_sides
+        self.evidence_type = evidence_type
         self._random_state = (
             random.Random(seed) if seed is not None else random.Random()
         )
         """Roll dice and compare to target, generating evidence.
         Returns:
+            EnvironmentEvidence with dice roll and comparison results
         Raises:
             ValueError: If target value hasn't been set
             raise ValueError("Target value not set")
         dice_roll = self._random_state.randint(1, self.dice_sides)
+        comparison_results = []
+        # Basic evidence: higher/lower/same
         if dice_roll > self._target_value:
+            comparison_results.append("higher")
         elif dice_roll < self._target_value:
+            comparison_results.append("lower")
         else:
+            comparison_results.append("same")
+        # Extended evidence: half/double (only for extended type)
+        if self.evidence_type == EvidenceType.EXTENDED:
+            # Check for "half" - dice_roll = target/2 (exact integer only)
+            if self._target_value % 2 == 0 and dice_roll == self._target_value // 2:
+                comparison_results.append("half")
+            # Check for "double" - dice_roll = target*2 (within dice range)
+            if dice_roll == self._target_value * 2 and dice_roll <= self.dice_sides:
+                comparison_results.append("double")
         return EnvironmentEvidence(
+            dice_roll=dice_roll, comparison_results=comparison_results
         )

tests/test_architectural_constraints.py CHANGED Viewed

@@ -20,24 +20,24 @@ class TestArchitecturalConstraints:
     """Test architectural constraints and domain separation."""
     def test_belief_update_dataclass_structure(self):
-        """Test that BeliefUpdate contains only comparison_result field."""
         # Get all fields of BeliefUpdate
         fields = BeliefUpdate.__dataclass_fields__
-        # Should only contain comparison_result
         assert len(fields) == 1, (
             f"BeliefUpdate should have exactly 1 field, got {len(fields)}: "
             f"{list(fields.keys())}"
         )
-        assert "comparison_result" in fields, (
-            "BeliefUpdate must contain comparison_result field"
         )
         assert "dice_roll" not in fields, (
             "BeliefUpdate MUST NOT contain dice_roll field"
         )
     def test_environment_evidence_dataclass_structure(self):
-        """Test that EnvironmentEvidence contains both dice_roll and comparison_result."""
         # Get all fields of EnvironmentEvidence
         fields = EnvironmentEvidence.__dataclass_fields__
@@ -47,8 +47,8 @@ class TestArchitecturalConstraints:
             f"{list(fields.keys())}"
         )
         assert "dice_roll" in fields, "EnvironmentEvidence must contain dice_roll field"
-        assert "comparison_result" in fields, (
-            "EnvironmentEvidence must contain comparison_result field"
         )
     def test_belief_state_methods_no_dice_roll_parameters(self):
@@ -69,14 +69,14 @@ class TestArchitecturalConstraints:
     def test_belief_update_creation_without_dice_roll(self):
         """Test that BeliefUpdate can be created without dice_roll."""
-        # This should work (only comparison_result)
-        update = BeliefUpdate(comparison_result="higher")
-        assert update.comparison_result == "higher"
         # This should fail if dice_roll field exists
         try:
             # This should raise TypeError if dice_roll is not a field
-            BeliefUpdate(dice_roll=3, comparison_result="higher")
             pytest.fail("BeliefUpdate should not accept dice_roll parameter")
         except TypeError:
             pass  # Expected - dice_roll should not be a valid parameter
@@ -100,8 +100,8 @@ class TestArchitecturalConstraints:
         # Verify that evidence history in belief domain contains only comparison results
         for evidence in game.belief_state.evidence_history:
-            assert hasattr(evidence, "comparison_result"), (
-                "Belief evidence must have comparison_result"
             )
             assert not hasattr(evidence, "dice_roll"), (
                 "Belief evidence MUST NOT have dice_roll"
@@ -127,7 +127,7 @@ class TestArchitecturalConstraints:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Apply "higher" evidence
-        update = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update)
         # Verify that probabilities follow expected pattern for "higher"
@@ -153,14 +153,14 @@ class TestArchitecturalConstraints:
         assert hasattr(state.evidence_history[0], "dice_roll"), (
             "Game state should maintain full evidence for display"
         )
-        assert hasattr(state.evidence_history[0], "comparison_result"), (
             "Game state should maintain comparison results"
         )
         # But belief state should only have comparison results
         belief_evidence = game.belief_state.evidence_history[0]
-        assert hasattr(belief_evidence, "comparison_result"), (
-            "Belief evidence must have comparison_result"
         )
         assert not hasattr(belief_evidence, "dice_roll"), (
             "Belief evidence MUST NOT have dice_roll"
@@ -173,7 +173,7 @@ class TestArchitecturalConstraints:
             belief_state = BayesianBeliefState(dice_sides=dice_sides)
             # Apply "higher" evidence
-            update = BeliefUpdate(comparison_result="higher")
             belief_state.update_beliefs(update)
             # Target 1 should have highest probability: P(roll > 1) = (dice_sides - 1) / dice_sides

     """Test architectural constraints and domain separation."""
     def test_belief_update_dataclass_structure(self):
+        """Test that BeliefUpdate contains only comparison_results field."""
         # Get all fields of BeliefUpdate
         fields = BeliefUpdate.__dataclass_fields__
+        # Should only contain comparison_results
         assert len(fields) == 1, (
             f"BeliefUpdate should have exactly 1 field, got {len(fields)}: "
             f"{list(fields.keys())}"
         )
+        assert "comparison_results" in fields, (
+            "BeliefUpdate must contain comparison_results field"
         )
         assert "dice_roll" not in fields, (
             "BeliefUpdate MUST NOT contain dice_roll field"
         )
     def test_environment_evidence_dataclass_structure(self):
+        """Test that EnvironmentEvidence contains both dice_roll and comparison_results."""
         # Get all fields of EnvironmentEvidence
         fields = EnvironmentEvidence.__dataclass_fields__
             f"{list(fields.keys())}"
         )
         assert "dice_roll" in fields, "EnvironmentEvidence must contain dice_roll field"
+        assert "comparison_results" in fields, (
+            "EnvironmentEvidence must contain comparison_results field"
         )
     def test_belief_state_methods_no_dice_roll_parameters(self):
     def test_belief_update_creation_without_dice_roll(self):
         """Test that BeliefUpdate can be created without dice_roll."""
+        # This should work (only comparison_results)
+        update = BeliefUpdate(comparison_results=["higher"])
+        assert update.comparison_results == ["higher"]
         # This should fail if dice_roll field exists
         try:
             # This should raise TypeError if dice_roll is not a field
+            BeliefUpdate(dice_roll=3, comparison_results=["higher"])
             pytest.fail("BeliefUpdate should not accept dice_roll parameter")
         except TypeError:
             pass  # Expected - dice_roll should not be a valid parameter
         # Verify that evidence history in belief domain contains only comparison results
         for evidence in game.belief_state.evidence_history:
+            assert hasattr(evidence, "comparison_results"), (
+                "Belief evidence must have comparison_results"
             )
             assert not hasattr(evidence, "dice_roll"), (
                 "Belief evidence MUST NOT have dice_roll"
         belief_state = BayesianBeliefState(dice_sides=6)
         # Apply "higher" evidence
+        update = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update)
         # Verify that probabilities follow expected pattern for "higher"
         assert hasattr(state.evidence_history[0], "dice_roll"), (
             "Game state should maintain full evidence for display"
         )
+        assert hasattr(state.evidence_history[0], "comparison_results"), (
             "Game state should maintain comparison results"
         )
         # But belief state should only have comparison results
         belief_evidence = game.belief_state.evidence_history[0]
+        assert hasattr(belief_evidence, "comparison_results"), (
+            "Belief evidence must have comparison_results"
         )
         assert not hasattr(belief_evidence, "dice_roll"), (
             "Belief evidence MUST NOT have dice_roll"
             belief_state = BayesianBeliefState(dice_sides=dice_sides)
             # Apply "higher" evidence
+            update = BeliefUpdate(comparison_results=["higher"])
             belief_state.update_beliefs(update)
             # Target 1 should have highest probability: P(roll > 1) = (dice_sides - 1) / dice_sides

tests/test_belief_domain.py CHANGED Viewed

@@ -9,15 +9,15 @@ class TestBeliefUpdate:
     def test_belief_update_creation(self):
         """Test creating belief update with valid data."""
-        update = BeliefUpdate(comparison_result="higher")
-        assert update.comparison_result == "higher"
     def test_belief_update_all_results(self):
         """Test belief update with all comparison results."""
         valid_results = ["higher", "lower", "same"]
         for result in valid_results:
-            update = BeliefUpdate(comparison_result=result)
-            assert update.comparison_result == result
 class TestBayesianBeliefState:
@@ -63,7 +63,7 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Update with evidence that favors lower target values
-        update = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update)
         # Lower targets are more likely to result in "higher" comparison
@@ -93,7 +93,7 @@ class TestBayesianBeliefState:
         # Evidence: comparison result is "higher" (dice roll > target)
         # This is more likely for lower target values
-        update = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update)
         # Lower targets should have higher probability than higher targets
@@ -111,7 +111,7 @@ class TestBayesianBeliefState:
         # Evidence: comparison result is "lower" (dice roll < target)
         # This is more likely for higher target values
-        update = BeliefUpdate(comparison_result="lower")
         belief_state.update_beliefs(update)
         # Higher targets should have higher probability than lower targets
@@ -129,7 +129,7 @@ class TestBayesianBeliefState:
         # Evidence: comparison result is "same" (dice roll = target)
         # This has equal probability for all targets: P(roll = target) = 1/6
-        update = BeliefUpdate(comparison_result="same")
         belief_state.update_beliefs(update)
         # All targets should have equal probability since P(roll = target) = 1/6 for all
@@ -142,11 +142,11 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         # First update: "higher" (favors lower targets)
-        update1 = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update1)
         # Second update: "lower" (favors higher targets)
-        update2 = BeliefUpdate(comparison_result="lower")
         belief_state.update_beliefs(update2)
         # The combination should favor middle targets
@@ -167,9 +167,9 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         updates = [
-            BeliefUpdate(comparison_result="higher"),
-            BeliefUpdate(comparison_result="lower"),
-            BeliefUpdate(comparison_result="same"),
         ]
         for update in updates:
@@ -183,7 +183,7 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Update beliefs
-        update = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update)
         # Verify beliefs changed from uniform
@@ -215,7 +215,7 @@ class TestBayesianBeliefState:
         # Create a near-certain belief by applying many "higher" updates
         # This will eventually make target 1 much more likely than others
         for _ in range(10):
-            update = BeliefUpdate(comparison_result="higher")
             belief_state.update_beliefs(update)
         entropy = belief_state.get_entropy()
@@ -227,7 +227,7 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Reduce uncertainty but don't eliminate it
-        update = BeliefUpdate(comparison_result="higher")
         belief_state.update_beliefs(update)
         entropy = belief_state.get_entropy()
@@ -245,8 +245,8 @@ class TestBayesianBeliefState:
         # Add some evidence
         updates = [
-            BeliefUpdate(comparison_result="higher"),
-            BeliefUpdate(comparison_result="lower"),
         ]
         for i, update in enumerate(updates, 1):
@@ -258,10 +258,10 @@ class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         updates = [
-            BeliefUpdate(comparison_result="higher"),
-            BeliefUpdate(comparison_result="lower"),
-            BeliefUpdate(comparison_result="same"),
-            BeliefUpdate(comparison_result="higher"),
         ]
         # Check initial sum
@@ -278,7 +278,7 @@ class TestBayesianBeliefState:
         # Apply a few "higher" results to favor lower targets
         for _ in range(3):
-            update1 = BeliefUpdate(comparison_result="higher")
             belief_state.update_beliefs(update1)
         # Target 1 should be favored, target 6 should have zero probability
@@ -289,7 +289,7 @@ class TestBayesianBeliefState:
         assert abs(prob_6 - 0.0) < 1e-10  # Target 6 should have zero probability
         # Apply more evidence and verify probabilities still sum to 1
-        update2 = BeliefUpdate(comparison_result="lower")
         belief_state.update_beliefs(update2)
         total_prob = sum(belief_state.get_belief_for_target(i) for i in range(1, 7))

     def test_belief_update_creation(self):
         """Test creating belief update with valid data."""
+        update = BeliefUpdate(comparison_results=["higher"])
+        assert update.comparison_results == ["higher"]
     def test_belief_update_all_results(self):
         """Test belief update with all comparison results."""
         valid_results = ["higher", "lower", "same"]
         for result in valid_results:
+            update = BeliefUpdate(comparison_results=[result])
+            assert update.comparison_results == [result]
 class TestBayesianBeliefState:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Update with evidence that favors lower target values
+        update = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update)
         # Lower targets are more likely to result in "higher" comparison
         # Evidence: comparison result is "higher" (dice roll > target)
         # This is more likely for lower target values
+        update = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update)
         # Lower targets should have higher probability than higher targets
         # Evidence: comparison result is "lower" (dice roll < target)
         # This is more likely for higher target values
+        update = BeliefUpdate(comparison_results=["lower"])
         belief_state.update_beliefs(update)
         # Higher targets should have higher probability than lower targets
         # Evidence: comparison result is "same" (dice roll = target)
         # This has equal probability for all targets: P(roll = target) = 1/6
+        update = BeliefUpdate(comparison_results=["same"])
         belief_state.update_beliefs(update)
         # All targets should have equal probability since P(roll = target) = 1/6 for all
         belief_state = BayesianBeliefState(dice_sides=6)
         # First update: "higher" (favors lower targets)
+        update1 = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update1)
         # Second update: "lower" (favors higher targets)
+        update2 = BeliefUpdate(comparison_results=["lower"])
         belief_state.update_beliefs(update2)
         # The combination should favor middle targets
         belief_state = BayesianBeliefState(dice_sides=6)
         updates = [
+            BeliefUpdate(comparison_results=["higher"]),
+            BeliefUpdate(comparison_results=["lower"]),
+            BeliefUpdate(comparison_results=["same"]),
         ]
         for update in updates:
         belief_state = BayesianBeliefState(dice_sides=6)
         # Update beliefs
+        update = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update)
         # Verify beliefs changed from uniform
         # Create a near-certain belief by applying many "higher" updates
         # This will eventually make target 1 much more likely than others
         for _ in range(10):
+            update = BeliefUpdate(comparison_results=["higher"])
             belief_state.update_beliefs(update)
         entropy = belief_state.get_entropy()
         belief_state = BayesianBeliefState(dice_sides=6)
         # Reduce uncertainty but don't eliminate it
+        update = BeliefUpdate(comparison_results=["higher"])
         belief_state.update_beliefs(update)
         entropy = belief_state.get_entropy()
         # Add some evidence
         updates = [
+            BeliefUpdate(comparison_results=["higher"]),
+            BeliefUpdate(comparison_results=["lower"]),
         ]
         for i, update in enumerate(updates, 1):
         belief_state = BayesianBeliefState(dice_sides=6)
         updates = [
+            BeliefUpdate(comparison_results=["higher"]),
+            BeliefUpdate(comparison_results=["lower"]),
+            BeliefUpdate(comparison_results=["same"]),
+            BeliefUpdate(comparison_results=["higher"]),
         ]
         # Check initial sum
         # Apply a few "higher" results to favor lower targets
         for _ in range(3):
+            update1 = BeliefUpdate(comparison_results=["higher"])
             belief_state.update_beliefs(update1)
         # Target 1 should be favored, target 6 should have zero probability
         assert abs(prob_6 - 0.0) < 1e-10  # Target 6 should have zero probability
         # Apply more evidence and verify probabilities still sum to 1
+        update2 = BeliefUpdate(comparison_results=["lower"])
         belief_state.update_beliefs(update2)
         total_prob = sum(belief_state.get_belief_for_target(i) for i in range(1, 7))

tests/test_environment_domain.py CHANGED Viewed

@@ -1,6 +1,10 @@
 import pytest
-from domains.environment.environment_domain import Environment, EnvironmentEvidence
 class TestEnvironmentEvidence:
@@ -8,16 +12,26 @@ class TestEnvironmentEvidence:
     def test_evidence_creation(self):
         """Test creating evidence with valid data."""
-        evidence = EnvironmentEvidence(dice_roll=3, comparison_result="higher")
         assert evidence.dice_roll == 3
-        assert evidence.comparison_result == "higher"
     def test_evidence_comparison_results(self):
         """Test all valid comparison results."""
         valid_results = ["higher", "lower", "same"]
         for result in valid_results:
-            evidence = EnvironmentEvidence(dice_roll=1, comparison_result=result)
-            assert evidence.comparison_result == result
 class TestEnvironment:
@@ -28,11 +42,13 @@ class TestEnvironment:
         # Default initialization
         env = Environment()
         assert env.dice_sides == 6
         assert env._target_value is None
         # Custom initialization
-        env = Environment(dice_sides=8, seed=42)
         assert env.dice_sides == 8
         assert env._target_value is None
     def test_set_target_value_valid(self):
@@ -103,11 +119,11 @@ class TestEnvironment:
             assert 1 <= evidence.dice_roll <= 6
             if evidence.dice_roll > 1:
-                assert evidence.comparison_result == "higher"
             elif evidence.dice_roll < 1:
-                assert evidence.comparison_result == "lower"
             else:
-                assert evidence.comparison_result == "same"
     def test_roll_dice_and_compare_lower(self):
         """Test dice roll comparison when result is lower."""
@@ -120,11 +136,11 @@ class TestEnvironment:
             assert 1 <= evidence.dice_roll <= 6
             if evidence.dice_roll > 6:
-                assert evidence.comparison_result == "higher"
             elif evidence.dice_roll < 6:
-                assert evidence.comparison_result == "lower"
             else:
-                assert evidence.comparison_result == "same"
     def test_roll_dice_and_compare_same(self):
         """Test dice roll comparison when result is same."""
@@ -140,13 +156,13 @@ class TestEnvironment:
                 evidence = env.roll_dice_and_compare()
                 if evidence.dice_roll == target:
-                    assert evidence.comparison_result == "same"
                     found_same = True
                     break
                 elif evidence.dice_roll > target:
-                    assert evidence.comparison_result == "higher"
                 else:
-                    assert evidence.comparison_result == "lower"
             # With 100 attempts, we should find at least one match for 6-sided die
             assert found_same, f"Failed to roll target value {target} in 100 attempts"
@@ -161,15 +177,17 @@ class TestEnvironment:
         # Roll many times to see all outcomes
         for _ in range(100):
             evidence = env.roll_dice_and_compare()
-            outcomes_seen.add(evidence.comparison_result)
             # Verify consistency
             if evidence.dice_roll > 3:
-                assert evidence.comparison_result == "higher"
             elif evidence.dice_roll < 3:
-                assert evidence.comparison_result == "lower"
             else:
-                assert evidence.comparison_result == "same"
         # Should see all three outcomes with enough rolls
         assert "higher" in outcomes_seen
@@ -184,4 +202,80 @@ class TestEnvironment:
             evidence = env.roll_dice_and_compare()
             assert 1 <= evidence.dice_roll <= sides
-            assert evidence.comparison_result in ["higher", "lower", "same"]

 import pytest
+from domains.environment.environment_domain import (
+    Environment,
+    EnvironmentEvidence,
+    EvidenceType,
+)
 class TestEnvironmentEvidence:
     def test_evidence_creation(self):
         """Test creating evidence with valid data."""
+        evidence = EnvironmentEvidence(dice_roll=3, comparison_results=["higher"])
         assert evidence.dice_roll == 3
+        assert evidence.comparison_results == ["higher"]
     def test_evidence_comparison_results(self):
         """Test all valid comparison results."""
         valid_results = ["higher", "lower", "same"]
         for result in valid_results:
+            evidence = EnvironmentEvidence(dice_roll=1, comparison_results=[result])
+            assert evidence.comparison_results == [result]
+    def test_evidence_multiple_comparison_results(self):
+        """Test evidence with multiple comparison results."""
+        evidence = EnvironmentEvidence(
+            dice_roll=3, comparison_results=["higher", "double"]
+        )
+        assert evidence.dice_roll == 3
+        assert evidence.comparison_results == ["higher", "double"]
+        assert "higher" in evidence.comparison_results
+        assert "double" in evidence.comparison_results
 class TestEnvironment:
         # Default initialization
         env = Environment()
         assert env.dice_sides == 6
+        assert env.evidence_type == EvidenceType.BASIC
         assert env._target_value is None
         # Custom initialization
+        env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
         assert env.dice_sides == 8
+        assert env.evidence_type == EvidenceType.EXTENDED
         assert env._target_value is None
     def test_set_target_value_valid(self):
             assert 1 <= evidence.dice_roll <= 6
             if evidence.dice_roll > 1:
+                assert "higher" in evidence.comparison_results
             elif evidence.dice_roll < 1:
+                assert "lower" in evidence.comparison_results
             else:
+                assert "same" in evidence.comparison_results
     def test_roll_dice_and_compare_lower(self):
         """Test dice roll comparison when result is lower."""
             assert 1 <= evidence.dice_roll <= 6
             if evidence.dice_roll > 6:
+                assert "higher" in evidence.comparison_results
             elif evidence.dice_roll < 6:
+                assert "lower" in evidence.comparison_results
             else:
+                assert "same" in evidence.comparison_results
     def test_roll_dice_and_compare_same(self):
         """Test dice roll comparison when result is same."""
                 evidence = env.roll_dice_and_compare()
                 if evidence.dice_roll == target:
+                    assert "same" in evidence.comparison_results
                     found_same = True
                     break
                 elif evidence.dice_roll > target:
+                    assert "higher" in evidence.comparison_results
                 else:
+                    assert "lower" in evidence.comparison_results
             # With 100 attempts, we should find at least one match for 6-sided die
             assert found_same, f"Failed to roll target value {target} in 100 attempts"
         # Roll many times to see all outcomes
         for _ in range(100):
             evidence = env.roll_dice_and_compare()
+            # Add all comparison results to outcomes_seen
+            for result in evidence.comparison_results:
+                outcomes_seen.add(result)
             # Verify consistency
             if evidence.dice_roll > 3:
+                assert "higher" in evidence.comparison_results
             elif evidence.dice_roll < 3:
+                assert "lower" in evidence.comparison_results
             else:
+                assert "same" in evidence.comparison_results
         # Should see all three outcomes with enough rolls
         assert "higher" in outcomes_seen
             evidence = env.roll_dice_and_compare()
             assert 1 <= evidence.dice_roll <= sides
+            # At least one basic comparison result should be present
+            basic_results = {"higher", "lower", "same"}
+            assert any(
+                result in basic_results for result in evidence.comparison_results
+            )
+    def test_basic_evidence_type(self):
+        """Test basic evidence type produces only basic comparison results."""
+        env = Environment(dice_sides=6, evidence_type=EvidenceType.BASIC, seed=42)
+        env.set_target_value(4)
+        for _ in range(50):
+            evidence = env.roll_dice_and_compare()
+            # Should only contain basic results
+            for result in evidence.comparison_results:
+                assert result in ["higher", "lower", "same"]
+            # Should contain exactly one basic result
+            assert len(evidence.comparison_results) == 1
+    def test_extended_evidence_type(self):
+        """Test extended evidence type can produce additional comparison results."""
+        env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
+        env.set_target_value(4)  # Target = 4, so half = 2, double = 8
+        extended_results_seen = set()
+        for _ in range(100):
+            evidence = env.roll_dice_and_compare()
+            # Should always contain at least one basic result
+            basic_results = {"higher", "lower", "same"}
+            assert any(
+                result in basic_results for result in evidence.comparison_results
+            )
+            # Collect all results
+            for result in evidence.comparison_results:
+                extended_results_seen.add(result)
+                assert result in ["higher", "lower", "same", "half", "double"]
+        # Basic results should definitely be seen
+        assert (
+            "higher" in extended_results_seen
+            or "lower" in extended_results_seen
+            or "same" in extended_results_seen
+        )
+    def test_extended_evidence_half_condition(self):
+        """Test that 'half' evidence is generated correctly."""
+        env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
+        env.set_target_value(4)  # Target = 4, so half = 2
+        # Force a dice roll of 2 by testing specific conditions
+        for _ in range(200):  # More attempts to find the half condition
+            evidence = env.roll_dice_and_compare()
+            if evidence.dice_roll == 2:  # Should be 'half' of target 4
+                assert "half" in evidence.comparison_results
+                assert "lower" in evidence.comparison_results  # 2 < 4
+                break
+        # If we didn't find it randomly, we know the logic is correct from the condition above
+        # This test mainly verifies the logic structure
+    def test_extended_evidence_double_condition(self):
+        """Test that 'double' evidence is generated correctly."""
+        env = Environment(dice_sides=8, evidence_type=EvidenceType.EXTENDED, seed=42)
+        env.set_target_value(3)  # Target = 3, so double = 6
+        # Force a dice roll of 6 by testing specific conditions
+        for _ in range(200):  # More attempts to find the double condition
+            evidence = env.roll_dice_and_compare()
+            if evidence.dice_roll == 6:  # Should be 'double' of target 3
+                assert "double" in evidence.comparison_results
+                assert "higher" in evidence.comparison_results  # 6 > 3
+                break
+        # If we didn't find it randomly, we know the logic is correct from the condition above
+        # This test mainly verifies the logic structure

tests/test_game_coordination.py CHANGED Viewed

@@ -20,7 +20,7 @@ class TestGameState:
     def test_game_state_with_optional_params(self):
         """Test creating game state with optional parameters."""
-        evidence = [EnvironmentEvidence(dice_roll=3, comparison_result="higher")]
         beliefs = [0.2, 0.3, 0.5]
         state = GameState(
@@ -146,7 +146,9 @@ class TestBayesianGame:
         # Evidence should be valid
         evidence = updated_state.evidence_history[0]
         assert 1 <= evidence.dice_roll <= 6
-        assert evidence.comparison_result in ["higher", "lower", "same"]
     def test_play_multiple_rounds(self):
         """Test playing multiple rounds."""
@@ -295,7 +297,7 @@ class TestBayesianGame:
         # Evidence should influence beliefs correctly
         for state in states:
             for evidence in state.evidence_history:
-                if evidence.comparison_result == "higher":
                     # Target must be less than dice roll
                     for _target in range(evidence.dice_roll, 7):
                         # These targets should have reduced probability
@@ -309,7 +311,7 @@ class TestBayesianGame:
         # Apply evidence that changes beliefs
         from domains.belief.belief_domain import BeliefUpdate
-        update = BeliefUpdate(comparison_result="higher")
         game.belief_state.update_beliefs(update)
         # Update game state to reflect the belief change
@@ -346,7 +348,7 @@ class TestBayesianGame:
                 state1.evidence_history, state2.evidence_history, strict=False
             ):
                 assert ev1.dice_roll == ev2.dice_roll
-                assert ev1.comparison_result == ev2.comparison_result
             # Beliefs should be identical
             assert state1.current_beliefs == state2.current_beliefs

     def test_game_state_with_optional_params(self):
         """Test creating game state with optional parameters."""
+        evidence = [EnvironmentEvidence(dice_roll=3, comparison_results=["higher"])]
         beliefs = [0.2, 0.3, 0.5]
         state = GameState(
         # Evidence should be valid
         evidence = updated_state.evidence_history[0]
         assert 1 <= evidence.dice_roll <= 6
+        # At least one basic comparison result should be present
+        basic_results = {"higher", "lower", "same"}
+        assert any(result in basic_results for result in evidence.comparison_results)
     def test_play_multiple_rounds(self):
         """Test playing multiple rounds."""
         # Evidence should influence beliefs correctly
         for state in states:
             for evidence in state.evidence_history:
+                if "higher" in evidence.comparison_results:
                     # Target must be less than dice roll
                     for _target in range(evidence.dice_roll, 7):
                         # These targets should have reduced probability
         # Apply evidence that changes beliefs
         from domains.belief.belief_domain import BeliefUpdate
+        update = BeliefUpdate(comparison_results=["higher"])
         game.belief_state.update_beliefs(update)
         # Update game state to reflect the belief change
                 state1.evidence_history, state2.evidence_history, strict=False
             ):
                 assert ev1.dice_roll == ev2.dice_roll
+                assert ev1.comparison_results == ev2.comparison_results
             # Beliefs should be identical
             assert state1.current_beliefs == state2.current_beliefs

ui/gradio_interface.py CHANGED Viewed

@@ -2,6 +2,7 @@ import gradio as gr
 import matplotlib.pyplot as plt
 from domains.coordination.game_coordination import BayesianGame, GamePhase
 class GradioInterface:
@@ -13,18 +14,29 @@ class GradioInterface:
         self.reset_game()
     def reset_game(
-        self, dice_sides: int = 6, max_rounds: int = 10
     ) -> tuple[str, plt.Figure, str]:
         """Reset the game with new parameters.
         Args:
             dice_sides: Number of sides on the dice
             max_rounds: Maximum number of rounds
         Returns:
             Tuple of (status, belief_chart, game_log)
         """
-        self.game = BayesianGame(dice_sides=dice_sides, max_rounds=max_rounds)
         return self._get_interface_state()
     def start_new_game(self, target_value: str = "") -> tuple[str, plt.Figure, str]:
@@ -224,12 +236,20 @@ class GradioInterface:
         log_lines = ["**Evidence History:**\n"]
         for i, evidence in enumerate(self.game.game_state.evidence_history, 1):
-            emoji = {"higher": "⬆️", "lower": "⬇️", "same": "🎯"}[
-                evidence.comparison_result
-            ]
-            log_lines.append(
-                f"Round {i}: Rolled {evidence.dice_roll} → {evidence.comparison_result} {emoji}"
-            )
         # Add completion message if game is finished
         if self.game.game_state.phase == GamePhase.FINISHED:
@@ -282,7 +302,9 @@ def create_interface() -> gr.Interface:
         **Game Rules:**
         - Judge and Player 1 can see the target die value
         - Player 2 must deduce the target value using Bayesian inference
-        - Each round: Player 1 rolls dice and reports "higher"/"lower"/"same" compared to target
         - Game runs for a specified number of rounds
         """
         )
@@ -299,6 +321,13 @@ def create_interface() -> gr.Interface:
                         value=10, label="Max Rounds", minimum=1, maximum=50, precision=0
                     )
                 reset_btn = gr.Button("🔄 Reset Game", variant="secondary")
                 target_input = gr.Textbox(
@@ -317,7 +346,7 @@ def create_interface() -> gr.Interface:
         # Event handlers
         reset_btn.click(
             interface.reset_game,
-            inputs=[dice_sides, max_rounds],
             outputs=[status_output, belief_plot, game_log],
         )

 import matplotlib.pyplot as plt
 from domains.coordination.game_coordination import BayesianGame, GamePhase
+from domains.environment.environment_domain import EvidenceType
 class GradioInterface:
         self.reset_game()
     def reset_game(
+        self,
+        dice_sides: int = 6,
+        max_rounds: int = 10,
+        evidence_type_str: str = "Basic",
     ) -> tuple[str, plt.Figure, str]:
         """Reset the game with new parameters.
         Args:
             dice_sides: Number of sides on the dice
             max_rounds: Maximum number of rounds
+            evidence_type_str: Evidence type ("Basic" or "Extended")
         Returns:
             Tuple of (status, belief_chart, game_log)
         """
+        evidence_type = (
+            EvidenceType.EXTENDED
+            if evidence_type_str == "Extended"
+            else EvidenceType.BASIC
+        )
+        self.game = BayesianGame(
+            dice_sides=dice_sides, max_rounds=max_rounds, evidence_type=evidence_type
+        )
         return self._get_interface_state()
     def start_new_game(self, target_value: str = "") -> tuple[str, plt.Figure, str]:
         log_lines = ["**Evidence History:**\n"]
         for i, evidence in enumerate(self.game.game_state.evidence_history, 1):
+            # Handle multiple evidence types
+            evidence_display = []
+            for result in evidence.comparison_results:
+                emoji = {
+                    "higher": "⬆️",
+                    "lower": "⬇️",
+                    "same": "🎯",
+                    "half": "½",
+                    "double": "x2",
+                }.get(result, "❓")
+                evidence_display.append(f"{result} {emoji}")
+            evidence_str = ", ".join(evidence_display)
+            log_lines.append(f"Round {i}: Rolled {evidence.dice_roll} → {evidence_str}")
         # Add completion message if game is finished
         if self.game.game_state.phase == GamePhase.FINISHED:
         **Game Rules:**
         - Judge and Player 1 can see the target die value
         - Player 2 must deduce the target value using Bayesian inference
+        - Each round: Player 1 rolls dice and reports evidence based on selected type
+        - **Basic Evidence**: higher/lower/same compared to target
+        - **Extended Evidence**: higher/lower/same/half/double (multiple types can apply)
         - Game runs for a specified number of rounds
         """
         )
                         value=10, label="Max Rounds", minimum=1, maximum=50, precision=0
                     )
+                evidence_type_dropdown = gr.Dropdown(
+                    choices=["Basic", "Extended"],
+                    value="Basic",
+                    label="Evidence Type",
+                    info="Basic: higher/lower/same only. Extended: adds half/double evidence.",
+                )
                 reset_btn = gr.Button("🔄 Reset Game", variant="secondary")
                 target_input = gr.Textbox(
         # Event handlers
         reset_btn.click(
             interface.reset_game,
+            inputs=[dice_sides, max_rounds, evidence_type_dropdown],
             outputs=[status_output, belief_plot, game_log],
         )