Spaces:
Runtime error
Runtime error
| # Layout Evaluation | |
| --- | |
| ## Quick Start | |
| ### 1. Full Evaluation (generate new scenes) | |
| ```bash | |
| # Run all tests (will call AI API for config generation) | |
| python run_evaluation.py --generate | |
| # Run single test | |
| python run_evaluation.py --test basic_studio_01 --generate | |
| ``` | |
| **Note**: `--generate` will call OpenAI API for each test to generate configuration | |
| ### 2. View Test Cases | |
| ```bash | |
| python -c "from evaluation.test_cases import get_test_cases; \ | |
| cases = get_test_cases(); \ | |
| print(f'Total: {len(cases)} test cases'); \ | |
| [print(f'{i+1}. {c.id}: {c.prompt[:50]}...') for i, c in enumerate(cases)]" | |
| ``` | |
| --- | |
| ## Evaluation Metrics | |
| ### Balanced Evaluator Scoring Components | |
| | Metric | Weight | Description | | |
| |--------|--------|-------------| | |
| | **Room Count Accuracy** | 30% | Match between generated and GT room counts (with tolerance) | | |
| | **Room Presence** | 20% | Required rooms exist | | |
| | **Adjacency** | 30% | Correct adjacency relationships | | |
| | **Min Area** | 10% | Meet minimum area requirements | | |
| | **Max Area** | 10% | Within maximum area limits | | |
| ### Score Interpretation | |
| - **90-100%**: Excellent - fully meets requirements | |
| - **80-89%**: Good - mostly meets requirements, minor issues | |
| - **70-79%**: Acceptable - meets basic requirements, room for improvement | |
| - **<70%**: Insufficient - does not meet requirements | |
| --- | |
| ## Test Cases | |
| 20 test cases covering different complexity levels and use cases: | |
| ### Simple (3 cases) | |
| - `basic_studio_01`: Basic studio apartment | |
| - `one_bedroom_apt_01`: One-bedroom apartment | |
| - `student_apartment_01`: Minimal student housing | |
| ### Medium (6 cases) | |
| - `two_bedroom_apt_01`: Two-bedroom apartment | |
| - `open_plan_loft_01`: Open-plan space | |
| - `family_home_01`: Family home | |
| - `master_suite_home_01`: Master suite home | |
| - `compact_efficiency_01`: Space-efficient compact layout | |
| - `three_bed_family_townhouse_01`: Standard family townhouse | |
| ### Complex (5 cases) | |
| - `four_bedroom_house_01`: Four-bedroom house | |
| - `separated_zones_01`: Separated day/night zones | |
| - `guest_suite_home_01`: Guest suite home | |
| - `dual_master_suite_01`: Dual master suites | |
| - `multigenerational_home_01`: Multi-generational layout (4BR/3BA) | |
| ### Special (6 cases) | |
| - `home_office_layout_01`: Home office layout | |
| - `balcony_apartment_01`: Balcony apartment | |
| - `work_from_home_layout_01`: WFH with office separation | |
| - `luxury_penthouse_01`: Luxury penthouse with study | |
| - `entertainment_home_01`: Entertainment-focused design | |
| - `single_floor_accessible_01`: Accessibility-friendly layout | |
| --- | |
| ## Ground Truth Format | |
| ```python | |
| { | |
| "nodes": [ | |
| { | |
| "id": "1", | |
| "type": "LivingRoom", # Room type | |
| "required": True, # Is required | |
| "count": 1, # Expected count | |
| "min_area": 15.0, # Minimum area (sq meters) | |
| "max_area": 30.0 # Maximum area (optional) | |
| } | |
| ], | |
| "edges": [ | |
| { | |
| "from": "LivingRoom", | |
| "to": "Kitchen", | |
| "type": "adjacent", # Adjacency type | |
| "required": True # Is required | |
| } | |
| ], | |
| "constraints": { | |
| "adjacency": [ # Required adjacency | |
| ("LivingRoom", "Kitchen") | |
| ], | |
| "room_counts": { # Room counts | |
| "Bedroom": 2 | |
| }, | |
| "min_areas": { # Minimum areas | |
| "LivingRoom": 15.0 | |
| }, | |
| "max_areas": { # Maximum areas (optional) | |
| "LivingRoom": 30.0 | |
| } | |
| } | |
| } | |
| ``` | |