Scene_Foundry_Demo / EVALUATION_README.md
Chunteng's picture
Initial commit (Fresh Start)
a03fc9e
# Layout Evaluation
---
## Quick Start
### 1. Full Evaluation (generate new scenes)
```bash
# Run all tests (will call AI API for config generation)
python run_evaluation.py --generate
# Run single test
python run_evaluation.py --test basic_studio_01 --generate
```
**Note**: `--generate` will call OpenAI API for each test to generate configuration
### 2. View Test Cases
```bash
python -c "from evaluation.test_cases import get_test_cases; \
cases = get_test_cases(); \
print(f'Total: {len(cases)} test cases'); \
[print(f'{i+1}. {c.id}: {c.prompt[:50]}...') for i, c in enumerate(cases)]"
```
---
## Evaluation Metrics
### Balanced Evaluator Scoring Components
| Metric | Weight | Description |
|--------|--------|-------------|
| **Room Count Accuracy** | 30% | Match between generated and GT room counts (with tolerance) |
| **Room Presence** | 20% | Required rooms exist |
| **Adjacency** | 30% | Correct adjacency relationships |
| **Min Area** | 10% | Meet minimum area requirements |
| **Max Area** | 10% | Within maximum area limits |
### Score Interpretation
- **90-100%**: Excellent - fully meets requirements
- **80-89%**: Good - mostly meets requirements, minor issues
- **70-79%**: Acceptable - meets basic requirements, room for improvement
- **<70%**: Insufficient - does not meet requirements
---
## Test Cases
20 test cases covering different complexity levels and use cases:
### Simple (3 cases)
- `basic_studio_01`: Basic studio apartment
- `one_bedroom_apt_01`: One-bedroom apartment
- `student_apartment_01`: Minimal student housing
### Medium (6 cases)
- `two_bedroom_apt_01`: Two-bedroom apartment
- `open_plan_loft_01`: Open-plan space
- `family_home_01`: Family home
- `master_suite_home_01`: Master suite home
- `compact_efficiency_01`: Space-efficient compact layout
- `three_bed_family_townhouse_01`: Standard family townhouse
### Complex (5 cases)
- `four_bedroom_house_01`: Four-bedroom house
- `separated_zones_01`: Separated day/night zones
- `guest_suite_home_01`: Guest suite home
- `dual_master_suite_01`: Dual master suites
- `multigenerational_home_01`: Multi-generational layout (4BR/3BA)
### Special (6 cases)
- `home_office_layout_01`: Home office layout
- `balcony_apartment_01`: Balcony apartment
- `work_from_home_layout_01`: WFH with office separation
- `luxury_penthouse_01`: Luxury penthouse with study
- `entertainment_home_01`: Entertainment-focused design
- `single_floor_accessible_01`: Accessibility-friendly layout
---
## Ground Truth Format
```python
{
"nodes": [
{
"id": "1",
"type": "LivingRoom", # Room type
"required": True, # Is required
"count": 1, # Expected count
"min_area": 15.0, # Minimum area (sq meters)
"max_area": 30.0 # Maximum area (optional)
}
],
"edges": [
{
"from": "LivingRoom",
"to": "Kitchen",
"type": "adjacent", # Adjacency type
"required": True # Is required
}
],
"constraints": {
"adjacency": [ # Required adjacency
("LivingRoom", "Kitchen")
],
"room_counts": { # Room counts
"Bedroom": 2
},
"min_areas": { # Minimum areas
"LivingRoom": 15.0
},
"max_areas": { # Maximum areas (optional)
"LivingRoom": 30.0
}
}
}
```