Scene_Foundry_Demo / EVALUATION_README.md
Chunteng's picture
Initial commit (Fresh Start)
a03fc9e

A newer version of the Gradio SDK is available: 6.3.0

Upgrade

Layout Evaluation


Quick Start

1. Full Evaluation (generate new scenes)

# Run all tests (will call AI API for config generation)
python run_evaluation.py --generate

# Run single test
python run_evaluation.py --test basic_studio_01 --generate

Note: --generate will call OpenAI API for each test to generate configuration

2. View Test Cases

python -c "from evaluation.test_cases import get_test_cases; \
           cases = get_test_cases(); \
           print(f'Total: {len(cases)} test cases'); \
           [print(f'{i+1}. {c.id}: {c.prompt[:50]}...') for i, c in enumerate(cases)]"

Evaluation Metrics

Balanced Evaluator Scoring Components

Metric Weight Description
Room Count Accuracy 30% Match between generated and GT room counts (with tolerance)
Room Presence 20% Required rooms exist
Adjacency 30% Correct adjacency relationships
Min Area 10% Meet minimum area requirements
Max Area 10% Within maximum area limits

Score Interpretation

  • 90-100%: Excellent - fully meets requirements
  • 80-89%: Good - mostly meets requirements, minor issues
  • 70-79%: Acceptable - meets basic requirements, room for improvement
  • <70%: Insufficient - does not meet requirements

Test Cases

20 test cases covering different complexity levels and use cases:

Simple (3 cases)

  • basic_studio_01: Basic studio apartment
  • one_bedroom_apt_01: One-bedroom apartment
  • student_apartment_01: Minimal student housing

Medium (6 cases)

  • two_bedroom_apt_01: Two-bedroom apartment
  • open_plan_loft_01: Open-plan space
  • family_home_01: Family home
  • master_suite_home_01: Master suite home
  • compact_efficiency_01: Space-efficient compact layout
  • three_bed_family_townhouse_01: Standard family townhouse

Complex (5 cases)

  • four_bedroom_house_01: Four-bedroom house
  • separated_zones_01: Separated day/night zones
  • guest_suite_home_01: Guest suite home
  • dual_master_suite_01: Dual master suites
  • multigenerational_home_01: Multi-generational layout (4BR/3BA)

Special (6 cases)

  • home_office_layout_01: Home office layout
  • balcony_apartment_01: Balcony apartment
  • work_from_home_layout_01: WFH with office separation
  • luxury_penthouse_01: Luxury penthouse with study
  • entertainment_home_01: Entertainment-focused design
  • single_floor_accessible_01: Accessibility-friendly layout

Ground Truth Format

{
    "nodes": [
        {
            "id": "1",
            "type": "LivingRoom",        # Room type
            "required": True,             # Is required
            "count": 1,                   # Expected count
            "min_area": 15.0,            # Minimum area (sq meters)
            "max_area": 30.0             # Maximum area (optional)
        }
    ],
    "edges": [
        {
            "from": "LivingRoom",
            "to": "Kitchen",
            "type": "adjacent",           # Adjacency type
            "required": True              # Is required
        }
    ],
    "constraints": {
        "adjacency": [                    # Required adjacency
            ("LivingRoom", "Kitchen")
        ],
        "room_counts": {                  # Room counts
            "Bedroom": 2
        },
        "min_areas": {                    # Minimum areas
            "LivingRoom": 15.0
        },
        "max_areas": {                    # Maximum areas (optional)
            "LivingRoom": 30.0
        }
    }
}