Spaces:

scholo
/

MMIB-Counterfactual-Image-Generation-Tool

Running

App Files Files Community

Scholarus commited on 15 days ago

Commit

29350b2

1 Parent(s): cf82114

New options added

Browse files

Files changed (6) hide show

README.md +28 -4
app.py +10 -7
pipeline.py +434 -37
scripts/generate_questions_mapping.py +277 -100
scripts/generate_scenes.py +2 -2
scripts/render.py +85 -14

README.md CHANGED Viewed

@@ -190,20 +190,18 @@ python3 pipeline.py --num_scenes 10 --min_cf_change_score 1.5 --max_cf_attempts
 The pipeline supports 17 different counterfactual types, divided into two categories:
-**Image Counterfactuals** (Should change VQA answers - 9 types):
 - `change_color` - Change the color of a random object (e.g., red → blue)
 - `change_shape` - Change the shape of a random object (cube/sphere/cylinder)
 - `change_size` - Change the size of a random object (small ↔ large)
 - `change_material` - Change the material of a random object (metal ↔ rubber)
 - `change_position` - Move a random object to a different location (with collision detection)
-- `change_count` - Add or remove objects from the scene
 - `add_object` - Add a new random object to the scene
 - `remove_object` - Remove a random object from the scene
 - `replace_object` - Replace an object with a different one (keeping position)
-**Negative Counterfactuals** (Should NOT change VQA answers - 8 types):
 - `change_background` - Change the background/ground color
-- `change_texture` - Change texture style for all objects (metal ↔ rubber)
 - `change_lighting` - Change lighting conditions (bright/dim/warm/cool/dramatic)
 - `add_noise` - Add image noise/grain (light/medium/heavy levels)
 - `apply_fisheye` - Apply fisheye lens distortion effect
@@ -282,6 +280,32 @@ python scripts/generate_questions_mapping.py --output_dir output --auto_latest -
 python scripts/generate_questions_mapping.py --output_dir output --auto_latest --generate_questions --long_format --long_csv_name qa_dataset.csv
 ```
 ## Project Structure

 The pipeline supports 17 different counterfactual types, divided into two categories:
+**Image Counterfactuals** (Should change VQA answers - 8 types):
 - `change_color` - Change the color of a random object (e.g., red → blue)
 - `change_shape` - Change the shape of a random object (cube/sphere/cylinder)
 - `change_size` - Change the size of a random object (small ↔ large)
 - `change_material` - Change the material of a random object (metal ↔ rubber)
 - `change_position` - Move a random object to a different location (with collision detection)
 - `add_object` - Add a new random object to the scene
 - `remove_object` - Remove a random object from the scene
 - `replace_object` - Replace an object with a different one (keeping position)
+**Negative Counterfactuals** (Should NOT change VQA answers - 7 types):
 - `change_background` - Change the background/ground color
 - `change_lighting` - Change lighting conditions (bright/dim/warm/cool/dramatic)
 - `add_noise` - Add image noise/grain (light/medium/heavy levels)
 - `apply_fisheye` - Apply fisheye lens distortion effect
 python scripts/generate_questions_mapping.py --output_dir output --auto_latest --generate_questions --long_format --long_csv_name qa_dataset.csv
 ```
+#### Ensuring counterfactual answers differ
+For evaluation and datasets, the **counterfactual image’s answer to its counterfactual question** should differ from the **original image’s answer to the original question**. The pipeline does this in two ways:
+1. **Automatic retries**
+   When generating questions (with `--generate_questions`), the script retries up to **`MAX_CF_ANSWER_RETRIES`** (default 50) per scene. For each attempt it picks new counterfactual questions (with different randomness). It keeps a pair only when both CF1 and CF2 answers differ from the original answer (after normalizing, e.g. case and whitespace). If after all retries they still match, the scene is still included and a warning is printed.
+2. **CF-specific question templates**
+   Counterfactual questions are chosen from templates that target the **changed** attribute or count, so the answer on the counterfactual image is different by design:
+   - **Count-changing CFs** (e.g. `add_object`, `remove_object`): questions like “How many objects are in the scene?” or “Are there more than N objects?” so the count/yes-no differs.
+   - **Attribute-changing CFs** (e.g. `change_color`, `change_shape`): questions about the **new** value (e.g. “How many red objects?” when an object was changed to red), so the count on the CF image differs from the original.
+**What you need to do:**
+- Run question generation **after** scenes and images exist (either as part of the pipeline with `--generate_questions`, or later on a run directory):
+  ```bash
+  # As part of a full run (ensures answers differ for that run’s scenes)
+  python pipeline.py --num_scenes 10 --num_objects 5 --run_name my_run --generate_questions
+  # Or later, on an existing run
+  python scripts/generate_questions_mapping.py --output_dir output/my_run --generate_questions
+  ```
+- **Optional:** To allow more attempts per scene, edit `scripts/generate_questions_mapping.py` and increase **`MAX_CF_ANSWER_RETRIES`** (e.g. from 50 to 100). No CLI flag is exposed for this.
 ## Project Structure

app.py CHANGED Viewed

@@ -618,20 +618,21 @@ def main():
             value=False,
             help="Use the same counterfactual type for every variant (first selected type, or one random if none selected)"
         )
-        with st.expander("Image CFs (change answers)", expanded=False):
             use_change_color = st.checkbox("Change Color", value=False)
             use_change_shape = st.checkbox("Change Shape", value=False)
             use_change_size = st.checkbox("Change Size", value=False)
             use_change_material = st.checkbox("Change Material", value=False)
             use_change_position = st.checkbox("Change Position", value=False)
-            use_change_count = st.checkbox("Change Count", value=False)
             use_add_object = st.checkbox("Add Object", value=False)
             use_remove_object = st.checkbox("Remove Object", value=False)
             use_replace_object = st.checkbox("Replace Object", value=False)
         with st.expander("Negative CFs (don't change answers)", expanded=False):
             use_change_background = st.checkbox("Change Background", value=False)
-            use_change_texture = st.checkbox("Change Texture", value=False)
             use_change_lighting = st.checkbox("Change Lighting", value=False)
             use_add_noise = st.checkbox("Add Noise", value=False)
             use_apply_fisheye = st.checkbox("Apply Fisheye", value=False)
@@ -713,18 +714,20 @@ def main():
             cf_types.append('change_material')
         if use_change_position:
             cf_types.append('change_position')
-        if use_change_count:
-            cf_types.append('change_count')
         if use_add_object:
             cf_types.append('add_object')
         if use_remove_object:
             cf_types.append('remove_object')
         if use_replace_object:
             cf_types.append('replace_object')
         if use_change_background:
             cf_types.append('change_background')
-        if use_change_texture:
-            cf_types.append('change_texture')
         if use_change_lighting:
             cf_types.append('change_lighting')
         if use_add_noise:

             value=False,
             help="Use the same counterfactual type for every variant (first selected type, or one random if none selected)"
         )
+        with st.expander("Image CFs (change answers)", expanded=True):
             use_change_color = st.checkbox("Change Color", value=False)
             use_change_shape = st.checkbox("Change Shape", value=False)
             use_change_size = st.checkbox("Change Size", value=False)
             use_change_material = st.checkbox("Change Material", value=False)
             use_change_position = st.checkbox("Change Position", value=False)
             use_add_object = st.checkbox("Add Object", value=False)
             use_remove_object = st.checkbox("Remove Object", value=False)
             use_replace_object = st.checkbox("Replace Object", value=False)
+            use_swap_attribute = st.checkbox("Swap Attribute", value=False)
+            use_occlusion_change = st.checkbox("Occlusion Change", value=False)
+            use_relational_flip = st.checkbox("Relational Flip", value=False)
         with st.expander("Negative CFs (don't change answers)", expanded=False):
             use_change_background = st.checkbox("Change Background", value=False)
             use_change_lighting = st.checkbox("Change Lighting", value=False)
             use_add_noise = st.checkbox("Add Noise", value=False)
             use_apply_fisheye = st.checkbox("Apply Fisheye", value=False)
             cf_types.append('change_material')
         if use_change_position:
             cf_types.append('change_position')
         if use_add_object:
             cf_types.append('add_object')
         if use_remove_object:
             cf_types.append('remove_object')
         if use_replace_object:
             cf_types.append('replace_object')
+        if use_swap_attribute:
+            cf_types.append('swap_attribute')
+        if use_occlusion_change:
+            cf_types.append('occlusion_change')
+        if use_relational_flip:
+            cf_types.append('relational_flip')
         if use_change_background:
             cf_types.append('change_background')
         if use_change_lighting:
             cf_types.append('change_lighting')
         if use_add_noise:

pipeline.py CHANGED Viewed

@@ -4,6 +4,7 @@ import json
 import argparse
 import os
 import subprocess
 import random
 import copy
 import sys
@@ -176,20 +177,7 @@ def create_patched_render_script():
         if old_margin_check in patched_content:
             patched_content = patched_content.replace(old_margin_check, new_margin_check)
-    check_vis_start = patched_content.find('def check_visibility(blender_objects, min_pixels_per_object):')
-    if check_vis_start != -1:
-        docstring_end = patched_content.find('"""', check_vis_start + 50)
-        docstring_end = patched_content.find('\n', docstring_end) + 1
-        next_def = patched_content.find('\ndef ', docstring_end)
-        if next_def == -1:
-            next_def = len(patched_content)
-        new_function = '''def check_visibility(blender_objects, min_pixels_per_object):
-  return True
-'''
-        patched_content = patched_content[:check_vis_start] + new_function + patched_content[next_def:]
     patched_content = patched_content.replace(
         "parser.add_argument('--min_pixels_per_object', default=200, type=int,",
         "parser.add_argument('--min_pixels_per_object', default=50, type=int,"
@@ -447,10 +435,11 @@ def cf_change_position(scene):
     old_coords = obj['3d_coords']
     try:
-        with open('data/properties.json', 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
-    except:
         size_mapping = {'small': 0.35, 'large': 0.7}
     r = size_mapping.get(obj['size'], 0.5)
@@ -458,12 +447,9 @@ def cf_change_position(scene):
         r /= math.sqrt(2)
     min_dist = 0.25
-    max_attempts = 100
-    for attempt in range(max_attempts):
         new_x = random.uniform(-3, 3)
         new_y = random.uniform(-3, 3)
         try:
             dx0 = float(new_x) - float(old_coords[0])
             dy0 = float(new_y) - float(old_coords[1])
@@ -476,14 +462,11 @@ def cf_change_position(scene):
         for other_idx, other_obj in enumerate(cf_scene['objects']):
             if other_idx == move_idx:
                 continue
             other_x, other_y, _ = other_obj['3d_coords']
             other_r = size_mapping.get(other_obj['size'], 0.5)
             if other_obj['shape'] == 'cube':
                 other_r /= math.sqrt(2)
             dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
             if dist < (r + other_r + min_dist):
                 collision = True
                 break
@@ -502,10 +485,11 @@ def cf_change_surrounding_count(scene):
         return cf_scene, "no change (0 objects)"
     try:
-        with open('data/properties.json', 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
-    except:
         size_mapping = {'small': 0.35, 'large': 0.7}
     action = random.choice(['add', 'remove'])
@@ -571,7 +555,6 @@ def cf_change_surrounding_count(scene):
             return cf_scene, "no change (couldn't find valid positions for new objects)"
     else:
-        # Remove 1-2 objects
         num_to_remove = min(random.randint(1, 2), len(cf_scene['objects']) - 1)
         removed = []
         for _ in range(num_to_remove):
@@ -590,12 +573,12 @@ def cf_add_object(scene):
     materials = ['metal', 'rubber']
     sizes = ['small', 'large']
-    # Load size mapping
     try:
-        with open('data/properties.json', 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
-    except:
         size_mapping = {'small': 0.35, 'large': 0.7}
     min_dist = 0.25
@@ -680,6 +663,193 @@ def cf_replace_object(scene):
     return cf_scene, f"replaced {old_desc} with {new_color} {new_shape}"
 def cf_change_background(scene):
     cf_scene = copy.deepcopy(scene)
@@ -854,16 +1024,17 @@ IMAGE_COUNTERFACTUALS = {
     'change_size': cf_change_size,
     'change_material': cf_change_material,
     'change_position': cf_change_position,
-    'change_count': cf_change_surrounding_count,
     'add_object': cf_add_object,
     'remove_object': cf_remove_object,
     'replace_object': cf_replace_object,
 }
 # Negative CFs - These should NOT change answers to questions
 NEGATIVE_COUNTERFACTUALS = {
     'change_background': cf_change_background,
-    'change_texture': cf_change_texture,
     'change_lighting': cf_change_lighting,
     'add_noise': cf_add_noise,
     'apply_fisheye': cf_apply_fisheye,
@@ -965,7 +1136,7 @@ def generate_counterfactuals(scene, num_counterfactuals=2, cf_types=None, same_c
         one_type = random.choice(list(COUNTERFACTUAL_TYPES.keys()))
         selected_types = [one_type] * num_counterfactuals
     elif cf_types:
-        selected_types = (cf_types * ((num_counterfactuals // len(cf_types)) + 1))[:num_counterfactuals]
     if selected_types is not None:
         for cf_type in selected_types:
@@ -1293,14 +1464,15 @@ def list_counterfactual_types():
     print("  change_size        - Change size of an object (small/large)")
     print("  change_material    - Change material of an object (metal/rubber)")
     print("  change_position    - Move an object to a different location")
-    print("  change_count       - Add or remove objects from the scene")
     print("  add_object         - Add a new random object")
     print("  remove_object      - Remove a random object")
     print("  replace_object     - Replace an object with a different one")
     print("\nNEGATIVE COUNTERFACTUALS (Should NOT change VQA answers):")
     print("  change_background  - Change background/ground color")
-    print("  change_texture     - Change texture style (all objects)")
     print("  change_lighting    - Change lighting conditions")
     print("  add_noise          - Add image noise/grain")
     print("  apply_fisheye      - Apply fisheye lens distortion")
@@ -1556,6 +1728,218 @@ def save_run_metadata(run_dir, args, successful_scenes, successful_renders):
     print(f"\n[OK] Saved run metadata to: {metadata_path}")
 def main():
     # When run as a script, ensure we're in the project root
     # Change to script directory so relative paths work
@@ -1602,10 +1986,10 @@ def main():
                        # Image CFs (should change answers)
                        'change_color', 'change_shape', 'change_size',
                        'change_material', 'change_position',
-                       'change_count', 'add_object', 'remove_object',
-                       'replace_object',
                        # Negative CFs (should NOT change answers)
-                       'change_background', 'change_texture',
                        'change_lighting', 'add_noise',
                        'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
                    ],
@@ -1625,8 +2009,12 @@ def main():
     parser.add_argument('--generate_questions', action='store_true',
                        help='Create questions and answers CSV after rendering completes')
     parser.add_argument('--csv_name', default='image_mapping_with_questions.csv',
                        help='Output CSV filename (default: image_mapping_with_questions.csv)')
     args = parser.parse_args()
@@ -1642,6 +2030,13 @@ def main():
         print("ERROR: --run_name is required when using --resume")
         return
     # Find Blender
     blender_path = args.blender_path or find_blender()
     print(f"Using Blender: {blender_path}")
@@ -1843,6 +2238,8 @@ def main():
                     generate_questions=True
                 )
                 print(f"\n[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
             except Exception as e:
                 print(f"\n[ERROR] Questions: {e}")
                 import traceback

 import argparse
 import os
 import subprocess
+import csv
 import random
 import copy
 import sys
         if old_margin_check in patched_content:
             patched_content = patched_content.replace(old_margin_check, new_margin_check)
+    # Visibility check left enabled (no longer patched to return True)
     patched_content = patched_content.replace(
         "parser.add_argument('--min_pixels_per_object', default=200, type=int,",
         "parser.add_argument('--min_pixels_per_object', default=50, type=int,"
     old_coords = obj['3d_coords']
     try:
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
+    except Exception:
         size_mapping = {'small': 0.35, 'large': 0.7}
     r = size_mapping.get(obj['size'], 0.5)
         r /= math.sqrt(2)
     min_dist = 0.25
+    for attempt in range(100):
         new_x = random.uniform(-3, 3)
         new_y = random.uniform(-3, 3)
         try:
             dx0 = float(new_x) - float(old_coords[0])
             dy0 = float(new_y) - float(old_coords[1])
         for other_idx, other_obj in enumerate(cf_scene['objects']):
             if other_idx == move_idx:
                 continue
             other_x, other_y, _ = other_obj['3d_coords']
             other_r = size_mapping.get(other_obj['size'], 0.5)
             if other_obj['shape'] == 'cube':
                 other_r /= math.sqrt(2)
             dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
             if dist < (r + other_r + min_dist):
                 collision = True
                 break
         return cf_scene, "no change (0 objects)"
     try:
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
+    except Exception:
         size_mapping = {'small': 0.35, 'large': 0.7}
     action = random.choice(['add', 'remove'])
             return cf_scene, "no change (couldn't find valid positions for new objects)"
     else:
         num_to_remove = min(random.randint(1, 2), len(cf_scene['objects']) - 1)
         removed = []
         for _ in range(num_to_remove):
     materials = ['metal', 'rubber']
     sizes = ['small', 'large']
     try:
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
             properties = json.load(f)
             size_mapping = properties['sizes']
+    except Exception:
         size_mapping = {'small': 0.35, 'large': 0.7}
     min_dist = 0.25
     return cf_scene, f"replaced {old_desc} with {new_color} {new_shape}"
+def cf_swap_attribute(scene):
+    """Swap colors between two existing objects. Tests if model relies on priors (e.g. sphere=red)."""
+    cf_scene = copy.deepcopy(scene)
+    if len(cf_scene['objects']) < 2:
+        return cf_scene, "no swap (fewer than 2 objects)"
+    idx_a, idx_b = random.sample(range(len(cf_scene['objects'])), 2)
+    obj_a = cf_scene['objects'][idx_a]
+    obj_b = cf_scene['objects'][idx_b]
+    if obj_a['color'] == obj_b['color']:
+        return cf_scene, "no swap (both objects same color)"
+    color_a, color_b = obj_a['color'], obj_b['color']
+    obj_a['color'] = color_b
+    obj_b['color'] = color_a
+    return cf_scene, f"swapped colors between {color_a} {obj_a['shape']} and {color_b} {obj_b['shape']}"
+TARGET_OCCLUSION_COVERAGE = 0.6
+def cf_occlusion_change(scene):
+    """Move an object so it partially hides another. Tests spatial depth and partial shape recognition."""
+    cf_scene = copy.deepcopy(scene)
+    if len(cf_scene['objects']) < 2:
+        return cf_scene, "no occlusion (fewer than 2 objects)"
+    directions = cf_scene.get('directions', {})
+    front = directions.get('front', [0.75, -0.66, 0.0])
+    if len(front) < 2:
+        front = [0.75, -0.66]
+    try:
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
+            properties = json.load(f)
+            size_mapping = properties['sizes']
+    except Exception:
+        size_mapping = {'small': 0.35, 'large': 0.7}
+    def get_radius(obj):
+        r = size_mapping.get(obj['size'], 0.5)
+        if obj['shape'] == 'cube':
+            r /= math.sqrt(2)
+        return r
+    min_dist = 0.15
+    def is_valid_occlusion_pos(cf_scene, occluder_idx, target_idx, new_x, new_y, occluder_r):
+        for i, other in enumerate(cf_scene['objects']):
+            if i == occluder_idx:
+                continue
+            other_x, other_y, _ = other['3d_coords']
+            other_r = get_radius(other)
+            dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
+            if dist < (occluder_r + other_r + min_dist):
+                return False
+        return -2.8 <= new_x <= 2.8 and -2.8 <= new_y <= 2.8
+    fx, fy = float(front[0]), float(front[1])
+    norm = math.sqrt(fx * fx + fy * fy) or 1.0
+    coverage_range = 0.5
+    delta_max = coverage_range * (1.0 - TARGET_OCCLUSION_COVERAGE)
+    base_deltas = [0.02, 0.05, 0.08, 0.12, 0.18, delta_max * 0.5, delta_max]
+    pairs = [(i, j) for i in range(len(cf_scene['objects'])) for j in range(len(cf_scene['objects'])) if i != j]
+    random.shuffle(pairs)
+    for occluder_idx, target_idx in pairs:
+        occluder = cf_scene['objects'][occluder_idx]
+        target = cf_scene['objects'][target_idx]
+        tx, ty, tz = target['3d_coords']
+        oz = occluder['3d_coords'][2]
+        occluder_r = get_radius(occluder)
+        target_r = get_radius(target)
+        min_offset = occluder_r + target_r + min_dist
+        offsets = [min_offset + d for d in base_deltas]
+        for offset in offsets:
+            new_x = tx + (fx / norm) * offset
+            new_y = ty + (fy / norm) * offset
+            if is_valid_occlusion_pos(cf_scene, occluder_idx, target_idx, new_x, new_y, occluder_r):
+                occluder['3d_coords'] = [new_x, new_y, oz]
+                occluder['pixel_coords'] = [0, 0, 0]
+                return cf_scene, f"moved {occluder['color']} {occluder['shape']} to partially occlude {target['color']} {target['shape']}"
+    return cf_scene, "no occlusion (couldn't find valid position)"
+def cf_relational_flip(scene):
+    """Move object A from 'left of B' to 'right of B' (or vice versa). Targets spatial prepositions."""
+    cf_scene = copy.deepcopy(scene)
+    if len(cf_scene['objects']) < 2:
+        return cf_scene, "no flip (fewer than 2 objects)"
+    directions = cf_scene.get('directions', {})
+    left_vec = directions.get('left', [-0.66, -0.75, 0.0])
+    if len(left_vec) < 2:
+        left_vec = [-0.66, -0.75]
+    try:
+        script_dir = os.path.dirname(os.path.abspath(__file__))
+        with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
+            properties = json.load(f)
+            size_mapping = properties['sizes']
+    except Exception:
+        size_mapping = {'small': 0.35, 'large': 0.7}
+    def get_radius(obj):
+        r = size_mapping.get(obj['size'], 0.5)
+        if obj['shape'] == 'cube':
+            r /= math.sqrt(2)
+        return r
+    def is_valid_pos(cf_scene, a_idx, new_x, new_y, r_a, min_dist=0.12):
+        for i, other in enumerate(cf_scene['objects']):
+            if i == a_idx:
+                continue
+            ox, oy, _ = other['3d_coords']
+            other_r = get_radius(other)
+            dist = math.sqrt((new_x - ox)**2 + (new_y - oy)**2)
+            if dist < (r_a + other_r + min_dist):
+                return False
+        return -2.8 <= new_x <= 2.8 and -2.8 <= new_y <= 2.8
+    relationships = cf_scene.get('relationships', {})
+    left_of = relationships.get('left', [])
+    right_of = relationships.get('right', [])
+    candidates = []
+    for b_idx in range(len(cf_scene['objects'])):
+        for a_idx in left_of[b_idx] if b_idx < len(left_of) else []:
+            if a_idx != b_idx and a_idx < len(cf_scene['objects']):
+                candidates.append((a_idx, b_idx, 'left'))
+        for a_idx in right_of[b_idx] if b_idx < len(right_of) else []:
+            if a_idx != b_idx and a_idx < len(cf_scene['objects']):
+                candidates.append((a_idx, b_idx, 'right'))
+    if not candidates:
+        lx, ly = float(left_vec[0]), float(left_vec[1])
+        for a_idx in range(len(cf_scene['objects'])):
+            for b_idx in range(len(cf_scene['objects'])):
+                if a_idx == b_idx:
+                    continue
+                ax_a, ay_a, _ = cf_scene['objects'][a_idx]['3d_coords']
+                bx_b, by_b, _ = cf_scene['objects'][b_idx]['3d_coords']
+                dx, dy = ax_a - bx_b, ay_a - by_b
+                dot = dx * lx + dy * ly
+                if abs(dot) > 0.2:
+                    side = 'left' if dot > 0 else 'right'
+                    candidates.append((a_idx, b_idx, side))
+    if not candidates:
+        return cf_scene, "no flip (no clear left/right relationships)"
+    random.shuffle(candidates)
+    lx, ly = float(left_vec[0]), float(left_vec[1])
+    for a_idx, b_idx, side in candidates:
+        obj_a = cf_scene['objects'][a_idx]
+        obj_b = cf_scene['objects'][b_idx]
+        ax, ay, az = obj_a['3d_coords']
+        bx, by, bz = obj_b['3d_coords']
+        r_a = get_radius(obj_a)
+        dx, dy = ax - bx, ay - by
+        dot_left = dx * lx + dy * ly
+        ref_dx = dx - 2 * dot_left * lx
+        ref_dy = dy - 2 * dot_left * ly
+        for scale in [1.0, 0.9, 0.8, 0.7, 0.85, 0.75]:
+            new_x = bx + scale * ref_dx
+            new_y = by + scale * ref_dy
+            if is_valid_pos(cf_scene, a_idx, new_x, new_y, r_a):
+                obj_a['3d_coords'] = [new_x, new_y, az]
+                obj_a['pixel_coords'] = [0, 0, 0]
+                new_side = "right" if side == "left" else "left"
+                return cf_scene, f"moved {obj_a['color']} {obj_a['shape']} from {side} of {obj_b['color']} {obj_b['shape']} to {new_side}"
+    return cf_scene, "no flip (couldn't find collision-free position)"
 def cf_change_background(scene):
     cf_scene = copy.deepcopy(scene)
     'change_size': cf_change_size,
     'change_material': cf_change_material,
     'change_position': cf_change_position,
     'add_object': cf_add_object,
     'remove_object': cf_remove_object,
     'replace_object': cf_replace_object,
+    'swap_attribute': cf_swap_attribute,
+    'occlusion_change': cf_occlusion_change,
+    'relational_flip': cf_relational_flip,
 }
 # Negative CFs - These should NOT change answers to questions
 NEGATIVE_COUNTERFACTUALS = {
     'change_background': cf_change_background,
     'change_lighting': cf_change_lighting,
     'add_noise': cf_add_noise,
     'apply_fisheye': cf_apply_fisheye,
         one_type = random.choice(list(COUNTERFACTUAL_TYPES.keys()))
         selected_types = [one_type] * num_counterfactuals
     elif cf_types:
+        selected_types = [random.choice(cf_types) for _ in range(num_counterfactuals)]
     if selected_types is not None:
         for cf_type in selected_types:
     print("  change_size        - Change size of an object (small/large)")
     print("  change_material    - Change material of an object (metal/rubber)")
     print("  change_position    - Move an object to a different location")
     print("  add_object         - Add a new random object")
+    print("  swap_attribute     - Swap colors between two objects")
+    print("  occlusion_change   - Move object to partially hide another")
+    print("  relational_flip    - Move object from left of X to right of X")
     print("  remove_object      - Remove a random object")
     print("  replace_object     - Replace an object with a different one")
     print("\nNEGATIVE COUNTERFACTUALS (Should NOT change VQA answers):")
     print("  change_background  - Change background/ground color")
     print("  change_lighting    - Change lighting conditions")
     print("  add_noise          - Add image noise/grain")
     print("  apply_fisheye      - Apply fisheye lens distortion")
     print(f"\n[OK] Saved run metadata to: {metadata_path}")
+def regenerate_scene_sets(args):
+    """Regenerate specific scene sets in an existing run. Uses settings from run_metadata.json."""
+    run_dir = os.path.join(args.output_dir, args.run_name)
+    if not os.path.exists(run_dir):
+        print(f"ERROR: Run directory does not exist: {run_dir}")
+        return
+    metadata_path = os.path.join(run_dir, 'run_metadata.json')
+    if not os.path.exists(metadata_path):
+        print(f"ERROR: run_metadata.json not found in {run_dir}. Cannot determine original settings.")
+        return
+    with open(metadata_path, 'r') as f:
+        metadata = json.load(f)
+    meta_args = metadata.get('arguments', {})
+    scenes_dir = os.path.join(run_dir, 'scenes')
+    images_dir = os.path.join(run_dir, 'images')
+    os.makedirs(scenes_dir, exist_ok=True)
+    os.makedirs(images_dir, exist_ok=True)
+    blender_path = args.blender_path or find_blender()
+    print(f"Using Blender: {blender_path}")
+    print("\nPreparing scripts...")
+    create_patched_render_script()
+    scene_indices = sorted(set(args.regenerate))
+    num_counterfactuals = meta_args.get('num_counterfactuals', 2)
+    cf_types = meta_args.get('cf_types')
+    if isinstance(cf_types, list) and cf_types:
+        pass
+    else:
+        cf_types = None
+    use_gpu = meta_args.get('use_gpu', 0)
+    samples = meta_args.get('samples', 512)
+    width = meta_args.get('width', 320)
+    height = meta_args.get('height', 240)
+    print(f"\n{'='*70}")
+    print(f"REGENERATING {len(scene_indices)} SCENE SETS: {scene_indices}")
+    print(f"{'='*70}")
+    temp_run_id = os.path.basename(run_dir)
+    checkpoint_file = os.path.join(run_dir, 'checkpoint.json')
+    completed_scenes = load_checkpoint(checkpoint_file)
+    for i in scene_indices:
+        print(f"\n{'='*70}")
+        print(f"REGENERATING SCENE SET #{i}")
+        print(f"{'='*70}")
+        num_objects = meta_args.get('num_objects')
+        if num_objects is None:
+            min_objs = meta_args.get('min_objects', 3)
+            max_objs = meta_args.get('max_objects', 7)
+            num_objects = random.randint(min_objs, max_objs)
+        base_scene = None
+        for retry in range(3):
+            base_scene = generate_base_scene(num_objects, blender_path, i, temp_run_dir=temp_run_id)
+            if base_scene and len(base_scene.get('objects', [])) > 0:
+                break
+            print(f"  Retry {retry + 1}/3...")
+        if not base_scene or len(base_scene.get('objects', [])) == 0:
+            print(f"  [FAILED] Could not generate base scene for #{i}")
+            continue
+        min_cf_score = meta_args.get('min_cf_change_score', 1.0)
+        max_cf_attempts = meta_args.get('max_cf_attempts', 10)
+        min_noise = meta_args.get('min_noise_level', 'light')
+        counterfactuals = generate_counterfactuals(
+            base_scene,
+            num_counterfactuals,
+            cf_types=cf_types,
+            same_cf_type=meta_args.get('same_cf_type', False),
+            min_change_score=min_cf_score,
+            max_cf_attempts=max_cf_attempts,
+            min_noise_level=min_noise
+        )
+        for idx, cf in enumerate(counterfactuals):
+            print(f"    CF{idx+1} [{cf.get('cf_category', '?')}] ({cf.get('type', '?')}): {cf.get('description', '')}")
+        scene_prefix = f"scene_{i:04d}"
+        scene_paths = {'original': os.path.join(scenes_dir, f"{scene_prefix}_original.json")}
+        image_paths = {'original': os.path.join(images_dir, f"{scene_prefix}_original.png")}
+        base_scene['cf_metadata'] = {
+            'variant': 'original', 'is_counterfactual': False, 'cf_index': None,
+            'cf_category': 'original', 'cf_type': None, 'cf_description': None, 'source_scene': scene_prefix,
+        }
+        save_scene(base_scene, scene_paths['original'])
+        for idx, cf in enumerate(counterfactuals):
+            cf_name = f"cf{idx+1}"
+            scene_paths[cf_name] = os.path.join(scenes_dir, f"{scene_prefix}_{cf_name}.json")
+            image_paths[cf_name] = os.path.join(images_dir, f"{scene_prefix}_{cf_name}.png")
+            cf_scene = cf['scene']
+            cf_scene['cf_metadata'] = {
+                'variant': cf_name, 'is_counterfactual': True, 'cf_index': idx + 1,
+                'cf_category': cf.get('cf_category', 'unknown'), 'cf_type': cf.get('type', None),
+                'cf_description': cf.get('description', None), 'change_score': cf.get('change_score'),
+                'change_attempts': cf.get('change_attempts'), 'source_scene': scene_prefix,
+            }
+            save_scene(cf_scene, scene_paths[cf_name])
+        print(f"  [OK] Saved {len(counterfactuals) + 1} scene files")
+        if not args.skip_render:
+            print("  Rendering...")
+            render_success = 0
+            for scene_type, scene_path in scene_paths.items():
+                if render_scene(blender_path, scene_path, image_paths[scene_type],
+                               use_gpu, samples, width, height):
+                    render_success += 1
+                    print(f"    [OK] {scene_type}")
+            print(f"  [OK] Rendered {render_success}/{len(scene_paths)} images")
+            completed_scenes.add(i)
+    save_checkpoint(checkpoint_file, list(completed_scenes))
+    temp_run_path = os.path.join(os.getcwd(), 'temp_output', temp_run_id)
+    if os.path.exists(temp_run_path):
+        shutil.rmtree(temp_run_path)
+    if os.path.exists('render_images_patched.py'):
+        try:
+            os.remove('render_images_patched.py')
+        except Exception:
+            pass
+    print(f"\n{'='*70}")
+    print("REGENERATION COMPLETE")
+    print(f"{'='*70}")
+    print(f"Regenerated {len(scene_indices)} scene sets: {scene_indices}")
+    print(f"Run directory: {run_dir}")
+    if args.generate_questions:
+        if generate_mapping_with_questions is None:
+            print("\n[WARNING] Questions module not found. Skipping CSV generation.")
+        else:
+            print("\nRegenerating questions CSV...")
+            try:
+                generate_mapping_with_questions(run_dir, args.csv_name, generate_questions=True)
+                print(f"[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
+            except Exception as e:
+                print(f"[ERROR] Questions: {e}")
+                import traceback
+                traceback.print_exc()
+def filter_same_answer_scenes(run_dir, csv_filename):
+    """Remove CSV rows where CF1 or CF2 answer matches original; delete those scenes' images and scene JSONs."""
+    csv_path = os.path.join(run_dir, csv_filename)
+    if not os.path.isfile(csv_path):
+        return
+    with open(csv_path, 'r', encoding='utf-8') as f:
+        reader = csv.reader(f)
+        header = next(reader)
+        try:
+            idx_orig_ans = header.index('original_image_answer_to_original_question')
+            idx_cf1_ans = header.index('cf1_image_answer_to_cf1_question')
+            idx_cf2_ans = header.index('cf2_image_answer_to_cf2_question')
+            idx_orig_img = header.index('original_image')
+        except ValueError:
+            return
+    kept_rows = [header]
+    removed_scene_ids = set()
+    with open(csv_path, 'r', encoding='utf-8') as f:
+        reader = csv.reader(f)
+        next(reader)
+        for row in reader:
+            if len(row) <= max(idx_orig_ans, idx_cf1_ans, idx_cf2_ans, idx_orig_img):
+                kept_rows.append(row)
+                continue
+            o = str(row[idx_orig_ans]).strip().lower()
+            c1 = str(row[idx_cf1_ans]).strip().lower()
+            c2 = str(row[idx_cf2_ans]).strip().lower()
+            if o == c1 or o == c2:
+                orig_img = row[idx_orig_img]
+                if orig_img.endswith('_original.png'):
+                    scene_id = orig_img.replace('_original.png', '')
+                    removed_scene_ids.add(scene_id)
+                continue
+            kept_rows.append(row)
+    if not removed_scene_ids:
+        return
+    with open(csv_path, 'w', newline='', encoding='utf-8') as f:
+        writer = csv.writer(f, quoting=csv.QUOTE_ALL)
+        writer.writerows(kept_rows)
+    images_dir = os.path.join(run_dir, 'images')
+    scenes_dir = os.path.join(run_dir, 'scenes')
+    deleted = 0
+    for scene_id in removed_scene_ids:
+        for suffix in ('_original', '_cf1', '_cf2'):
+            for d, ext in [(images_dir, '.png'), (scenes_dir, '.json')]:
+                if not os.path.isdir(d):
+                    continue
+                fn = scene_id + suffix + ext
+                fp = os.path.join(d, fn)
+                if os.path.isfile(fp):
+                    try:
+                        os.remove(fp)
+                        deleted += 1
+                    except OSError:
+                        pass
+    print(f"\n[OK] Filtered {len(removed_scene_ids)} scenes where answers matched; removed {deleted} files. CSV now has {len(kept_rows) - 1} rows.")
 def main():
     # When run as a script, ensure we're in the project root
     # Change to script directory so relative paths work
                        # Image CFs (should change answers)
                        'change_color', 'change_shape', 'change_size',
                        'change_material', 'change_position',
+                       'add_object', 'remove_object', 'replace_object',
+                       'swap_attribute', 'occlusion_change', 'relational_flip',
                        # Negative CFs (should NOT change answers)
+                       'change_background',
                        'change_lighting', 'add_noise',
                        'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
                    ],
     parser.add_argument('--generate_questions', action='store_true',
                        help='Create questions and answers CSV after rendering completes')
+    parser.add_argument('--filter_same_answer', action='store_true',
+                       help='After generating questions, remove scenes where CF1 or CF2 answer matches original (delete those rows and their image/scene files). Use with --generate_questions.')
     parser.add_argument('--csv_name', default='image_mapping_with_questions.csv',
                        help='Output CSV filename (default: image_mapping_with_questions.csv)')
+    parser.add_argument('--regenerate', nargs='+', type=int, metavar='N',
+                       help='Regenerate specific scene sets by index (e.g. --regenerate 63 83 272). Requires --run_name. Uses settings from run_metadata.json.')
     args = parser.parse_args()
         print("ERROR: --run_name is required when using --resume")
         return
+    if args.regenerate is not None:
+        if not args.run_name:
+            print("ERROR: --run_name is required when using --regenerate")
+            return
+        regenerate_scene_sets(args)
+        return
     # Find Blender
     blender_path = args.blender_path or find_blender()
     print(f"Using Blender: {blender_path}")
                     generate_questions=True
                 )
                 print(f"\n[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
+                if getattr(args, 'filter_same_answer', False):
+                    filter_same_answer_scenes(run_dir, args.csv_name)
             except Exception as e:
                 print(f"\n[ERROR] Questions: {e}")
                 import traceback

scripts/generate_questions_mapping.py CHANGED Viewed

@@ -1,5 +1,3 @@
 import os
 import argparse
 import csv
@@ -73,19 +71,28 @@ def get_scene_properties(scene):
 IMAGE_CF_TYPES = {
     'change_color', 'change_shape', 'change_size', 'change_material',
-    'change_position', 'change_count', 'add_object', 'remove_object', 'replace_object'
 }
 NEGATIVE_CF_TYPES = {
-    'change_background', 'change_texture', 'change_lighting', 'add_noise',
     'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
 }
 def get_cf_type_from_scene(scene):
     meta = scene.get('cf_metadata') or {}
     if not meta.get('is_counterfactual'):
         return None
     return meta.get('cf_type')
 def get_change_details(original_scene, cf_scene):
     orig_objs = original_scene.get('objects', [])
     cf_objs = cf_scene.get('objects', [])
@@ -100,12 +107,101 @@ def get_change_details(original_scene, cf_scene):
                 return {'attribute': attr, 'orig_val': ov or 'unknown', 'cf_val': cv or 'unknown', 'object_index': i}
     return None
-def generate_question_for_counterfactual(cf_type, original_scene, cf_scene):
     change = get_change_details(original_scene, cf_scene)
     orig_objs = original_scene.get('objects', [])
     cf_objs = cf_scene.get('objects', [])
     props_orig = get_scene_properties(original_scene)
     props_cf = get_scene_properties(cf_scene)
     if cf_type and cf_type in NEGATIVE_CF_TYPES:
         templates = [
             ("How many objects are in the scene?", {}),
@@ -123,57 +219,91 @@ def generate_question_for_counterfactual(cf_type, original_scene, cf_scene):
         return question, params
     if change and change.get('attribute') == 'count':
-        question = "How many objects are in the scene?"
-        return question, {}
     if change and change.get('attribute') in ('color', 'shape', 'material', 'size'):
         attr = change['attribute']
-        orig_val = change.get('orig_val', '')
-        cf_val = change.get('cf_val', '')
         if attr == 'color':
-            question = f"How many {cf_val} objects are there?"
-            params = {'color': cf_val}
         elif attr == 'shape':
-            question = f"How many {cf_val}s are there?" if not cf_val.endswith('s') else f"How many {cf_val} are there?"
-            params = {'shape': cf_val.rstrip('s')}
         elif attr == 'material':
-            question = f"How many {cf_val} objects are there?"
-            params = {'material': cf_val}
         elif attr == 'size':
-            question = f"How many {cf_val} objects are there?"
-            params = {'size': cf_val}
         else:
             question = "How many objects are in the scene?"
             params = {}
         return question, params
     if cf_type in ('change_color', 'change_shape', 'replace_object'):
         for attr, key in [('color', 'colors'), ('shape', 'shapes'), ('material', 'materials'), ('size', 'sizes')]:
-            vals = props_cf.get(key) or props_orig.get(key) or []
             if vals:
-                val = random.choice(list(vals))
                 if attr == 'shape':
-                    plural = val + 's' if not val.endswith('s') else val
-                    question = f"How many {plural} are there?"
                 elif attr == 'color':
-                    question = f"How many {val} objects are there?"
                 elif attr == 'material':
-                    question = f"How many {val} objects are there?"
                 else:
-                    question = f"How many {val} objects are there?"
-                return question, {attr: val}
-    if cf_type in ('change_count', 'add_object', 'remove_object'):
-        return "How many objects are in the scene?", {}
-    if cf_type in ('change_size', 'change_material', 'change_position'):
         key = 'sizes' if cf_type == 'change_size' else ('materials' if cf_type == 'change_material' else 'colors')
         attr = key.rstrip('s')
-        vals = props_cf.get(key) or props_orig.get(key) or []
         if vals:
-            val = random.choice(list(vals))
-            question = f"How many {val} objects are there?"
             return question, {attr: val}
-    question = "How many objects are in the scene?"
     return question, {}
 def generate_question_for_scene(scene_file):
@@ -502,8 +632,6 @@ def create_counterfactual_questions(original_question, params, scene):
             if cf_q is None:
                 cf_q = "How many objects are in the scene?"
                 cf_params = {}
-            # Ensure cf_params is set
             if not cf_params:
                 cf_params = {}
@@ -545,10 +673,23 @@ def create_counterfactual_questions(original_question, params, scene):
     return cf_questions
 def answer_question_for_scene(question, scene):
     objects = scene.get('objects', [])
     question_lower = question.lower()
     if "more than" in question_lower:
         match = re.search(r'more than (\d+)', question_lower)
         if match:
@@ -774,7 +915,9 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
     if with_links:
         header = ['scene_id', 'original_image_link', 'original_scene_link',
                   'counterfactual1_image_link', 'counterfactual1_scene_link',
-                  'counterfactual2_image_link', 'counterfactual2_scene_link']
         if generate_questions:
             header.extend([
                 'original_question', 'counterfactual1_question', 'counterfactual2_question',
@@ -793,6 +936,8 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
     elif generate_questions:
         rows.append([
             'original_image', 'counterfactual1_image', 'counterfactual2_image',
             'original_question', 'counterfactual1_question', 'counterfactual2_question',
             'original_question_difficulty', 'counterfactual1_question_difficulty', 'counterfactual2_question_difficulty',
             'original_image_answer_to_original_question',
@@ -806,7 +951,9 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
             'cf2_image_answer_to_cf2_question'
         ])
     else:
-        rows.append(['original_image', 'counterfactual1_image', 'counterfactual2_image'])
     total_scenes = len(scene_sets)
@@ -841,47 +988,75 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
             try:
                 original_question, params = generate_question_for_scene(original_scene_file)
                 cf1_type = get_cf_type_from_scene(cf1_scene)
                 cf2_type = get_cf_type_from_scene(cf2_scene)
-                cf_questions = create_counterfactual_questions(original_question, params, original_scene) if (not cf1_type or not cf2_type) else None
-                if cf1_type:
-                    cf1_question, cf1_params = generate_question_for_counterfactual(cf1_type, original_scene, cf1_scene)
-                else:
-                    cf1_question, cf1_params = cf_questions[0] if cf_questions and len(cf_questions) > 0 else ("How many objects are in the scene?", {})
-                if cf2_type:
-                    cf2_question, cf2_params = generate_question_for_counterfactual(cf2_type, original_scene, cf2_scene)
-                else:
-                    cf2_question, cf2_params = cf_questions[1] if cf_questions and len(cf_questions) > 1 else (cf_questions[0] if cf_questions else ("How many objects are in the scene?", {}))
             except Exception as e:
                 import traceback
                 traceback.print_exc()
                 continue
-            try:
-                original_difficulty = calculate_question_difficulty(original_question, params)
-                cf1_difficulty = calculate_question_difficulty(cf1_question, cf1_params)
-                cf2_difficulty = calculate_question_difficulty(cf2_question, cf2_params)
-            except Exception as e:
-                import traceback
-                traceback.print_exc()
-                continue
-            try:
-                original_ans_orig_q = answer_question_for_scene(original_question, original_scene)
-                original_ans_cf1_q = answer_question_for_scene(cf1_question, original_scene)
-                original_ans_cf2_q = answer_question_for_scene(cf2_question, original_scene)
-                cf1_ans_orig_q = answer_question_for_scene(original_question, cf1_scene)
-                cf1_ans_cf1_q = answer_question_for_scene(cf1_question, cf1_scene)
-                cf1_ans_cf2_q = answer_question_for_scene(cf2_question, cf1_scene)
-                cf2_ans_orig_q = answer_question_for_scene(original_question, cf2_scene)
-                cf2_ans_cf1_q = answer_question_for_scene(cf1_question, cf2_scene)
-                cf2_ans_cf2_q = answer_question_for_scene(cf2_question, cf2_scene)
-            except Exception as e:
-                import traceback
-                traceback.print_exc()
-                continue
             try:
                 if with_links:
@@ -903,6 +1078,7 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
                         original_image_link, original_scene_link,
                         cf1_image_link, cf1_scene_link,
                         cf2_image_link, cf2_scene_link,
                         original_question, cf1_question, cf2_question,
                         original_difficulty, cf1_difficulty, cf2_difficulty,
                         original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
@@ -912,6 +1088,7 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
                 else:
                     rows.append([
                         original_id, cf1_id, cf2_id,
                         original_question, cf1_question, cf2_question,
                         original_difficulty, cf1_difficulty, cf2_difficulty,
                         original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
@@ -923,6 +1100,20 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
                 traceback.print_exc()
                 continue
         else:
             if with_links:
                 def make_link(filename, file_type='image'):
                     if base_url:
@@ -941,10 +1132,11 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
                     scene_num,
                     original_image_link, original_scene_link,
                     cf1_image_link, cf1_scene_link,
-                    cf2_image_link, cf2_scene_link
                 ])
             else:
-                rows.append([original_id, cf1_id, cf2_id])
     csv_path = os.path.join(run_dir, csv_filename)
     try:
@@ -966,44 +1158,29 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
             if generate_questions:
                 print(f"  Scene ID: {row[0]}")
                 print(f"  Links:")
-                print(f"    Original image: {row[1]}")
-                print(f"    Original scene: {row[2]}")
-                print(f"    CF1 image: {row[3]}")
-                print(f"    CF1 scene: {row[4]}")
-                print(f"    CF2 image: {row[5]}")
-                print(f"    CF2 scene: {row[6]}")
-                print(f"  Questions:")
-                print(f"    Original: {row[7]}")
-                print(f"    CF1: {row[8]}")
-                print(f"    CF2: {row[9]}")
             else:
                 print(f"  Scene ID: {row[0]}")
                 print(f"  Links:")
                 print(f"    Original image: {row[1]}, scene: {row[2]}")
                 print(f"    CF1 image: {row[3]}, scene: {row[4]}")
                 print(f"    CF2 image: {row[5]}, scene: {row[6]}")
-        elif generate_questions and len(row) > 6:
-            print(f"  Images:")
-            print(f"    Original: {row[0]}")
-            print(f"    Counterfactual 1: {row[1]}")
-            print(f"    Counterfactual 2: {row[2]}")
-            print(f"  Questions:")
-            print(f"    Original question: {row[3]}")
-            print(f"    CF1 question: {row[4]}")
-            print(f"    CF2 question: {row[5]}")
             print(f"  Answer Matrix (scene × question):")
-            print(f"    Original image:")
-            print(f"      -> Original Q: {row[6]}")
-            print(f"      -> CF1 Q: {row[7]}")
-            print(f"      -> CF2 Q: {row[8]}")
-            print(f"    CF1 image:")
-            print(f"      -> Original Q: {row[9]}")
-            print(f"      -> CF1 Q: {row[10]}")
-            print(f"      -> CF2 Q: {row[11]}")
-            print(f"    CF2 image:")
-            print(f"      -> Original Q: {row[12]}")
-            print(f"      -> CF1 Q: {row[13]}")
-            print(f"      -> CF2 Q: {row[14]}")
 def main():
     parser = argparse.ArgumentParser(

 import os
 import argparse
 import csv
 IMAGE_CF_TYPES = {
     'change_color', 'change_shape', 'change_size', 'change_material',
+    'change_position', 'add_object', 'remove_object', 'replace_object',
+    'swap_attribute', 'occlusion_change', 'relational_flip'
 }
 NEGATIVE_CF_TYPES = {
+    'change_background', 'change_lighting', 'add_noise',
     'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
 }
+MAX_CF_ANSWER_RETRIES = 150
 def get_cf_type_from_scene(scene):
     meta = scene.get('cf_metadata') or {}
     if not meta.get('is_counterfactual'):
         return None
     return meta.get('cf_type')
+def get_cf_description_from_scene(scene):
+    meta = scene.get('cf_metadata') or {}
+    if not meta.get('is_counterfactual'):
+        return None
+    return meta.get('cf_description')
 def get_change_details(original_scene, cf_scene):
     orig_objs = original_scene.get('objects', [])
     cf_objs = cf_scene.get('objects', [])
                 return {'attribute': attr, 'orig_val': ov or 'unknown', 'cf_val': cv or 'unknown', 'object_index': i}
     return None
+CF_COUNT_QUESTION_TEMPLATES = [
+    "How many objects are in the scene?",
+    "What is the total number of objects in the scene?",
+]
+CF_COLOR_QUESTION_TEMPLATES = [
+    ("How many {val} objects are there?", 'color'),
+    ("Are there any {val} objects?", 'color'),
+    ("What is the total number of {val} objects?", 'color'),
+]
+CF_SHAPE_QUESTION_TEMPLATES = [
+    ("How many {val} are there?", 'shape'),
+    ("Are there any {val}?", 'shape'),
+    ("What is the total number of {val}?", 'shape'),
+]
+CF_MATERIAL_QUESTION_TEMPLATES = [
+    ("How many {val} objects are there?", 'material'),
+    ("Are there any {val} objects?", 'material'),
+    ("What is the total number of {val} objects?", 'material'),
+]
+CF_SIZE_QUESTION_TEMPLATES = [
+    ("How many {val} objects are there?", 'size'),
+    ("Are there any {val} objects?", 'size'),
+    ("What is the total number of {val} objects?", 'size'),
+]
+def _pluralize_shape(shape):
+    if not shape:
+        return shape
+    s = shape.strip().lower()
+    if s.endswith('s'):
+        return s
+    return s + 's'
+def _count_by_attribute(objects, attr):
+    """Count objects per attribute value (color, shape, material, size)."""
+    counts = {}
+    for obj in objects:
+        val = (obj.get(attr) or '').lower().strip()
+        if val:
+            counts[val] = counts.get(val, 0) + 1
+    return counts
+def _get_attributes_with_different_counts(original_scene, cf_scene):
+    """Find attribute values whose count differs between original and CF scene."""
+    orig_objs = original_scene.get('objects', [])
+    cf_objs = cf_scene.get('objects', [])
+    differing = []
+    for attr in ['color', 'shape', 'material', 'size']:
+        orig_counts = _count_by_attribute(orig_objs, attr)
+        cf_counts = _count_by_attribute(cf_objs, attr)
+        all_vals = set(orig_counts) | set(cf_counts)
+        for val in all_vals:
+            o = orig_counts.get(val, 0)
+            c = cf_counts.get(val, 0)
+            if o != c:
+                differing.append((attr, val, o, c))
+    return differing
+def generate_question_for_counterfactual(cf_type, original_scene, cf_scene, retry_index=0):
+    """Generate a question tailored to cf_type. Use retry_index to vary question on retries."""
+    random.seed(hash((str(cf_type), retry_index, str(id(original_scene)), str(id(cf_scene)))))
     change = get_change_details(original_scene, cf_scene)
     orig_objs = original_scene.get('objects', [])
     cf_objs = cf_scene.get('objects', [])
     props_orig = get_scene_properties(original_scene)
     props_cf = get_scene_properties(cf_scene)
+    # For IMAGE CFs: prefer questions targeting attributes that differ between orig and cf
+    if cf_type and cf_type in IMAGE_CF_TYPES:
+        differing = _get_attributes_with_different_counts(original_scene, cf_scene)
+        if differing:
+            idx = retry_index % len(differing) if differing else 0
+            attr, val, orig_count, cf_count = differing[idx]
+            if attr == 'color':
+                template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
+                question = template.format(val=val)
+            elif attr == 'shape':
+                plural = _pluralize_shape(val)
+                template, _ = random.choice(CF_SHAPE_QUESTION_TEMPLATES)
+                question = template.format(val=plural)
+            elif attr == 'material':
+                template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
+                question = template.format(val=val)
+            elif attr == 'size':
+                template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
+                question = template.format(val=val)
+            else:
+                question = None
+            if question:
+                return question, {attr: val.rstrip('s') if attr == 'shape' else val}
     if cf_type and cf_type in NEGATIVE_CF_TYPES:
         templates = [
             ("How many objects are in the scene?", {}),
         return question, params
     if change and change.get('attribute') == 'count':
+        orig_count = change.get('orig_count', len(orig_objs))
+        cf_count = change.get('cf_count', len(cf_objs))
+        templates_with_params = []
+        templates_with_params.append((random.choice(CF_COUNT_QUESTION_TEMPLATES), {}))
+        if cf_count > orig_count:
+            templates_with_params.append((f"Are there more than {orig_count} objects?", {}))
+            templates_with_params.append((f"Are there at least {cf_count} objects?", {}))
+        if cf_count < orig_count:
+            templates_with_params.append((f"Are there fewer than {orig_count} objects?", {}))
+            templates_with_params.append((f"Are there more than {cf_count} objects?", {}))
+        template, params = random.choice(templates_with_params)
+        return template, params
     if change and change.get('attribute') in ('color', 'shape', 'material', 'size'):
         attr = change['attribute']
+        cf_val = (change.get('cf_val') or '').strip().lower()
+        if not cf_val:
+            cf_val = 'unknown'
+        params = {attr: cf_val}
         if attr == 'color':
+            template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
+            question = template.format(val=cf_val)
         elif attr == 'shape':
+            template, _ = random.choice(CF_SHAPE_QUESTION_TEMPLATES)
+            plural = _pluralize_shape(cf_val)
+            question = template.format(val=plural)
+            params['shape'] = cf_val.rstrip('s')
         elif attr == 'material':
+            template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
+            question = template.format(val=cf_val)
         elif attr == 'size':
+            template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
+            question = template.format(val=cf_val)
         else:
             question = "How many objects are in the scene?"
             params = {}
         return question, params
+    if cf_type in ('add_object', 'remove_object'):
+        templates = list(CF_COUNT_QUESTION_TEMPLATES)
+        if len(orig_objs) != len(cf_objs):
+            if len(cf_objs) > len(orig_objs):
+                templates.extend([f"Are there more than {len(orig_objs)} objects?", f"Are there at least {len(cf_objs)} objects?"])
+            else:
+                templates.extend([f"Are there fewer than {len(orig_objs)} objects?", f"Are there more than {len(cf_objs)} objects?"])
+        template = random.choice(templates)
+        return template, {}
     if cf_type in ('change_color', 'change_shape', 'replace_object'):
         for attr, key in [('color', 'colors'), ('shape', 'shapes'), ('material', 'materials'), ('size', 'sizes')]:
+            vals = list(props_cf.get(key) or props_orig.get(key) or [])
             if vals:
+                val = random.choice(vals)
                 if attr == 'shape':
+                    plural = _pluralize_shape(val)
+                    templates = CF_SHAPE_QUESTION_TEMPLATES
+                    template, _ = random.choice(templates)
+                    question = template.format(val=plural)
                 elif attr == 'color':
+                    template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
+                    question = template.format(val=val)
                 elif attr == 'material':
+                    template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
+                    question = template.format(val=val)
                 else:
+                    template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
+                    question = template.format(val=val)
+                return question, {attr: val.rstrip('s') if attr == 'shape' else val}
+    if cf_type in ('change_size', 'change_material', 'change_position', 'swap_attribute', 'occlusion_change', 'relational_flip'):
         key = 'sizes' if cf_type == 'change_size' else ('materials' if cf_type == 'change_material' else 'colors')
         attr = key.rstrip('s')
+        vals = list(props_cf.get(key) or props_orig.get(key) or [])
         if vals:
+            val = random.choice(vals)
+            if cf_type == 'change_size':
+                template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
+            elif cf_type == 'change_material':
+                template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
+            else:
+                template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
+            question = template.format(val=val)
             return question, {attr: val}
+    question = random.choice(CF_COUNT_QUESTION_TEMPLATES)
     return question, {}
 def generate_question_for_scene(scene_file):
             if cf_q is None:
                 cf_q = "How many objects are in the scene?"
                 cf_params = {}
             if not cf_params:
                 cf_params = {}
     return cf_questions
+def normalize_answer(a):
+    if a is None:
+        return ""
+    return str(a).strip().lower()
 def answer_question_for_scene(question, scene):
     objects = scene.get('objects', [])
     question_lower = question.lower()
+    if "at least" in question_lower:
+        match = re.search(r'at least (\d+)', question_lower)
+        if match:
+            threshold = int(match.group(1))
+            count = count_matching_objects(question_lower, objects)
+            return "yes" if count >= threshold else "no"
     if "more than" in question_lower:
         match = re.search(r'more than (\d+)', question_lower)
         if match:
     if with_links:
         header = ['scene_id', 'original_image_link', 'original_scene_link',
                   'counterfactual1_image_link', 'counterfactual1_scene_link',
+                  'counterfactual2_image_link', 'counterfactual2_scene_link',
+                  'counterfactual1_type', 'counterfactual2_type',
+                  'counterfactual1_description', 'counterfactual2_description']
         if generate_questions:
             header.extend([
                 'original_question', 'counterfactual1_question', 'counterfactual2_question',
     elif generate_questions:
         rows.append([
             'original_image', 'counterfactual1_image', 'counterfactual2_image',
+            'counterfactual1_type', 'counterfactual2_type',
+            'counterfactual1_description', 'counterfactual2_description',
             'original_question', 'counterfactual1_question', 'counterfactual2_question',
             'original_question_difficulty', 'counterfactual1_question_difficulty', 'counterfactual2_question_difficulty',
             'original_image_answer_to_original_question',
             'cf2_image_answer_to_cf2_question'
         ])
     else:
+        rows.append(['original_image', 'counterfactual1_image', 'counterfactual2_image',
+                    'counterfactual1_type', 'counterfactual2_type',
+                    'counterfactual1_description', 'counterfactual2_description'])
     total_scenes = len(scene_sets)
             try:
                 original_question, params = generate_question_for_scene(original_scene_file)
+                original_ans_orig_q = answer_question_for_scene(original_question, original_scene)
                 cf1_type = get_cf_type_from_scene(cf1_scene)
                 cf2_type = get_cf_type_from_scene(cf2_scene)
+                cf1_description = get_cf_description_from_scene(cf1_scene)
+                cf2_description = get_cf_description_from_scene(cf2_scene)
             except Exception as e:
                 import traceback
                 traceback.print_exc()
                 continue
+            cf1_question = cf2_question = None
+            cf1_params = cf2_params = {}
+            original_difficulty = cf1_difficulty = cf2_difficulty = None
+            original_ans_cf1_q = original_ans_cf2_q = None
+            cf1_ans_orig_q = cf1_ans_cf1_q = cf1_ans_cf2_q = None
+            cf2_ans_orig_q = cf2_ans_cf1_q = cf2_ans_cf2_q = None
+            orig_norm = normalize_answer(original_ans_orig_q)
+            for cf_retry in range(MAX_CF_ANSWER_RETRIES):
+                try:
+                    random.seed(hash((scene_num, idx, cf_retry)))
+                    cf_questions = create_counterfactual_questions(original_question, params, original_scene) if (not cf1_type or not cf2_type) else None
+                    if cf1_type:
+                        cf1_question, cf1_params = generate_question_for_counterfactual(cf1_type, original_scene, cf1_scene, retry_index=cf_retry)
+                    else:
+                        cf1_question, cf1_params = cf_questions[0] if cf_questions and len(cf_questions) > 0 else ("How many objects are in the scene?", {})
+                    if cf2_type:
+                        cf2_question, cf2_params = generate_question_for_counterfactual(cf2_type, original_scene, cf2_scene, retry_index=cf_retry)
+                    else:
+                        cf2_question, cf2_params = cf_questions[1] if cf_questions and len(cf_questions) > 1 else (cf_questions[0] if cf_questions else ("How many objects are in the scene?", {}))
+                except Exception as e:
+                    import traceback
+                    traceback.print_exc()
+                    continue
+                try:
+                    original_difficulty = calculate_question_difficulty(original_question, params)
+                    cf1_difficulty = calculate_question_difficulty(cf1_question, cf1_params)
+                    cf2_difficulty = calculate_question_difficulty(cf2_question, cf2_params)
+                except Exception as e:
+                    import traceback
+                    traceback.print_exc()
+                    continue
+                try:
+                    original_ans_cf1_q = answer_question_for_scene(cf1_question, original_scene)
+                    original_ans_cf2_q = answer_question_for_scene(cf2_question, original_scene)
+                    cf1_ans_orig_q = answer_question_for_scene(original_question, cf1_scene)
+                    cf1_ans_cf1_q = answer_question_for_scene(cf1_question, cf1_scene)
+                    cf1_ans_cf2_q = answer_question_for_scene(cf2_question, cf1_scene)
+                    cf2_ans_orig_q = answer_question_for_scene(original_question, cf2_scene)
+                    cf2_ans_cf1_q = answer_question_for_scene(cf1_question, cf2_scene)
+                    cf2_ans_cf2_q = answer_question_for_scene(cf2_question, cf2_scene)
+                except Exception as e:
+                    import traceback
+                    traceback.print_exc()
+                    continue
+                # For image CFs: ensure CF image's answer to CF question differs from original image's answer to same question.
+                # change_position, occlusion_change, relational_flip only move objects.
+                # swap_attribute swaps colors (same net counts). Our QA cannot distinguish, so skip validation.
+                CF_TYPES_ACCEPT_WITHOUT_CHECK = {'change_position', 'swap_attribute', 'occlusion_change', 'relational_flip'}
+                cf1_differs = (cf1_type not in IMAGE_CF_TYPES) or (cf1_type in CF_TYPES_ACCEPT_WITHOUT_CHECK) or (normalize_answer(original_ans_cf1_q) != normalize_answer(cf1_ans_cf1_q))
+                cf2_differs = (cf2_type not in IMAGE_CF_TYPES) or (cf2_type in CF_TYPES_ACCEPT_WITHOUT_CHECK) or (normalize_answer(original_ans_cf2_q) != normalize_answer(cf2_ans_cf2_q))
+                # Accept when at least one CF has different answers (maximize usable data; the other may have same answer due to count-preserving swaps)
+                if cf1_differs or cf2_differs:
+                    break
+            else:
+                print(f"WARNING: Scene {scene_num}: could not find questions with different answers for both CFs after {MAX_CF_ANSWER_RETRIES} retries (scene included with best-effort questions)")
             try:
                 if with_links:
                         original_image_link, original_scene_link,
                         cf1_image_link, cf1_scene_link,
                         cf2_image_link, cf2_scene_link,
+                        cf1_type, cf2_type, cf1_description, cf2_description,
                         original_question, cf1_question, cf2_question,
                         original_difficulty, cf1_difficulty, cf2_difficulty,
                         original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
                 else:
                     rows.append([
                         original_id, cf1_id, cf2_id,
+                        cf1_type, cf2_type, cf1_description, cf2_description,
                         original_question, cf1_question, cf2_question,
                         original_difficulty, cf1_difficulty, cf2_difficulty,
                         original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
                 traceback.print_exc()
                 continue
         else:
+            # Load CF scene files to get cf_type and cf_description for the mapping
+            cf1_type = cf2_type = cf1_description = cf2_description = ''
+            cf1_scene_file = find_scene_file(scenes_dir, cf1_id)
+            cf2_scene_file = find_scene_file(scenes_dir, cf2_id)
+            if cf1_scene_file and cf2_scene_file:
+                try:
+                    cf1_scene = load_scene(cf1_scene_file)
+                    cf2_scene = load_scene(cf2_scene_file)
+                    cf1_type = get_cf_type_from_scene(cf1_scene) or ''
+                    cf2_type = get_cf_type_from_scene(cf2_scene) or ''
+                    cf1_description = get_cf_description_from_scene(cf1_scene) or ''
+                    cf2_description = get_cf_description_from_scene(cf2_scene) or ''
+                except Exception:
+                    pass
             if with_links:
                 def make_link(filename, file_type='image'):
                     if base_url:
                     scene_num,
                     original_image_link, original_scene_link,
                     cf1_image_link, cf1_scene_link,
+                    cf2_image_link, cf2_scene_link,
+                    cf1_type, cf2_type, cf1_description, cf2_description
                 ])
             else:
+                rows.append([original_id, cf1_id, cf2_id, cf1_type, cf2_type, cf1_description, cf2_description])
     csv_path = os.path.join(run_dir, csv_filename)
     try:
             if generate_questions:
                 print(f"  Scene ID: {row[0]}")
                 print(f"  Links:")
+                print(f"    Original image: {row[1]}, scene: {row[2]}")
+                print(f"    CF1 image: {row[3]}, scene: {row[4]}")
+                print(f"    CF2 image: {row[5]}, scene: {row[6]}")
+                print(f"  CF type / description: CF1 type={row[7]}, CF2 type={row[8]}; CF1 desc={row[9]!r}, CF2 desc={row[10]!r}")
+                print(f"  Questions: Original: {row[11]}, CF1: {row[12]}, CF2: {row[13]}")
             else:
                 print(f"  Scene ID: {row[0]}")
                 print(f"  Links:")
                 print(f"    Original image: {row[1]}, scene: {row[2]}")
                 print(f"    CF1 image: {row[3]}, scene: {row[4]}")
                 print(f"    CF2 image: {row[5]}, scene: {row[6]}")
+                print(f"  CF type / description: CF1 type={row[7]}, CF2 type={row[8]}; CF1 desc={row[9]!r}, CF2 desc={row[10]!r}")
+        elif generate_questions and len(row) > 14:
+            print(f"  Images: Original: {row[0]}, CF1: {row[1]}, CF2: {row[2]}")
+            print(f"  CF type / description: CF1 type={row[3]}, CF2 type={row[4]}; CF1 desc={row[5]!r}, CF2 desc={row[6]!r}")
+            print(f"  Questions: Original: {row[7]}, CF1: {row[8]}, CF2: {row[9]}")
             print(f"  Answer Matrix (scene × question):")
+            print(f"    Original image -> Orig Q: {row[10]}, CF1 Q: {row[11]}, CF2 Q: {row[12]}")
+            print(f"    CF1 image -> Orig Q: {row[13]}, CF1 Q: {row[14]}, CF2 Q: {row[15]}")
+            print(f"    CF2 image -> Orig Q: {row[16]}, CF1 Q: {row[17]}, CF2 Q: {row[18]}")
+        elif len(row) >= 7:
+            print(f"  Images: Original: {row[0]}, CF1: {row[1]}, CF2: {row[2]}")
+            print(f"  CF type / description: CF1 type={row[3]}, CF2 type={row[4]}; CF1 desc={row[5]!r}, CF2 desc={row[6]!r}")
 def main():
     parser = argparse.ArgumentParser(

scripts/generate_scenes.py CHANGED Viewed

@@ -53,9 +53,9 @@ def main():
                    choices=[
                        'change_color', 'change_shape', 'change_size',
                        'change_material', 'change_position',
-                       'change_count', 'add_object', 'remove_object',
                        'replace_object',
-                       'change_background', 'change_texture',
                        'change_lighting', 'add_noise',
                        'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
                    ],

                    choices=[
                        'change_color', 'change_shape', 'change_size',
                        'change_material', 'change_position',
+                       'add_object', 'remove_object', 'swap_attribute', 'occlusion_change', 'relational_flip',
                        'replace_object',
+                       'change_background',
                        'change_lighting', 'add_noise',
                        'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
                    ],

scripts/render.py CHANGED Viewed

@@ -353,6 +353,15 @@ parser.add_argument('--render_tile_size', default=256, type=int,
 parser.add_argument('--output_image', default=None,
     help="Output image path (used when rendering from JSON)")
 BACKGROUND_COLORS = {
     'default': None,
     'gray': (0.5, 0.5, 0.5),
@@ -370,11 +379,11 @@ BACKGROUND_COLORS = {
 LIGHTING_PRESETS = {
     'default': {'key': 1.0, 'fill': 0.5, 'back': 0.3},
-    'bright': {'key': 4.0, 'fill': 2.2, 'back': 1.4},
-    'dim': {'key': 0.08, 'fill': 0.04, 'back': 0.02},
-    'warm': {'key': 2.8, 'fill': 1.4, 'back': 0.7, 'color': (1.0, 0.75, 0.5)},
-    'cool': {'key': 2.0, 'fill': 1.1, 'back': 0.9, 'color': (0.5, 0.75, 1.0)},
-    'dramatic': {'key': 5.5, 'fill': 0.05, 'back': 0.02},
 }
 def set_background_color(color_name):
@@ -540,6 +549,7 @@ def render_from_json(args):
   shape_semantic_to_file = properties['shapes']
   material_semantic_to_file = properties['materials']
   print("Adding objects to scene...")
   for i, obj_info in enumerate(scene_struct.get('objects', [])):
     x, y, z = obj_info['3d_coords']
@@ -560,6 +570,8 @@ def render_from_json(args):
     except Exception as e:
       print(f"Error adding object {i}: {e}")
       continue
     rgba = color_name_to_rgba[obj_info['color']]
     semantic_material = obj_info['material']
@@ -575,6 +587,26 @@ def render_from_json(args):
     except Exception as e:
       print(f"Warning: Could not add material: {e}")
   filter_type = scene_struct.get('filter_type')
   filter_strength = scene_struct.get('filter_strength', 1.0)
@@ -858,7 +890,8 @@ def add_random_objects(scene_struct, num_objects, args, camera, max_scene_attemp
     if len(objects) < num_objects:
       continue
-    all_visible = check_visibility(blender_objects, args.min_pixels_per_object)
     if not all_visible:
       print('Some objects are occluded; replacing objects')
       for obj in blender_objects:
@@ -898,15 +931,46 @@ def compute_all_relationships(scene_struct, eps=0.2):
 def check_visibility(blender_objects, min_pixels_per_object):
-  """Visibility check disabled for compatibility (was causing scene gen to fail)."""
-  return True
-def render_shadeless(blender_objects, path='flat.png'):
   """
   Render a version of the scene with shading disabled and unique materials
   assigned to all objects. The image itself is written to path. This is used to ensure
   that all objects will be visible in the final rendered scene (when check_visibility is enabled).
   """
   render_args = bpy.context.scene.render
@@ -924,7 +988,8 @@ def render_shadeless(blender_objects, path='flat.png'):
       obj = bpy.data.objects[obj_name]
       obj.hide_render = True
-  object_colors = set()
   old_materials = []
   for i, obj in enumerate(blender_objects):
     if len(obj.data.materials) > 0:
@@ -940,10 +1005,16 @@ def render_shadeless(blender_objects, path='flat.png'):
     node_emission = nodes.new(type='ShaderNodeEmission')
     node_output = nodes.new(type='ShaderNodeOutputMaterial')
-    while True:
-      r, g, b = [random.random() for _ in range(3)]
-      if (r, g, b) not in object_colors: break
-    object_colors.add((r, g, b))
     node_emission.inputs['Color'].default_value = (r, g, b, 1.0)
     mat.node_tree.links.new(node_emission.outputs['Emission'], node_output.inputs['Surface'])

 parser.add_argument('--output_image', default=None,
     help="Output image path (used when rendering from JSON)")
+MIN_VISIBLE_FRACTION = 0.001
+MIN_VISIBLE_FRACTION_PARTIAL_OCCLUSION = 0.0005
+MIN_PIXELS_FLOOR = 50
+def min_visible_pixels(width, height, fraction=MIN_VISIBLE_FRACTION, floor=MIN_PIXELS_FLOOR):
+  return max(floor, int(width * height * fraction))
 BACKGROUND_COLORS = {
     'default': None,
     'gray': (0.5, 0.5, 0.5),
 LIGHTING_PRESETS = {
     'default': {'key': 1.0, 'fill': 0.5, 'back': 0.3},
+    'bright': {'key': 12.0, 'fill': 6.0, 'back': 4.0},
+    'dim': {'key': 0.008, 'fill': 0.004, 'back': 0.002},
+    'warm': {'key': 5.0, 'fill': 0.8, 'back': 0.3, 'color': (1.0, 0.5, 0.2)},
+    'cool': {'key': 4.0, 'fill': 2.0, 'back': 1.5, 'color': (0.2, 0.5, 1.0)},
+    'dramatic': {'key': 15.0, 'fill': 0.005, 'back': 0.002},
 }
 def set_background_color(color_name):
   shape_semantic_to_file = properties['shapes']
   material_semantic_to_file = properties['materials']
+  blender_objects = []
   print("Adding objects to scene...")
   for i, obj_info in enumerate(scene_struct.get('objects', [])):
     x, y, z = obj_info['3d_coords']
     except Exception as e:
       print(f"Error adding object {i}: {e}")
       continue
+    if INSIDE_BLENDER and bpy.context.object:
+      blender_objects.append(bpy.context.object)
     rgba = color_name_to_rgba[obj_info['color']]
     semantic_material = obj_info['material']
     except Exception as e:
       print(f"Warning: Could not add material: {e}")
+  if blender_objects:
+    cf_meta = scene_struct.get('cf_metadata') or {}
+    cf_type = cf_meta.get('cf_type', '')
+    w = getattr(args, 'width', 320)
+    h = getattr(args, 'height', 240)
+    if cf_type == 'occlusion_change':
+      min_pixels = min_visible_pixels(w, h, MIN_VISIBLE_FRACTION_PARTIAL_OCCLUSION, MIN_PIXELS_FLOOR)
+    else:
+      base = min_visible_pixels(w, h, MIN_VISIBLE_FRACTION, MIN_PIXELS_FLOOR)
+      min_pixels = max(getattr(args, 'min_pixels_per_object', MIN_PIXELS_FLOOR), base)
+    all_visible = check_visibility(blender_objects, min_pixels)
+    if not all_visible:
+      print('Visibility check failed: at least one object has too few visible pixels')
+      for obj in blender_objects:
+        try:
+          delete_object(obj)
+        except Exception:
+          pass
+      sys.exit(1)
   filter_type = scene_struct.get('filter_type')
   filter_strength = scene_struct.get('filter_strength', 1.0)
     if len(objects) < num_objects:
       continue
+    min_pixels = max(args.min_pixels_per_object, min_visible_pixels(args.width, args.height))
+    all_visible = check_visibility(blender_objects, min_pixels)
     if not all_visible:
       print('Some objects are occluded; replacing objects')
       for obj in blender_objects:
 def check_visibility(blender_objects, min_pixels_per_object):
+  """
+  Ensure each object has at least min_pixels_per_object visible pixels in the
+  rendered image (rejects scenes where an object is fully occluded by others).
+  """
+  if not INSIDE_BLENDER or not blender_objects:
+    return True
+  if Image is None:
+    return True
+  fd, path = tempfile.mkstemp(suffix='.png')
+  os.close(fd)
+  try:
+    colors_list = render_shadeless(blender_objects, path, use_distinct_colors=True)
+    img = Image.open(path).convert('RGB')
+    w, h = img.size
+    pix = img.load()
+    color_to_idx = {}
+    for i, (r, g, b) in enumerate(colors_list):
+      key = (round(r * 255), round(g * 255), round(b * 255))
+      color_to_idx[key] = i
+    counts = [0] * len(blender_objects)
+    for y in range(h):
+      for x in range(w):
+        key = (pix[x, y][0], pix[x, y][1], pix[x, y][2])
+        if key in color_to_idx:
+          counts[color_to_idx[key]] += 1
+    all_visible = all(c >= min_pixels_per_object for c in counts)
+    return all_visible
+  finally:
+    try:
+      os.remove(path)
+    except Exception:
+      pass
+def render_shadeless(blender_objects, path='flat.png', use_distinct_colors=False):
   """
   Render a version of the scene with shading disabled and unique materials
   assigned to all objects. The image itself is written to path. This is used to ensure
   that all objects will be visible in the final rendered scene (when check_visibility is enabled).
+  Returns a list of (r,g,b) colors in object order (for visibility counting when use_distinct_colors=True).
   """
   render_args = bpy.context.scene.render
       obj = bpy.data.objects[obj_name]
       obj.hide_render = True
+  n = len(blender_objects)
+  object_colors = [] if use_distinct_colors else set()
   old_materials = []
   for i, obj in enumerate(blender_objects):
     if len(obj.data.materials) > 0:
     node_emission = nodes.new(type='ShaderNodeEmission')
     node_output = nodes.new(type='ShaderNodeOutputMaterial')
+    if use_distinct_colors:
+      r = (i + 1) / (n + 1)
+      g, b = 0.5, 0.5
+      object_colors.append((r, g, b))
+    else:
+      while True:
+        r, g, b = [random.random() for _ in range(3)]
+        if (r, g, b) not in object_colors:
+          break
+      object_colors.add((r, g, b))
     node_emission.inputs['Color'].default_value = (r, g, b, 1.0)
     mat.node_tree.links.new(node_emission.outputs['Emission'], node_output.inputs['Surface'])