Scholarus commited on
Commit
29350b2
·
1 Parent(s): cf82114

New options added

Browse files
README.md CHANGED
@@ -190,20 +190,18 @@ python3 pipeline.py --num_scenes 10 --min_cf_change_score 1.5 --max_cf_attempts
190
 
191
  The pipeline supports 17 different counterfactual types, divided into two categories:
192
 
193
- **Image Counterfactuals** (Should change VQA answers - 9 types):
194
  - `change_color` - Change the color of a random object (e.g., red → blue)
195
  - `change_shape` - Change the shape of a random object (cube/sphere/cylinder)
196
  - `change_size` - Change the size of a random object (small ↔ large)
197
  - `change_material` - Change the material of a random object (metal ↔ rubber)
198
  - `change_position` - Move a random object to a different location (with collision detection)
199
- - `change_count` - Add or remove objects from the scene
200
  - `add_object` - Add a new random object to the scene
201
  - `remove_object` - Remove a random object from the scene
202
  - `replace_object` - Replace an object with a different one (keeping position)
203
 
204
- **Negative Counterfactuals** (Should NOT change VQA answers - 8 types):
205
  - `change_background` - Change the background/ground color
206
- - `change_texture` - Change texture style for all objects (metal ↔ rubber)
207
  - `change_lighting` - Change lighting conditions (bright/dim/warm/cool/dramatic)
208
  - `add_noise` - Add image noise/grain (light/medium/heavy levels)
209
  - `apply_fisheye` - Apply fisheye lens distortion effect
@@ -282,6 +280,32 @@ python scripts/generate_questions_mapping.py --output_dir output --auto_latest -
282
  python scripts/generate_questions_mapping.py --output_dir output --auto_latest --generate_questions --long_format --long_csv_name qa_dataset.csv
283
  ```
284
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
285
 
286
  ## Project Structure
287
 
 
190
 
191
  The pipeline supports 17 different counterfactual types, divided into two categories:
192
 
193
+ **Image Counterfactuals** (Should change VQA answers - 8 types):
194
  - `change_color` - Change the color of a random object (e.g., red → blue)
195
  - `change_shape` - Change the shape of a random object (cube/sphere/cylinder)
196
  - `change_size` - Change the size of a random object (small ↔ large)
197
  - `change_material` - Change the material of a random object (metal ↔ rubber)
198
  - `change_position` - Move a random object to a different location (with collision detection)
 
199
  - `add_object` - Add a new random object to the scene
200
  - `remove_object` - Remove a random object from the scene
201
  - `replace_object` - Replace an object with a different one (keeping position)
202
 
203
+ **Negative Counterfactuals** (Should NOT change VQA answers - 7 types):
204
  - `change_background` - Change the background/ground color
 
205
  - `change_lighting` - Change lighting conditions (bright/dim/warm/cool/dramatic)
206
  - `add_noise` - Add image noise/grain (light/medium/heavy levels)
207
  - `apply_fisheye` - Apply fisheye lens distortion effect
 
280
  python scripts/generate_questions_mapping.py --output_dir output --auto_latest --generate_questions --long_format --long_csv_name qa_dataset.csv
281
  ```
282
 
283
+ #### Ensuring counterfactual answers differ
284
+
285
+ For evaluation and datasets, the **counterfactual image’s answer to its counterfactual question** should differ from the **original image’s answer to the original question**. The pipeline does this in two ways:
286
+
287
+ 1. **Automatic retries**
288
+ When generating questions (with `--generate_questions`), the script retries up to **`MAX_CF_ANSWER_RETRIES`** (default 50) per scene. For each attempt it picks new counterfactual questions (with different randomness). It keeps a pair only when both CF1 and CF2 answers differ from the original answer (after normalizing, e.g. case and whitespace). If after all retries they still match, the scene is still included and a warning is printed.
289
+
290
+ 2. **CF-specific question templates**
291
+ Counterfactual questions are chosen from templates that target the **changed** attribute or count, so the answer on the counterfactual image is different by design:
292
+ - **Count-changing CFs** (e.g. `add_object`, `remove_object`): questions like “How many objects are in the scene?” or “Are there more than N objects?” so the count/yes-no differs.
293
+ - **Attribute-changing CFs** (e.g. `change_color`, `change_shape`): questions about the **new** value (e.g. “How many red objects?” when an object was changed to red), so the count on the CF image differs from the original.
294
+
295
+ **What you need to do:**
296
+
297
+ - Run question generation **after** scenes and images exist (either as part of the pipeline with `--generate_questions`, or later on a run directory):
298
+
299
+ ```bash
300
+ # As part of a full run (ensures answers differ for that run’s scenes)
301
+ python pipeline.py --num_scenes 10 --num_objects 5 --run_name my_run --generate_questions
302
+
303
+ # Or later, on an existing run
304
+ python scripts/generate_questions_mapping.py --output_dir output/my_run --generate_questions
305
+ ```
306
+
307
+ - **Optional:** To allow more attempts per scene, edit `scripts/generate_questions_mapping.py` and increase **`MAX_CF_ANSWER_RETRIES`** (e.g. from 50 to 100). No CLI flag is exposed for this.
308
+
309
 
310
  ## Project Structure
311
 
app.py CHANGED
@@ -618,20 +618,21 @@ def main():
618
  value=False,
619
  help="Use the same counterfactual type for every variant (first selected type, or one random if none selected)"
620
  )
621
- with st.expander("Image CFs (change answers)", expanded=False):
622
  use_change_color = st.checkbox("Change Color", value=False)
623
  use_change_shape = st.checkbox("Change Shape", value=False)
624
  use_change_size = st.checkbox("Change Size", value=False)
625
  use_change_material = st.checkbox("Change Material", value=False)
626
  use_change_position = st.checkbox("Change Position", value=False)
627
- use_change_count = st.checkbox("Change Count", value=False)
628
  use_add_object = st.checkbox("Add Object", value=False)
629
  use_remove_object = st.checkbox("Remove Object", value=False)
630
  use_replace_object = st.checkbox("Replace Object", value=False)
 
 
 
631
 
632
  with st.expander("Negative CFs (don't change answers)", expanded=False):
633
  use_change_background = st.checkbox("Change Background", value=False)
634
- use_change_texture = st.checkbox("Change Texture", value=False)
635
  use_change_lighting = st.checkbox("Change Lighting", value=False)
636
  use_add_noise = st.checkbox("Add Noise", value=False)
637
  use_apply_fisheye = st.checkbox("Apply Fisheye", value=False)
@@ -713,18 +714,20 @@ def main():
713
  cf_types.append('change_material')
714
  if use_change_position:
715
  cf_types.append('change_position')
716
- if use_change_count:
717
- cf_types.append('change_count')
718
  if use_add_object:
719
  cf_types.append('add_object')
720
  if use_remove_object:
721
  cf_types.append('remove_object')
722
  if use_replace_object:
723
  cf_types.append('replace_object')
 
 
 
 
 
 
724
  if use_change_background:
725
  cf_types.append('change_background')
726
- if use_change_texture:
727
- cf_types.append('change_texture')
728
  if use_change_lighting:
729
  cf_types.append('change_lighting')
730
  if use_add_noise:
 
618
  value=False,
619
  help="Use the same counterfactual type for every variant (first selected type, or one random if none selected)"
620
  )
621
+ with st.expander("Image CFs (change answers)", expanded=True):
622
  use_change_color = st.checkbox("Change Color", value=False)
623
  use_change_shape = st.checkbox("Change Shape", value=False)
624
  use_change_size = st.checkbox("Change Size", value=False)
625
  use_change_material = st.checkbox("Change Material", value=False)
626
  use_change_position = st.checkbox("Change Position", value=False)
 
627
  use_add_object = st.checkbox("Add Object", value=False)
628
  use_remove_object = st.checkbox("Remove Object", value=False)
629
  use_replace_object = st.checkbox("Replace Object", value=False)
630
+ use_swap_attribute = st.checkbox("Swap Attribute", value=False)
631
+ use_occlusion_change = st.checkbox("Occlusion Change", value=False)
632
+ use_relational_flip = st.checkbox("Relational Flip", value=False)
633
 
634
  with st.expander("Negative CFs (don't change answers)", expanded=False):
635
  use_change_background = st.checkbox("Change Background", value=False)
 
636
  use_change_lighting = st.checkbox("Change Lighting", value=False)
637
  use_add_noise = st.checkbox("Add Noise", value=False)
638
  use_apply_fisheye = st.checkbox("Apply Fisheye", value=False)
 
714
  cf_types.append('change_material')
715
  if use_change_position:
716
  cf_types.append('change_position')
 
 
717
  if use_add_object:
718
  cf_types.append('add_object')
719
  if use_remove_object:
720
  cf_types.append('remove_object')
721
  if use_replace_object:
722
  cf_types.append('replace_object')
723
+ if use_swap_attribute:
724
+ cf_types.append('swap_attribute')
725
+ if use_occlusion_change:
726
+ cf_types.append('occlusion_change')
727
+ if use_relational_flip:
728
+ cf_types.append('relational_flip')
729
  if use_change_background:
730
  cf_types.append('change_background')
 
 
731
  if use_change_lighting:
732
  cf_types.append('change_lighting')
733
  if use_add_noise:
pipeline.py CHANGED
@@ -4,6 +4,7 @@ import json
4
  import argparse
5
  import os
6
  import subprocess
 
7
  import random
8
  import copy
9
  import sys
@@ -176,20 +177,7 @@ def create_patched_render_script():
176
  if old_margin_check in patched_content:
177
  patched_content = patched_content.replace(old_margin_check, new_margin_check)
178
 
179
- check_vis_start = patched_content.find('def check_visibility(blender_objects, min_pixels_per_object):')
180
- if check_vis_start != -1:
181
- docstring_end = patched_content.find('"""', check_vis_start + 50)
182
- docstring_end = patched_content.find('\n', docstring_end) + 1
183
- next_def = patched_content.find('\ndef ', docstring_end)
184
- if next_def == -1:
185
- next_def = len(patched_content)
186
-
187
- new_function = '''def check_visibility(blender_objects, min_pixels_per_object):
188
- return True
189
-
190
- '''
191
- patched_content = patched_content[:check_vis_start] + new_function + patched_content[next_def:]
192
-
193
  patched_content = patched_content.replace(
194
  "parser.add_argument('--min_pixels_per_object', default=200, type=int,",
195
  "parser.add_argument('--min_pixels_per_object', default=50, type=int,"
@@ -447,10 +435,11 @@ def cf_change_position(scene):
447
  old_coords = obj['3d_coords']
448
 
449
  try:
450
- with open('data/properties.json', 'r') as f:
 
451
  properties = json.load(f)
452
  size_mapping = properties['sizes']
453
- except:
454
  size_mapping = {'small': 0.35, 'large': 0.7}
455
 
456
  r = size_mapping.get(obj['size'], 0.5)
@@ -458,12 +447,9 @@ def cf_change_position(scene):
458
  r /= math.sqrt(2)
459
 
460
  min_dist = 0.25
461
-
462
- max_attempts = 100
463
- for attempt in range(max_attempts):
464
  new_x = random.uniform(-3, 3)
465
  new_y = random.uniform(-3, 3)
466
-
467
  try:
468
  dx0 = float(new_x) - float(old_coords[0])
469
  dy0 = float(new_y) - float(old_coords[1])
@@ -476,14 +462,11 @@ def cf_change_position(scene):
476
  for other_idx, other_obj in enumerate(cf_scene['objects']):
477
  if other_idx == move_idx:
478
  continue
479
-
480
  other_x, other_y, _ = other_obj['3d_coords']
481
  other_r = size_mapping.get(other_obj['size'], 0.5)
482
  if other_obj['shape'] == 'cube':
483
  other_r /= math.sqrt(2)
484
-
485
  dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
486
-
487
  if dist < (r + other_r + min_dist):
488
  collision = True
489
  break
@@ -502,10 +485,11 @@ def cf_change_surrounding_count(scene):
502
  return cf_scene, "no change (0 objects)"
503
 
504
  try:
505
- with open('data/properties.json', 'r') as f:
 
506
  properties = json.load(f)
507
  size_mapping = properties['sizes']
508
- except:
509
  size_mapping = {'small': 0.35, 'large': 0.7}
510
 
511
  action = random.choice(['add', 'remove'])
@@ -571,7 +555,6 @@ def cf_change_surrounding_count(scene):
571
  return cf_scene, "no change (couldn't find valid positions for new objects)"
572
 
573
  else:
574
- # Remove 1-2 objects
575
  num_to_remove = min(random.randint(1, 2), len(cf_scene['objects']) - 1)
576
  removed = []
577
  for _ in range(num_to_remove):
@@ -590,12 +573,12 @@ def cf_add_object(scene):
590
  materials = ['metal', 'rubber']
591
  sizes = ['small', 'large']
592
 
593
- # Load size mapping
594
  try:
595
- with open('data/properties.json', 'r') as f:
 
596
  properties = json.load(f)
597
  size_mapping = properties['sizes']
598
- except:
599
  size_mapping = {'small': 0.35, 'large': 0.7}
600
 
601
  min_dist = 0.25
@@ -680,6 +663,193 @@ def cf_replace_object(scene):
680
 
681
  return cf_scene, f"replaced {old_desc} with {new_color} {new_shape}"
682
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
683
  def cf_change_background(scene):
684
  cf_scene = copy.deepcopy(scene)
685
 
@@ -854,16 +1024,17 @@ IMAGE_COUNTERFACTUALS = {
854
  'change_size': cf_change_size,
855
  'change_material': cf_change_material,
856
  'change_position': cf_change_position,
857
- 'change_count': cf_change_surrounding_count,
858
  'add_object': cf_add_object,
859
  'remove_object': cf_remove_object,
860
  'replace_object': cf_replace_object,
 
 
 
861
  }
862
 
863
  # Negative CFs - These should NOT change answers to questions
864
  NEGATIVE_COUNTERFACTUALS = {
865
  'change_background': cf_change_background,
866
- 'change_texture': cf_change_texture,
867
  'change_lighting': cf_change_lighting,
868
  'add_noise': cf_add_noise,
869
  'apply_fisheye': cf_apply_fisheye,
@@ -965,7 +1136,7 @@ def generate_counterfactuals(scene, num_counterfactuals=2, cf_types=None, same_c
965
  one_type = random.choice(list(COUNTERFACTUAL_TYPES.keys()))
966
  selected_types = [one_type] * num_counterfactuals
967
  elif cf_types:
968
- selected_types = (cf_types * ((num_counterfactuals // len(cf_types)) + 1))[:num_counterfactuals]
969
 
970
  if selected_types is not None:
971
  for cf_type in selected_types:
@@ -1293,14 +1464,15 @@ def list_counterfactual_types():
1293
  print(" change_size - Change size of an object (small/large)")
1294
  print(" change_material - Change material of an object (metal/rubber)")
1295
  print(" change_position - Move an object to a different location")
1296
- print(" change_count - Add or remove objects from the scene")
1297
  print(" add_object - Add a new random object")
 
 
 
1298
  print(" remove_object - Remove a random object")
1299
  print(" replace_object - Replace an object with a different one")
1300
 
1301
  print("\nNEGATIVE COUNTERFACTUALS (Should NOT change VQA answers):")
1302
  print(" change_background - Change background/ground color")
1303
- print(" change_texture - Change texture style (all objects)")
1304
  print(" change_lighting - Change lighting conditions")
1305
  print(" add_noise - Add image noise/grain")
1306
  print(" apply_fisheye - Apply fisheye lens distortion")
@@ -1556,6 +1728,218 @@ def save_run_metadata(run_dir, args, successful_scenes, successful_renders):
1556
 
1557
  print(f"\n[OK] Saved run metadata to: {metadata_path}")
1558
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1559
  def main():
1560
  # When run as a script, ensure we're in the project root
1561
  # Change to script directory so relative paths work
@@ -1602,10 +1986,10 @@ def main():
1602
  # Image CFs (should change answers)
1603
  'change_color', 'change_shape', 'change_size',
1604
  'change_material', 'change_position',
1605
- 'change_count', 'add_object', 'remove_object',
1606
- 'replace_object',
1607
  # Negative CFs (should NOT change answers)
1608
- 'change_background', 'change_texture',
1609
  'change_lighting', 'add_noise',
1610
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
1611
  ],
@@ -1625,8 +2009,12 @@ def main():
1625
 
1626
  parser.add_argument('--generate_questions', action='store_true',
1627
  help='Create questions and answers CSV after rendering completes')
 
 
1628
  parser.add_argument('--csv_name', default='image_mapping_with_questions.csv',
1629
  help='Output CSV filename (default: image_mapping_with_questions.csv)')
 
 
1630
 
1631
  args = parser.parse_args()
1632
 
@@ -1642,6 +2030,13 @@ def main():
1642
  print("ERROR: --run_name is required when using --resume")
1643
  return
1644
 
 
 
 
 
 
 
 
1645
  # Find Blender
1646
  blender_path = args.blender_path or find_blender()
1647
  print(f"Using Blender: {blender_path}")
@@ -1843,6 +2238,8 @@ def main():
1843
  generate_questions=True
1844
  )
1845
  print(f"\n[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
 
 
1846
  except Exception as e:
1847
  print(f"\n[ERROR] Questions: {e}")
1848
  import traceback
 
4
  import argparse
5
  import os
6
  import subprocess
7
+ import csv
8
  import random
9
  import copy
10
  import sys
 
177
  if old_margin_check in patched_content:
178
  patched_content = patched_content.replace(old_margin_check, new_margin_check)
179
 
180
+ # Visibility check left enabled (no longer patched to return True)
 
 
 
 
 
 
 
 
 
 
 
 
 
181
  patched_content = patched_content.replace(
182
  "parser.add_argument('--min_pixels_per_object', default=200, type=int,",
183
  "parser.add_argument('--min_pixels_per_object', default=50, type=int,"
 
435
  old_coords = obj['3d_coords']
436
 
437
  try:
438
+ script_dir = os.path.dirname(os.path.abspath(__file__))
439
+ with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
440
  properties = json.load(f)
441
  size_mapping = properties['sizes']
442
+ except Exception:
443
  size_mapping = {'small': 0.35, 'large': 0.7}
444
 
445
  r = size_mapping.get(obj['size'], 0.5)
 
447
  r /= math.sqrt(2)
448
 
449
  min_dist = 0.25
450
+ for attempt in range(100):
 
 
451
  new_x = random.uniform(-3, 3)
452
  new_y = random.uniform(-3, 3)
 
453
  try:
454
  dx0 = float(new_x) - float(old_coords[0])
455
  dy0 = float(new_y) - float(old_coords[1])
 
462
  for other_idx, other_obj in enumerate(cf_scene['objects']):
463
  if other_idx == move_idx:
464
  continue
 
465
  other_x, other_y, _ = other_obj['3d_coords']
466
  other_r = size_mapping.get(other_obj['size'], 0.5)
467
  if other_obj['shape'] == 'cube':
468
  other_r /= math.sqrt(2)
 
469
  dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
 
470
  if dist < (r + other_r + min_dist):
471
  collision = True
472
  break
 
485
  return cf_scene, "no change (0 objects)"
486
 
487
  try:
488
+ script_dir = os.path.dirname(os.path.abspath(__file__))
489
+ with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
490
  properties = json.load(f)
491
  size_mapping = properties['sizes']
492
+ except Exception:
493
  size_mapping = {'small': 0.35, 'large': 0.7}
494
 
495
  action = random.choice(['add', 'remove'])
 
555
  return cf_scene, "no change (couldn't find valid positions for new objects)"
556
 
557
  else:
 
558
  num_to_remove = min(random.randint(1, 2), len(cf_scene['objects']) - 1)
559
  removed = []
560
  for _ in range(num_to_remove):
 
573
  materials = ['metal', 'rubber']
574
  sizes = ['small', 'large']
575
 
 
576
  try:
577
+ script_dir = os.path.dirname(os.path.abspath(__file__))
578
+ with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
579
  properties = json.load(f)
580
  size_mapping = properties['sizes']
581
+ except Exception:
582
  size_mapping = {'small': 0.35, 'large': 0.7}
583
 
584
  min_dist = 0.25
 
663
 
664
  return cf_scene, f"replaced {old_desc} with {new_color} {new_shape}"
665
 
666
+
667
+ def cf_swap_attribute(scene):
668
+ """Swap colors between two existing objects. Tests if model relies on priors (e.g. sphere=red)."""
669
+ cf_scene = copy.deepcopy(scene)
670
+
671
+ if len(cf_scene['objects']) < 2:
672
+ return cf_scene, "no swap (fewer than 2 objects)"
673
+
674
+ idx_a, idx_b = random.sample(range(len(cf_scene['objects'])), 2)
675
+ obj_a = cf_scene['objects'][idx_a]
676
+ obj_b = cf_scene['objects'][idx_b]
677
+
678
+ if obj_a['color'] == obj_b['color']:
679
+ return cf_scene, "no swap (both objects same color)"
680
+
681
+ color_a, color_b = obj_a['color'], obj_b['color']
682
+ obj_a['color'] = color_b
683
+ obj_b['color'] = color_a
684
+
685
+ return cf_scene, f"swapped colors between {color_a} {obj_a['shape']} and {color_b} {obj_b['shape']}"
686
+
687
+
688
+ TARGET_OCCLUSION_COVERAGE = 0.6
689
+
690
+
691
+ def cf_occlusion_change(scene):
692
+ """Move an object so it partially hides another. Tests spatial depth and partial shape recognition."""
693
+ cf_scene = copy.deepcopy(scene)
694
+
695
+ if len(cf_scene['objects']) < 2:
696
+ return cf_scene, "no occlusion (fewer than 2 objects)"
697
+
698
+ directions = cf_scene.get('directions', {})
699
+ front = directions.get('front', [0.75, -0.66, 0.0])
700
+ if len(front) < 2:
701
+ front = [0.75, -0.66]
702
+
703
+ try:
704
+ script_dir = os.path.dirname(os.path.abspath(__file__))
705
+ with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
706
+ properties = json.load(f)
707
+ size_mapping = properties['sizes']
708
+ except Exception:
709
+ size_mapping = {'small': 0.35, 'large': 0.7}
710
+
711
+ def get_radius(obj):
712
+ r = size_mapping.get(obj['size'], 0.5)
713
+ if obj['shape'] == 'cube':
714
+ r /= math.sqrt(2)
715
+ return r
716
+
717
+ min_dist = 0.15
718
+ def is_valid_occlusion_pos(cf_scene, occluder_idx, target_idx, new_x, new_y, occluder_r):
719
+ for i, other in enumerate(cf_scene['objects']):
720
+ if i == occluder_idx:
721
+ continue
722
+ other_x, other_y, _ = other['3d_coords']
723
+ other_r = get_radius(other)
724
+ dist = math.sqrt((new_x - other_x)**2 + (new_y - other_y)**2)
725
+ if dist < (occluder_r + other_r + min_dist):
726
+ return False
727
+ return -2.8 <= new_x <= 2.8 and -2.8 <= new_y <= 2.8
728
+
729
+ fx, fy = float(front[0]), float(front[1])
730
+ norm = math.sqrt(fx * fx + fy * fy) or 1.0
731
+ coverage_range = 0.5
732
+ delta_max = coverage_range * (1.0 - TARGET_OCCLUSION_COVERAGE)
733
+ base_deltas = [0.02, 0.05, 0.08, 0.12, 0.18, delta_max * 0.5, delta_max]
734
+
735
+ pairs = [(i, j) for i in range(len(cf_scene['objects'])) for j in range(len(cf_scene['objects'])) if i != j]
736
+ random.shuffle(pairs)
737
+
738
+ for occluder_idx, target_idx in pairs:
739
+ occluder = cf_scene['objects'][occluder_idx]
740
+ target = cf_scene['objects'][target_idx]
741
+ tx, ty, tz = target['3d_coords']
742
+ oz = occluder['3d_coords'][2]
743
+ occluder_r = get_radius(occluder)
744
+ target_r = get_radius(target)
745
+ min_offset = occluder_r + target_r + min_dist
746
+ offsets = [min_offset + d for d in base_deltas]
747
+
748
+ for offset in offsets:
749
+ new_x = tx + (fx / norm) * offset
750
+ new_y = ty + (fy / norm) * offset
751
+ if is_valid_occlusion_pos(cf_scene, occluder_idx, target_idx, new_x, new_y, occluder_r):
752
+ occluder['3d_coords'] = [new_x, new_y, oz]
753
+ occluder['pixel_coords'] = [0, 0, 0]
754
+ return cf_scene, f"moved {occluder['color']} {occluder['shape']} to partially occlude {target['color']} {target['shape']}"
755
+
756
+ return cf_scene, "no occlusion (couldn't find valid position)"
757
+
758
+
759
+ def cf_relational_flip(scene):
760
+ """Move object A from 'left of B' to 'right of B' (or vice versa). Targets spatial prepositions."""
761
+ cf_scene = copy.deepcopy(scene)
762
+
763
+ if len(cf_scene['objects']) < 2:
764
+ return cf_scene, "no flip (fewer than 2 objects)"
765
+
766
+ directions = cf_scene.get('directions', {})
767
+ left_vec = directions.get('left', [-0.66, -0.75, 0.0])
768
+ if len(left_vec) < 2:
769
+ left_vec = [-0.66, -0.75]
770
+
771
+ try:
772
+ script_dir = os.path.dirname(os.path.abspath(__file__))
773
+ with open(os.path.join(script_dir, 'data', 'properties.json'), 'r') as f:
774
+ properties = json.load(f)
775
+ size_mapping = properties['sizes']
776
+ except Exception:
777
+ size_mapping = {'small': 0.35, 'large': 0.7}
778
+
779
+ def get_radius(obj):
780
+ r = size_mapping.get(obj['size'], 0.5)
781
+ if obj['shape'] == 'cube':
782
+ r /= math.sqrt(2)
783
+ return r
784
+
785
+ def is_valid_pos(cf_scene, a_idx, new_x, new_y, r_a, min_dist=0.12):
786
+ for i, other in enumerate(cf_scene['objects']):
787
+ if i == a_idx:
788
+ continue
789
+ ox, oy, _ = other['3d_coords']
790
+ other_r = get_radius(other)
791
+ dist = math.sqrt((new_x - ox)**2 + (new_y - oy)**2)
792
+ if dist < (r_a + other_r + min_dist):
793
+ return False
794
+ return -2.8 <= new_x <= 2.8 and -2.8 <= new_y <= 2.8
795
+
796
+ relationships = cf_scene.get('relationships', {})
797
+ left_of = relationships.get('left', [])
798
+ right_of = relationships.get('right', [])
799
+
800
+ candidates = []
801
+ for b_idx in range(len(cf_scene['objects'])):
802
+ for a_idx in left_of[b_idx] if b_idx < len(left_of) else []:
803
+ if a_idx != b_idx and a_idx < len(cf_scene['objects']):
804
+ candidates.append((a_idx, b_idx, 'left'))
805
+ for a_idx in right_of[b_idx] if b_idx < len(right_of) else []:
806
+ if a_idx != b_idx and a_idx < len(cf_scene['objects']):
807
+ candidates.append((a_idx, b_idx, 'right'))
808
+
809
+ if not candidates:
810
+ lx, ly = float(left_vec[0]), float(left_vec[1])
811
+ for a_idx in range(len(cf_scene['objects'])):
812
+ for b_idx in range(len(cf_scene['objects'])):
813
+ if a_idx == b_idx:
814
+ continue
815
+ ax_a, ay_a, _ = cf_scene['objects'][a_idx]['3d_coords']
816
+ bx_b, by_b, _ = cf_scene['objects'][b_idx]['3d_coords']
817
+ dx, dy = ax_a - bx_b, ay_a - by_b
818
+ dot = dx * lx + dy * ly
819
+ if abs(dot) > 0.2:
820
+ side = 'left' if dot > 0 else 'right'
821
+ candidates.append((a_idx, b_idx, side))
822
+
823
+ if not candidates:
824
+ return cf_scene, "no flip (no clear left/right relationships)"
825
+
826
+ random.shuffle(candidates)
827
+ lx, ly = float(left_vec[0]), float(left_vec[1])
828
+
829
+ for a_idx, b_idx, side in candidates:
830
+ obj_a = cf_scene['objects'][a_idx]
831
+ obj_b = cf_scene['objects'][b_idx]
832
+ ax, ay, az = obj_a['3d_coords']
833
+ bx, by, bz = obj_b['3d_coords']
834
+ r_a = get_radius(obj_a)
835
+
836
+ dx, dy = ax - bx, ay - by
837
+ dot_left = dx * lx + dy * ly
838
+ ref_dx = dx - 2 * dot_left * lx
839
+ ref_dy = dy - 2 * dot_left * ly
840
+
841
+ for scale in [1.0, 0.9, 0.8, 0.7, 0.85, 0.75]:
842
+ new_x = bx + scale * ref_dx
843
+ new_y = by + scale * ref_dy
844
+ if is_valid_pos(cf_scene, a_idx, new_x, new_y, r_a):
845
+ obj_a['3d_coords'] = [new_x, new_y, az]
846
+ obj_a['pixel_coords'] = [0, 0, 0]
847
+ new_side = "right" if side == "left" else "left"
848
+ return cf_scene, f"moved {obj_a['color']} {obj_a['shape']} from {side} of {obj_b['color']} {obj_b['shape']} to {new_side}"
849
+
850
+ return cf_scene, "no flip (couldn't find collision-free position)"
851
+
852
+
853
  def cf_change_background(scene):
854
  cf_scene = copy.deepcopy(scene)
855
 
 
1024
  'change_size': cf_change_size,
1025
  'change_material': cf_change_material,
1026
  'change_position': cf_change_position,
 
1027
  'add_object': cf_add_object,
1028
  'remove_object': cf_remove_object,
1029
  'replace_object': cf_replace_object,
1030
+ 'swap_attribute': cf_swap_attribute,
1031
+ 'occlusion_change': cf_occlusion_change,
1032
+ 'relational_flip': cf_relational_flip,
1033
  }
1034
 
1035
  # Negative CFs - These should NOT change answers to questions
1036
  NEGATIVE_COUNTERFACTUALS = {
1037
  'change_background': cf_change_background,
 
1038
  'change_lighting': cf_change_lighting,
1039
  'add_noise': cf_add_noise,
1040
  'apply_fisheye': cf_apply_fisheye,
 
1136
  one_type = random.choice(list(COUNTERFACTUAL_TYPES.keys()))
1137
  selected_types = [one_type] * num_counterfactuals
1138
  elif cf_types:
1139
+ selected_types = [random.choice(cf_types) for _ in range(num_counterfactuals)]
1140
 
1141
  if selected_types is not None:
1142
  for cf_type in selected_types:
 
1464
  print(" change_size - Change size of an object (small/large)")
1465
  print(" change_material - Change material of an object (metal/rubber)")
1466
  print(" change_position - Move an object to a different location")
 
1467
  print(" add_object - Add a new random object")
1468
+ print(" swap_attribute - Swap colors between two objects")
1469
+ print(" occlusion_change - Move object to partially hide another")
1470
+ print(" relational_flip - Move object from left of X to right of X")
1471
  print(" remove_object - Remove a random object")
1472
  print(" replace_object - Replace an object with a different one")
1473
 
1474
  print("\nNEGATIVE COUNTERFACTUALS (Should NOT change VQA answers):")
1475
  print(" change_background - Change background/ground color")
 
1476
  print(" change_lighting - Change lighting conditions")
1477
  print(" add_noise - Add image noise/grain")
1478
  print(" apply_fisheye - Apply fisheye lens distortion")
 
1728
 
1729
  print(f"\n[OK] Saved run metadata to: {metadata_path}")
1730
 
1731
+
1732
+ def regenerate_scene_sets(args):
1733
+ """Regenerate specific scene sets in an existing run. Uses settings from run_metadata.json."""
1734
+ run_dir = os.path.join(args.output_dir, args.run_name)
1735
+ if not os.path.exists(run_dir):
1736
+ print(f"ERROR: Run directory does not exist: {run_dir}")
1737
+ return
1738
+
1739
+ metadata_path = os.path.join(run_dir, 'run_metadata.json')
1740
+ if not os.path.exists(metadata_path):
1741
+ print(f"ERROR: run_metadata.json not found in {run_dir}. Cannot determine original settings.")
1742
+ return
1743
+
1744
+ with open(metadata_path, 'r') as f:
1745
+ metadata = json.load(f)
1746
+ meta_args = metadata.get('arguments', {})
1747
+
1748
+ scenes_dir = os.path.join(run_dir, 'scenes')
1749
+ images_dir = os.path.join(run_dir, 'images')
1750
+ os.makedirs(scenes_dir, exist_ok=True)
1751
+ os.makedirs(images_dir, exist_ok=True)
1752
+
1753
+ blender_path = args.blender_path or find_blender()
1754
+ print(f"Using Blender: {blender_path}")
1755
+ print("\nPreparing scripts...")
1756
+ create_patched_render_script()
1757
+
1758
+ scene_indices = sorted(set(args.regenerate))
1759
+ num_counterfactuals = meta_args.get('num_counterfactuals', 2)
1760
+ cf_types = meta_args.get('cf_types')
1761
+ if isinstance(cf_types, list) and cf_types:
1762
+ pass
1763
+ else:
1764
+ cf_types = None
1765
+
1766
+ use_gpu = meta_args.get('use_gpu', 0)
1767
+ samples = meta_args.get('samples', 512)
1768
+ width = meta_args.get('width', 320)
1769
+ height = meta_args.get('height', 240)
1770
+
1771
+ print(f"\n{'='*70}")
1772
+ print(f"REGENERATING {len(scene_indices)} SCENE SETS: {scene_indices}")
1773
+ print(f"{'='*70}")
1774
+
1775
+ temp_run_id = os.path.basename(run_dir)
1776
+ checkpoint_file = os.path.join(run_dir, 'checkpoint.json')
1777
+ completed_scenes = load_checkpoint(checkpoint_file)
1778
+
1779
+ for i in scene_indices:
1780
+ print(f"\n{'='*70}")
1781
+ print(f"REGENERATING SCENE SET #{i}")
1782
+ print(f"{'='*70}")
1783
+
1784
+ num_objects = meta_args.get('num_objects')
1785
+ if num_objects is None:
1786
+ min_objs = meta_args.get('min_objects', 3)
1787
+ max_objs = meta_args.get('max_objects', 7)
1788
+ num_objects = random.randint(min_objs, max_objs)
1789
+
1790
+ base_scene = None
1791
+ for retry in range(3):
1792
+ base_scene = generate_base_scene(num_objects, blender_path, i, temp_run_dir=temp_run_id)
1793
+ if base_scene and len(base_scene.get('objects', [])) > 0:
1794
+ break
1795
+ print(f" Retry {retry + 1}/3...")
1796
+
1797
+ if not base_scene or len(base_scene.get('objects', [])) == 0:
1798
+ print(f" [FAILED] Could not generate base scene for #{i}")
1799
+ continue
1800
+
1801
+ min_cf_score = meta_args.get('min_cf_change_score', 1.0)
1802
+ max_cf_attempts = meta_args.get('max_cf_attempts', 10)
1803
+ min_noise = meta_args.get('min_noise_level', 'light')
1804
+
1805
+ counterfactuals = generate_counterfactuals(
1806
+ base_scene,
1807
+ num_counterfactuals,
1808
+ cf_types=cf_types,
1809
+ same_cf_type=meta_args.get('same_cf_type', False),
1810
+ min_change_score=min_cf_score,
1811
+ max_cf_attempts=max_cf_attempts,
1812
+ min_noise_level=min_noise
1813
+ )
1814
+
1815
+ for idx, cf in enumerate(counterfactuals):
1816
+ print(f" CF{idx+1} [{cf.get('cf_category', '?')}] ({cf.get('type', '?')}): {cf.get('description', '')}")
1817
+
1818
+ scene_prefix = f"scene_{i:04d}"
1819
+ scene_paths = {'original': os.path.join(scenes_dir, f"{scene_prefix}_original.json")}
1820
+ image_paths = {'original': os.path.join(images_dir, f"{scene_prefix}_original.png")}
1821
+
1822
+ base_scene['cf_metadata'] = {
1823
+ 'variant': 'original', 'is_counterfactual': False, 'cf_index': None,
1824
+ 'cf_category': 'original', 'cf_type': None, 'cf_description': None, 'source_scene': scene_prefix,
1825
+ }
1826
+ save_scene(base_scene, scene_paths['original'])
1827
+
1828
+ for idx, cf in enumerate(counterfactuals):
1829
+ cf_name = f"cf{idx+1}"
1830
+ scene_paths[cf_name] = os.path.join(scenes_dir, f"{scene_prefix}_{cf_name}.json")
1831
+ image_paths[cf_name] = os.path.join(images_dir, f"{scene_prefix}_{cf_name}.png")
1832
+ cf_scene = cf['scene']
1833
+ cf_scene['cf_metadata'] = {
1834
+ 'variant': cf_name, 'is_counterfactual': True, 'cf_index': idx + 1,
1835
+ 'cf_category': cf.get('cf_category', 'unknown'), 'cf_type': cf.get('type', None),
1836
+ 'cf_description': cf.get('description', None), 'change_score': cf.get('change_score'),
1837
+ 'change_attempts': cf.get('change_attempts'), 'source_scene': scene_prefix,
1838
+ }
1839
+ save_scene(cf_scene, scene_paths[cf_name])
1840
+
1841
+ print(f" [OK] Saved {len(counterfactuals) + 1} scene files")
1842
+
1843
+ if not args.skip_render:
1844
+ print(" Rendering...")
1845
+ render_success = 0
1846
+ for scene_type, scene_path in scene_paths.items():
1847
+ if render_scene(blender_path, scene_path, image_paths[scene_type],
1848
+ use_gpu, samples, width, height):
1849
+ render_success += 1
1850
+ print(f" [OK] {scene_type}")
1851
+ print(f" [OK] Rendered {render_success}/{len(scene_paths)} images")
1852
+ completed_scenes.add(i)
1853
+
1854
+ save_checkpoint(checkpoint_file, list(completed_scenes))
1855
+
1856
+ temp_run_path = os.path.join(os.getcwd(), 'temp_output', temp_run_id)
1857
+ if os.path.exists(temp_run_path):
1858
+ shutil.rmtree(temp_run_path)
1859
+ if os.path.exists('render_images_patched.py'):
1860
+ try:
1861
+ os.remove('render_images_patched.py')
1862
+ except Exception:
1863
+ pass
1864
+
1865
+ print(f"\n{'='*70}")
1866
+ print("REGENERATION COMPLETE")
1867
+ print(f"{'='*70}")
1868
+ print(f"Regenerated {len(scene_indices)} scene sets: {scene_indices}")
1869
+ print(f"Run directory: {run_dir}")
1870
+
1871
+ if args.generate_questions:
1872
+ if generate_mapping_with_questions is None:
1873
+ print("\n[WARNING] Questions module not found. Skipping CSV generation.")
1874
+ else:
1875
+ print("\nRegenerating questions CSV...")
1876
+ try:
1877
+ generate_mapping_with_questions(run_dir, args.csv_name, generate_questions=True)
1878
+ print(f"[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
1879
+ except Exception as e:
1880
+ print(f"[ERROR] Questions: {e}")
1881
+ import traceback
1882
+ traceback.print_exc()
1883
+
1884
+
1885
+ def filter_same_answer_scenes(run_dir, csv_filename):
1886
+ """Remove CSV rows where CF1 or CF2 answer matches original; delete those scenes' images and scene JSONs."""
1887
+ csv_path = os.path.join(run_dir, csv_filename)
1888
+ if not os.path.isfile(csv_path):
1889
+ return
1890
+ with open(csv_path, 'r', encoding='utf-8') as f:
1891
+ reader = csv.reader(f)
1892
+ header = next(reader)
1893
+ try:
1894
+ idx_orig_ans = header.index('original_image_answer_to_original_question')
1895
+ idx_cf1_ans = header.index('cf1_image_answer_to_cf1_question')
1896
+ idx_cf2_ans = header.index('cf2_image_answer_to_cf2_question')
1897
+ idx_orig_img = header.index('original_image')
1898
+ except ValueError:
1899
+ return
1900
+ kept_rows = [header]
1901
+ removed_scene_ids = set()
1902
+ with open(csv_path, 'r', encoding='utf-8') as f:
1903
+ reader = csv.reader(f)
1904
+ next(reader)
1905
+ for row in reader:
1906
+ if len(row) <= max(idx_orig_ans, idx_cf1_ans, idx_cf2_ans, idx_orig_img):
1907
+ kept_rows.append(row)
1908
+ continue
1909
+ o = str(row[idx_orig_ans]).strip().lower()
1910
+ c1 = str(row[idx_cf1_ans]).strip().lower()
1911
+ c2 = str(row[idx_cf2_ans]).strip().lower()
1912
+ if o == c1 or o == c2:
1913
+ orig_img = row[idx_orig_img]
1914
+ if orig_img.endswith('_original.png'):
1915
+ scene_id = orig_img.replace('_original.png', '')
1916
+ removed_scene_ids.add(scene_id)
1917
+ continue
1918
+ kept_rows.append(row)
1919
+ if not removed_scene_ids:
1920
+ return
1921
+ with open(csv_path, 'w', newline='', encoding='utf-8') as f:
1922
+ writer = csv.writer(f, quoting=csv.QUOTE_ALL)
1923
+ writer.writerows(kept_rows)
1924
+ images_dir = os.path.join(run_dir, 'images')
1925
+ scenes_dir = os.path.join(run_dir, 'scenes')
1926
+ deleted = 0
1927
+ for scene_id in removed_scene_ids:
1928
+ for suffix in ('_original', '_cf1', '_cf2'):
1929
+ for d, ext in [(images_dir, '.png'), (scenes_dir, '.json')]:
1930
+ if not os.path.isdir(d):
1931
+ continue
1932
+ fn = scene_id + suffix + ext
1933
+ fp = os.path.join(d, fn)
1934
+ if os.path.isfile(fp):
1935
+ try:
1936
+ os.remove(fp)
1937
+ deleted += 1
1938
+ except OSError:
1939
+ pass
1940
+ print(f"\n[OK] Filtered {len(removed_scene_ids)} scenes where answers matched; removed {deleted} files. CSV now has {len(kept_rows) - 1} rows.")
1941
+
1942
+
1943
  def main():
1944
  # When run as a script, ensure we're in the project root
1945
  # Change to script directory so relative paths work
 
1986
  # Image CFs (should change answers)
1987
  'change_color', 'change_shape', 'change_size',
1988
  'change_material', 'change_position',
1989
+ 'add_object', 'remove_object', 'replace_object',
1990
+ 'swap_attribute', 'occlusion_change', 'relational_flip',
1991
  # Negative CFs (should NOT change answers)
1992
+ 'change_background',
1993
  'change_lighting', 'add_noise',
1994
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
1995
  ],
 
2009
 
2010
  parser.add_argument('--generate_questions', action='store_true',
2011
  help='Create questions and answers CSV after rendering completes')
2012
+ parser.add_argument('--filter_same_answer', action='store_true',
2013
+ help='After generating questions, remove scenes where CF1 or CF2 answer matches original (delete those rows and their image/scene files). Use with --generate_questions.')
2014
  parser.add_argument('--csv_name', default='image_mapping_with_questions.csv',
2015
  help='Output CSV filename (default: image_mapping_with_questions.csv)')
2016
+ parser.add_argument('--regenerate', nargs='+', type=int, metavar='N',
2017
+ help='Regenerate specific scene sets by index (e.g. --regenerate 63 83 272). Requires --run_name. Uses settings from run_metadata.json.')
2018
 
2019
  args = parser.parse_args()
2020
 
 
2030
  print("ERROR: --run_name is required when using --resume")
2031
  return
2032
 
2033
+ if args.regenerate is not None:
2034
+ if not args.run_name:
2035
+ print("ERROR: --run_name is required when using --regenerate")
2036
+ return
2037
+ regenerate_scene_sets(args)
2038
+ return
2039
+
2040
  # Find Blender
2041
  blender_path = args.blender_path or find_blender()
2042
  print(f"Using Blender: {blender_path}")
 
2238
  generate_questions=True
2239
  )
2240
  print(f"\n[OK] CSV saved to: {os.path.join(run_dir, args.csv_name)}")
2241
+ if getattr(args, 'filter_same_answer', False):
2242
+ filter_same_answer_scenes(run_dir, args.csv_name)
2243
  except Exception as e:
2244
  print(f"\n[ERROR] Questions: {e}")
2245
  import traceback
scripts/generate_questions_mapping.py CHANGED
@@ -1,5 +1,3 @@
1
-
2
-
3
  import os
4
  import argparse
5
  import csv
@@ -73,19 +71,28 @@ def get_scene_properties(scene):
73
 
74
  IMAGE_CF_TYPES = {
75
  'change_color', 'change_shape', 'change_size', 'change_material',
76
- 'change_position', 'change_count', 'add_object', 'remove_object', 'replace_object'
 
77
  }
78
  NEGATIVE_CF_TYPES = {
79
- 'change_background', 'change_texture', 'change_lighting', 'add_noise',
80
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
81
  }
82
 
 
 
83
  def get_cf_type_from_scene(scene):
84
  meta = scene.get('cf_metadata') or {}
85
  if not meta.get('is_counterfactual'):
86
  return None
87
  return meta.get('cf_type')
88
 
 
 
 
 
 
 
89
  def get_change_details(original_scene, cf_scene):
90
  orig_objs = original_scene.get('objects', [])
91
  cf_objs = cf_scene.get('objects', [])
@@ -100,12 +107,101 @@ def get_change_details(original_scene, cf_scene):
100
  return {'attribute': attr, 'orig_val': ov or 'unknown', 'cf_val': cv or 'unknown', 'object_index': i}
101
  return None
102
 
103
- def generate_question_for_counterfactual(cf_type, original_scene, cf_scene):
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
  change = get_change_details(original_scene, cf_scene)
105
  orig_objs = original_scene.get('objects', [])
106
  cf_objs = cf_scene.get('objects', [])
107
  props_orig = get_scene_properties(original_scene)
108
  props_cf = get_scene_properties(cf_scene)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  if cf_type and cf_type in NEGATIVE_CF_TYPES:
110
  templates = [
111
  ("How many objects are in the scene?", {}),
@@ -123,57 +219,91 @@ def generate_question_for_counterfactual(cf_type, original_scene, cf_scene):
123
  return question, params
124
 
125
  if change and change.get('attribute') == 'count':
126
- question = "How many objects are in the scene?"
127
- return question, {}
 
 
 
 
 
 
 
 
 
 
128
 
129
  if change and change.get('attribute') in ('color', 'shape', 'material', 'size'):
130
  attr = change['attribute']
131
- orig_val = change.get('orig_val', '')
132
- cf_val = change.get('cf_val', '')
 
 
133
  if attr == 'color':
134
- question = f"How many {cf_val} objects are there?"
135
- params = {'color': cf_val}
136
  elif attr == 'shape':
137
- question = f"How many {cf_val}s are there?" if not cf_val.endswith('s') else f"How many {cf_val} are there?"
138
- params = {'shape': cf_val.rstrip('s')}
 
 
139
  elif attr == 'material':
140
- question = f"How many {cf_val} objects are there?"
141
- params = {'material': cf_val}
142
  elif attr == 'size':
143
- question = f"How many {cf_val} objects are there?"
144
- params = {'size': cf_val}
145
  else:
146
  question = "How many objects are in the scene?"
147
  params = {}
148
  return question, params
149
 
 
 
 
 
 
 
 
 
 
 
150
  if cf_type in ('change_color', 'change_shape', 'replace_object'):
151
  for attr, key in [('color', 'colors'), ('shape', 'shapes'), ('material', 'materials'), ('size', 'sizes')]:
152
- vals = props_cf.get(key) or props_orig.get(key) or []
153
  if vals:
154
- val = random.choice(list(vals))
155
  if attr == 'shape':
156
- plural = val + 's' if not val.endswith('s') else val
157
- question = f"How many {plural} are there?"
 
 
158
  elif attr == 'color':
159
- question = f"How many {val} objects are there?"
 
160
  elif attr == 'material':
161
- question = f"How many {val} objects are there?"
 
162
  else:
163
- question = f"How many {val} objects are there?"
164
- return question, {attr: val}
165
- if cf_type in ('change_count', 'add_object', 'remove_object'):
166
- return "How many objects are in the scene?", {}
167
- if cf_type in ('change_size', 'change_material', 'change_position'):
168
  key = 'sizes' if cf_type == 'change_size' else ('materials' if cf_type == 'change_material' else 'colors')
169
  attr = key.rstrip('s')
170
- vals = props_cf.get(key) or props_orig.get(key) or []
171
  if vals:
172
- val = random.choice(list(vals))
173
- question = f"How many {val} objects are there?"
 
 
 
 
 
 
174
  return question, {attr: val}
175
 
176
- question = "How many objects are in the scene?"
177
  return question, {}
178
 
179
  def generate_question_for_scene(scene_file):
@@ -502,8 +632,6 @@ def create_counterfactual_questions(original_question, params, scene):
502
  if cf_q is None:
503
  cf_q = "How many objects are in the scene?"
504
  cf_params = {}
505
-
506
- # Ensure cf_params is set
507
  if not cf_params:
508
  cf_params = {}
509
 
@@ -545,10 +673,23 @@ def create_counterfactual_questions(original_question, params, scene):
545
 
546
  return cf_questions
547
 
 
 
 
 
 
 
548
  def answer_question_for_scene(question, scene):
549
  objects = scene.get('objects', [])
550
  question_lower = question.lower()
551
 
 
 
 
 
 
 
 
552
  if "more than" in question_lower:
553
  match = re.search(r'more than (\d+)', question_lower)
554
  if match:
@@ -774,7 +915,9 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
774
  if with_links:
775
  header = ['scene_id', 'original_image_link', 'original_scene_link',
776
  'counterfactual1_image_link', 'counterfactual1_scene_link',
777
- 'counterfactual2_image_link', 'counterfactual2_scene_link']
 
 
778
  if generate_questions:
779
  header.extend([
780
  'original_question', 'counterfactual1_question', 'counterfactual2_question',
@@ -793,6 +936,8 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
793
  elif generate_questions:
794
  rows.append([
795
  'original_image', 'counterfactual1_image', 'counterfactual2_image',
 
 
796
  'original_question', 'counterfactual1_question', 'counterfactual2_question',
797
  'original_question_difficulty', 'counterfactual1_question_difficulty', 'counterfactual2_question_difficulty',
798
  'original_image_answer_to_original_question',
@@ -806,7 +951,9 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
806
  'cf2_image_answer_to_cf2_question'
807
  ])
808
  else:
809
- rows.append(['original_image', 'counterfactual1_image', 'counterfactual2_image'])
 
 
810
 
811
  total_scenes = len(scene_sets)
812
 
@@ -841,47 +988,75 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
841
 
842
  try:
843
  original_question, params = generate_question_for_scene(original_scene_file)
 
844
  cf1_type = get_cf_type_from_scene(cf1_scene)
845
  cf2_type = get_cf_type_from_scene(cf2_scene)
846
- cf_questions = create_counterfactual_questions(original_question, params, original_scene) if (not cf1_type or not cf2_type) else None
847
- if cf1_type:
848
- cf1_question, cf1_params = generate_question_for_counterfactual(cf1_type, original_scene, cf1_scene)
849
- else:
850
- cf1_question, cf1_params = cf_questions[0] if cf_questions and len(cf_questions) > 0 else ("How many objects are in the scene?", {})
851
- if cf2_type:
852
- cf2_question, cf2_params = generate_question_for_counterfactual(cf2_type, original_scene, cf2_scene)
853
- else:
854
- cf2_question, cf2_params = cf_questions[1] if cf_questions and len(cf_questions) > 1 else (cf_questions[0] if cf_questions else ("How many objects are in the scene?", {}))
855
  except Exception as e:
856
  import traceback
857
  traceback.print_exc()
858
  continue
859
 
860
- try:
861
- original_difficulty = calculate_question_difficulty(original_question, params)
862
- cf1_difficulty = calculate_question_difficulty(cf1_question, cf1_params)
863
- cf2_difficulty = calculate_question_difficulty(cf2_question, cf2_params)
864
- except Exception as e:
865
- import traceback
866
- traceback.print_exc()
867
- continue
868
 
869
- try:
870
- original_ans_orig_q = answer_question_for_scene(original_question, original_scene)
871
- original_ans_cf1_q = answer_question_for_scene(cf1_question, original_scene)
872
- original_ans_cf2_q = answer_question_for_scene(cf2_question, original_scene)
 
 
 
 
 
 
 
 
 
 
 
 
873
 
874
- cf1_ans_orig_q = answer_question_for_scene(original_question, cf1_scene)
875
- cf1_ans_cf1_q = answer_question_for_scene(cf1_question, cf1_scene)
876
- cf1_ans_cf2_q = answer_question_for_scene(cf2_question, cf1_scene)
 
 
 
 
 
877
 
878
- cf2_ans_orig_q = answer_question_for_scene(original_question, cf2_scene)
879
- cf2_ans_cf1_q = answer_question_for_scene(cf1_question, cf2_scene)
880
- cf2_ans_cf2_q = answer_question_for_scene(cf2_question, cf2_scene)
881
- except Exception as e:
882
- import traceback
883
- traceback.print_exc()
884
- continue
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
885
 
886
  try:
887
  if with_links:
@@ -903,6 +1078,7 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
903
  original_image_link, original_scene_link,
904
  cf1_image_link, cf1_scene_link,
905
  cf2_image_link, cf2_scene_link,
 
906
  original_question, cf1_question, cf2_question,
907
  original_difficulty, cf1_difficulty, cf2_difficulty,
908
  original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
@@ -912,6 +1088,7 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
912
  else:
913
  rows.append([
914
  original_id, cf1_id, cf2_id,
 
915
  original_question, cf1_question, cf2_question,
916
  original_difficulty, cf1_difficulty, cf2_difficulty,
917
  original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
@@ -923,6 +1100,20 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
923
  traceback.print_exc()
924
  continue
925
  else:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
926
  if with_links:
927
  def make_link(filename, file_type='image'):
928
  if base_url:
@@ -941,10 +1132,11 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
941
  scene_num,
942
  original_image_link, original_scene_link,
943
  cf1_image_link, cf1_scene_link,
944
- cf2_image_link, cf2_scene_link
 
945
  ])
946
  else:
947
- rows.append([original_id, cf1_id, cf2_id])
948
 
949
  csv_path = os.path.join(run_dir, csv_filename)
950
  try:
@@ -966,44 +1158,29 @@ def generate_mapping_with_questions(run_dir, csv_filename='image_mapping_with_qu
966
  if generate_questions:
967
  print(f" Scene ID: {row[0]}")
968
  print(f" Links:")
969
- print(f" Original image: {row[1]}")
970
- print(f" Original scene: {row[2]}")
971
- print(f" CF1 image: {row[3]}")
972
- print(f" CF1 scene: {row[4]}")
973
- print(f" CF2 image: {row[5]}")
974
- print(f" CF2 scene: {row[6]}")
975
- print(f" Questions:")
976
- print(f" Original: {row[7]}")
977
- print(f" CF1: {row[8]}")
978
- print(f" CF2: {row[9]}")
979
  else:
980
  print(f" Scene ID: {row[0]}")
981
  print(f" Links:")
982
  print(f" Original image: {row[1]}, scene: {row[2]}")
983
  print(f" CF1 image: {row[3]}, scene: {row[4]}")
984
  print(f" CF2 image: {row[5]}, scene: {row[6]}")
985
- elif generate_questions and len(row) > 6:
986
- print(f" Images:")
987
- print(f" Original: {row[0]}")
988
- print(f" Counterfactual 1: {row[1]}")
989
- print(f" Counterfactual 2: {row[2]}")
990
- print(f" Questions:")
991
- print(f" Original question: {row[3]}")
992
- print(f" CF1 question: {row[4]}")
993
- print(f" CF2 question: {row[5]}")
994
  print(f" Answer Matrix (scene × question):")
995
- print(f" Original image:")
996
- print(f" -> Original Q: {row[6]}")
997
- print(f" -> CF1 Q: {row[7]}")
998
- print(f" -> CF2 Q: {row[8]}")
999
- print(f" CF1 image:")
1000
- print(f" -> Original Q: {row[9]}")
1001
- print(f" -> CF1 Q: {row[10]}")
1002
- print(f" -> CF2 Q: {row[11]}")
1003
- print(f" CF2 image:")
1004
- print(f" -> Original Q: {row[12]}")
1005
- print(f" -> CF1 Q: {row[13]}")
1006
- print(f" -> CF2 Q: {row[14]}")
1007
 
1008
  def main():
1009
  parser = argparse.ArgumentParser(
 
 
 
1
  import os
2
  import argparse
3
  import csv
 
71
 
72
  IMAGE_CF_TYPES = {
73
  'change_color', 'change_shape', 'change_size', 'change_material',
74
+ 'change_position', 'add_object', 'remove_object', 'replace_object',
75
+ 'swap_attribute', 'occlusion_change', 'relational_flip'
76
  }
77
  NEGATIVE_CF_TYPES = {
78
+ 'change_background', 'change_lighting', 'add_noise',
79
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
80
  }
81
 
82
+ MAX_CF_ANSWER_RETRIES = 150
83
+
84
  def get_cf_type_from_scene(scene):
85
  meta = scene.get('cf_metadata') or {}
86
  if not meta.get('is_counterfactual'):
87
  return None
88
  return meta.get('cf_type')
89
 
90
+ def get_cf_description_from_scene(scene):
91
+ meta = scene.get('cf_metadata') or {}
92
+ if not meta.get('is_counterfactual'):
93
+ return None
94
+ return meta.get('cf_description')
95
+
96
  def get_change_details(original_scene, cf_scene):
97
  orig_objs = original_scene.get('objects', [])
98
  cf_objs = cf_scene.get('objects', [])
 
107
  return {'attribute': attr, 'orig_val': ov or 'unknown', 'cf_val': cv or 'unknown', 'object_index': i}
108
  return None
109
 
110
+ CF_COUNT_QUESTION_TEMPLATES = [
111
+ "How many objects are in the scene?",
112
+ "What is the total number of objects in the scene?",
113
+ ]
114
+ CF_COLOR_QUESTION_TEMPLATES = [
115
+ ("How many {val} objects are there?", 'color'),
116
+ ("Are there any {val} objects?", 'color'),
117
+ ("What is the total number of {val} objects?", 'color'),
118
+ ]
119
+ CF_SHAPE_QUESTION_TEMPLATES = [
120
+ ("How many {val} are there?", 'shape'),
121
+ ("Are there any {val}?", 'shape'),
122
+ ("What is the total number of {val}?", 'shape'),
123
+ ]
124
+ CF_MATERIAL_QUESTION_TEMPLATES = [
125
+ ("How many {val} objects are there?", 'material'),
126
+ ("Are there any {val} objects?", 'material'),
127
+ ("What is the total number of {val} objects?", 'material'),
128
+ ]
129
+ CF_SIZE_QUESTION_TEMPLATES = [
130
+ ("How many {val} objects are there?", 'size'),
131
+ ("Are there any {val} objects?", 'size'),
132
+ ("What is the total number of {val} objects?", 'size'),
133
+ ]
134
+
135
+
136
+ def _pluralize_shape(shape):
137
+ if not shape:
138
+ return shape
139
+ s = shape.strip().lower()
140
+ if s.endswith('s'):
141
+ return s
142
+ return s + 's'
143
+
144
+
145
+ def _count_by_attribute(objects, attr):
146
+ """Count objects per attribute value (color, shape, material, size)."""
147
+ counts = {}
148
+ for obj in objects:
149
+ val = (obj.get(attr) or '').lower().strip()
150
+ if val:
151
+ counts[val] = counts.get(val, 0) + 1
152
+ return counts
153
+
154
+
155
+ def _get_attributes_with_different_counts(original_scene, cf_scene):
156
+ """Find attribute values whose count differs between original and CF scene."""
157
+ orig_objs = original_scene.get('objects', [])
158
+ cf_objs = cf_scene.get('objects', [])
159
+ differing = []
160
+ for attr in ['color', 'shape', 'material', 'size']:
161
+ orig_counts = _count_by_attribute(orig_objs, attr)
162
+ cf_counts = _count_by_attribute(cf_objs, attr)
163
+ all_vals = set(orig_counts) | set(cf_counts)
164
+ for val in all_vals:
165
+ o = orig_counts.get(val, 0)
166
+ c = cf_counts.get(val, 0)
167
+ if o != c:
168
+ differing.append((attr, val, o, c))
169
+ return differing
170
+
171
+
172
+ def generate_question_for_counterfactual(cf_type, original_scene, cf_scene, retry_index=0):
173
+ """Generate a question tailored to cf_type. Use retry_index to vary question on retries."""
174
+ random.seed(hash((str(cf_type), retry_index, str(id(original_scene)), str(id(cf_scene)))))
175
  change = get_change_details(original_scene, cf_scene)
176
  orig_objs = original_scene.get('objects', [])
177
  cf_objs = cf_scene.get('objects', [])
178
  props_orig = get_scene_properties(original_scene)
179
  props_cf = get_scene_properties(cf_scene)
180
+
181
+ # For IMAGE CFs: prefer questions targeting attributes that differ between orig and cf
182
+ if cf_type and cf_type in IMAGE_CF_TYPES:
183
+ differing = _get_attributes_with_different_counts(original_scene, cf_scene)
184
+ if differing:
185
+ idx = retry_index % len(differing) if differing else 0
186
+ attr, val, orig_count, cf_count = differing[idx]
187
+ if attr == 'color':
188
+ template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
189
+ question = template.format(val=val)
190
+ elif attr == 'shape':
191
+ plural = _pluralize_shape(val)
192
+ template, _ = random.choice(CF_SHAPE_QUESTION_TEMPLATES)
193
+ question = template.format(val=plural)
194
+ elif attr == 'material':
195
+ template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
196
+ question = template.format(val=val)
197
+ elif attr == 'size':
198
+ template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
199
+ question = template.format(val=val)
200
+ else:
201
+ question = None
202
+ if question:
203
+ return question, {attr: val.rstrip('s') if attr == 'shape' else val}
204
+
205
  if cf_type and cf_type in NEGATIVE_CF_TYPES:
206
  templates = [
207
  ("How many objects are in the scene?", {}),
 
219
  return question, params
220
 
221
  if change and change.get('attribute') == 'count':
222
+ orig_count = change.get('orig_count', len(orig_objs))
223
+ cf_count = change.get('cf_count', len(cf_objs))
224
+ templates_with_params = []
225
+ templates_with_params.append((random.choice(CF_COUNT_QUESTION_TEMPLATES), {}))
226
+ if cf_count > orig_count:
227
+ templates_with_params.append((f"Are there more than {orig_count} objects?", {}))
228
+ templates_with_params.append((f"Are there at least {cf_count} objects?", {}))
229
+ if cf_count < orig_count:
230
+ templates_with_params.append((f"Are there fewer than {orig_count} objects?", {}))
231
+ templates_with_params.append((f"Are there more than {cf_count} objects?", {}))
232
+ template, params = random.choice(templates_with_params)
233
+ return template, params
234
 
235
  if change and change.get('attribute') in ('color', 'shape', 'material', 'size'):
236
  attr = change['attribute']
237
+ cf_val = (change.get('cf_val') or '').strip().lower()
238
+ if not cf_val:
239
+ cf_val = 'unknown'
240
+ params = {attr: cf_val}
241
  if attr == 'color':
242
+ template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
243
+ question = template.format(val=cf_val)
244
  elif attr == 'shape':
245
+ template, _ = random.choice(CF_SHAPE_QUESTION_TEMPLATES)
246
+ plural = _pluralize_shape(cf_val)
247
+ question = template.format(val=plural)
248
+ params['shape'] = cf_val.rstrip('s')
249
  elif attr == 'material':
250
+ template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
251
+ question = template.format(val=cf_val)
252
  elif attr == 'size':
253
+ template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
254
+ question = template.format(val=cf_val)
255
  else:
256
  question = "How many objects are in the scene?"
257
  params = {}
258
  return question, params
259
 
260
+ if cf_type in ('add_object', 'remove_object'):
261
+ templates = list(CF_COUNT_QUESTION_TEMPLATES)
262
+ if len(orig_objs) != len(cf_objs):
263
+ if len(cf_objs) > len(orig_objs):
264
+ templates.extend([f"Are there more than {len(orig_objs)} objects?", f"Are there at least {len(cf_objs)} objects?"])
265
+ else:
266
+ templates.extend([f"Are there fewer than {len(orig_objs)} objects?", f"Are there more than {len(cf_objs)} objects?"])
267
+ template = random.choice(templates)
268
+ return template, {}
269
+
270
  if cf_type in ('change_color', 'change_shape', 'replace_object'):
271
  for attr, key in [('color', 'colors'), ('shape', 'shapes'), ('material', 'materials'), ('size', 'sizes')]:
272
+ vals = list(props_cf.get(key) or props_orig.get(key) or [])
273
  if vals:
274
+ val = random.choice(vals)
275
  if attr == 'shape':
276
+ plural = _pluralize_shape(val)
277
+ templates = CF_SHAPE_QUESTION_TEMPLATES
278
+ template, _ = random.choice(templates)
279
+ question = template.format(val=plural)
280
  elif attr == 'color':
281
+ template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
282
+ question = template.format(val=val)
283
  elif attr == 'material':
284
+ template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
285
+ question = template.format(val=val)
286
  else:
287
+ template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
288
+ question = template.format(val=val)
289
+ return question, {attr: val.rstrip('s') if attr == 'shape' else val}
290
+
291
+ if cf_type in ('change_size', 'change_material', 'change_position', 'swap_attribute', 'occlusion_change', 'relational_flip'):
292
  key = 'sizes' if cf_type == 'change_size' else ('materials' if cf_type == 'change_material' else 'colors')
293
  attr = key.rstrip('s')
294
+ vals = list(props_cf.get(key) or props_orig.get(key) or [])
295
  if vals:
296
+ val = random.choice(vals)
297
+ if cf_type == 'change_size':
298
+ template, _ = random.choice(CF_SIZE_QUESTION_TEMPLATES)
299
+ elif cf_type == 'change_material':
300
+ template, _ = random.choice(CF_MATERIAL_QUESTION_TEMPLATES)
301
+ else:
302
+ template, _ = random.choice(CF_COLOR_QUESTION_TEMPLATES)
303
+ question = template.format(val=val)
304
  return question, {attr: val}
305
 
306
+ question = random.choice(CF_COUNT_QUESTION_TEMPLATES)
307
  return question, {}
308
 
309
  def generate_question_for_scene(scene_file):
 
632
  if cf_q is None:
633
  cf_q = "How many objects are in the scene?"
634
  cf_params = {}
 
 
635
  if not cf_params:
636
  cf_params = {}
637
 
 
673
 
674
  return cf_questions
675
 
676
+ def normalize_answer(a):
677
+ if a is None:
678
+ return ""
679
+ return str(a).strip().lower()
680
+
681
+
682
  def answer_question_for_scene(question, scene):
683
  objects = scene.get('objects', [])
684
  question_lower = question.lower()
685
 
686
+ if "at least" in question_lower:
687
+ match = re.search(r'at least (\d+)', question_lower)
688
+ if match:
689
+ threshold = int(match.group(1))
690
+ count = count_matching_objects(question_lower, objects)
691
+ return "yes" if count >= threshold else "no"
692
+
693
  if "more than" in question_lower:
694
  match = re.search(r'more than (\d+)', question_lower)
695
  if match:
 
915
  if with_links:
916
  header = ['scene_id', 'original_image_link', 'original_scene_link',
917
  'counterfactual1_image_link', 'counterfactual1_scene_link',
918
+ 'counterfactual2_image_link', 'counterfactual2_scene_link',
919
+ 'counterfactual1_type', 'counterfactual2_type',
920
+ 'counterfactual1_description', 'counterfactual2_description']
921
  if generate_questions:
922
  header.extend([
923
  'original_question', 'counterfactual1_question', 'counterfactual2_question',
 
936
  elif generate_questions:
937
  rows.append([
938
  'original_image', 'counterfactual1_image', 'counterfactual2_image',
939
+ 'counterfactual1_type', 'counterfactual2_type',
940
+ 'counterfactual1_description', 'counterfactual2_description',
941
  'original_question', 'counterfactual1_question', 'counterfactual2_question',
942
  'original_question_difficulty', 'counterfactual1_question_difficulty', 'counterfactual2_question_difficulty',
943
  'original_image_answer_to_original_question',
 
951
  'cf2_image_answer_to_cf2_question'
952
  ])
953
  else:
954
+ rows.append(['original_image', 'counterfactual1_image', 'counterfactual2_image',
955
+ 'counterfactual1_type', 'counterfactual2_type',
956
+ 'counterfactual1_description', 'counterfactual2_description'])
957
 
958
  total_scenes = len(scene_sets)
959
 
 
988
 
989
  try:
990
  original_question, params = generate_question_for_scene(original_scene_file)
991
+ original_ans_orig_q = answer_question_for_scene(original_question, original_scene)
992
  cf1_type = get_cf_type_from_scene(cf1_scene)
993
  cf2_type = get_cf_type_from_scene(cf2_scene)
994
+ cf1_description = get_cf_description_from_scene(cf1_scene)
995
+ cf2_description = get_cf_description_from_scene(cf2_scene)
 
 
 
 
 
 
 
996
  except Exception as e:
997
  import traceback
998
  traceback.print_exc()
999
  continue
1000
 
1001
+ cf1_question = cf2_question = None
1002
+ cf1_params = cf2_params = {}
1003
+ original_difficulty = cf1_difficulty = cf2_difficulty = None
1004
+ original_ans_cf1_q = original_ans_cf2_q = None
1005
+ cf1_ans_orig_q = cf1_ans_cf1_q = cf1_ans_cf2_q = None
1006
+ cf2_ans_orig_q = cf2_ans_cf1_q = cf2_ans_cf2_q = None
1007
+ orig_norm = normalize_answer(original_ans_orig_q)
 
1008
 
1009
+ for cf_retry in range(MAX_CF_ANSWER_RETRIES):
1010
+ try:
1011
+ random.seed(hash((scene_num, idx, cf_retry)))
1012
+ cf_questions = create_counterfactual_questions(original_question, params, original_scene) if (not cf1_type or not cf2_type) else None
1013
+ if cf1_type:
1014
+ cf1_question, cf1_params = generate_question_for_counterfactual(cf1_type, original_scene, cf1_scene, retry_index=cf_retry)
1015
+ else:
1016
+ cf1_question, cf1_params = cf_questions[0] if cf_questions and len(cf_questions) > 0 else ("How many objects are in the scene?", {})
1017
+ if cf2_type:
1018
+ cf2_question, cf2_params = generate_question_for_counterfactual(cf2_type, original_scene, cf2_scene, retry_index=cf_retry)
1019
+ else:
1020
+ cf2_question, cf2_params = cf_questions[1] if cf_questions and len(cf_questions) > 1 else (cf_questions[0] if cf_questions else ("How many objects are in the scene?", {}))
1021
+ except Exception as e:
1022
+ import traceback
1023
+ traceback.print_exc()
1024
+ continue
1025
 
1026
+ try:
1027
+ original_difficulty = calculate_question_difficulty(original_question, params)
1028
+ cf1_difficulty = calculate_question_difficulty(cf1_question, cf1_params)
1029
+ cf2_difficulty = calculate_question_difficulty(cf2_question, cf2_params)
1030
+ except Exception as e:
1031
+ import traceback
1032
+ traceback.print_exc()
1033
+ continue
1034
 
1035
+ try:
1036
+ original_ans_cf1_q = answer_question_for_scene(cf1_question, original_scene)
1037
+ original_ans_cf2_q = answer_question_for_scene(cf2_question, original_scene)
1038
+ cf1_ans_orig_q = answer_question_for_scene(original_question, cf1_scene)
1039
+ cf1_ans_cf1_q = answer_question_for_scene(cf1_question, cf1_scene)
1040
+ cf1_ans_cf2_q = answer_question_for_scene(cf2_question, cf1_scene)
1041
+ cf2_ans_orig_q = answer_question_for_scene(original_question, cf2_scene)
1042
+ cf2_ans_cf1_q = answer_question_for_scene(cf1_question, cf2_scene)
1043
+ cf2_ans_cf2_q = answer_question_for_scene(cf2_question, cf2_scene)
1044
+ except Exception as e:
1045
+ import traceback
1046
+ traceback.print_exc()
1047
+ continue
1048
+
1049
+ # For image CFs: ensure CF image's answer to CF question differs from original image's answer to same question.
1050
+ # change_position, occlusion_change, relational_flip only move objects.
1051
+ # swap_attribute swaps colors (same net counts). Our QA cannot distinguish, so skip validation.
1052
+ CF_TYPES_ACCEPT_WITHOUT_CHECK = {'change_position', 'swap_attribute', 'occlusion_change', 'relational_flip'}
1053
+ cf1_differs = (cf1_type not in IMAGE_CF_TYPES) or (cf1_type in CF_TYPES_ACCEPT_WITHOUT_CHECK) or (normalize_answer(original_ans_cf1_q) != normalize_answer(cf1_ans_cf1_q))
1054
+ cf2_differs = (cf2_type not in IMAGE_CF_TYPES) or (cf2_type in CF_TYPES_ACCEPT_WITHOUT_CHECK) or (normalize_answer(original_ans_cf2_q) != normalize_answer(cf2_ans_cf2_q))
1055
+ # Accept when at least one CF has different answers (maximize usable data; the other may have same answer due to count-preserving swaps)
1056
+ if cf1_differs or cf2_differs:
1057
+ break
1058
+ else:
1059
+ print(f"WARNING: Scene {scene_num}: could not find questions with different answers for both CFs after {MAX_CF_ANSWER_RETRIES} retries (scene included with best-effort questions)")
1060
 
1061
  try:
1062
  if with_links:
 
1078
  original_image_link, original_scene_link,
1079
  cf1_image_link, cf1_scene_link,
1080
  cf2_image_link, cf2_scene_link,
1081
+ cf1_type, cf2_type, cf1_description, cf2_description,
1082
  original_question, cf1_question, cf2_question,
1083
  original_difficulty, cf1_difficulty, cf2_difficulty,
1084
  original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
 
1088
  else:
1089
  rows.append([
1090
  original_id, cf1_id, cf2_id,
1091
+ cf1_type, cf2_type, cf1_description, cf2_description,
1092
  original_question, cf1_question, cf2_question,
1093
  original_difficulty, cf1_difficulty, cf2_difficulty,
1094
  original_ans_orig_q, original_ans_cf1_q, original_ans_cf2_q,
 
1100
  traceback.print_exc()
1101
  continue
1102
  else:
1103
+ # Load CF scene files to get cf_type and cf_description for the mapping
1104
+ cf1_type = cf2_type = cf1_description = cf2_description = ''
1105
+ cf1_scene_file = find_scene_file(scenes_dir, cf1_id)
1106
+ cf2_scene_file = find_scene_file(scenes_dir, cf2_id)
1107
+ if cf1_scene_file and cf2_scene_file:
1108
+ try:
1109
+ cf1_scene = load_scene(cf1_scene_file)
1110
+ cf2_scene = load_scene(cf2_scene_file)
1111
+ cf1_type = get_cf_type_from_scene(cf1_scene) or ''
1112
+ cf2_type = get_cf_type_from_scene(cf2_scene) or ''
1113
+ cf1_description = get_cf_description_from_scene(cf1_scene) or ''
1114
+ cf2_description = get_cf_description_from_scene(cf2_scene) or ''
1115
+ except Exception:
1116
+ pass
1117
  if with_links:
1118
  def make_link(filename, file_type='image'):
1119
  if base_url:
 
1132
  scene_num,
1133
  original_image_link, original_scene_link,
1134
  cf1_image_link, cf1_scene_link,
1135
+ cf2_image_link, cf2_scene_link,
1136
+ cf1_type, cf2_type, cf1_description, cf2_description
1137
  ])
1138
  else:
1139
+ rows.append([original_id, cf1_id, cf2_id, cf1_type, cf2_type, cf1_description, cf2_description])
1140
 
1141
  csv_path = os.path.join(run_dir, csv_filename)
1142
  try:
 
1158
  if generate_questions:
1159
  print(f" Scene ID: {row[0]}")
1160
  print(f" Links:")
1161
+ print(f" Original image: {row[1]}, scene: {row[2]}")
1162
+ print(f" CF1 image: {row[3]}, scene: {row[4]}")
1163
+ print(f" CF2 image: {row[5]}, scene: {row[6]}")
1164
+ print(f" CF type / description: CF1 type={row[7]}, CF2 type={row[8]}; CF1 desc={row[9]!r}, CF2 desc={row[10]!r}")
1165
+ print(f" Questions: Original: {row[11]}, CF1: {row[12]}, CF2: {row[13]}")
 
 
 
 
 
1166
  else:
1167
  print(f" Scene ID: {row[0]}")
1168
  print(f" Links:")
1169
  print(f" Original image: {row[1]}, scene: {row[2]}")
1170
  print(f" CF1 image: {row[3]}, scene: {row[4]}")
1171
  print(f" CF2 image: {row[5]}, scene: {row[6]}")
1172
+ print(f" CF type / description: CF1 type={row[7]}, CF2 type={row[8]}; CF1 desc={row[9]!r}, CF2 desc={row[10]!r}")
1173
+ elif generate_questions and len(row) > 14:
1174
+ print(f" Images: Original: {row[0]}, CF1: {row[1]}, CF2: {row[2]}")
1175
+ print(f" CF type / description: CF1 type={row[3]}, CF2 type={row[4]}; CF1 desc={row[5]!r}, CF2 desc={row[6]!r}")
1176
+ print(f" Questions: Original: {row[7]}, CF1: {row[8]}, CF2: {row[9]}")
 
 
 
 
1177
  print(f" Answer Matrix (scene × question):")
1178
+ print(f" Original image -> Orig Q: {row[10]}, CF1 Q: {row[11]}, CF2 Q: {row[12]}")
1179
+ print(f" CF1 image -> Orig Q: {row[13]}, CF1 Q: {row[14]}, CF2 Q: {row[15]}")
1180
+ print(f" CF2 image -> Orig Q: {row[16]}, CF1 Q: {row[17]}, CF2 Q: {row[18]}")
1181
+ elif len(row) >= 7:
1182
+ print(f" Images: Original: {row[0]}, CF1: {row[1]}, CF2: {row[2]}")
1183
+ print(f" CF type / description: CF1 type={row[3]}, CF2 type={row[4]}; CF1 desc={row[5]!r}, CF2 desc={row[6]!r}")
 
 
 
 
 
 
1184
 
1185
  def main():
1186
  parser = argparse.ArgumentParser(
scripts/generate_scenes.py CHANGED
@@ -53,9 +53,9 @@ def main():
53
  choices=[
54
  'change_color', 'change_shape', 'change_size',
55
  'change_material', 'change_position',
56
- 'change_count', 'add_object', 'remove_object',
57
  'replace_object',
58
- 'change_background', 'change_texture',
59
  'change_lighting', 'add_noise',
60
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
61
  ],
 
53
  choices=[
54
  'change_color', 'change_shape', 'change_size',
55
  'change_material', 'change_position',
56
+ 'add_object', 'remove_object', 'swap_attribute', 'occlusion_change', 'relational_flip',
57
  'replace_object',
58
+ 'change_background',
59
  'change_lighting', 'add_noise',
60
  'apply_fisheye', 'apply_blur', 'apply_vignette', 'apply_chromatic_aberration'
61
  ],
scripts/render.py CHANGED
@@ -353,6 +353,15 @@ parser.add_argument('--render_tile_size', default=256, type=int,
353
  parser.add_argument('--output_image', default=None,
354
  help="Output image path (used when rendering from JSON)")
355
 
 
 
 
 
 
 
 
 
 
356
  BACKGROUND_COLORS = {
357
  'default': None,
358
  'gray': (0.5, 0.5, 0.5),
@@ -370,11 +379,11 @@ BACKGROUND_COLORS = {
370
 
371
  LIGHTING_PRESETS = {
372
  'default': {'key': 1.0, 'fill': 0.5, 'back': 0.3},
373
- 'bright': {'key': 4.0, 'fill': 2.2, 'back': 1.4},
374
- 'dim': {'key': 0.08, 'fill': 0.04, 'back': 0.02},
375
- 'warm': {'key': 2.8, 'fill': 1.4, 'back': 0.7, 'color': (1.0, 0.75, 0.5)},
376
- 'cool': {'key': 2.0, 'fill': 1.1, 'back': 0.9, 'color': (0.5, 0.75, 1.0)},
377
- 'dramatic': {'key': 5.5, 'fill': 0.05, 'back': 0.02},
378
  }
379
 
380
  def set_background_color(color_name):
@@ -540,6 +549,7 @@ def render_from_json(args):
540
  shape_semantic_to_file = properties['shapes']
541
  material_semantic_to_file = properties['materials']
542
 
 
543
  print("Adding objects to scene...")
544
  for i, obj_info in enumerate(scene_struct.get('objects', [])):
545
  x, y, z = obj_info['3d_coords']
@@ -560,6 +570,8 @@ def render_from_json(args):
560
  except Exception as e:
561
  print(f"Error adding object {i}: {e}")
562
  continue
 
 
563
 
564
  rgba = color_name_to_rgba[obj_info['color']]
565
  semantic_material = obj_info['material']
@@ -575,6 +587,26 @@ def render_from_json(args):
575
  except Exception as e:
576
  print(f"Warning: Could not add material: {e}")
577
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
578
  filter_type = scene_struct.get('filter_type')
579
  filter_strength = scene_struct.get('filter_strength', 1.0)
580
 
@@ -858,7 +890,8 @@ def add_random_objects(scene_struct, num_objects, args, camera, max_scene_attemp
858
  if len(objects) < num_objects:
859
  continue
860
 
861
- all_visible = check_visibility(blender_objects, args.min_pixels_per_object)
 
862
  if not all_visible:
863
  print('Some objects are occluded; replacing objects')
864
  for obj in blender_objects:
@@ -898,15 +931,46 @@ def compute_all_relationships(scene_struct, eps=0.2):
898
 
899
 
900
  def check_visibility(blender_objects, min_pixels_per_object):
901
- """Visibility check disabled for compatibility (was causing scene gen to fail)."""
902
- return True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
903
 
904
 
905
- def render_shadeless(blender_objects, path='flat.png'):
906
  """
907
  Render a version of the scene with shading disabled and unique materials
908
  assigned to all objects. The image itself is written to path. This is used to ensure
909
  that all objects will be visible in the final rendered scene (when check_visibility is enabled).
 
910
  """
911
  render_args = bpy.context.scene.render
912
 
@@ -924,7 +988,8 @@ def render_shadeless(blender_objects, path='flat.png'):
924
  obj = bpy.data.objects[obj_name]
925
  obj.hide_render = True
926
 
927
- object_colors = set()
 
928
  old_materials = []
929
  for i, obj in enumerate(blender_objects):
930
  if len(obj.data.materials) > 0:
@@ -940,10 +1005,16 @@ def render_shadeless(blender_objects, path='flat.png'):
940
  node_emission = nodes.new(type='ShaderNodeEmission')
941
  node_output = nodes.new(type='ShaderNodeOutputMaterial')
942
 
943
- while True:
944
- r, g, b = [random.random() for _ in range(3)]
945
- if (r, g, b) not in object_colors: break
946
- object_colors.add((r, g, b))
 
 
 
 
 
 
947
 
948
  node_emission.inputs['Color'].default_value = (r, g, b, 1.0)
949
  mat.node_tree.links.new(node_emission.outputs['Emission'], node_output.inputs['Surface'])
 
353
  parser.add_argument('--output_image', default=None,
354
  help="Output image path (used when rendering from JSON)")
355
 
356
+ MIN_VISIBLE_FRACTION = 0.001
357
+ MIN_VISIBLE_FRACTION_PARTIAL_OCCLUSION = 0.0005
358
+ MIN_PIXELS_FLOOR = 50
359
+
360
+
361
+ def min_visible_pixels(width, height, fraction=MIN_VISIBLE_FRACTION, floor=MIN_PIXELS_FLOOR):
362
+ return max(floor, int(width * height * fraction))
363
+
364
+
365
  BACKGROUND_COLORS = {
366
  'default': None,
367
  'gray': (0.5, 0.5, 0.5),
 
379
 
380
  LIGHTING_PRESETS = {
381
  'default': {'key': 1.0, 'fill': 0.5, 'back': 0.3},
382
+ 'bright': {'key': 12.0, 'fill': 6.0, 'back': 4.0},
383
+ 'dim': {'key': 0.008, 'fill': 0.004, 'back': 0.002},
384
+ 'warm': {'key': 5.0, 'fill': 0.8, 'back': 0.3, 'color': (1.0, 0.5, 0.2)},
385
+ 'cool': {'key': 4.0, 'fill': 2.0, 'back': 1.5, 'color': (0.2, 0.5, 1.0)},
386
+ 'dramatic': {'key': 15.0, 'fill': 0.005, 'back': 0.002},
387
  }
388
 
389
  def set_background_color(color_name):
 
549
  shape_semantic_to_file = properties['shapes']
550
  material_semantic_to_file = properties['materials']
551
 
552
+ blender_objects = []
553
  print("Adding objects to scene...")
554
  for i, obj_info in enumerate(scene_struct.get('objects', [])):
555
  x, y, z = obj_info['3d_coords']
 
570
  except Exception as e:
571
  print(f"Error adding object {i}: {e}")
572
  continue
573
+ if INSIDE_BLENDER and bpy.context.object:
574
+ blender_objects.append(bpy.context.object)
575
 
576
  rgba = color_name_to_rgba[obj_info['color']]
577
  semantic_material = obj_info['material']
 
587
  except Exception as e:
588
  print(f"Warning: Could not add material: {e}")
589
 
590
+ if blender_objects:
591
+ cf_meta = scene_struct.get('cf_metadata') or {}
592
+ cf_type = cf_meta.get('cf_type', '')
593
+ w = getattr(args, 'width', 320)
594
+ h = getattr(args, 'height', 240)
595
+ if cf_type == 'occlusion_change':
596
+ min_pixels = min_visible_pixels(w, h, MIN_VISIBLE_FRACTION_PARTIAL_OCCLUSION, MIN_PIXELS_FLOOR)
597
+ else:
598
+ base = min_visible_pixels(w, h, MIN_VISIBLE_FRACTION, MIN_PIXELS_FLOOR)
599
+ min_pixels = max(getattr(args, 'min_pixels_per_object', MIN_PIXELS_FLOOR), base)
600
+ all_visible = check_visibility(blender_objects, min_pixels)
601
+ if not all_visible:
602
+ print('Visibility check failed: at least one object has too few visible pixels')
603
+ for obj in blender_objects:
604
+ try:
605
+ delete_object(obj)
606
+ except Exception:
607
+ pass
608
+ sys.exit(1)
609
+
610
  filter_type = scene_struct.get('filter_type')
611
  filter_strength = scene_struct.get('filter_strength', 1.0)
612
 
 
890
  if len(objects) < num_objects:
891
  continue
892
 
893
+ min_pixels = max(args.min_pixels_per_object, min_visible_pixels(args.width, args.height))
894
+ all_visible = check_visibility(blender_objects, min_pixels)
895
  if not all_visible:
896
  print('Some objects are occluded; replacing objects')
897
  for obj in blender_objects:
 
931
 
932
 
933
  def check_visibility(blender_objects, min_pixels_per_object):
934
+ """
935
+ Ensure each object has at least min_pixels_per_object visible pixels in the
936
+ rendered image (rejects scenes where an object is fully occluded by others).
937
+ """
938
+ if not INSIDE_BLENDER or not blender_objects:
939
+ return True
940
+ if Image is None:
941
+ return True
942
+ fd, path = tempfile.mkstemp(suffix='.png')
943
+ os.close(fd)
944
+ try:
945
+ colors_list = render_shadeless(blender_objects, path, use_distinct_colors=True)
946
+ img = Image.open(path).convert('RGB')
947
+ w, h = img.size
948
+ pix = img.load()
949
+ color_to_idx = {}
950
+ for i, (r, g, b) in enumerate(colors_list):
951
+ key = (round(r * 255), round(g * 255), round(b * 255))
952
+ color_to_idx[key] = i
953
+ counts = [0] * len(blender_objects)
954
+ for y in range(h):
955
+ for x in range(w):
956
+ key = (pix[x, y][0], pix[x, y][1], pix[x, y][2])
957
+ if key in color_to_idx:
958
+ counts[color_to_idx[key]] += 1
959
+ all_visible = all(c >= min_pixels_per_object for c in counts)
960
+ return all_visible
961
+ finally:
962
+ try:
963
+ os.remove(path)
964
+ except Exception:
965
+ pass
966
 
967
 
968
+ def render_shadeless(blender_objects, path='flat.png', use_distinct_colors=False):
969
  """
970
  Render a version of the scene with shading disabled and unique materials
971
  assigned to all objects. The image itself is written to path. This is used to ensure
972
  that all objects will be visible in the final rendered scene (when check_visibility is enabled).
973
+ Returns a list of (r,g,b) colors in object order (for visibility counting when use_distinct_colors=True).
974
  """
975
  render_args = bpy.context.scene.render
976
 
 
988
  obj = bpy.data.objects[obj_name]
989
  obj.hide_render = True
990
 
991
+ n = len(blender_objects)
992
+ object_colors = [] if use_distinct_colors else set()
993
  old_materials = []
994
  for i, obj in enumerate(blender_objects):
995
  if len(obj.data.materials) > 0:
 
1005
  node_emission = nodes.new(type='ShaderNodeEmission')
1006
  node_output = nodes.new(type='ShaderNodeOutputMaterial')
1007
 
1008
+ if use_distinct_colors:
1009
+ r = (i + 1) / (n + 1)
1010
+ g, b = 0.5, 0.5
1011
+ object_colors.append((r, g, b))
1012
+ else:
1013
+ while True:
1014
+ r, g, b = [random.random() for _ in range(3)]
1015
+ if (r, g, b) not in object_colors:
1016
+ break
1017
+ object_colors.add((r, g, b))
1018
 
1019
  node_emission.inputs['Color'].default_value = (r, g, b, 1.0)
1020
  mat.node_tree.links.new(node_emission.outputs['Emission'], node_output.inputs['Surface'])