Scripts
Utility scripts for working with CLEVR counterfactual scenes.
Available Scripts
generate_scenes.py
Generate scene JSON files with counterfactuals (no rendering).
Purpose: Create scene JSON files that can be rendered later.
Usage:
# Generate 10 scene sets
python scripts/generate_scenes.py --num_scenes 10 --num_objects 5 --run_name experiment1
# Resume from checkpoint
python scripts/generate_scenes.py --num_scenes 100 --run_name experiment1 --resume
Output: Scene JSON files in output/run_name/scenes/
Next step: Use pipeline.py --render_only to render these scenes.
Note: Alternatively, you can use pipeline.py --skip_render instead of this script.
generate_examples.py
Generate examples of each counterfactual type applied to a base scene. Optionally renders scenes to images.
Purpose: Create reference examples demonstrating all counterfactual types.
Usage:
# Generate scene JSON files only
python scripts/generate_examples.py [--output_dir DIR] [--num_objects N]
# Generate and render to images
python scripts/generate_examples.py --render [--output_dir DIR] [--num_objects N] [--use_gpu 0|1]
Options:
--output_dir: Output directory (default:output/counterfactual_examples)--num_objects: Number of objects in base scene (default: 5)--render: Render scenes to PNG images--use_gpu: Use GPU rendering (0 = CPU, 1 = GPU, default: 0)
Output:
- Scene JSON files for all counterfactual types
- Optional: PNG images (if
--renderis used)
generate_questions_mapping.py
Generate CSV mapping with questions and counterfactual questions for scenes.
Purpose: Create question-answer datasets for training/evaluation.
Usage:
# For a specific run directory
python scripts/generate_questions_mapping.py --output_dir output/experiment1 --generate_questions
# Auto-detect latest run
python scripts/generate_questions_mapping.py --output_dir output --auto_latest --generate_questions
# Generate CSV with scene_id and links (relative paths)
python scripts/generate_questions_mapping.py --output_dir output/experiment1 --generate_questions --with_links
# Generate CSV with scene_id and full URLs
python scripts/generate_questions_mapping.py --output_dir output/experiment1 --generate_questions --with_links --base_url https://example.com/dataset
Options:
--output_dir: Run directory or base output directory (default:output)--auto_latest: Automatically find and use the latest run in output_dir--csv_name: Output CSV filename (default:image_mapping_with_questions.csv)--generate_questions: Generate questions and answers for each scene set--with_links: Include scene_id and image/scene link columns (for URLs or file paths)--base_url: Base URL for links (e.g.,https://example.com). If not provided, uses relative paths likeimages/filename.png
Output: CSV files with question-answer mappings
Main Pipeline
For production use (generating large datasets), use the main pipeline script:
python pipeline.py --num_scenes 100 --num_objects 5 --run_name my_experiment
See the main README.md for full documentation of the production pipeline.
Script Summary
| Script | Purpose | When to Use |
|---|---|---|
generate_scenes.py |
Generate scene JSON files | Generate scenes separately (alternative to pipeline.py --skip_render) |
generate_examples.py |
Generate reference examples | Creating demonstrations, testing counterfactuals |
generate_questions_mapping.py |
Create QA datasets | Preparing training/evaluation data |
pipeline.py |
Combined generation + rendering | Main entry point. Supports --skip_render and --render_only modes |