ProCap: Experiment Materials
This repository contains the official experimental materials for the paper:
Imagine How to Change: Explicit Procedure Modeling for Change Captioning
It provides processed datasets, pre-trained model weights, and evaluation tools for reproducing the results reported in the paper.
๐ฆ All materials are also available via Baidu Netdisk
Extraction Code: 5h7w
Contents
Data
All datasets are preprocessed into pseudo-sequence format (.h5 files) generated by VFIformer.
Included Datasets
CLEVR-data
Processed pseudo-sequences for the CLEVR-Change datasetedit-data
Processed pseudo-sequences for the Image-Editing-Request datasetspot-data
Processed pseudo-sequences for the Spot-the-Diff datasetfilter_files
Confidence scores computed using CLIP4IDCfiltered-spot-captions
Refined captions for the Spot-the-Diff dataset
Model Weights
This repository provides pre-trained weights for both stages in the paper.
Explicit Procedure Modeling (Stage 1)
pretrained_vqganโ VQGAN models for each datasetstage1_clevr_beststage1_edit_beststage1_spot_best
Implicit Procedure Captioning (Stage 2)
clevr_bestedit_bestspot_best
Note: Stage 1 checkpoints can be directly reused to initialize Stage 2 training.
Evaluation
densevid_eval
Evaluation tools used for quantitative assessment
Usage
1. Data Preparation
- Move caption files in
filtered-spot-captionsto the original caption directory of the Spot-the-Diff dataset. - Copy the processed data folders to the original dataset root and rename them as follows:
| Dataset | Folder | Rename To |
|---|---|---|
| CLEVR-Change | CLEVR-data |
CLEVR_processed |
| Image-Editing-Request | edit-data |
edit_processed |
| Spot-the-Diff | spot-data |
spot_processed |
- Place
filter_filesin the project root directory.
2. Model Weights
- Place
pretrained_vqganin the project root directory. - To reuse Stage 1 weights during training, set
symlink_pathin training scripts as:
symlink_path="/path/to/stage1/weight/dalle.pt"
- To evaluate with pre-trained checkpoints, set
resume_pathin evaluation scripts as:
resume_path="/path/to/pretrained/model/model.chkpt"
3. Evaluation Tool
Place the densevid_eval directory in the project root before evaluation.
Citation
If you find our work or this repository useful, please consider citing our paper:
@inproceedings{
sun2026imagine,
title={Imagine How To Change: Explicit Procedure Modeling for Change Captioning},
author={Sun, Jiayang and Guo, Zixin and Cao, Min and Zhu, Guibo and Laaksonen, Jorma},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
}
License
This repository is released under the MIT License.