AnthonyGosselin
/

Ctrl-Crash

Diffusers

Model card Files Files and versions

xet

Community

AnthonyGosselin commited on Jun 3, 2025

Commit

eb41b3e

verified ·

1 Parent(s): 9ffc2fd

Update README.md

Browse files

Files changed (1) hide show

README.md +15 -5

README.md CHANGED Viewed

@@ -6,7 +6,7 @@ base_model:
 # Model Card for Ctrl-Crash
-Generate car crash videos from an initial frame and using bounding-box and crash type conditioning.
 <p align="center">
   <table cellspacing="0" cellpadding="0">
@@ -26,7 +26,7 @@ Generate car crash videos from an initial frame and using bounding-box and crash
 <img src="architecture_figure.png" width=800>
 </p>
-TODO: Provide a longer summary of what this model is.
 - Visit the **project page** for demos: https://anthonygosselin.github.io/Ctrl-Crash-ProjectPage/
 - Visit the **repository** to get started: https://github.com/CtrlCrash-Anonymous/Ctrl-Crash-Anonymous
@@ -39,13 +39,23 @@ This model uses the Stability AI Image-to-Video model (SVD 1.1) as a base model:
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-TODO: Here we can describe the different operation modes (Reconstruction, Prediction and counterfactuals)
 ## Bias, Risks, and Limitations
-TODO: Limitations of model
-**BibTeX:**
 ```bibtex
 @misc{gosselin2025ctrlcrashcontrollablediffusionrealistic,
       title={Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes},

 # Model Card for Ctrl-Crash
+Generate car crash videos from an initial frame, using bounding-box and crash type control signals.
 <p align="center">
   <table cellspacing="0" cellpadding="0">
 <img src="architecture_figure.png" width=800>
 </p>
+<!-- TODO: Provide a longer summary of what this model is. -->
 - Visit the **project page** for demos: https://anthonygosselin.github.io/Ctrl-Crash-ProjectPage/
 - Visit the **repository** to get started: https://github.com/CtrlCrash-Anonymous/Ctrl-Crash-Anonymous
 <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
+<!-- TODO: Here we can describe the different operation modes (Reconstruction, Prediction and counterfactuals) -->
+Ctrl-Crash supports different task settings, each enabled by varying the available control signals, namely:
+- **(1) Crash Reconstruction**: Given an initial image, full bounding box sequence, and a crash type, the model reconstructs a consistent video combining the visual context of the initial frame with agent motion derived from the bounding boxes.
+- **(2) Crash Prediction**: Given the initial frame and only a few initial bounding box frames (e.g., 0–9), the model predicts the future motion of agents in a way that aligns with the target crash type.
+- **(3) Crash Counterfactuals**: Extending the prediction task, this mode varies the crash type signal while keeping other inputs fixed, enabling the generation of multiple plausible outcomes for the same scene—supporting counterfactual safety reasoning.
 ## Bias, Risks, and Limitations
+Despite its strong performance, our approach has several limitations, which motivates future work in this direction.
+- Counterfactual outcomes can be hard to generate when initial scene conditions conflict with the desired crash type.
+- The model relies heavily on bounding boxes, making it sensitive to tracking errors—especially in fully conditioned reconstruction.
+- With no bounding boxes conditioning, motion direction can be ambiguous, and 2D boxes struggle to capture rotation or orientation, limiting realism in behaviors like spinouts
+- Does not support text conditioning
+**BibTeX:**
 ```bibtex
 @misc{gosselin2025ctrlcrashcontrollablediffusionrealistic,
       title={Ctrl-Crash: Controllable Diffusion for Realistic Car Crashes},