Spaces:
Running
on
Zero
Catastrophic anatomical hallucinations on complex pose multi-view generation
Hi @Vast-AI team,
I’ve been stress testing MV-Adapter-I2MV-SDXL on complex, non-canonical poses. While multi-view generation from a single image is incredibly challenging, I found a severe failure mode regarding skeletal integrity and occlusion.
I used an input of a subject in an inverted handstand with high occlusion (a large skirt blocking the body).
The Issue: When generating alternative views, the model completely loses anatomical structure. It cannot infer the hidden geometry of the body and instead generates severe, unstructured hallucinations of flesh and fabric (see attached examples). There is a total breakdown of multi-view consistency.
The Fix: My team at Repalto specializes in constructing data for multi-view consistency and complex anatomical priors. We can build a "Multi-View / High-Occlusion Benchmark" (e.g., yoga poses, heavy clothing, varied camera angles) to help the model learn to maintain skeletal integrity even when parts are hidden.
Happy to create a benchmark batch over if you want to use it to evaluate or fine-tune the next version. Just let me know.






