LLaVA-SpaceSGG: Visual Instruct Tuning for Open-vocabulary Scene Graph Generation with Enhanced Spatial Relations
Paper
•
2412.06322
•
Published
LLaVA-SpaceSGG baseline models for scene graph generations. Paper: https://arxiv.org/abs/2412.06322v1
Base model
liuhaotian/llava-v1.5-13b