|
|
--- |
|
|
license: mit |
|
|
language: |
|
|
- en |
|
|
tags: |
|
|
- 3d-scene-generation |
|
|
- indoor-scene |
|
|
- vision-language |
|
|
- reinforcement-learning |
|
|
base_model: Qwen/Qwen2.5-VL-7B-Instruct |
|
|
gated: auto |
|
|
extra_gated_prompt: >- |
|
|
By requesting access to SceneReVis-7B, you agree to the following terms: |
|
|
1. You will use this model only for academic research purposes. |
|
|
2. You will not redistribute the model weights without permission. |
|
|
3. You will cite our paper in any published work that uses this model. |
|
|
extra_gated_fields: |
|
|
Name: text |
|
|
Affiliation: text |
|
|
I want to use this model for: |
|
|
type: select |
|
|
options: |
|
|
- Academic Research |
|
|
- Education |
|
|
- label: Commercial Use |
|
|
value: commercial |
|
|
- label: Other |
|
|
value: other |
|
|
I agree to use this model for non-commercial research only: checkbox |
|
|
extra_gated_heading: "Request access to SceneReVis-7B" |
|
|
extra_gated_description: "Please fill out the form below. Access will be granted automatically after submission." |
|
|
extra_gated_button_content: "Submit & Get Access" |
|
|
--- |
|
|
|
|
|
# SceneReVis-7B |
|
|
|
|
|
SceneReVis-7B is a vision-language model fine-tuned for iterative 3D indoor scene generation and editing. |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Base Model**: Qwen2.5-VL-7B-Instruct |
|
|
- **Training**: SFT on SceneChain-12K + GRPO reinforcement learning with voxel-based physics rewards |
|
|
- **Architecture**: Vision-Language Model with tool-calling capabilities |
|
|
|
|
|
## Usage |
|
|
|
|
|
See the [SceneReVis repository](https://github.com/Runder-sun/SceneReVis) for inference instructions. |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@article{zhao2026scenerevis, |
|
|
title={SceneReVis: A Self-Reflective Vision-Grounded Framework for 3D Indoor Scene Synthesis via Multi-turn RL}, |
|
|
author={Yang Zhao and Shizhao Sun and Meisheng Zhang and Yingdong Shi and Xubo Yang and Jiang Bian}, |
|
|
journal={arXiv preprint arXiv:2602.09432}, |
|
|
year={2026} |
|
|
} |
|
|
``` |
|
|
|