Improve model card: Update pipeline tag, add library_name, extend tags, and import full GitHub README content
#1
by nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,13 +1,140 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model:
|
| 4 |
- black-forest-labs/FLUX.1-dev
|
| 5 |
-
|
|
|
|
| 6 |
tags:
|
| 7 |
- panorama
|
| 8 |
- generation
|
| 9 |
- perception
|
| 10 |
- flow-matching
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
|
|
|
| 12 |
# OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
|
| 13 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- black-forest-labs/FLUX.1-dev
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
pipeline_tag: text-to-3d
|
| 6 |
tags:
|
| 7 |
- panorama
|
| 8 |
- generation
|
| 9 |
- perception
|
| 10 |
- flow-matching
|
| 11 |
+
- text-to-image
|
| 12 |
+
- image-to-image
|
| 13 |
+
library_name: diffusers
|
| 14 |
---
|
| 15 |
+
|
| 16 |
# OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
|
| 17 |
+
|
| 18 |
+
We present [OmniX](https://arxiv.org/abs/2510.26800), a family of panoramic flow matching models for unified panorama generation, perception, and completion.
|
| 19 |
+
|
| 20 |
+
<div align="center">
|
| 21 |
+
|
| 22 |
+
[](https://yukun-huang.github.io/OmniX/)
|
| 23 |
+
[](https://arxiv.org/abs/2510.26800)
|
| 24 |
+
[](https://huggingface.co/KevinHuang/OmniX)
|
| 25 |
+
[](https://huggingface.co/datasets/KevinHuang/PanoX)
|
| 26 |
+
[](https://github.com/HKU-MMLab/OmniX)
|
| 27 |
+
|
| 28 |
+
</div>
|
| 29 |
+
|
| 30 |
+
<p align="left">
|
| 31 |
+
<img src="https://huggingface.co/KevinHuang/OmniX/resolve/main/assets/teaser.png" width="100%">
|
| 32 |
+
<br>
|
| 33 |
+
We introduce <b>OmniX</b>, a family of flow matching generative models that achieves <b>unified panorama perception, generation, and completion</b>. Using OmniX as a world generator, we can create graphics-ready 3D scenes ready for physically based rendering, relighting, and simualtion.
|
| 34 |
+
</p>
|
| 35 |
+
|
| 36 |
+
## Paper Abstract
|
| 37 |
+
|
| 38 |
+
There are two prevalent ways to constructing 3D scenes: procedural generation and 2D lifting. Among them, panorama-based 2D lifting has emerged as a promising technique, leveraging powerful 2D generative priors to produce immersive, realistic, and diverse 3D environments. In this work, we advance this technique to generate graphics-ready 3D scenes suitable for physically based rendering (PBR), relighting, and simulation. Our key insight is to repurpose 2D generative models for panoramic perception of geometry, textures, and PBR materials. Unlike existing 2D lifting approaches that emphasize appearance generation and ignore the perception of intrinsic properties, we present OmniX, a versatile and unified framework. Based on a lightweight and efficient cross-modal adapter structure, OmniX reuses 2D generative priors for a broad range of panoramic vision tasks, including panoramic perception, generation, and completion. Furthermore, we construct a large-scale synthetic panorama dataset containing high-quality multimodal panoramas from diverse indoor and outdoor scenes. Extensive experiments demonstrate the effectiveness of our model in panoramic visual perception and graphics-ready 3D scene generation, opening new possibilities for immersive and physically realistic virtual world generation.
|
| 39 |
+
|
| 40 |
+
## โ๏ธ Installation
|
| 41 |
+
Please follow the instructions below to get the code and install dependencies.
|
| 42 |
+
|
| 43 |
+
### Clone the repo:
|
| 44 |
+
```bash
|
| 45 |
+
git clone https://github.com/HKU-MMLab/OmniX.git
|
| 46 |
+
cd OmniX
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
### Create a conda environment:
|
| 50 |
+
```
|
| 51 |
+
conda create -n omnix python=3.11
|
| 52 |
+
conda activate omnix
|
| 53 |
+
```
|
| 54 |
+
|
| 55 |
+
### Install dependencies:
|
| 56 |
+
```
|
| 57 |
+
pip install -r requirements.txt
|
| 58 |
+
```
|
| 59 |
+
|
| 60 |
+
### Install Blender (optional, for exporting 3D scenes only):
|
| 61 |
+
Please refer to the [official installation guide](https://www.blender.org/download/) to install Blender on your PC or remote server. We use Blender 4.4.3 for Linux.
|
| 62 |
+
|
| 63 |
+
Alternatively, you may use:
|
| 64 |
+
```
|
| 65 |
+
pip install bpy
|
| 66 |
+
```
|
| 67 |
+
to use the Blender Python API without installing the full Blender, but we haven't tested this carefully.
|
| 68 |
+
|
| 69 |
+
## ๐ Inference
|
| 70 |
+
|
| 71 |
+
### Panorama Generation
|
| 72 |
+
OmniX can generate high-quality panoramic images from image or text prompts:
|
| 73 |
+
```bash
|
| 74 |
+
# Generation from Text
|
| 75 |
+
python run_pano_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/generation_from_text"
|
| 76 |
+
|
| 77 |
+
# Generation from Image and Text
|
| 78 |
+
python run_pano_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/generation_from_image_and_text"
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
<img src="https://huggingface.co/KevinHuang/OmniX/resolve/main/assets/pano_gen.png" width="100%">
|
| 82 |
+
|
| 83 |
+
### Panorama Perception
|
| 84 |
+
Given an RGB panorama as input, OmniX can predict geometric, intrinsic, and semantic properties:
|
| 85 |
+
|
| 86 |
+
```bash
|
| 87 |
+
# Perception (Distance, Normal, Albedo, Roughness, Metallic, Semantic) from Panorama
|
| 88 |
+
python run_pano_perception.py --panorama "assets/examples/panorama.png" --output_dir "outputs/perception_from_panorama"
|
| 89 |
+
```
|
| 90 |
+
|
| 91 |
+
<img src="https://huggingface.co/KevinHuang/OmniX/resolve/main/assets/pano_perc.png" width="100%">
|
| 92 |
+
|
| 93 |
+
### Panorama Generation and Perception
|
| 94 |
+
Naturally, we can combine panorama generation and perception to obtain a panoramic image with multiple property annotations:
|
| 95 |
+
|
| 96 |
+
```bash
|
| 97 |
+
# Generation and Perception from Text
|
| 98 |
+
python run_pano_all.py --prompt "Photorealistic modern living room" --output_dir "outputs/generation_and_perception_from_text"
|
| 99 |
+
|
| 100 |
+
# Generation and Perception from Image and Text
|
| 101 |
+
python run_pano_all.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/generation_and_perception_from_image_and_text"
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
<img src="https://huggingface.co/KevinHuang/OmniX/resolve/main/assets/pano_gen_and_perc.png" width="100%">
|
| 105 |
+
|
| 106 |
+
### Graphics-Ready Scene Generation (Beta)
|
| 107 |
+
Note that the code for graphics-ready scene reconstruction/generation is not ready and is still in progress.
|
| 108 |
+
|
| 109 |
+
```bash
|
| 110 |
+
# Generation from Text
|
| 111 |
+
python run_scene_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/construction_from_text"
|
| 112 |
+
# Generation from Text (Fast)
|
| 113 |
+
python run_scene_generation.py --prompt "Photorealistic modern living room" --output_dir "outputs/construction_fast_from_text" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth
|
| 114 |
+
|
| 115 |
+
# Generation from Image and Text
|
| 116 |
+
python run_scene_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/construction_from_image_and_text"
|
| 117 |
+
# Generation from Image and Text (Fast)
|
| 118 |
+
python run_scene_generation.py --image "assets/examples/image.png" --prompt "Photorealistic modern living room" --output_dir "outputs/construction_fast_from_image_and_text" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth
|
| 119 |
+
|
| 120 |
+
# Generation from Panorama
|
| 121 |
+
python run_scene_generation.py --panorama "assets/examples/panorama.png" --output_dir "outputs/construction_from_panorama"
|
| 122 |
+
# Generation from Panorama (Fast)
|
| 123 |
+
python run_scene_generation.py --panorama "assets/examples/panorama.png" --output_dir "outputs/construction_fast_from_panorama" --rgb_as_albedo --disable_normal --use_default_pbr --fill_invalid_depth
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
<img src="https://huggingface.co/KevinHuang/OmniX/resolve/main/assets/scene.png" width="100%">
|
| 127 |
+
|
| 128 |
+
## ๐ Acknowledgement
|
| 129 |
+
This repository is based on many amazing research works and open-source projects: [PanFusion](https://github.com/chengzhag/PanFusion), [DreamCube](https://github.com/Yukun-Huang/DreamCube), [WorldGen](https://github.com/ZiYang-xie/WorldGen), [diffusers](https://github.com/huggingface/diffusers), [equilib](https://github.com/haruishi43/equilib), etc. Thanks all the authors for their selfless contributions to the community!
|
| 130 |
+
|
| 131 |
+
## ๐ Citation
|
| 132 |
+
If you find this repository helpful for your work, please consider citing it as follows:
|
| 133 |
+
```bib
|
| 134 |
+
@article{omnix,
|
| 135 |
+
title={OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes},
|
| 136 |
+
author={Huang, Yukun and Yu, Jiwen and Zhou, Yanning and Wang, Jianan and Wang, Xintao and Wan, Pengfei and Liu, Xihui},
|
| 137 |
+
journal={arXiv preprint arXiv:2510.26800},
|
| 138 |
+
year={2025}
|
| 139 |
+
}
|
| 140 |
+
```
|