superGong
/

DIRECT

@@ -2,32 +2,44 @@
 base_model:
 - black-forest-labs/FLUX.1-Fill-dev
 - microsoft/TRELLIS-image-large
 tags:
 - object-insertion
-- image-to-image
 - 3d-aware
 - pose-controllable-generation
-pipeline_tag: image-to-image
 ---
-# DIRECT
-This repository contains the model weights for **Direct 3D-Aware Object Insertion via Decomposed Visual Proxies**.
-DIRECT performs pose-controllable object insertion by decomposing the insertion condition into visual proxies, including a reference object image, a geometry proxy rendered from a reconstructed 3D object, and a scene context image.
-Project page: https://gong1130.github.io/DIRECT/
-Code: https://github.com/Gong1130/DIRECT
 ## Usage
-Please refer to the official code repository for installation instructions and **interactive demo** usage.
-## Model Details
-This repository contains **DIRECT-specific** weights **only**:
 - `lora.safetensors`
 - `condition_embedder.safetensors`
 - `x_embedder.safetensors`
@@ -36,8 +48,19 @@ This repository contains **DIRECT-specific** weights **only**:
 - `image_projector.safetensors`
 - `config.json`
-The model requires the following **external** models:
-- `black-forest-labs/FLUX.1-Fill-dev`
-- `google/siglip2-so400m-patch14-384`
-- `microsoft/TRELLIS-image-large`

 base_model:
 - black-forest-labs/FLUX.1-Fill-dev
 - microsoft/TRELLIS-image-large
+pipeline_tag: image-to-image
 tags:
 - object-insertion
 - 3d-aware
 - pose-controllable-generation
+- image-to-image
 ---
+# DIRECT: Direct 3D-Aware Object Insertion via Decomposed Visual Proxies
+This repository contains the model weights for **DIRECT**, presented in the paper [Direct 3D-Aware Object Insertion via Decomposed Visual Proxies](https://huggingface.co/papers/2606.06601).
+**Authors**: Jingbo Gong, Yikai Wang, Yushi Lan, Yuhao Wan, Ziheng Ouyang, Rui Zhao, Ming-Ming Cheng, Qibin Hou, and Chen Change Loy.
+[**Project Page**](https://gong1130.github.io/DIRECT/) | [**Paper (ArXiv)**](https://arxiv.org/abs/2606.06601) | [**Code**](https://github.com/Gong1130/DIRECT)
+## Overview
+DIRECT (Decomposed Injection for Reference Composition and Target-integration) is a framework that enables pose-controllable object insertion. It integrates interactive pose manipulation with high-fidelity 2D image synthesis by decomposing insertion conditions into three visual proxies:
+- **Appearance guidance**: Captures visual details from the reference object image.
+- **Geometry guidance**: Derived from a user-adjusted 3D proxy rendered from a reconstructed 3D object.
+- **Context guidance**: From the target background scene.
+By injecting these through separate pathways, DIRECT preserves reference appearance, follows user-specified poses, and adapts the object naturally to the target scene.
 ## Usage
+Please refer to the [official GitHub repository](https://github.com/Gong1130/DIRECT) for installation instructions. You can run the interactive demo with the following command:
+```bash
+python demo/demo.py --gradio_port 7860 --viser_port 8081
+```
+The demo allows you to segment a reference object, reconstruct it in 3D, and interactively manipulate its pose within the background image.
+## Model Details
+This repository contains **DIRECT-specific** weights only:
 - `lora.safetensors`
 - `condition_embedder.safetensors`
 - `x_embedder.safetensors`
 - `image_projector.safetensors`
 - `config.json`
+The framework requires the following **external** foundation models:
+- [black-forest-labs/FLUX.1-Fill-dev](https://huggingface.co/black-forest-labs/FLUX.1-Fill-dev)
+- [google/siglip2-so400m-patch14-384](https://huggingface.co/google/siglip2-so400m-patch14-384)
+- [microsoft/TRELLIS-image-large](https://huggingface.co/microsoft/TRELLIS-image-large)
+- [briaai/RMBG-2.0](https://huggingface.co/briaai/RMBG-2.0) (for background removal in the demo)
+## Citation
+```bibtex
+@inproceedings{gong2026direct,
+  title     = {Direct 3D-Aware Object Insertion via Decomposed Visual Proxies},
+  author    = {Jingbo Gong and Yikai Wang and Yushi Lan and Yuhao Wan and Ziheng Ouyang and Rui Zhao and Ming-Ming Cheng and Qibin Hou and Chen Change Loy},
+  booktitle = {ICML},
+  year      = {2026}
+}
+```