object-model-6layer / README.md
haoz19's picture
Add gate request form + Access section
2574115 verified
---
license: mit
tags:
- 3d
- depth-estimation
- multilayer-depth
- point-cloud
- diffusion
- image-to-3d
library_name: torch
pipeline_tag: image-to-3d
extra_gated_heading: "Request access to the World Tracing object model"
extra_gated_description: >
These checkpoints are released for research and product
experimentation under the **MIT license**. Please share a few
details below so we can keep a light audit trail of how the
weights are used in the wild. Requests are reviewed manually,
typically within **1-3 business days**.
extra_gated_button_content: "Submit access request"
extra_gated_fields:
Full name: text
Affiliation (university / company): text
Country: country
Primary intended use:
type: select
options:
- Academic research
- Personal / hobbyist project
- Industrial research
- Commercial product
- Other
Brief description of your intended use: text
I agree to cite the World Tracing paper in any publication or release that uses these weights: checkbox
---
# World Tracing — Object Model (6-layer, r75b)
## Access
The checkpoints in this repo are released under the **MIT license**,
but downloads are **gated** so we can keep a light audit trail of
how the model is used. To download:
1. Scroll up and fill in the **"Submit access request"** form (basic
contact info + a short note on intended use).
2. We review every request manually, usually within **1-3 business
days**. You will receive an email from Hugging Face once your
request is approved.
3. After approval, log in with `huggingface-cli login` (or set
`HF_TOKEN`) and run any of the inference examples from the
[GitHub repo](https://github.com/haoz19/world-tracing) — the `wt`
package picks the token up automatically and `--ckpt r75b` /
`r69e` / `r76` triggers a normal `hf_hub_download`.
> *Note:* this is a **manual review** flow, not an auto-approve
> click-through. We read every request individually, so please give
> a one-line description of what you plan to use the weights for.
EMA-only release weights for the **r75b** object model from
[*World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible*](https://haoz19.github.io/world-tracing-page/).
* **Repo**: <https://github.com/haoz19/world-tracing>
* **Project page**: <https://haoz19.github.io/world-tracing-page/>
* **Config name**: `r75b`
* **Architecture**: `MultilayerXYZModel` (DINOv2-vit-L encoder + 6-layer
diffusion head), 1.7 B params
* **Input**: 504 × 504 RGBA, alpha-matted single object
* **Output**: per-layer XYZ in camera space, 6 stacked depth maps
(visible surface + 5 occluded layers behind it)
* **Training data**: Objaverse renders + curated public 3D-asset corpora
## Files
| File | Size | Format |
|---|---|---|
| `model.pt` | 6.21 GB | bare `state_dict`, float32 |
This release contains the EMA weights only (no optimizer / config /
gradients) so the download is ~26 % of the original training
checkpoint.
## Usage
```bash
git clone https://github.com/haoz19/world-tracing
cd world-tracing
pip install -e ".[viz,bg]"
python examples/infer_rgba.py \
--image examples/test_images/object/obj014_leather_briefcase.png \
--ckpt r75b \
--config r75b \
--out /tmp/wt_obj.rrd
```
Bare `--ckpt r75b` triggers `huggingface_hub.hf_hub_download` against
this repo and caches the weights under `~/.cache/huggingface/hub/`.
First run downloads 6.21 GB; subsequent runs are instant.
## Citation
```bibtex
@misc{zhang2026worldtracing,
title = {World Tracing: Generative Pixel-Aligned Geometry Beyond the Visible},
author = {Hao Zhang and Mohamed El Banani and Jen-Hao Cheng and Paul Zhang
and Yi Hua and Ben Mildenhall and Christoph Lassner
and Narendra Ahuja and Gengshan Yang},
year = {2026},
eprint = {TODO},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
```
## License
MIT — see the [GitHub repo](https://github.com/haoz19/world-tracing/blob/main/LICENSE).