Update README.md

#8
by kangxuey - opened
Files changed (1) hide show
  1. README.md +9 -4
README.md CHANGED
@@ -16,14 +16,19 @@ pipeline_tag: image-to-3d
16
  ---
17
 
18
  # Asset Harvester | System Model Card
19
- [**Paper** (coming soon)]() | [**Project Page** (coming soon)](https://research.nvidia.com/labs/sil/asset-harvester) | [**Code**](https://github.com/NVIDIA/asset-harvester) | [**Model**](https://huggingface.co/nvidia/asset-harvester) | [**Data**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NCore)
20
 
21
  ## **Description:**
22
 
23
- **Asset Harvester** generates 3D assets from a single image or multiple images of vehicles or VRUs extracted from autonomous driving sessions.
24
 
25
- It leverages 4 models (see the white paper for architecture) in the process.
26
- The [AV object Mask2former](model_cards/AV_Object_Mask2former.md) instance segmentation model is used for image processing when parsing input views from NCore data sessions.
 
 
 
 
 
27
  The input images are encoded by [C-Radio](https://huggingface.co/nvidia/C-RADIO),
28
  and the multiview diffusion model, [SparseViewDiT](model_cards/MultiviewDiffusion.md), is then used to generate 16 multiview images of the input objects.
29
  In cases where camera parameters are not provided, the multiview diffusion model includes a camera pose estimation submodule that predicts camera parameters for the input images.
 
16
  ---
17
 
18
  # Asset Harvester | System Model Card
19
+ **Paper** | **Project Page** | [**Code**](https://github.com/NVIDIA/asset-harvester) | [**Model**](https://huggingface.co/nvidia/asset-harvester) | [**Data**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NCore)
20
 
21
  ## **Description:**
22
 
23
+ **Asset Harvester** is an image-to-3D model and end-to-end system that converts sparse, in-the-wild object observations from real driving logs into complete, simulation-ready assets. The model generates 3D assets from a single image or multiple images of vehicles, VRUs or other road objects extracted from autonomous driving sessions. To run Asset Harvester, please check our [**codebase**](https://github.com/NVIDIA/asset-harvester).
24
 
25
+ <p align="center">
26
+ <img src="docs/pipeline.gif" alt="Asset Harvester teaser" width="100%" style="border: none;">
27
+ </p>
28
+
29
+ **Asset Harvester** turns real-world driving logs into complete, simulation-ready 3D assets — from just one or a few in-the-wild object views. It handles vehicles, pedestrians, riders, and other road objects, even under heavy occlusion, noisy calibration, and extreme viewpoint bias. A multiview diffusion model generates consistent novel viewpoints, and a feed-forward Gaussian reconstructor lifts them to full 3D in seconds. The result: high-fidelity 3D Gaussian splat assets ready for insertion into simulation environments. The pipeline plugs directly into NVIDIA NCore and NuRec for scalable data ingestion and closed-loop simulation.
30
+
31
+ Here's how the model checkpoints in this repo are used in the end-to-end system following the order in the pipeline: The [AV object Mask2former](model_cards/AV_Object_Mask2former.md) instance segmentation model is used for image processing when parsing input views from NCore data sessions.
32
  The input images are encoded by [C-Radio](https://huggingface.co/nvidia/C-RADIO),
33
  and the multiview diffusion model, [SparseViewDiT](model_cards/MultiviewDiffusion.md), is then used to generate 16 multiview images of the input objects.
34
  In cases where camera parameters are not provided, the multiview diffusion model includes a camera pose estimation submodule that predicts camera parameters for the input images.