Rename model_cards/MultviewDiffusion.md to model_cards/MultiviewDiffusion.md

Files changed (1) hide show

model_cards/{MultviewDiffusion.md → MultiviewDiffusion.md} RENAMED Viewed

@@ -31,7 +31,8 @@ HuggingFace
 **Architecture Type:** Linear Diffusion Transformer
-**Network Architecture:** Linear-attention Diffusion Transformer with a Deep Compression Autoencoder (DC-AE) for efficient high-resolution image generation. C-RADIO for image conditioning signal.
 ## **Input:**
@@ -76,18 +77,18 @@ The model was trained, tested, and finetuned using an Objaverse subset internal
 | Dataset names | Size and content | Training partition | Test partition |
 | :---- | :---- | :---- | :---- |
-| Internal Nvidia AV dataset | Posed images of 278k objects | 83% (cross validation) | 17% |
 | Omniverse 3D assets | 200 3D assets of objects | 100% | 0% |
 | Objaverse | 80k assets collected under commercially viable Creative Commons licenses, | 100% | 0% |
-### Objaverse Commercially Viable Subset
 **Link:** https://objaverse.allenai.org
 **Data Collection Method:** Synthetic 3D assets aggregated from various open-source and licensed sources
 **Labeling Method by Dataset:** Hybrid: Human and Automated
 **Properties:** This dataset consists of a diverse set of over 80,000 synthetic 3D object models spanning everyday items, animals, tools, and complex structures. Each model is rendered into multi-view 2D images with associated camera poses, materials, and mesh properties.
-### Internal NVIDIA AV dataset
 **Data Collection Method:** Sensors

 **Architecture Type:** Linear Diffusion Transformer
+**Network Architecture:** Sparse View Linear-attention Diffusion Transformer, as described in our white paper,
+with a Deep Compression Autoencoder (DC-AE) for efficient high-resolution image generation. C-RADIO for image conditioning signal.
 ## **Input:**
 | Dataset names | Size and content | Training partition | Test partition |
 | :---- | :---- | :---- | :---- |
+| Nvidia Proprietary AV dataset | Posed images of 278k objects | 83% (cross validation) | 17% |
 | Omniverse 3D assets | 200 3D assets of objects | 100% | 0% |
 | Objaverse | 80k assets collected under commercially viable Creative Commons licenses, | 100% | 0% |
+### Objaverse Commercially Viable Subset under CC licenses
 **Link:** https://objaverse.allenai.org
 **Data Collection Method:** Synthetic 3D assets aggregated from various open-source and licensed sources
 **Labeling Method by Dataset:** Hybrid: Human and Automated
 **Properties:** This dataset consists of a diverse set of over 80,000 synthetic 3D object models spanning everyday items, animals, tools, and complex structures. Each model is rendered into multi-view 2D images with associated camera poses, materials, and mesh properties.
+### Nvidia Proprietary AV dataset
 **Data Collection Method:** Sensors