kangxuey commited on
Commit
d06b652
·
verified ·
1 Parent(s): 3345e79

Rename model_cards/MultviewDiffusion.md to model_cards/MultiviewDiffusion.md

Browse files
model_cards/{MultviewDiffusion.md → MultiviewDiffusion.md} RENAMED
@@ -31,7 +31,8 @@ HuggingFace
31
 
32
  **Architecture Type:** Linear Diffusion Transformer
33
 
34
- **Network Architecture:** Linear-attention Diffusion Transformer with a Deep Compression Autoencoder (DC-AE) for efficient high-resolution image generation. C-RADIO for image conditioning signal.
 
35
 
36
  ## **Input:**
37
 
@@ -76,18 +77,18 @@ The model was trained, tested, and finetuned using an Objaverse subset internal
76
 
77
  | Dataset names | Size and content | Training partition | Test partition |
78
  | :---- | :---- | :---- | :---- |
79
- | Internal Nvidia AV dataset | Posed images of 278k objects | 83% (cross validation) | 17% |
80
  | Omniverse 3D assets | 200 3D assets of objects | 100% | 0% |
81
  | Objaverse | 80k assets collected under commercially viable Creative Commons licenses, | 100% | 0% |
82
 
83
- ### Objaverse Commercially Viable Subset
84
 
85
  **Link:** https://objaverse.allenai.org
86
  **Data Collection Method:** Synthetic 3D assets aggregated from various open-source and licensed sources
87
  **Labeling Method by Dataset:** Hybrid: Human and Automated
88
  **Properties:** This dataset consists of a diverse set of over 80,000 synthetic 3D object models spanning everyday items, animals, tools, and complex structures. Each model is rendered into multi-view 2D images with associated camera poses, materials, and mesh properties.
89
 
90
- ### Internal NVIDIA AV dataset
91
 
92
  **Data Collection Method:** Sensors
93
 
 
31
 
32
  **Architecture Type:** Linear Diffusion Transformer
33
 
34
+ **Network Architecture:** Sparse View Linear-attention Diffusion Transformer, as described in our white paper,
35
+ with a Deep Compression Autoencoder (DC-AE) for efficient high-resolution image generation. C-RADIO for image conditioning signal.
36
 
37
  ## **Input:**
38
 
 
77
 
78
  | Dataset names | Size and content | Training partition | Test partition |
79
  | :---- | :---- | :---- | :---- |
80
+ | Nvidia Proprietary AV dataset | Posed images of 278k objects | 83% (cross validation) | 17% |
81
  | Omniverse 3D assets | 200 3D assets of objects | 100% | 0% |
82
  | Objaverse | 80k assets collected under commercially viable Creative Commons licenses, | 100% | 0% |
83
 
84
+ ### Objaverse Commercially Viable Subset under CC licenses
85
 
86
  **Link:** https://objaverse.allenai.org
87
  **Data Collection Method:** Synthetic 3D assets aggregated from various open-source and licensed sources
88
  **Labeling Method by Dataset:** Hybrid: Human and Automated
89
  **Properties:** This dataset consists of a diverse set of over 80,000 synthetic 3D object models spanning everyday items, animals, tools, and complex structures. Each model is rendered into multi-view 2D images with associated camera poses, materials, and mesh properties.
90
 
91
+ ### Nvidia Proprietary AV dataset
92
 
93
  **Data Collection Method:** Sensors
94