Video-Text-to-Text
Transformers
Safetensors
English
llava_llama
multimodal
video-understanding
region-grounding
3d-reasoning
4d-reasoning
perceptual-distillation
nvila
vila
Instructions to use nvidia/4D-RGPT-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/4D-RGPT-8B with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("nvidia/4D-RGPT-8B", dtype="auto") - Notebooks
- Google Colab
- Kaggle
Commit ·
2c113ec
1
Parent(s): 492ca8b
fix links (#1)
Browse files- fix links (a60e5a5c8fec071b036d4e98d7ad3bba2ba4e1b1)
Co-authored-by: Min-Hung Chen <cmhungsteve@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -35,7 +35,7 @@ Global
|
|
| 35 |
Expected users are multimodal AI researchers, applied research teams, and developers studying video understanding, region grounding, 3D/4D reasoning, and physical AI. Representative use cases include region-level video question answering, model benchmarking, research on depth-and-time-aware MLLMs, and prototyping for domains such as robotics, autonomous driving, and industrial inspection.
|
| 36 |
|
| 37 |
### Release Date:
|
| 38 |
-
Hugging Face [06/01/2026] via [https://huggingface.co/nvidia/4D-RGPT-8B]
|
| 39 |
|
| 40 |
## References(s):
|
| 41 |
* Paper: https://arxiv.org/abs/2512.17012 <br>
|
|
|
|
| 35 |
Expected users are multimodal AI researchers, applied research teams, and developers studying video understanding, region grounding, 3D/4D reasoning, and physical AI. Representative use cases include region-level video question answering, model benchmarking, research on depth-and-time-aware MLLMs, and prototyping for domains such as robotics, autonomous driving, and industrial inspection.
|
| 36 |
|
| 37 |
### Release Date:
|
| 38 |
+
Hugging Face [06/01/2026] via [https://huggingface.co/nvidia/4D-RGPT-8B](https://huggingface.co/nvidia/4D-RGPT-8B).
|
| 39 |
|
| 40 |
## References(s):
|
| 41 |
* Paper: https://arxiv.org/abs/2512.17012 <br>
|