nvidia
/

4D-RGPT-8B

Video-Text-to-Text

video-understanding

region-grounding

perceptual-distillation

Model card Files Files and versions

cmhungsteve commited on Jun 2

Commit

a60e5a5

·

verified ·

1 Parent(s): 492ca8b

fix links

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ Global
 Expected users are multimodal AI researchers, applied research teams, and developers studying video understanding, region grounding, 3D/4D reasoning, and physical AI. Representative use cases include region-level video question answering, model benchmarking, research on depth-and-time-aware MLLMs, and prototyping for domains such as robotics, autonomous driving, and industrial inspection.
 ### Release Date:
-Hugging Face [06/01/2026] via [https://huggingface.co/nvidia/4D-RGPT-8B]
 ## References(s):
 * Paper: https://arxiv.org/abs/2512.17012 <br>

 Expected users are multimodal AI researchers, applied research teams, and developers studying video understanding, region grounding, 3D/4D reasoning, and physical AI. Representative use cases include region-level video question answering, model benchmarking, research on depth-and-time-aware MLLMs, and prototyping for domains such as robotics, autonomous driving, and industrial inspection.
 ### Release Date:
+Hugging Face [06/01/2026] via [https://huggingface.co/nvidia/4D-RGPT-8B](https://huggingface.co/nvidia/4D-RGPT-8B).
 ## References(s):
 * Paper: https://arxiv.org/abs/2512.17012 <br>