Spaces:
Running
Running
Link Physical AI section to collection
#2
by NV-DLynch - opened
README.md
CHANGED
|
@@ -59,7 +59,7 @@ NVIDIA releases optimized and aligned versions of leading community architecture
|
|
| 59 |
* [**Llama-3.1-Nemotron**](https://huggingface.co/collections/nvidia/llama-nemotron)**:** A diverse family of models where Llama 3.1 architectures are fine-tuned using NVIDIA's **HelpSteer2** datasets to improve helpfulness and instruction adherence. Includes [Ultra (253B)](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1), [Super (49B)](https://huggingface.co/nvidia/Llama-3.3-Nemotron-Super-49B-v1), and [Nano (8B)](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-8B-v1) variants.
|
| 60 |
* **Mistral-NeMo:** A 12B parameter model built in collaboration with Mistral AI, offering a high performance-to-size ratio and an expanded 128k context window.
|
| 61 |
|
| 62 |
-
## **Physical AI**
|
| 63 |
|
| 64 |
### **NVIDIA Cosmos**
|
| 65 |
|
|
@@ -76,6 +76,24 @@ NVIDIA releases optimized and aligned versions of leading community architecture
|
|
| 76 |
|
| 77 |
[IsaacLab-Arena](https://github.com/isaac-sim/IsaacLab-Arena): An open-source framework for large-scale, GPU-accelerated robot policy evaluation in simulation built over top of IsaacLab. Provides modular APIs for task curation, automated diversification, and parallel benchmarking across embodiments and environments. [We have integrated IsaacLab-Arena into LeRobot](https://huggingface.co/docs/lerobot/envhub_isaaclab_arena) for scalable closed-loop policy evaluation and benchmarking along with datasets and 250+ scenes from our partner [Lightwheel AI](https://www.lightwheel.ai/), on HuggingfaceHub.
|
| 78 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 79 |
# **Nemotron Datasets**
|
| 80 |
|
| 81 |
Every model NVIDIA ships rests on a data layer β and that data shapes how the model reasons, what it knows, and where it can be safely deployed. Nemotron Datasets are the open version of that foundation: web-scale pretraining corpora, alignment and reasoning data, multimodal grounding, and embodied AI simulation, released under permissive licenses with the training recipes and evaluation frameworks that produced them. Beyond Nemotron, NVIDIA's broader open data catalog spans 200+ releases across Physical AI and robotics, autonomous vehicles, biology and drug discovery, retrieval and evaluation benchmarks, and sovereign AI. Use the table below to find the right starting point for what you're trying to build.
|
|
|
|
| 59 |
* [**Llama-3.1-Nemotron**](https://huggingface.co/collections/nvidia/llama-nemotron)**:** A diverse family of models where Llama 3.1 architectures are fine-tuned using NVIDIA's **HelpSteer2** datasets to improve helpfulness and instruction adherence. Includes [Ultra (253B)](https://huggingface.co/nvidia/Llama-3_1-Nemotron-Ultra-253B-v1), [Super (49B)](https://huggingface.co/nvidia/Llama-3.3-Nemotron-Super-49B-v1), and [Nano (8B)](https://huggingface.co/nvidia/Llama-3.1-Nemotron-Nano-8B-v1) variants.
|
| 60 |
* **Mistral-NeMo:** A 12B parameter model built in collaboration with Mistral AI, offering a high performance-to-size ratio and an expanded 128k context window.
|
| 61 |
|
| 62 |
+
## **[Physical AI](https://huggingface.co/collections/nvidia/physical-ai)**
|
| 63 |
|
| 64 |
### **NVIDIA Cosmos**
|
| 65 |
|
|
|
|
| 76 |
|
| 77 |
[IsaacLab-Arena](https://github.com/isaac-sim/IsaacLab-Arena): An open-source framework for large-scale, GPU-accelerated robot policy evaluation in simulation built over top of IsaacLab. Provides modular APIs for task curation, automated diversification, and parallel benchmarking across embodiments and environments. [We have integrated IsaacLab-Arena into LeRobot](https://huggingface.co/docs/lerobot/envhub_isaaclab_arena) for scalable closed-loop policy evaluation and benchmarking along with datasets and 250+ scenes from our partner [Lightwheel AI](https://www.lightwheel.ai/), on HuggingfaceHub.
|
| 78 |
|
| 79 |
+
### **Autonomous Vehicles**
|
| 80 |
+
|
| 81 |
+
Open datasets and benchmarks for training and evaluating self-driving stacks β spanning real-world driving logs, Cosmos-generated synthetic scenarios, and reasoning benchmarks for traffic understanding.
|
| 82 |
+
|
| 83 |
+
* [**PhysicalAI-Autonomous-Vehicles**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles)**:** Large-scale multimodal driving dataset (222k items) for perception, prediction, and planning research.
|
| 84 |
+
* [**Cosmos-Drive-Dreams**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams) **&** [**Cosmos-Synthetic**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Synthetic)**:** Synthetic driving data generated with Cosmos World Foundation Models for closing the long-tail and corner-case gap.
|
| 85 |
+
* [**PhysicalAI-Autonomous-Vehicles-NuRec**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NuRec) **&** [**NCore**](https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles-NCore)**:** Neural reconstruction and core sensor datasets for high-fidelity scene replay and closed-loop simulation.
|
| 86 |
+
* [**VANTAGE-Bench**](https://huggingface.co/datasets/nvidia/PhysicalAI-VANTAGE-Bench)**:** Benchmark for evaluating vision-language models on autonomous driving scene understanding, with a [focused subset](https://huggingface.co/datasets/nvidia/PhysicalAI-VANTAGE-Bench-Subset) for rapid iteration.
|
| 87 |
+
* [**Traffic-Anomaly-Reasoning**](https://huggingface.co/datasets/nvidia/PhysicalAI-Traffic-Anomaly-Reasoning)**:** Dataset for training and evaluating models on traffic anomaly detection and causal reasoning.
|
| 88 |
+
|
| 89 |
+
### **Smart Spaces**
|
| 90 |
+
|
| 91 |
+
Datasets and simulation-ready assets for spatial intelligence in warehouses, factories, and other physical environments β purpose-built for AI agents that perceive, reason about, and act in real-world spaces.
|
| 92 |
+
|
| 93 |
+
* [**PhysicalAI-SmartSpaces**](https://huggingface.co/datasets/nvidia/PhysicalAI-SmartSpaces)**:** Multi-camera dataset for smart-space perception including person tracking, activity recognition, and scene understanding.
|
| 94 |
+
* [**SimReady-Warehouse-01**](https://huggingface.co/datasets/nvidia/PhysicalAI-SimReady-Warehouse-01)**:** Simulation-ready warehouse assets for Omniverse / Isaac Sim, enabling end-to-end synthetic data generation and policy training.
|
| 95 |
+
* [**Spatial-Intelligence-Warehouse**](https://huggingface.co/datasets/nvidia/PhysicalAI-Spatial-Intelligence-Warehouse) **&** [**Lyra-SDG**](https://huggingface.co/datasets/nvidia/PhysicalAI-SpatialIntelligence-Lyra-SDG)**:** Warehouse spatial-reasoning datasets and synthetic data generation pipelines for training spatially-aware foundation models.
|
| 96 |
+
|
| 97 |
# **Nemotron Datasets**
|
| 98 |
|
| 99 |
Every model NVIDIA ships rests on a data layer β and that data shapes how the model reasons, what it knows, and where it can be safely deployed. Nemotron Datasets are the open version of that foundation: web-scale pretraining corpora, alignment and reasoning data, multimodal grounding, and embodied AI simulation, released under permissive licenses with the training recipes and evaluation frameworks that produced them. Beyond Nemotron, NVIDIA's broader open data catalog spans 200+ releases across Physical AI and robotics, autonomous vehicles, biology and drug discovery, retrieval and evaluation benchmarks, and sovereign AI. Use the table below to find the right starting point for what you're trying to build.
|