Upload folder using huggingface_hub

26de0fc verified 28 days ago

1.7 kB

license: other
license_name: nvidia-open-model-license
license_link: >-
  https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/
tags:
  - robotics
  - video-generation
  - diffusion
  - action-conditioned
  - dreamdojo
  - cosmos-predict2.5
library_name: diffusers
pipeline_tag: video-to-video

DreamDojo-AgiBot-2B-Diffusers

Fine-tuned on AgiBot robot data. Part of the DreamDojo model family.


Size	2B
Stage	Post-training
Architecture	DiT (Diffusion Transformer) with AdaLN-LoRA
Base	Cosmos Predict 2.5

Checkpoint Structure

DreamDojo-AgiBot-2B-Diffusers/
├── transformer/            # DiT backbone (sharded safetensors)
├── crossattn_adapter/      # Text-to-DiT projection (100352 → 1024)
├── vae/                    # AutoencoderKLWan (standard diffusers)
├── lam/                    # Latent Action Model (710M params)
├── text_encoder/           # Cosmos-Reason1-7B
├── scheduler/              # FlowMatchEulerDiscreteScheduler
├── action_processor/       # DreamDojo-specific config
└── config.json

Architecture

	2B
Model channels	2048
Transformer blocks	28
Attention heads	16
Patch size (spatial / temporal)	2 / 1
Action dim	384 (unified)

Citation

@article{dreamdojo2025,
  title={DreamDojo: Advancing Real-World Robot Policies Through Generated Interactive Environments},
  author={NVIDIA},
  year={2025}
}

License

Please refer to the NVIDIA DreamDojo repository for license terms.