Magenta RealTime 2 — AOTInductor step graphs (base)

Prebuilt AOTInductor (weight-less) compiled graphs for the per-frame generation step of magenta-torch/magenta-realtime-2. They let the model run faster-than-real-time without calling torch.compile at runtime (e.g. on ZeroGPU, where torch.compile is unavailable).

⚠️ Hardware-specific

AOTI artifacts are compiled for a specific GPU architecture — these were built for NVIDIA RTX 6000 PRO (Blackwell, sm_120), the ZeroGPU architecture. They will not load on other GPUs (A100, H100, L4, T4, consumer cards, …).

On any other GPU, don't use these. Instead:

model = AutoModel.from_pretrained("magenta-torch/magenta-realtime-2",
                                  trust_remote_code=True, dtype=torch.bfloat16).to("cuda")
model.compile_steps()          # portable torch.compile — works on any CUDA GPU

Or export your own AOTI graphs for your architecture from the per-frame step (MagentaRT2ForConditionalGeneration.depthformer.decoder) — see the converter/compile utilities in the dev repo (fork).

Use (only on matching hardware)

model.load_compiled("magenta-torch/magenta-rt-aoti-base")   # binds the weight-less graphs to the model's weights

Contains temporal.pt2 and depth.pt2 (the two hot step graphs) + metadata.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for magenta-community/magenta-rt-aoti-base

Finetuned
(2)
this model