Strip video_embeddings_connector + video_aggregate_embed (saves 4.77 GB)
Browse filesRemoved 131 dead tensors that audio-only inference never touches:
- model.diffusion_model.video_embeddings_connector.* (~3.2 GB, 8 transformer blocks at 4096-dim)
- text_embedding_projection.video_aggregate_embed.* (~1.5 GB)
Runtime audio_only=True path drops these before .to(device) and replaces video_aggregate_embed with a zero-returning dummy. Local A/B inference reproduces identical denoise/decode timing and output shape.
6.71 GB -> 1.94 GB.
dramabox-audio-components.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:73d50dd3e913fd1d2511a09e4a2225f60f2ede43ef629764e6d4a389422bf7d1
|
| 3 |
+
size 1942831020
|