Manmay commited on
Commit
cb2ed56
·
verified ·
1 Parent(s): 597bbe9

Strip video_embeddings_connector + video_aggregate_embed (saves 4.77 GB)

Browse files

Removed 131 dead tensors that audio-only inference never touches:
- model.diffusion_model.video_embeddings_connector.* (~3.2 GB, 8 transformer blocks at 4096-dim)
- text_embedding_projection.video_aggregate_embed.* (~1.5 GB)

Runtime audio_only=True path drops these before .to(device) and replaces video_aggregate_embed with a zero-returning dummy. Local A/B inference reproduces identical denoise/decode timing and output shape.

6.71 GB -> 1.94 GB.

dramabox-audio-components.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5fc3fd010c386ddce78f58ac43600cdbc921c50b5b0426ff08167863bfd419d7
3
- size 5167945180
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73d50dd3e913fd1d2511a09e4a2225f60f2ede43ef629764e6d4a389422bf7d1
3
+ size 1942831020