clip_feat_dim unexpected by AsymmetricAttention.init()

by panopstor - opened Oct 22, 2024

Oct 22, 2024

•

edited Oct 22, 2024

Hi, I'm trying to hack on the code a bit to see if I can get this to run on a single <48Gb node and ran into a few problems.

clip_feat_dim from the yaml isn't expected but it is passed by **block_kwargs into AsymmetricAttention.init which won't accept it.

I was able to work around this by adding **block_kwargs to the init function so any extra kwargs are effectively ignored, but just a heads up.

Also this assert fails in t2v_synth_mochi.py:

    assert y_feat[-1].shape == (B, MAX_T5_TOKEN_LENGTH, 4096)

It seemingly matches, but the tensor.shape function returns a tensor, not a tuple:

    print(f"y_feat[-1].shape = {y_feat[-1].shape}")

Output:

    (T2VSynthMochiModel pid=3652095) y_feat[-1].shape = torch.Size([2, 256, 4096])

I temporarily corrected it to this, which works:

assert y_feat[-1].shape == torch.zeros(B, MAX_T5_TOKEN_LENGTH, 4096).shape

Edit: for anyone else trying to downsize, set num_workers = 1 in infer.py line 35, if I can get it working I'll share code. Trying 16bit and possibly bitsandbytes to see if I can get it down a bit...

panopstor

Oct 23, 2024

Fork live here:
https://github.com/victorchall/genmoai-smol
Going to close since I got it working.

panopstor changed discussion status to closed Oct 23, 2024

ved-genmo

Oct 23, 2024

Both issues should now be fixed! Thanks for the detailed bug report, and lmk if you run into anything else.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

clip_feat_dim unexpected by AsymmetricAttention.__init__()

clip_feat_dim unexpected by AsymmetricAttention.init()