Virchow output PRISM input latent dims mismatch

#21

by lkuhn - opened Oct 27, 2025

Oct 27, 2025

Hello,

This might be a stupid question but i cant seem to find the answer.
I want to use prism to compute slide level embeddings and perform content based image retrieval based on embedding similarity.
As your documentation specifies i am using Virchow to compute patch level embeddings, but Virchow outputs tile embeddings with 1280 latent dimension and PRISM expects 2560 as given by the example and my error message.
In the example it is stated to consult the Virchow repo for guidance on how to compute embeddings but I cant find how to obtain them in 2560, is there another virchow model that does that?

Thank you for your help!

adamcasson

Paige AI org Oct 27, 2025

Yes, 1280 is the embedding dim size of the Virchow architecture for the class and patch tokens. However, in the Virchow repo documentation we use the concatenation of the 1280-d class token and the 1280-d average of the patch tokens as the full tile embedding:

This is how we arrive to a 2560-d input for PRISM.

lkuhn

Oct 28, 2025

Oh i see, I assumed that just the class token was used, Thanks for your Answer!

lkuhn changed discussion status to closed Oct 28, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment