Please include the linear projectors
Dear Authors,
Thank you for your excellent work.
Following ColPali and the original ColBERT paper, it is common practice to project high-dimensional language model embeddings into a lower-dimensional space (e.g., 128 dimensions). I noticed that Table 6 in your paper studies the effect of this projection.
In your paper, you stated that: "However, even with 128-dim, the storage
requirement of 184.3 GB for 1M pages may still be too high for production environments handling large document corpora." However, it appears that the currently released model uses the native embedding dimensionality of the underlying LLM (i.e., 2560), which may be impractical for efficient indexing.
Could you please provide the linear projection layers used in your experiments, or point us to the appropriate resources?
Best regards,
Louis