Colbert Mode Usage

#41

by pulkitchahar - opened Apr 7, 2024

Apr 7, 2024

I wanted to store the colbert embeddings for faster reranking of retrieval based on dense vec. But considering that if a document have 1024 tokens on average(trunc if more), I will have 1024*1024 matrix, the size of which if i use fp16 will be 2MB. That sounds huge, especially when I think about scaling up. Am I doing this right, or am I missing something? Are there any ways to decrease the size but still keep the performance similar to original.

bobox

May 21, 2024

i'm also interested

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment