Mary Kline

marykline1

AI & ML interests

None yet

Recent Activity

commentedon an article 8 days ago

SigLIP 2: A better multilingual vision language encoder

View all activity

Organizations

None yet

commented on SigLIP 2: A better multilingual vision language encoder 8 days ago

Indeed. If you want lengthier text, I would divide it into 64-token chunks.... (perhaps even overlapping), embed each one separately, and then either average.... the ends or, depending on your use case, dot each one with the picture..... embedding and calculate the maximum or average score.

Actually, I'm wondering what kinds of searches longer than 64 tokens you deal with. Almost every Siglip.... use case that comes to mind falls well below 64.

Mary Kline

AI & ML interests

Recent Activity

Organizations

marykline1's activity