Victor Morand's picture

Victor Morand

VictorMorand

·

https://victormorand.github.io

AI & ML interests

Information Retrieval - Interpretability

Recent Activity

upvoted an article 8 days ago

Party is over: regularizing ColBERT models to fix efficient ANN methods

commentedon an article 9 days ago

Introducing the Ettin Reranker Family

liked a dataset 9 days ago

cross-encoder/lightonai-embeddings-fine-tuning-reranked-v1

View all activity

Organizations

upvoted an article 8 days ago

Article

Party is over: regularizing ColBERT models to fix efficient ANN methods

lightonai

•

9 days ago

• 23

commented on Introducing the Ettin Reranker Family 9 days ago

Amazing work @tomaarsen , thanks for everything you do for open source !
May I ask some questions on the recipe ?

If I understand well, you are mixing lightOn hard negatives data (with Jang et al. stratified sampling) with broader lightonai/embeddings-pre-training - which (i think) doesn't include mined negatives
- Did you try several mixes of pretraining / resampled hard negatives ?
In ST, you advise to discard Arguana and Touché when using Nanobeir13, I guess it works best with those at the end ?
Best,

liked a dataset 9 days ago

cross-encoder/lightonai-embeddings-fine-tuning-reranked-v1

Preview • Updated May 19 • 1.75k • 8

liked a model 15 days ago

cross-encoder/ettin-reranker-150m-v1

Text Ranking • 0.1B • Updated May 19 • 13.3k • 2

upvoted an article about 1 month ago

Article

Introducing the Ettin Reranker Family

tomaarsen

•

May 19

• 52

liked a model about 2 months ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 3 days ago • 2.05M • • 5.05k

liked a dataset about 2 months ago

LequeuISIR/GDN-CC-large

Viewer • Updated Apr 28 • 541k • 118 • 1

upvoted a paper about 2 months ago

The GDN-CC Dataset: Automatic Corpus Clarification for AI-enhanced Democratic Citizen Consultations

Paper • 2601.14944 • Published Apr 20 • 5

liked a dataset about 2 months ago

LequeuISIR/GDN-CC

Viewer • Updated Apr 28 • 2.29k • 63 • 4

liked 2 datasets 2 months ago

lightonai/embeddings-fine-tuning

Viewer • Updated Apr 17 • 53.7M • 2.76k • 20

lightonai/embeddings-pre-training-curated

Viewer • Updated Apr 21 • 665M • 7.88k • 12

upvoted an article 2 months ago

Article

DenseOn with the LateOn: Open State-of-the-Art Single and Multi-Vector Models

lightonai

•

Apr 21

• 41

upvoted a paper 2 months ago

Boosting Visual Instruction Tuning with Self-Supervised Guidance

Paper • 2604.12966 • Published Apr 14 • 11

liked a model 2 months ago

llm2ner/ToMMeR-Llama-3.2-1B_L6_R64

Token Classification • 264k • Updated Apr 9 • 1

updated 6 models 3 months ago

llm2ner/ToMMeR-phi-4_L3_R64

Token Classification • 660k • Updated Apr 9

llm2ner/ToMMeR-phi-2_L5_R64

Token Classification • 330k • Updated Apr 9

llm2ner/ToMMeR-phi-1_5_L5_R64

Token Classification • 264k • Updated Apr 9

llm2ner/ToMMeR-mistral-7b_L5_R64

Token Classification • 528k • Updated Apr 9

llm2ner/ToMMeR-Llama-3.2-3B_L1_R64

Token Classification • 396k • Updated Apr 9

llm2ner/ToMMeR-Llama-3.1-8B_L5_R64

Token Classification • 528k • Updated Apr 9