zhichao-geng (Zhichao Geng)

upvoted a paper 26 days ago

Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers

Paper • 2411.04403 • Published Nov 7, 2024 • 2

New activity in opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte 3 months ago

Unable to deploy model with huggingface tei

➕ 👀 2

5

#2 opened 5 months ago by

dhruv-wrk

commented on Introducing RTEB: A New Standard for Retrieval Evaluation 4 months ago

Observed a fun fact: English-only models tend to get worse performance on closed datasets, while multilingual models are better at closed dataset.

Is it because the baseline, MTEB, is a benchmark merely in English, and RTEB is multilingual? It's natural that multilingual model get better performance on multilingual benchmark instead of English.

upvoted 2 articles 4 months ago

Article

Introducing RTEB: A New Standard for Retrieval Evaluation

+4

Oct 1, 2025

•

134

Article

Vocabulary is the most important element of Sparse Retrieval

Oct 4, 2025

•

10

New activity in allenai/dolma 5 months ago

update DATA_DIR to DOLMA_DATA_DIR

#53 opened 5 months ago by

zhichao-geng

liked a dataset 5 months ago

allenai/dolma

Updated Apr 17, 2024 • 1.89k • 983

liked a model 5 months ago

opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini

Feature Extraction • 22.7M • Updated Jun 30, 2025 • 459 • 6

updated 2 datasets 6 months ago

opensearch-project/msmarco-hard-negatives-llm-scores

Viewer • Updated Aug 14, 2025 • 503k • 9 • 3

opensearch-project/msmarco-hard-negatives

Viewer • Updated Aug 14, 2025 • 503k • 10 • 4

liked a dataset 6 months ago

opensearch-project/msmarco-hard-negatives-llm-scores

Viewer • Updated Aug 14, 2025 • 503k • 9 • 3

published a dataset 6 months ago

opensearch-project/msmarco-hard-negatives-llm-scores

Viewer • Updated Aug 14, 2025 • 503k • 9 • 3

liked a dataset 6 months ago

opensearch-project/msmarco-hard-negatives

Viewer • Updated Aug 14, 2025 • 503k • 10 • 4

published a dataset 6 months ago

opensearch-project/msmarco-hard-negatives

Viewer • Updated Aug 14, 2025 • 503k • 10 • 4

liked a model 6 months ago

Luyu/co-condenser-marco

Fill-Mask • Updated Aug 13, 2021 • 362 • 5

updated a model 6 months ago

zhichao-geng/gte-en-mlm-base-resize

0.2B • Updated Aug 1, 2025 • 2

New activity in Alibaba-NLP/gte-en-mlm-large 6 months ago

Alibaba-NLP/gte-en-mlm-large get worse MLM loss than Alibaba-NLP/gte-en-mlm-base

#2 opened 6 months ago by

zhichao-geng

published a model 6 months ago

zhichao-geng/gte-en-mlm-base-resize

0.2B • Updated Aug 1, 2025 • 2

updated a model 6 months ago

opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte

Feature Extraction • 0.1B • Updated Jul 22, 2025 • 15.9k • 13

upvoted a paper 7 months ago

Exploring ell_0 Sparsification for Inference-free Sparse Retrievers

Paper • 2504.14839 • Published Apr 21, 2025 • 4

Zhichao Geng

AI & ML interests

Recent Activity

Organizations

Towards Competitive Search Relevance For Inference-Free Learned Sparse Retrievers

Unable to deploy model with huggingface tei

Introducing RTEB: A New Standard for Retrieval Evaluation

Vocabulary is the most important element of Sparse Retrieval

update DATA_DIR to DOLMA_DATA_DIR

allenai/dolma

opensearch-project/opensearch-neural-sparse-encoding-doc-v2-mini

opensearch-project/msmarco-hard-negatives-llm-scores

opensearch-project/msmarco-hard-negatives

opensearch-project/msmarco-hard-negatives-llm-scores

opensearch-project/msmarco-hard-negatives-llm-scores

opensearch-project/msmarco-hard-negatives

opensearch-project/msmarco-hard-negatives

Luyu/co-condenser-marco

zhichao-geng/gte-en-mlm-base-resize

Alibaba-NLP/gte-en-mlm-large get worse MLM loss than Alibaba-NLP/gte-en-mlm-base

zhichao-geng/gte-en-mlm-base-resize

opensearch-project/opensearch-neural-sparse-encoding-doc-v3-gte

Exploring ell_0 Sparsification for Inference-free Sparse Retrievers

Zhichao Geng

AI & ML interests

Recent Activity

Organizations

zhichao-geng's activity

Unable to deploy model with huggingface tei

Introducing RTEB: A New Standard for Retrieval Evaluation

Vocabulary is the most important element of Sparse Retrieval

update DATA_DIR to DOLMA_DATA_DIR

Alibaba-NLP/gte-en-mlm-large get worse MLM loss than Alibaba-NLP/gte-en-mlm-base