nvidia
/

llama-nemoretriever-colembed-1b-v1

Visual Document Retrieval

llama_nemoretrievercolembed

feature-extraction

multimodal-embedding

multilingual-embedding

Text-to-Visual Document (T→VD) retrieval

Model card Files Files and versions

nv-bschifferer commited on Jun 26

Commit

1f0fdea

·

1 Parent(s): 6a21313

update readme

Files changed (2) hide show

README.md +3 -4
config.json +0 -1

README.md CHANGED Viewed

@@ -20,13 +20,11 @@ pipeline_tag: visual-document-retrieval
 # llama-nemoretriever-colembed-1b-v1
-# llama-nemoretriever-colembed-1b-v1
 ## Description
 The **nvidia/llama-nemoretriever-colembed-1b-v1** is a late interaction embedding model  fine-tuned for query-document retrieval. Users can input `queries`, which are text, or `documents` which are page images, to the model. The model outputs ColBERT-style multi-vector numerical representations for input queries and documents.  It is the smaller version of [llama-nemoretriever-colembed-3b-v1](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1), which achieved 1st place on ViDoRe V1 (nDCG@5), ViDoRe V2 (nDCG@5) and MTEB VisualDocumentRetrieval (Rank Borda) (as of 27th June, 2025). **nvidia/llama-nemoretriever-colembed-1b-v1** achieves 2nd place on the benchmarks.
-This model is for non-commercial/research use only.                                             |
 ### License/Terms of Use
 Governing Terms: [NVIDIA License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/LICENSE)
@@ -114,11 +112,12 @@ from transformers import AutoModel
 # Load Model
 model = AutoModel.from_pretrained(
-    'nvidia/llama-NemoRetriever-ColEmbed-1B-v1',
     device_map='cuda',
     trust_remote_code=True,
     torch_dtype=torch.bfloat16,
     attn_implementation="flash_attention_2",
 ).eval()
 # Queries

 # llama-nemoretriever-colembed-1b-v1
 ## Description
 The **nvidia/llama-nemoretriever-colembed-1b-v1** is a late interaction embedding model  fine-tuned for query-document retrieval. Users can input `queries`, which are text, or `documents` which are page images, to the model. The model outputs ColBERT-style multi-vector numerical representations for input queries and documents.  It is the smaller version of [llama-nemoretriever-colembed-3b-v1](https://huggingface.co/nvidia/llama-nemoretriever-colembed-3b-v1), which achieved 1st place on ViDoRe V1 (nDCG@5), ViDoRe V2 (nDCG@5) and MTEB VisualDocumentRetrieval (Rank Borda) (as of 27th June, 2025). **nvidia/llama-nemoretriever-colembed-1b-v1** achieves 2nd place on the benchmarks.
+This model is for non-commercial/research use only.
 ### License/Terms of Use
 Governing Terms: [NVIDIA License](https://huggingface.co/nvidia/llama-nemoretriever-colembed-1b-v1/blob/main/LICENSE)
 # Load Model
 model = AutoModel.from_pretrained(
+    'nvidia/llama-nemoretriever-colembed-1b-v1',
     device_map='cuda',
     trust_remote_code=True,
     torch_dtype=torch.bfloat16,
     attn_implementation="flash_attention_2",
+    revision='6a21313a150a903bc522dc0d15ed47784a0d4c8d'
 ).eval()
 # Queries

config.json CHANGED Viewed

@@ -1,6 +1,5 @@
 {
   "_commit_hash": null,
-  "_name_or_path": "./model_1b_test/",
   "architectures": [
     "llama_NemoRetrieverColEmbed"
   ],

 {
   "_commit_hash": null,
   "architectures": [
     "llama_NemoRetrieverColEmbed"
   ],