Instructions to use nvidia/NV-Embed-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use nvidia/NV-Embed-v1 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("nvidia/NV-Embed-v1", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
Weights are in FP16 (loaded in FP32) but paper mentions BF16
#17
by AdrienC - opened
The paper mentions that the training was done in bf16 (as one would expect with a Mistral model) however the safetensors files are float16 and the config.json loads the weights in float32. I would expect that saving the weights in FP16 could lead to overflows coming from BF16.
Could you give us more details on how to load and potentially fine-tune this model without running into issues ?