Instructions to use CloveAI/clov-embed-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use CloveAI/clov-embed-v2 with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("CloveAI/clov-embed-v2") sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Notebooks
- Google Colab
- Kaggle
| language: en | |
| tags: | |
| - sentence-transformers | |
| - embeddings | |
| - semantic-search | |
| - retrieval | |
| license: mit | |
| ```python | |
| import onnxruntime as ort | |
| import numpy as np | |
| from transformers import AutoTokenizer | |
| from huggingface_hub import hf_hub_download | |
| # ββ Load βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| tokenizer = AutoTokenizer.from_pretrained("CloveAI/clov-embed-v2", subfolder="tokenizer") | |
| onnx_path = hf_hub_download("CloveAI/clov-embed-v2", "onnx/biencoder_rope.onnx") | |
| session = ort.InferenceSession(onnx_path, providers=["CPUExecutionProvider"]) | |
| # ββ Encode βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| def encode(texts): | |
| if isinstance(texts, str): texts = [texts] | |
| enc = tokenizer(texts, padding=True, truncation=True, max_length=256, return_tensors="np") | |
| return session.run(["embeddings"], {"input_ids": enc["input_ids"], "attention_mask": enc["attention_mask"]})[0] | |
| # ββ Test βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ | |
| emb = encode("Hello world!") | |
| print(emb) # (1, 256) | |
| ``` | |
| # BiEncoder RoPE β Sentence Embedding Model | |
| A 34M parameter sentence embedding model trained from scratch using PyTorch. | |
| ## Architecture | |
| - 6-layer Transformer encoder with RoPE positional embeddings | |
| - Mean pooling + L2 normalization | |
| - 256-dim output vectors | |
| ## Training (Curriculum) | |
| | Phase | Dataset | Loss | | |
| |---|---|---| | |
| | 1 | all-nli | MNRLoss | | |
| | 2 | squad | MNRLoss | | |
| | 3 | msmarco-bm25 | HardNegativeLoss | | |
| | 4 | natural-questions | MNRLoss | | |
| ## Files | |
| - `tokenizer/` β HuggingFace tokenizer (bert-base-uncased) | |
| - `pytorch/checkpoint_phase4_nq.pt` β PyTorch weights | |
| - `onnx/biencoder_rope.onnx` β ONNX FP32 | |
| - `onnx/biencoder_rope_int8.onnx` β ONNX INT8 (recommended for CPU) | |
| ## Performance | |
| - FP32 ONNX size : 134.3 MB | |
| - INT8 ONNX size : 34.6 MB | |
| - Throughput : ~247 sentences/sec on CPU | |