Feature Extraction
sentence-transformers
PyTorch
ONNX
Safetensors
Transformers
Transformers.js
English
bert
fill-mask
sentence-similarity
mteb
custom_code
text-embeddings-inference
🇪🇺 Region: EU
Instructions to use jinaai/jina-embeddings-v2-base-code with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use jinaai/jina-embeddings-v2-base-code with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use jinaai/jina-embeddings-v2-base-code with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="jinaai/jina-embeddings-v2-base-code", trust_remote_code=True)# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) model = AutoModelForMaskedLM.from_pretrained("jinaai/jina-embeddings-v2-base-code", trust_remote_code=True) - Transformers.js
How to use jinaai/jina-embeddings-v2-base-code with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('feature-extraction', 'jinaai/jina-embeddings-v2-base-code'); - Notebooks
- Google Colab
- Kaggle
Add transformers.js sample code
#6
by Xenova HF Staff - opened
README.md
CHANGED
|
@@ -185,6 +185,26 @@ embeddings = model.encode([
|
|
| 185 |
print(cos_sim(embeddings[0], embeddings[1]))
|
| 186 |
```
|
| 187 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 188 |
## Plans
|
| 189 |
|
| 190 |
1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese.
|
|
|
|
| 185 |
print(cos_sim(embeddings[0], embeddings[1]))
|
| 186 |
```
|
| 187 |
|
| 188 |
+
You can also use the [Transformers.js](https://huggingface.co/docs/transformers.js) library to compute embeddings in JavaScript.
|
| 189 |
+
```js
|
| 190 |
+
// npm i @xenova/transformers
|
| 191 |
+
import { pipeline, cos_sim } from '@xenova/transformers';
|
| 192 |
+
|
| 193 |
+
const extractor = await pipeline('feature-extraction', 'jinaai/jina-embeddings-v2-base-code', {
|
| 194 |
+
quantized: false, // Comment out this line to use the 8-bit quantized version
|
| 195 |
+
});
|
| 196 |
+
|
| 197 |
+
const texts = [
|
| 198 |
+
'How do I access the index while iterating over a sequence with a for loop?',
|
| 199 |
+
'# Use the built-in enumerator\nfor idx, x in enumerate(xs):\n print(idx, x)',
|
| 200 |
+
]
|
| 201 |
+
const embeddings = await extractor(texts, { pooling: 'mean' });
|
| 202 |
+
|
| 203 |
+
const score = cos_sim(embeddings[0].data, embeddings[1].data);
|
| 204 |
+
console.log(score);
|
| 205 |
+
// 0.7281748759529421
|
| 206 |
+
```
|
| 207 |
+
|
| 208 |
## Plans
|
| 209 |
|
| 210 |
1. Bilingual embedding models supporting more European & Asian languages, including Spanish, French, Italian and Japanese.
|