Jaswanth-0821 commited on
Commit
d4e21e9
·
verified ·
1 Parent(s): 3936726

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -1
README.md CHANGED
@@ -7,4 +7,71 @@ tags:
7
  - feature-extraction
8
  - text-embeddings-inference
9
  ---
10
- # Tarka Embedding 30M V1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - feature-extraction
8
  - text-embeddings-inference
9
  ---
10
+ # Tarka Embedding 30M V1
11
+
12
+ > [!NOTE]
13
+ > ## Features
14
+ > - Compressed model by 20x.
15
+ > - Recovered approx. 86% performance on MTEB(Eng, v2) Benchmark
16
+
17
+ For more details refer the [blog post](https://tarka-air.gitbook.io/home/tarka-v1/tarka-embedding-30m-v1)
18
+ ## Results
19
+
20
+ ### MTEB(Eng, V2)
21
+
22
+ | Model | Parameters (B) | Mean (Task) | Mean (TaskType) | Classification | Clustering | Pair Classification | Reranking | Retrieval | STS | Summarization |
23
+ |------------------------------|----------------|-------------|------------------|----------------|------------|---------------------|-----------|-----------|-------|---------------|
24
+ | all-MiniLM-L6-v2 | 0.023 | 59.03 | 55.93 | 69.25 | 44.9 | 82.37 | 47.14 | 42.92 | 78.95 | 25.96 |
25
+ | gte-micro-v4 | 0.019 | 58.9 | 56.04 | 73.04 | 43.89 | 82.67 | 44.78 | 39.51 | 79.78 | 28.59 |
26
+ | snowflake-arctic-embed-xs | 0.023 | 59.77 | 56.12 | 67 | 42.44 | 81.33 | 45.26 | 52.65 | 76.21 | 27.96 |
27
+ | gte-micro | 0.017 | 53.89 | 52.5 | 67.47 | 41.86 | 80.76 | 43.16 | 27.66 | 77.86 | 28.76 |
28
+ | Qwen3 Embedding 0.6B | 0.6 | 70.7 | 64.88 | 85.76 | 54.05 | 84.37 | 48.18 | 61.83 | 86.57 | 33.43 |
29
+ | Tarka Embedding 30M V1 (S) | 0.03 | 46.07 | 45.22 | 60.37 | 41.37 | 66.29 | 38.34 | 19.56 | 64.15 | 26.44 |
30
+ | Tarka Embedding 30M V1 (M) | 0.03 | 51.96 | 49.88 | 66.52 | 43.47 | 70.66 | 40.12 | 30.15 | 69.81 | 28.42 |
31
+ | Tarka Embedding 30M V1 (L) | 0.03 | 60.43 | 56.69 | 79.2 | 46.99 | 78.24 | 43.32 | 42.5 | 76.92 | 29.63 |
32
+
33
+
34
+
35
+ ## Usage
36
+ ```python
37
+ from sentence_transformers import SentenceTransformer
38
+
39
+ # We recommend enabling flash_attention_2 for better acceleration and memory saving,
40
+ model = SentenceTransformer(
41
+ "Tarka-AIR/Tarka-Embedding-30M-V1",
42
+ trust_remote_code=True,
43
+ model_kwargs={
44
+ "attn_implementation": "flash_attention_2",
45
+ "device_map": "cuda",
46
+ "torch_dtype": "bfloat16",
47
+ },
48
+ tokenizer_kwargs={"padding_side": "left"},
49
+ )
50
+
51
+ # Config the model inference mode ("L","M","S")
52
+ model[0].auto_model.configure_subnetwork("L")
53
+
54
+ # The queries and documents to embed
55
+ queries = [
56
+ "What is the capital of China?",
57
+ "Explain gravity",
58
+ ]
59
+ documents = [
60
+ "The capital of China is Beijing.",
61
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
62
+ ]
63
+
64
+ # Encode the queries and documents. Note that queries benefit from using a prompt
65
+ # Here we use the prompt called "query" stored under `model.prompts`, but you can
66
+ # also pass your own prompt via the `prompt` argument
67
+ query_embeddings = model.encode(queries, prompt_name="query")
68
+ document_embeddings = model.encode(documents)
69
+
70
+ # Compute the (cosine) similarity between the query and document embeddings
71
+ similarity = model.similarity(query_embeddings, document_embeddings)
72
+ print(similarity)
73
+
74
+ # tensor([[0.8371, 0.1740],
75
+ # [0.2176, 0.6293]])
76
+
77
+ ```