Update README.md
Browse files
README.md
CHANGED
|
@@ -13,6 +13,16 @@ tags:
|
|
| 13 |
|
| 14 |
# SparkEmbedding-300m Model Card
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 16 |
### Description
|
| 17 |
SparkEmbedding-300m is a 300 million parameter multilingual text embedding model with **SoTA cross‑lingual retrieval** developed by the XenArcAI team. Fine-tuned from Google's EmbeddingGemma-300m, it incorporates an additional 1 million curated samples across 119 languages, emphasizing data complexity, linguistic diversity, and deep language understanding. This optimization enhances cross-lingual retrieval, producing embeddings with superior semantic alignment and efficacy in multilingual settings.
|
| 18 |
|
|
|
|
| 13 |
|
| 14 |
# SparkEmbedding-300m Model Card
|
| 15 |
|
| 16 |
+
|
| 17 |
+
<p align="center">
|
| 18 |
+
<img
|
| 19 |
+
src="https://cdn-uploads.huggingface.co/production/uploads/677fcdf29b9a9863eba3f29f/MMX5ZPqxa639HtG-cpt6c.png"
|
| 20 |
+
alt="CodeX Banner"
|
| 21 |
+
width="70%"
|
| 22 |
+
style="border-radius:15px;"
|
| 23 |
+
/>
|
| 24 |
+
|
| 25 |
+
|
| 26 |
### Description
|
| 27 |
SparkEmbedding-300m is a 300 million parameter multilingual text embedding model with **SoTA cross‑lingual retrieval** developed by the XenArcAI team. Fine-tuned from Google's EmbeddingGemma-300m, it incorporates an additional 1 million curated samples across 119 languages, emphasizing data complexity, linguistic diversity, and deep language understanding. This optimization enhances cross-lingual retrieval, producing embeddings with superior semantic alignment and efficacy in multilingual settings.
|
| 28 |
|