File size: 691 Bytes
4108e41 83903b2 4108e41 83903b2 4108e41 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | ---
language:
- ka
base_model:
- intfloat/multilingual-e5-small
tags:
- text-embeddings
- georgian
- multilingual-e5
---
# Georgian E5 Fine-tuned Text Embeddings
Fine-tuned version of `intfloat/multilingual-e5-small` for Georgian text embeddings using contrastive learning.
## Model Performance
- Validation Accuracy: 82.53%
- Training completed over 3 epochs
- Contrastive loss with margin=0.5
## Dataset
- 13,000+ Georgian text pairs across 9 semantic relationship types
## Usage
```python
from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("matsut21/georgian-e5-finetuned")
model = AutoModel.from_pretrained("matsut21/georgian-e5-finetuned") |