File size: 691 Bytes
4108e41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83903b2
4108e41
 
 
 
83903b2
4108e41
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
language:
- ka
base_model:
- intfloat/multilingual-e5-small
tags:
- text-embeddings
- georgian
- multilingual-e5
---

# Georgian E5 Fine-tuned Text Embeddings

Fine-tuned version of `intfloat/multilingual-e5-small` for Georgian text embeddings using contrastive learning.

## Model Performance
- Validation Accuracy: 82.53%
- Training completed over 3 epochs
- Contrastive loss with margin=0.5

## Dataset
- 13,000+ Georgian text pairs across 9 semantic relationship types

## Usage
```python
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("matsut21/georgian-e5-finetuned")
model = AutoModel.from_pretrained("matsut21/georgian-e5-finetuned")