matsut21's picture
Update README.md
83903b2 verified
metadata
language:
  - ka
base_model:
  - intfloat/multilingual-e5-small
tags:
  - text-embeddings
  - georgian
  - multilingual-e5

Georgian E5 Fine-tuned Text Embeddings

Fine-tuned version of intfloat/multilingual-e5-small for Georgian text embeddings using contrastive learning.

Model Performance

  • Validation Accuracy: 82.53%
  • Training completed over 3 epochs
  • Contrastive loss with margin=0.5

Dataset

  • 13,000+ Georgian text pairs across 9 semantic relationship types

Usage

from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("matsut21/georgian-e5-finetuned")
model = AutoModel.from_pretrained("matsut21/georgian-e5-finetuned")