tawkeed-embedding

tawkeed-embedding is an Arabic-first text embedding model built by Tawkeed, fine-tuned for on-device and edge AI deployment.

Forked from BAAI/bge-m3 and fine-tuned on Arabic semantic similarity and retrieval data, this model powers Arabic search, RAG, and similarity tasks running natively on Tawkeed devices.

Highlights

Arabic-first embeddings — trained and rigorously tested on Arabic text for semantic understanding
Edge-optimized — efficient enough to run embedding pipelines on Tawkeed edge hardware
Production-ready — validated on Arabic retrieval and similarity benchmarks
Multilingual — retains strong multilingual capability from BGE-M3

Model Details

Property	Value
Base Model	BAAI/bge-m3
Language	Arabic (ar), English (en), + multilingual
License	MIT
Task	Text Embedding / Retrieval / Similarity
Fine-tuning	Arabic semantic similarity & retrieval data
Deployment	On-device / Edge / Cloud

Training

This model is fine-tuned for Arabic embeddings through:

Fork of the BGE-M3 multilingual embedding model
Fine-tuning on Arabic semantic similarity and retrieval datasets
Evaluation on Arabic retrieval benchmarks

Usage

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("tawkeed-sa/tawkeed-embedding")

sentences = [
    "الذكاء الاصطناعي يغير العالم",
    "تقنيات التعلم العميق تتطور بسرعة",
    "الطقس جميل اليوم"
]

embeddings = model.encode(sentences)
print(embeddings.shape)

Tawkeed Model Family

A complete suite of Arabic AI models — from compact edge models to large-scale MoE — all fine-tuned and tested for Arabic.

Model	Size	Type
tawkeed-sa/tawkeed-0.8b	0.8b	Arabic LLM
tawkeed-sa/tawkeed-2b	2b	Arabic LLM
tawkeed-sa/tawkeed-4b	4b	Arabic LLM
tawkeed-sa/tawkeed-9b	9b	Arabic LLM
tawkeed-sa/tawkeed-27b	27b	Arabic LLM
tawkeed-sa/tawkeed-40b	40b	Arabic LLM
tawkeed-sa/tawkeed-27b-MLX	27b 8-bit	LLM — Apple Silicon (MLX)
tawkeed-sa/tawkeed-27b-GGUF	27b Q8_0	LLM — Ollama / llama.cpp
tawkeed-sa/tawkeed-ocr	—	OCR
tawkeed-sa/tawkeed-embedding	—	Embedding

About Tawkeed

Tawkeed builds Arabic-native AI that runs on the edge. Every model in the family is fine-tuned for Arabic, tested on Arabic benchmarks, and optimized for deployment on Tawkeed devices.

Built by Tawkeed.

Downloads last month: 4

Model tree for tawkeed-sa/tawkeed-embedding

Base model

BAAI/bge-m3

Quantized

(83)

this model