AI & ML interests

We research and develop linguistically-grounded optimization techniques for Large Language Models, focusing on how ancient linguistic structures can solve modern computational challenges.

Recent Activity

đŸŒŋ Root Semantic Research

Pioneering linguistic efficiency in artificial intelligence

GitHub Research Paper


đŸŽ¯ Our Mission

We research and develop linguistically-grounded optimization techniques for Large Language Models, focusing on how ancient linguistic structures can solve modern computational challenges.


đŸ”Ŧ Core Research: Semantic Compression Layer

Our flagship project explores using Arabic morphological structure as an intermediate representation layer for LLMs.

The Problem

Current tokenizers fragment text inefficiently, creating a "Token Tax" that:

  • Inflates compute costs quadratically
  • Disadvantages 160+ high-fertility languages
  • Wastes billions in training/inference costs

Our Solution

Arabic's 1,400-year-old root system offers a mathematical framework for semantic compression:

؃-ØĒ-ب (k-t-b) = "writing"
    │
    ├─ ŲƒŲŽØĒŲŽØ¨ŲŽ   wrote
    ├─ ؃ؐØĒŲŽØ§Ø¨  book
    ├─ ŲƒŲŽØ§ØĒŲØ¨  writer
    ├─ Ų…ŲŽŲƒŲ’ØĒŲŲˆØ¨ written
    └─ Ų…ŲŽŲƒŲ’ØĒŲŽØ¨ŲŽØŠ library

One root → Many meanings

Expected Impact:

  • đŸŽ¯ 30-50% token reduction
  • ⚡ Up to 75% compute savings
  • 🌍 Language-agnostic at the user level

đŸ“Ļ Coming Soon to Hugging Face

We're working on releasing:

Type Description Status
🤖 Models Root-compressed LLM variants đŸ”Ŧ In Research
📊 Datasets Arabic root-to-concept mappings 📋 Planned
🚀 Spaces Interactive compression demos 📋 Planned

🤝 Get Involved

We're an open research initiative seeking collaborators:

  • 🔤 Linguists — Arabic morphology experts to validate mappings
  • 🤖 ML Engineers — Tokenizer training & model fine-tuning
  • 📊 Researchers — Experiment design & benchmarking
  • ⚡ Systems Engineers — Inference optimization

📚 Publications


Making AI more efficient through linguistic insight

Open Research â€ĸ Open Source â€ĸ Open Collaboration

models 0

None public yet