AI & ML interests
We research and develop linguistically-grounded optimization techniques for Large Language Models, focusing on how ancient linguistic structures can solve modern computational challenges.
Recent Activity
đ¯ Our Mission
We research and develop linguistically-grounded optimization techniques for Large Language Models, focusing on how ancient linguistic structures can solve modern computational challenges.
đŦ Core Research: Semantic Compression Layer
Our flagship project explores using Arabic morphological structure as an intermediate representation layer for LLMs.
The Problem
Current tokenizers fragment text inefficiently, creating a "Token Tax" that:
- Inflates compute costs quadratically
- Disadvantages 160+ high-fertility languages
- Wastes billions in training/inference costs
Our Solution
Arabic's 1,400-year-old root system offers a mathematical framework for semantic compression:
Ų-ØĒ-ب (k-t-b) = "writing"
â
ââ ŲŲØĒŲØ¨Ų wrote
ââ ŲŲØĒŲØ§Ø¨ book
ââ ŲŲØ§ØĒŲØ¨ writer
ââ Ų
ŲŲŲØĒŲŲØ¨ written
ââ Ų
ŲŲŲØĒŲØ¨ŲØŠ library
One root â Many meanings
Expected Impact:
- đ¯ 30-50% token reduction
- ⥠Up to 75% compute savings
- đ Language-agnostic at the user level
đĻ Coming Soon to Hugging Face
We're working on releasing:
| Type | Description | Status |
|---|---|---|
| đ¤ Models | Root-compressed LLM variants | đŦ In Research |
| đ Datasets | Arabic root-to-concept mappings | đ Planned |
| đ Spaces | Interactive compression demos | đ Planned |
đ¤ Get Involved
We're an open research initiative seeking collaborators:
- đ¤ Linguists â Arabic morphology experts to validate mappings
- đ¤ ML Engineers â Tokenizer training & model fine-tuning
- đ Researchers â Experiment design & benchmarking
- ⥠Systems Engineers â Inference optimization
đ Publications
- White Paper: Root-Based Semantic Compression (January 2026)
- Leveraging Arabic Morphological Structure as an Optimization Layer for LLMs
Making AI more efficient through linguistic insight
Open Research âĸ Open Source âĸ Open Collaboration