You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Mirror-MoE-80M

A Sparse Mixture-of-Experts language model optimized for edge devices.

Metric	Value
Total Parameters	81M
Active Parameters	37M (2.2x sparse)
Experts	16 Sparse + 1 Shared Anchor
Context	512 tokens
Speed (Apple M4)	111 tokens/sec

🔥 Key Features

Extreme Efficiency: Only 37M parameters compute per token
Mobile-Ready: Runs at 100+ tok/s on Apple Silicon
Dual-Mode: Chat and RAG (context extraction) capable

📦 Model Variants

File	Best For
`mirror_ai_hybrid.safetensors`	General Chat + Fact Retrieval
`mirror_ai_elite.safetensors`	Logic + Instruction Following

📊 Benchmarks

Benchmark	Mirror-MoE-80M	Pythia-70M	Random
PIQA	53.6%	56%	50%
ARC-Easy	32.2%	37%	25%
HellaSwag	25.6%	26%	25%

Mirror-MoE achieves Pythia-70M-level PIQA with half the compute (37M vs 70M active params).

🚀 Quick Start

Apple Silicon (MLX)

pip install mlx tokenizers
python inference.py

PyTorch (CPU/CUDA)

pip install torch safetensors tokenizers
python inference_pytorch.py

📁 Files

File	Description
`mirror_ai_hybrid.safetensors`	Hybrid model weights (309MB)
`mirror_ai_elite.safetensors`	Elite model weights (309MB)
`custom_bpe_32k.json`	BPE tokenizer (32k vocab)
`model.py`	MLX architecture
`model_pytorch.py`	PyTorch architecture
`inference.py`	MLX inference script
`inference_pytorch.py`	PyTorch inference script

🏗️ Architecture

MirrorTransformer (81M total)
├── Embedding (16M)
├── 8x TransformerBlock
│   ├── Attention (RoPE)
│   └── MoE Layer
│       ├── Shared Expert (512-dim, always active)
│       └── 16 Sparse Experts (256-dim, Top-2 routing)
└── Output Head (16M)

📜 Citation

Research Paper

Read the Full Paper on Zenodo

@misc{mirror2026moe,
  title={Mirror-MoE-80M: Anchor-Stabilized Granular Mixture of Experts for Low-Resource Training},
  author={Dipesh Majithia},
  year={2026},
  publisher={Zenodo},
  doi={10.5281/zenodo.18473273},
  url={https://zenodo.org/records/18473273}
}

⚠️ Disclaimer

This is a research model. Outputs may be incorrect or biased. Not for production use without additional safety measures.

📄 License

CC BY 4.0 - Free to use with attribution to MirrorAI / Dipesh Majithia

Downloads last month: -; Downloads are not tracked for this model. How to track

dipeshmajithia
/

Mirror-80M-MoE