๐ Introducing MiniMind Max2 - Efficient LLMs for Edge Devices
#1
by
fariasultana
- opened
MiniMind Max2 is Here! ๐
We're excited to release MiniMind Max2, a family of efficient language models designed for edge deployment!
Key Features
- Only 25% parameters activated per token using Mixture of Experts
- 75% memory savings with Grouped Query Attention (4:1 ratio)
- Mobile-ready: Runs on smartphones, tablets, and IoT devices
- Easy export: ONNX, GGUF (llama.cpp), Android NDK
Model Sizes
| Model | Total | Active | INT4 Size |
|---|---|---|---|
| max2-nano | 500M | 125M | ~300MB |
| max2-lite | 1.5B | 375M | ~900MB |
| max2-pro | 3B | 750M | ~1.8GB |
Try It Now!
- ๐ฎ Demo: MiniMind-API Space
- ๐ Documentation: Check the README for full details
We'd love your feedback! Let us know what you think. ๐