๐Ÿš€ Introducing MiniMind Max2 - Efficient LLMs for Edge Devices

#1
by fariasultana - opened

MiniMind Max2 is Here! ๐ŸŽ‰

We're excited to release MiniMind Max2, a family of efficient language models designed for edge deployment!

Key Features

  • Only 25% parameters activated per token using Mixture of Experts
  • 75% memory savings with Grouped Query Attention (4:1 ratio)
  • Mobile-ready: Runs on smartphones, tablets, and IoT devices
  • Easy export: ONNX, GGUF (llama.cpp), Android NDK

Model Sizes

Model Total Active INT4 Size
max2-nano 500M 125M ~300MB
max2-lite 1.5B 375M ~900MB
max2-pro 3B 750M ~1.8GB

Try It Now!

  • ๐ŸŽฎ Demo: MiniMind-API Space
  • ๐Ÿ“– Documentation: Check the README for full details

We'd love your feedback! Let us know what you think. ๐Ÿ™

Sign up or log in to comment