Spaces:
Running
Running
File size: 3,771 Bytes
f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 1ae70e9 f873876 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | ---
title: README
emoji: ⚡
colorFrom: indigo
colorTo: purple
sdk: static
pinned: false
license: mit
---
# AtomGradient — Bringing AI to the Edge
**We are an independent research group dedicated to making AI run efficiently on edge devices.**
We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source.
🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient) · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
---
## Research
### [Prism — Cross-Domain Personal Data Integration on Consumer Hardware](https://atomgradient.github.io/Prism/)
Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage.
- 📈 **1.48x** cross-domain insight emergence (IIR)
- 🔒 **125.5x** federation compression, zero data leakage
- ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
[[GitHub]](https://github.com/AtomGradient/Prism) · [[Paper]](https://atomgradient.github.io/Prism/)
---
### [ANE Batch Prefill — On-Device Parallel LLM Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models.
- 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
- 🔋 **79%** power reduction for prefill component
- ⏱️ **<30 ms** state transfer overhead
[[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) · [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
---
### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared.
- 🔄 ANE prefill matches GPU at **~410 tokens**
- 🔋 **282x** GPU power reduction during prefill
- 📊 4 inference pipelines benchmarked
[[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) · [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
---
### [swift-qwen3-tts — On-Device Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
- 📦 **67%** model compression (2.35 GB → 808 MB)
- 🎙️ Real-time synthesis (**RTF 0.68x**)
- 🌍 12 languages supported
[[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) · [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
---
### [Gemma-Prune — On-Device Vision Language Model](https://atomgradient.github.io/swift-gemma-cli/)
Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
- 📦 **25%** model compression (2.8 GB → 2.1 GB)
- 📝 **110 tok/s** text generation
- 🖼️ **3.4x** image processing speedup
[[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) · [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
---
### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
Exploring memory optimization techniques for the MLX framework on Apple Silicon.
- ⚡ Up to **20x** faster mmap loading
- 🔄 Zero-copy model loading
- 📊 Comprehensive benchmarks
[[GitHub]](https://github.com/AtomGradient/OptMLX) · [[Paper]](https://atomgradient.github.io/OptMLX/)
---
## About
AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line bringing on-device AI capabilities to real-world applications.
`Edge AI` · `Privacy-First` · `Open Research`
|