ShuhongWu commited on
Commit
1ae70e9
·
verified ·
1 Parent(s): 9e9d0d4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -11
README.md CHANGED
@@ -1,11 +1,120 @@
1
- ---
2
- title: README
3
- emoji: 🐢
4
- colorFrom: blue
5
- colorTo: yellow
6
- sdk: static
7
- pinned: false
8
- license: mit
9
- ---
10
-
11
- [We are a research company that builds AI products for the edge.](https://www.atomgradient.com/)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ ---
3
+ title: README
4
+ emoji:
5
+ colorFrom: indigo
6
+ colorTo: purple
7
+ sdk: static
8
+ pinned: false
9
+ license: mit
10
+ ---
11
+
12
+ # AtomGradient — Bringing AI to the Edge
13
+
14
+ **We are an independent research group dedicated to making AI run efficiently on edge
15
+ devices.**
16
+ We believe powerful AI should be private, accessible, and free from cloud dependency. All our
17
+ research is open-source.
18
+
19
+ 🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient)
20
+ · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
21
+
22
+ ---
23
+
24
+ ## Research
25
+
26
+ ### [Prism — Cross-Domain Personal Data Integration on Consumer
27
+ Hardware](https://atomgradient.github.io/Prism/)
28
+
29
+ Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing
30
+ emergent cross-domain insights with zero data leakage.
31
+
32
+ - 📈 **1.48x** cross-domain insight emergence (IIR)
33
+ - 🔒 **125.5x** federation compression, zero data leakage
34
+ - ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
35
+
36
+ [[GitHub]](https://github.com/AtomGradient/Prism) ·
37
+ [[Paper]](https://atomgradient.github.io/Prism/)
38
+
39
+ ---
40
+
41
+ ### [ANE Batch Prefill — On-Device Parallel LLM
42
+ Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
43
+
44
+ Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon
45
+ for Qwen3.5 models.
46
+
47
+ - 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
48
+ - 🔋 **79%** power reduction for prefill component
49
+ - ⏱️ **<30 ms** state transfer overhead
50
+
51
+ [[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) ·
52
+ [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
53
+
54
+ ---
55
+
56
+ ### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple
57
+ Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
58
+
59
+ Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four
60
+ inference strategies compared.
61
+
62
+ - 🔄 ANE prefill matches GPU at **~410 tokens**
63
+ - 🔋 **282x** GPU power reduction during prefill
64
+ - 📊 4 inference pipelines benchmarked
65
+
66
+ [[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) ·
67
+ [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
68
+
69
+ ---
70
+
71
+ ### [swift-qwen3-tts — On-Device
72
+ Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
73
+
74
+ Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
75
+
76
+ - 📦 **67%** model compression (2.35 GB → 808 MB)
77
+ - 🎙️ Real-time synthesis (**RTF 0.68x**)
78
+ - 🌍 12 languages supported
79
+
80
+ [[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) ·
81
+ [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
82
+
83
+ ---
84
+
85
+ ### [Gemma-Prune — On-Device Vision Language
86
+ Model](https://atomgradient.github.io/swift-gemma-cli/)
87
+
88
+ Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
89
+
90
+ - 📦 **25%** model compression (2.8 GB → 2.1 GB)
91
+ - 📝 **110 tok/s** text generation
92
+ - 🖼️ **3.4x** image processing speedup
93
+
94
+ [[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) ·
95
+ [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
96
+
97
+ ---
98
+
99
+ ### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
100
+
101
+ Exploring memory optimization techniques for the MLX framework on Apple Silicon.
102
+
103
+ - ⚡ Up to **20x** faster mmap loading
104
+ - 🔄 Zero-copy model loading
105
+ - 📊 Comprehensive benchmarks
106
+
107
+ [[GitHub]](https://github.com/AtomGradient/OptMLX) ·
108
+ [[Paper]](https://atomgradient.github.io/OptMLX/)
109
+
110
+ ---
111
+
112
+ ## About
113
+
114
+ AtomGradient is an independent research group dedicated to making AI run efficiently on edge
115
+ devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line
116
+ bringing on-device AI capabilities to real-world applications.
117
+
118
+ `Edge AI` · `Privacy-First` · `Open Research`
119
+
120
+ ---