ShuhongWu commited on
Commit
f873876
·
verified ·
1 Parent(s): 1ae70e9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -82
README.md CHANGED
@@ -1,120 +1,98 @@
1
- ---
2
- ---
3
- title: README
4
- emoji:
5
- colorFrom: indigo
6
- colorTo: purple
7
- sdk: static
8
- pinned: false
9
- license: mit
10
- ---
11
 
12
- # AtomGradient — Bringing AI to the Edge
13
 
14
- **We are an independent research group dedicated to making AI run efficiently on edge
15
- devices.**
16
- We believe powerful AI should be private, accessible, and free from cloud dependency. All our
17
- research is open-source.
18
 
19
- 🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient)
20
- · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
21
 
22
- ---
23
 
24
- ## Research
25
 
26
- ### [Prism — Cross-Domain Personal Data Integration on Consumer
27
- Hardware](https://atomgradient.github.io/Prism/)
28
 
29
- Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing
30
- emergent cross-domain insights with zero data leakage.
31
 
32
- - 📈 **1.48x** cross-domain insight emergence (IIR)
33
- - 🔒 **125.5x** federation compression, zero data leakage
34
- - ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
35
 
36
- [[GitHub]](https://github.com/AtomGradient/Prism) ·
37
- [[Paper]](https://atomgradient.github.io/Prism/)
38
 
39
- ---
40
 
41
- ### [ANE Batch Prefill — On-Device Parallel LLM
42
- Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
43
 
44
- Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon
45
- for Qwen3.5 models.
46
 
47
- - 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
48
- - 🔋 **79%** power reduction for prefill component
49
- - ⏱️ **<30 ms** state transfer overhead
50
 
51
- [[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) ·
52
- [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
53
 
54
- ---
55
 
56
- ### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple
57
- Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
58
 
59
- Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four
60
- inference strategies compared.
61
 
62
- - 🔄 ANE prefill matches GPU at **~410 tokens**
63
- - 🔋 **282x** GPU power reduction during prefill
64
- - 📊 4 inference pipelines benchmarked
65
 
66
- [[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) ·
67
- [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
68
 
69
- ---
70
 
71
- ### [swift-qwen3-tts — On-Device
72
- Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
73
 
74
- Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
75
 
76
- - 📦 **67%** model compression (2.35 GB → 808 MB)
77
- - 🎙️ Real-time synthesis (**RTF 0.68x**)
78
- - 🌍 12 languages supported
79
 
80
- [[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) ·
81
- [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
82
 
83
- ---
84
 
85
- ### [Gemma-Prune — On-Device Vision Language
86
- Model](https://atomgradient.github.io/swift-gemma-cli/)
87
 
88
- Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
89
 
90
- - 📦 **25%** model compression (2.8 GB → 2.1 GB)
91
- - 📝 **110 tok/s** text generation
92
- - 🖼️ **3.4x** image processing speedup
93
 
94
- [[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) ·
95
- [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
96
 
97
- ---
98
 
99
- ### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
100
 
101
- Exploring memory optimization techniques for the MLX framework on Apple Silicon.
102
 
103
- - ⚡ Up to **20x** faster mmap loading
104
- - 🔄 Zero-copy model loading
105
- - 📊 Comprehensive benchmarks
106
 
107
- [[GitHub]](https://github.com/AtomGradient/OptMLX) ·
108
- [[Paper]](https://atomgradient.github.io/OptMLX/)
109
 
110
- ---
111
 
112
- ## About
113
 
114
- AtomGradient is an independent research group dedicated to making AI run efficiently on edge
115
- devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line
116
- bringing on-device AI capabilities to real-world applications.
117
 
118
- `Edge AI` · `Privacy-First` · `Open Research`
119
-
120
- ---
 
1
+ ---
2
+ title: README
3
+ emoji:
4
+ colorFrom: indigo
5
+ colorTo: purple
6
+ sdk: static
7
+ pinned: false
8
+ license: mit
9
+ ---
 
10
 
11
+ # AtomGradient — Bringing AI to the Edge
12
 
13
+ **We are an independent research group dedicated to making AI run efficiently on edge devices.**
14
+ We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source.
 
 
15
 
16
+ 🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient) · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
 
17
 
18
+ ---
19
 
20
+ ## Research
21
 
22
+ ### [Prism — Cross-Domain Personal Data Integration on Consumer Hardware](https://atomgradient.github.io/Prism/)
 
23
 
24
+ Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage.
 
25
 
26
+ - 📈 **1.48x** cross-domain insight emergence (IIR)
27
+ - 🔒 **125.5x** federation compression, zero data leakage
28
+ - ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
29
 
30
+ [[GitHub]](https://github.com/AtomGradient/Prism) · [[Paper]](https://atomgradient.github.io/Prism/)
 
31
 
32
+ ---
33
 
34
+ ### [ANE Batch Prefill — On-Device Parallel LLM Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
 
35
 
36
+ Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models.
 
37
 
38
+ - 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
39
+ - 🔋 **79%** power reduction for prefill component
40
+ - ⏱️ **<30 ms** state transfer overhead
41
 
42
+ [[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) · [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
 
43
 
44
+ ---
45
 
46
+ ### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
 
47
 
48
+ Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared.
 
49
 
50
+ - 🔄 ANE prefill matches GPU at **~410 tokens**
51
+ - 🔋 **282x** GPU power reduction during prefill
52
+ - 📊 4 inference pipelines benchmarked
53
 
54
+ [[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) · [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
 
55
 
56
+ ---
57
 
58
+ ### [swift-qwen3-tts — On-Device Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
 
59
 
60
+ Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
61
 
62
+ - 📦 **67%** model compression (2.35 GB → 808 MB)
63
+ - 🎙️ Real-time synthesis (**RTF 0.68x**)
64
+ - 🌍 12 languages supported
65
 
66
+ [[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) · [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
 
67
 
68
+ ---
69
 
70
+ ### [Gemma-Prune — On-Device Vision Language Model](https://atomgradient.github.io/swift-gemma-cli/)
 
71
 
72
+ Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
73
 
74
+ - 📦 **25%** model compression (2.8 GB → 2.1 GB)
75
+ - 📝 **110 tok/s** text generation
76
+ - 🖼️ **3.4x** image processing speedup
77
 
78
+ [[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) · [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
 
79
 
80
+ ---
81
 
82
+ ### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
83
 
84
+ Exploring memory optimization techniques for the MLX framework on Apple Silicon.
85
 
86
+ - ⚡ Up to **20x** faster mmap loading
87
+ - 🔄 Zero-copy model loading
88
+ - 📊 Comprehensive benchmarks
89
 
90
+ [[GitHub]](https://github.com/AtomGradient/OptMLX) · [[Paper]](https://atomgradient.github.io/OptMLX/)
 
91
 
92
+ ---
93
 
94
+ ## About
95
 
96
+ AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line bringing on-device AI capabilities to real-world applications.
 
 
97
 
98
+ `Edge AI` · `Privacy-First` · `Open Research`