Spaces:

AtomGradient
/

README

Running

App Files Files Community

ShuhongWu commited on 7 days ago

Commit

f873876

verified ·

1 Parent(s): 1ae70e9

Update README.md

Browse files

Files changed (1) hide show

README.md +60 -82

README.md CHANGED Viewed

@@ -1,120 +1,98 @@
-  ---
-  ---
-  title: README
-  emoji: ⚡
-  colorFrom: indigo
-  colorTo: purple
-  sdk: static
-  pinned: false
-  license: mit
-  ---
-  # AtomGradient — Bringing AI to the Edge
-  **We are an independent research group dedicated to making AI run efficiently on edge
-  devices.**
-  We believe powerful AI should be private, accessible, and free from cloud dependency. All our
-  research is open-source.
-  🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient)
-  · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
-  ---
-  ## Research
-  ### [Prism — Cross-Domain Personal Data Integration on Consumer
-  Hardware](https://atomgradient.github.io/Prism/)
-  Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing
-   emergent cross-domain insights with zero data leakage.
-  - 📈 **1.48x** cross-domain insight emergence (IIR)
-  - 🔒 **125.5x** federation compression, zero data leakage
-  - ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
-  [[GitHub]](https://github.com/AtomGradient/Prism) ·
-  [[Paper]](https://atomgradient.github.io/Prism/)
-  ---
-  ### [ANE Batch Prefill — On-Device Parallel LLM
-  Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
-  Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon
-   for Qwen3.5 models.
-  - 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
-  - 🔋 **79%** power reduction for prefill component
-  - ⏱️  **<30 ms** state transfer overhead
-  [[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) ·
-  [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
-  ---
-  ### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple
-  Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
-  Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four
-  inference strategies compared.
-  - 🔄 ANE prefill matches GPU at **~410 tokens**
-  - 🔋 **282x** GPU power reduction during prefill
-  - 📊 4 inference pipelines benchmarked
-  [[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) ·
-  [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
-  ---
-  ### [swift-qwen3-tts — On-Device
-  Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
-  Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
-  - 📦 **67%** model compression (2.35 GB → 808 MB)
-  - 🎙️  Real-time synthesis (**RTF 0.68x**)
-  - 🌍 12 languages supported
-  [[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) ·
-  [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
-  ---
-  ### [Gemma-Prune — On-Device Vision Language
-  Model](https://atomgradient.github.io/swift-gemma-cli/)
-  Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
-  - 📦 **25%** model compression (2.8 GB → 2.1 GB)
-  - 📝 **110 tok/s** text generation
-  - 🖼️  **3.4x** image processing speedup
-  [[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) ·
-  [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
-  ---
-  ### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
-  Exploring memory optimization techniques for the MLX framework on Apple Silicon.
-  - ⚡ Up to **20x** faster mmap loading
-  - 🔄 Zero-copy model loading
-  - 📊 Comprehensive benchmarks
-  [[GitHub]](https://github.com/AtomGradient/OptMLX) ·
-  [[Paper]](https://atomgradient.github.io/OptMLX/)
-  ---
-  ## About
-  AtomGradient is an independent research group dedicated to making AI run efficiently on edge
-  devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line
-  bringing on-device AI capabilities to real-world applications.
-  `Edge AI` · `Privacy-First` · `Open Research`
-  ---

+---
+title: README
+emoji: ⚡
+colorFrom: indigo
+colorTo: purple
+sdk: static
+pinned: false
+license: mit
+---
+# AtomGradient — Bringing AI to the Edge
+**We are an independent research group dedicated to making AI run efficiently on edge devices.**
+We believe powerful AI should be private, accessible, and free from cloud dependency. All our research is open-source.
+🌐 [atomgradient.com](https://atomgradient.com) · 🐙 [GitHub](https://github.com/AtomGradient) · 🚀 [EchoStream AI](https://www.echostream-ai.com/)
+---
+## Research
+### [Prism — Cross-Domain Personal Data Integration on Consumer Hardware](https://atomgradient.github.io/Prism/)
+Integrating finance, diet, mood, and reading data entirely on consumer Apple Silicon, producing emergent cross-domain insights with zero data leakage.
+- 📈 **1.48x** cross-domain insight emergence (IIR)
+- 🔒 **125.5x** federation compression, zero data leakage
+- ⚡ **49.9 TPS** real-time inference (35B on M2 Ultra)
+[[GitHub]](https://github.com/AtomGradient/Prism) · [[Paper]](https://atomgradient.github.io/Prism/)
+---
+### [ANE Batch Prefill — On-Device Parallel LLM Inference](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
+Fused matrix-vector kernels enabling concurrent ANE batch prefill + GPU decode on Apple Silicon for Qwen3.5 models.
+- 🚀 **11.3x** ANE batch prefill speedup (268 tok/s)
+- 🔋 **79%** power reduction for prefill component
+- ⏱️ **<30 ms** state transfer overhead
+[[GitHub]](https://github.com/AtomGradient/hybird-batch-prefill-on-ane) · [[Paper]](https://atomgradient.github.io/hybird-batch-prefill-on-ane/)
+---
+### [hybrid-ane-mlx-bench — Disaggregated LLM Inference on Apple Silicon](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
+Benchmarking CoreML ANE prefill + MLX GPU decode for Qwen3.5 on Apple Silicon, with four inference strategies compared.
+- 🔄 ANE prefill matches GPU at **~410 tokens**
+- 🔋 **282x** GPU power reduction during prefill
+- 📊 4 inference pipelines benchmarked
+[[GitHub]](https://github.com/AtomGradient/hybrid-ane-mlx-bench) · [[Paper]](https://atomgradient.github.io/hybrid-ane-mlx-bench/)
+---
+### [swift-qwen3-tts — On-Device Text-to-Speech](https://atomgradient.github.io/swift-qwen3-tts/)
+Native Swift implementation of Qwen3 TTS 0.6B for real-time, on-device speech synthesis.
+- 📦 **67%** model compression (2.35 GB → 808 MB)
+- 🎙️ Real-time synthesis (**RTF 0.68x**)
+- 🌍 12 languages supported
+[[GitHub]](https://github.com/AtomGradient/swift-qwen3-tts) · [[Paper]](https://atomgradient.github.io/swift-qwen3-tts/)
+---
+### [Gemma-Prune — On-Device Vision Language Model](https://atomgradient.github.io/swift-gemma-cli/)
+Multi-stage compression pipeline for deploying Gemma 3 4B VLM on consumer hardware.
+- 📦 **25%** model compression (2.8 GB → 2.1 GB)
+- 📝 **110 tok/s** text generation
+- 🖼️ **3.4x** image processing speedup
+[[GitHub]](https://github.com/AtomGradient/swift-gemma-cli) · [[Paper]](https://atomgradient.github.io/swift-gemma-cli/)
+---
+### [OptMLX — MLX Memory Optimization Research](https://atomgradient.github.io/OptMLX/)
+Exploring memory optimization techniques for the MLX framework on Apple Silicon.
+- ⚡ Up to **20x** faster mmap loading
+- 🔄 Zero-copy model loading
+- 📊 Comprehensive benchmarks
+[[GitHub]](https://github.com/AtomGradient/OptMLX) · [[Paper]](https://atomgradient.github.io/OptMLX/)
+---
+## About
+AtomGradient is an independent research group dedicated to making AI run efficiently on edge devices. Our research powers [EchoStream AI](https://www.echostream-ai.com/) — a product line bringing on-device AI capabilities to real-world applications.
+`Edge AI` · `Privacy-First` · `Open Research`