Amshaker commited on
Commit
3d45c04
Β·
verified Β·
1 Parent(s): 1d3c5e1

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +108 -0
README.md ADDED
@@ -0,0 +1,108 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ tags:
4
+ - mobile-o
5
+ - multimodal
6
+ - unified-model
7
+ - ios
8
+ - coreml
9
+ - mlx
10
+ - on-device
11
+ - mobile
12
+ - edge-ai
13
+ pipeline_tag: image-text-to-text
14
+ ---
15
+
16
+ <div align="center">
17
+
18
+ <h1>
19
+ <img src="https://github.com/Amshaker/Mobile-O/blob/main/assets/mobile-o-logo.png?raw=true" width="30" /> Mobile-O-0.5B-iOS
20
+ </h1>
21
+
22
+ **Optimized MLX & CoreML Components for On-Device Deployment**
23
+
24
+ <p>
25
+ <a href="https://arxiv.org/abs/XXXX.XXXXX"><img src="https://img.shields.io/badge/arXiv-XXXX.XXXXX-b31b1b.svg" alt="arXiv"></a>
26
+ <a href="https://github.com/Amshaker/Mobile-O"><img src="https://img.shields.io/badge/GitHub-Code-black.svg" alt="Code"></a>
27
+ <a href="https://amshaker.github.io/Mobile-O/"><img src="https://img.shields.io/badge/🌐-Project_Page-2563eb.svg" alt="Project Page"></a>
28
+ <a href="https://mobileo.cvmbzuai.com/"><img src="https://img.shields.io/badge/πŸš€-Live_Demo-10b981.svg" alt="Demo"></a>
29
+ <a href="https://apps.apple.com/app/XXXXXXXXXX"><img src="https://img.shields.io/badge/-App_Store-black.svg" alt="App Store"></a>
30
+ </p>
31
+
32
+ </div>
33
+
34
+ ## πŸ“Œ Overview
35
+
36
+ This repository contains the optimized **MLX** and **CoreML** model components of [Mobile-O-0.5B](https://huggingface.co/Amshaker/Mobile-O-0.5B) for native iOS deployment. These components power the [Mobile-O iOS app](https://github.com/Amshaker/Mobile-O/tree/main/Mobile-O-App), enabling fully on-device multimodal understanding and image generation with no cloud dependency.
37
+
38
+ ## πŸ“± On-Device Performance
39
+
40
+ | Spec | Detail |
41
+ |------|--------|
42
+ | ⚑ Image Generation | ~3 seconds |
43
+ | πŸ‘οΈ Visual Understanding | ~0.4 seconds |
44
+ | πŸ’Ύ Memory Footprint | < 2GB |
45
+ | πŸ“± Compatible Devices | iPhone (A17+ / M-series) |
46
+ | πŸ”’ Cloud Dependency | None β€” fully on-device |
47
+
48
+ ## πŸ“¦ Contents
49
+
50
+ This repo includes optimized model components in both **MLX** and **CoreML** formats:
51
+
52
+ | Component | Format | Description |
53
+ |-----------|--------|-------------|
54
+ | **VLM** | MLX / CoreML | FastVLM-0.5B (FastViT + Qwen2-0.5B) |
55
+ | **Diffusion Decoder** | MLX / CoreML | SANA-600M-512 (Linear DiT + VAE) |
56
+ | **MCP** | MLX / CoreML | Mobile Conditioning Projector (~2.4M params) |
57
+
58
+ ## πŸš€ Usage
59
+
60
+ ### With the iOS App
61
+
62
+ 1. Clone the [Mobile-O repo](https://github.com/Amshaker/Mobile-O)
63
+ 2. Navigate to the `Mobile-O-App/` directory
64
+ 3. Download this model repo into the app's model directory
65
+ 4. Build and run in Xcode
66
+
67
+ ```bash
68
+ git clone https://github.com/Amshaker/Mobile-O.git
69
+ cd Mobile-O/Mobile-O-App
70
+ ```
71
+
72
+ Refer to the [Mobile-O-App README](https://github.com/Amshaker/Mobile-O/tree/main/Mobile-O-App) for detailed setup instructions.
73
+
74
+ ### Download Models
75
+
76
+ ```python
77
+ from huggingface_hub import snapshot_download
78
+
79
+ snapshot_download(
80
+ repo_id="Amshaker/Mobile-O-0.5B-iOS",
81
+ repo_type="model",
82
+ local_dir="ios_models"
83
+ )
84
+ ```
85
+
86
+ ## πŸ”— Related Resources
87
+
88
+ | Resource | Link |
89
+ |----------|------|
90
+ | πŸ€— Mobile-O-0.5B | [PyTorch Model](https://huggingface.co/Amshaker/Mobile-O-0.5B) |
91
+ | πŸ€— Mobile-O-1.5B | [PyTorch Model](https://huggingface.co/Amshaker/Mobile-O-1.5B) |
92
+ | πŸ“± iOS App Source Code | [Mobile-O-App](https://github.com/Amshaker/Mobile-O/tree/main/Mobile-O-App) |
93
+ | πŸ€— Training Datasets | [Collection](https://huggingface.co/collections/Amshaker/mobile-o-datasets) |
94
+
95
+ ## πŸ“„ Citation
96
+
97
+ ```bibtex
98
+ @article{shaker2026mobileo,
99
+ title={Mobile-O: Unified Multimodal Understanding and Generation on Mobile Device},
100
+ author={Shaker, Abdelrahman and Heakl, Ahmed and Muhammad, Jaseel and Thawkar, Ritesh and Thawakar, Omkar and Li, Senmao and Cholakkal, Hisham and Reid, Ian and Xing, Eric P. and Khan, Salman and Khan, Fahad Shahbaz},
101
+ journal={arXiv preprint arXiv:XXXX.XXXXX},
102
+ year={2026}
103
+ }
104
+ ```
105
+
106
+ ## βš–οΈ License
107
+
108
+ Released under [CC BY-NC 4.0](https://creativecommons.org/licenses/by-nc/4.0/). For research purposes only.