Commit ·
c21a8a2
0
Parent(s):
Duplicate from NexaAI/OmniNeural-4B
Browse filesCo-authored-by: Alex Chen <alexchen4ai@users.noreply.huggingface.co>
- .gitattributes +46 -0
- LICENSE +85 -0
- README.md +188 -0
- assets/MOBILE_50MB.mp4 +3 -0
- assets/PC_Demo_Agent.mov +3 -0
- assets/PC_Demo_Audio.mov +3 -0
- assets/PC_demo_2_image.mov +3 -0
- audio/attachments-3-3.nexa +0 -0
- audio/htp_backend_ext_config.json +31 -0
- config.json +0 -0
- files-1-1.nexa +3 -0
- llm/attachments-1-3.nexa +0 -0
- llm/htp_backend_ext_config.json +21 -0
- nexa.manifest +5 -0
- vit/attachments-2-3.nexa +0 -0
- vit/htp_backend_ext_config.json +31 -0
- vlm/attachments-2-3.nexa +0 -0
- vlm/htp_backend_ext_config.json +31 -0
- weights-1-8.nexa +3 -0
- weights-2-8.nexa +3 -0
- weights-3-8.nexa +3 -0
- weights-4-8.nexa +3 -0
- weights-5-8.nexa +3 -0
- weights-6-8.nexa +3 -0
- weights-7-8.nexa +3 -0
- weights-8-8.nexa +3 -0
.gitattributes
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
*.7z filter=lfs diff=lfs merge=lfs -text
|
| 2 |
+
*.arrow filter=lfs diff=lfs merge=lfs -text
|
| 3 |
+
*.bin filter=lfs diff=lfs merge=lfs -text
|
| 4 |
+
*.bz2 filter=lfs diff=lfs merge=lfs -text
|
| 5 |
+
*.ckpt filter=lfs diff=lfs merge=lfs -text
|
| 6 |
+
*.ftz filter=lfs diff=lfs merge=lfs -text
|
| 7 |
+
*.gz filter=lfs diff=lfs merge=lfs -text
|
| 8 |
+
*.h5 filter=lfs diff=lfs merge=lfs -text
|
| 9 |
+
*.joblib filter=lfs diff=lfs merge=lfs -text
|
| 10 |
+
*.lfs.* filter=lfs diff=lfs merge=lfs -text
|
| 11 |
+
*.mlmodel filter=lfs diff=lfs merge=lfs -text
|
| 12 |
+
*.model filter=lfs diff=lfs merge=lfs -text
|
| 13 |
+
*.msgpack filter=lfs diff=lfs merge=lfs -text
|
| 14 |
+
*.npy filter=lfs diff=lfs merge=lfs -text
|
| 15 |
+
*.npz filter=lfs diff=lfs merge=lfs -text
|
| 16 |
+
*.onnx filter=lfs diff=lfs merge=lfs -text
|
| 17 |
+
*.ot filter=lfs diff=lfs merge=lfs -text
|
| 18 |
+
*.parquet filter=lfs diff=lfs merge=lfs -text
|
| 19 |
+
*.pb filter=lfs diff=lfs merge=lfs -text
|
| 20 |
+
*.pickle filter=lfs diff=lfs merge=lfs -text
|
| 21 |
+
*.pkl filter=lfs diff=lfs merge=lfs -text
|
| 22 |
+
*.pt filter=lfs diff=lfs merge=lfs -text
|
| 23 |
+
*.pth filter=lfs diff=lfs merge=lfs -text
|
| 24 |
+
*.rar filter=lfs diff=lfs merge=lfs -text
|
| 25 |
+
*.safetensors filter=lfs diff=lfs merge=lfs -text
|
| 26 |
+
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
| 27 |
+
*.tar.* filter=lfs diff=lfs merge=lfs -text
|
| 28 |
+
*.tar filter=lfs diff=lfs merge=lfs -text
|
| 29 |
+
*.tflite filter=lfs diff=lfs merge=lfs -text
|
| 30 |
+
*.tgz filter=lfs diff=lfs merge=lfs -text
|
| 31 |
+
*.wasm filter=lfs diff=lfs merge=lfs -text
|
| 32 |
+
*.xz filter=lfs diff=lfs merge=lfs -text
|
| 33 |
+
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
+
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
+
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
files-1-1.nexa filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
weights-1-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 38 |
+
weights-2-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 39 |
+
weights-3-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 40 |
+
weights-4-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 41 |
+
weights-5-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 42 |
+
weights-6-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 43 |
+
weights-7-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 44 |
+
weights-8-8.nexa filter=lfs diff=lfs merge=lfs -text
|
| 45 |
+
tokenizer.json filter=lfs diff=lfs merge=lfs -text
|
| 46 |
+
files-1-1.json filter=lfs diff=lfs merge=lfs -text
|
LICENSE
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
NEXA AI LICENSE AGREEMENT
|
| 2 |
+
Release Date: August 17, 2025
|
| 3 |
+
|
| 4 |
+
Copyright © 2025 Nexa AI, Inc., a Delaware corporation with its principal place of business in Cupertino, California, USA.
|
| 5 |
+
|
| 6 |
+
Friendly Summary (not a substitute for the terms below)
|
| 7 |
+
- You can use, modify, and share our models (and code) for research, evaluation, learning, and personal projects.
|
| 8 |
+
- Please credit us when your released model is trained or improved using our Outputs (“Built with Nexa” or “Improved using Nexa”).
|
| 9 |
+
- Business/paid use isn’t covered — reach out for a commercial license.
|
| 10 |
+
- No warranties; use at your own risk.
|
| 11 |
+
|
| 12 |
+
0. Purpose
|
| 13 |
+
This License is designed to be developer-friendly. It enables non-commercial research, experimentation, benchmarking, education, and **personal** use of the Nexa Materials. Commercial use requires a separate written agreement with Nexa.
|
| 14 |
+
|
| 15 |
+
1. Definitions
|
| 16 |
+
1.1 “Nexa Materials” means any models, model weights, parameters (including optimizer states), software, algorithms, documentation, and related artifacts that Nexa makes available under this License.
|
| 17 |
+
1.2 “Output” means any result produced by the Nexa Materials (e.g., text, code, images, embeddings, logits, metrics).
|
| 18 |
+
1.3 “Source Form” means the preferred form for making modifications (e.g., source code, documentation source, configuration files).
|
| 19 |
+
1.4 “Object Form” means any form resulting from mechanical transformation, compilation, or conversion of a Source Form (e.g., binaries, converted weights).
|
| 20 |
+
1.5 “Non-Commercial Use” means use that is not intended for direct or indirect commercial advantage or monetary compensation and is not integrated into any paid, revenue-generating, or production service or product.
|
| 21 |
+
1.6 “Personal Use” means Non-Commercial Use by a natural person, not acting on behalf of a company or organization. Examples include classwork, portfolio projects, hackathon prototypes, research notebooks, open-source demos, and personal apps that are distributed for free and without monetization.
|
| 22 |
+
|
| 23 |
+
2. License Grant (Non-Commercial + Personal)
|
| 24 |
+
Subject to this License, Nexa grants you a non-exclusive, worldwide, non-transferable, royalty-free license to:
|
| 25 |
+
(a) use, reproduce, and modify the Nexa Materials; and
|
| 26 |
+
(b) distribute the Nexa Materials and your modifications and derivative works,
|
| 27 |
+
in each case solely for Non-Commercial Use (including Personal Use).
|
| 28 |
+
|
| 29 |
+
Commercial Use is not permitted under this License. To engage in any Commercial Use, please contact Nexa for a separate license.
|
| 30 |
+
|
| 31 |
+
3. Redistribution Conditions
|
| 32 |
+
You may distribute the Nexa Materials or your derivative works for Non-Commercial Use in Source Form or Object Form if you:
|
| 33 |
+
3.1 Include a copy of this License with each distribution;
|
| 34 |
+
3.2 Clearly mark files you modified, including a brief description and date of change; and
|
| 35 |
+
3.3 Retain the following attribution in a NOTICE or equivalent file:
|
| 36 |
+
“Includes or is derived from Nexa Materials licensed under the Nexa AI Research. © Nexa AI, Inc. All rights reserved.”
|
| 37 |
+
You may apply additional or different license terms to your own modifications or derivative works as a whole, provided your terms do not grant rights that exceed or conflict with this License with respect to the Nexa Materials.
|
| 38 |
+
|
| 39 |
+
4. Attribution for Models Built Using Outputs
|
| 40 |
+
If you use Outputs to train, fine-tune, distill, or otherwise improve a model that you distribute or make available (free or paid), include a clear credit in documentation (e.g., README, model card, product docs) as:
|
| 41 |
+
“Built with Nexa” or “Improved using Nexa.”
|
| 42 |
+
For interactive products, a reasonable placement is an About/Credits page or documentation site.
|
| 43 |
+
|
| 44 |
+
5. Intellectual Property; No Patent License
|
| 45 |
+
5.1 Nexa retains all intellectual property rights in the Nexa Materials and in derivatives created by or for Nexa.
|
| 46 |
+
5.2 You retain rights in your original modifications and derivative works you create, subject to this License.
|
| 47 |
+
5.3 No patent rights are granted by this License, whether by implication, estoppel, or otherwise.
|
| 48 |
+
|
| 49 |
+
6. Trademarks
|
| 50 |
+
This License does not grant permission to use Nexa’s names, logos, or trademarks, except for the limited factual attributions required by Section 4 or customary references to the origin of the Nexa Materials.
|
| 51 |
+
|
| 52 |
+
7. Legal Compliance and Export Controls
|
| 53 |
+
You must comply with all applicable laws and regulations, including U.S. export control and sanctions laws and any local equivalents, in connection with your access to and use of the Nexa Materials and Outputs.
|
| 54 |
+
|
| 55 |
+
8. Acceptable Use and Responsibility
|
| 56 |
+
You are solely responsible for your use of the Nexa Materials and Outputs, including datasets and prompts you supply, and for ensuring compliance with privacy, data protection, security, and other applicable laws. Do not use the Nexa Materials or Outputs to violate rights of others or applicable law.
|
| 57 |
+
|
| 58 |
+
9. Indemnification
|
| 59 |
+
To the maximum extent permitted by law, you agree to defend, indemnify, and hold harmless Nexa and its affiliates from and against any claims, liabilities, damages, losses, and expenses (including reasonable attorneys’ fees) arising out of or related to your use, modification, or distribution of the Nexa Materials or Outputs.
|
| 60 |
+
|
| 61 |
+
10. Warranty Disclaimer
|
| 62 |
+
THE NEXA MATERIALS AND OUTPUTS ARE PROVIDED “AS IS” AND “AS AVAILABLE.” NEXA DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING WITHOUT LIMITATION WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, NON-INFRINGEMENT, ACCURACY, OR PERFORMANCE. YOU BEAR ALL RISK AS TO THE RESULTS AND PERFORMANCE OF THE NEXA MATERIALS AND OUTPUTS.
|
| 63 |
+
|
| 64 |
+
11. Limitation of Liability
|
| 65 |
+
TO THE MAXIMUM EXTENT PERMITTED BY LAW, NEXA SHALL NOT BE LIABLE FOR ANY INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, EXEMPLARY, OR PUNITIVE DAMAGES, OR FOR ANY LOSS OF PROFITS, REVENUE, DATA, OR GOODWILL, ARISING OUT OF OR RELATING TO THIS LICENSE OR THE NEXA MATERIALS OR OUTPUTS, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. NEXA’S TOTAL AGGREGATE LIABILITY FOR DIRECT DAMAGES SHALL NOT EXCEED ONE HUNDRED U.S. DOLLARS (US$100).
|
| 66 |
+
|
| 67 |
+
12. Defensive Termination
|
| 68 |
+
If you or any entity you control initiates any legal claim alleging that the Nexa Materials or Outputs infringe your intellectual property rights, this License automatically terminates as of the filing date of such claim.
|
| 69 |
+
|
| 70 |
+
13. Term and Termination
|
| 71 |
+
13.1 This License takes effect when you access the Nexa Materials and continues unless terminated as provided herein.
|
| 72 |
+
13.2 Nexa may terminate this License if you breach any of its terms. Upon termination, you must cease all use and distribution of the Nexa Materials and delete all copies under your possession or control.
|
| 73 |
+
13.3 Sections 5–16 survive termination.
|
| 74 |
+
|
| 75 |
+
14. Updates
|
| 76 |
+
Nexa may provide updates to the Nexa Materials or publish new versions of this License. Updated Nexa Materials will be governed by the license accompanying them. Continued use of updated Nexa Materials constitutes acceptance of the then-current license.
|
| 77 |
+
|
| 78 |
+
15. Governing Law and Venue
|
| 79 |
+
This License is governed by the laws of the State of Delaware, excluding its conflicts-of-law rules.
|
| 80 |
+
|
| 81 |
+
16. Contact
|
| 82 |
+
Questions about this License or commercial use? Email dev@nexa.ai (or your usual Nexa contact).
|
| 83 |
+
|
| 84 |
+
Attribution Snippet (for NOTICE)
|
| 85 |
+
“Nexa Materials are licensed under the Nexa AI Research © Nexa AI, Inc., Cupertino, CA, USA. All rights reserved.”
|
README.md
ADDED
|
@@ -0,0 +1,188 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
tags:
|
| 3 |
+
- multimodal
|
| 4 |
+
- NPU
|
| 5 |
+
- On-device
|
| 6 |
+
- Snapdragon PC
|
| 7 |
+
- Android
|
| 8 |
+
license: cc-by-4.0
|
| 9 |
+
license_name: nexa-research
|
| 10 |
+
license_link: LICENSE
|
| 11 |
+
pipeline_tag: any-to-any
|
| 12 |
+
---
|
| 13 |
+
<p align="center">
|
| 14 |
+
<img alt="omnineural" src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/zRUnoWmw43fl9hrXHg0pE.png">
|
| 15 |
+
</p>
|
| 16 |
+
|
| 17 |
+
# **OmniNeural** — World’s First NPU-aware Multimodal Model
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
## **Overview**
|
| 21 |
+
**OmniNeural** is the first fully multimodal model designed specifically for Neural Processing Units (NPUs). It natively understands **text, images, and audio**, and runs across PCs, mobile devices, automobile, IoT, and robotics.
|
| 22 |
+
|
| 23 |
+
## Demos
|
| 24 |
+
|
| 25 |
+
### 📱 Mobile Phone NPU - Demo on Samsung S25 Ultra
|
| 26 |
+
The first-ever fully local, multimodal, and conversational AI assistant that hears you and sees what you see, running **natively on Snapdragon NPU** for long battery life and low latency.
|
| 27 |
+
|
| 28 |
+
<video controls width="720" preload="metadata"
|
| 29 |
+
src="https://huggingface.co/NexaAI/OmniNeural-4B/resolve/main/assets/MOBILE_50MB.mp4"
|
| 30 |
+
type="video/mp4"></video>
|
| 31 |
+
|
| 32 |
+
---
|
| 33 |
+
|
| 34 |
+
## ✨ PC NPU - Capabilities Highlights
|
| 35 |
+
|
| 36 |
+
<table>
|
| 37 |
+
<tr>
|
| 38 |
+
<td width="33%">
|
| 39 |
+
<video controls width="100%" preload="metadata"
|
| 40 |
+
src="https://huggingface.co/NexaAI/OmniNeural-4B/resolve/main/assets/PC_demo_2_image.mov"></video>
|
| 41 |
+
<p align="center"><b>🖼️ Multi-Image Reasoning</b><br>Spot the difference across two images in multi-round dialogue.</p>
|
| 42 |
+
</td>
|
| 43 |
+
|
| 44 |
+
<td width="33%">
|
| 45 |
+
<video controls width="100%" preload="metadata"
|
| 46 |
+
src="https://huggingface.co/NexaAI/OmniNeural-4B/resolve/main/assets/PC_Demo_Agent.mov"></video>
|
| 47 |
+
<p align="center"><b>🤖 Image + Text → Function Call</b><br>Snap a poster, add a text instruction, and AI agent creates a calendar event.</p>
|
| 48 |
+
</td>
|
| 49 |
+
|
| 50 |
+
<td width="33%">
|
| 51 |
+
<video controls width="100%" preload="metadata"
|
| 52 |
+
src="https://huggingface.co/NexaAI/OmniNeural-4B/resolve/main/assets/PC_Demo_Audio.mov"></video>
|
| 53 |
+
<p align="center"><b>🎶 Multi-Audio Comparison</b><br>Tell the difference between two music clips locally.</p>
|
| 54 |
+
</td>
|
| 55 |
+
</tr>
|
| 56 |
+
</table>
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
|
| 60 |
+
---
|
| 61 |
+
|
| 62 |
+
## **Key Features**
|
| 63 |
+
- **Multimodal Intelligence** – Processes **text, image, and audio** in a unified model for richer reasoning and perception.
|
| 64 |
+
- **NPU-Optimized Architecture** – Uses ReLU ops, sparse tensors, convolutional layers, and static graph execution for maximum throughput — **20% faster than non-NPU-aware models** .
|
| 65 |
+
- **Hardware-Aware Attention** – Attention patterns tuned for NPU, lowering compute and memory demand .
|
| 66 |
+
- **Native Static Graph** – Supports variable-length multimodal inputs with stable, predictable latency .
|
| 67 |
+
- **Performance Gains** – **9× faster audio processing** and **3.5× faster image processing** on NPUs compared to baseline encoders .
|
| 68 |
+
- **Privacy-First Inference** – All computation stays local: private, offline-capable, and cost-efficient.
|
| 69 |
+
|
| 70 |
+
---
|
| 71 |
+
|
| 72 |
+
## **Performance / Benchmarks**
|
| 73 |
+
### Human Evaluation (vs baselines)
|
| 74 |
+
- **Vision**: Wins/ties in ~75% of prompts against Apple Foundation, Gemma-3n-E4B, Qwen2.5-Omni-3B.
|
| 75 |
+
- **Audio**: Clear lead over baselines, much better than Gemma3n and Apple foundation model.
|
| 76 |
+
- **Text**: Matches or outperforms leading multimodal baselines.
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
<p align="center">
|
| 80 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/vsrg43GxTOSAj7q_SI60o.png" width="1560" alt="Human eval chart" />
|
| 81 |
+
</p>
|
| 82 |
+
|
| 83 |
+
### Nexa Attention Speedups
|
| 84 |
+
- **9× faster** audio encoding (vs Whisper encoder).
|
| 85 |
+
- **3.5× faster** image encoding (vs SigLIP encoder).
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
<p align="center">
|
| 89 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/1039SN5JBQkS04z4YnoIi.png" width="400" alt="Human eval chart" />
|
| 90 |
+
</p>
|
| 91 |
+
|
| 92 |
+
---
|
| 93 |
+
|
| 94 |
+
## **Architecture Overview**
|
| 95 |
+
OmniNeural’s design is tightly coupled with NPU hardware:
|
| 96 |
+
- **NPU-friendly ops** (ReLU > GELU/SILU).
|
| 97 |
+
- **Sparse + small tensor multiplications** for efficiency.
|
| 98 |
+
- **Convolutional layers** favored over linear for better NPU parallelization.
|
| 99 |
+
- **Hardware-aware attention** patterns to cut compute cost.
|
| 100 |
+
- **Static graph execution** for predictable latency.
|
| 101 |
+
|
| 102 |
+
|
| 103 |
+

|
| 104 |
+
|
| 105 |
+
---
|
| 106 |
+
|
| 107 |
+
## **Production Use Cases**
|
| 108 |
+
|
| 109 |
+
- **PC & Mobile** – On-device AI agents combine **voice, vision, and text** for natural, accurate responses.
|
| 110 |
+
- Examples: Summarize slides into an email (PC)*, *extract action items from chat (mobile).
|
| 111 |
+
- Benefits: Private, offline, battery-efficient.
|
| 112 |
+
|
| 113 |
+
- **Automotive** – In-car assistants handle **voice control, cabin safety, and environment awareness**.
|
| 114 |
+
- Examples: Detects risks (child unbuckled, pet left, loose objects) and road conditions (fog, construction).
|
| 115 |
+
- Benefits: Decisions run locally in milliseconds.
|
| 116 |
+
|
| 117 |
+
- **IoT & Robotics** – Multimodal sensing for **factories, AR/VR, drones, and robots**.
|
| 118 |
+
- Examples: Defect detection, technician overlays, hazard spotting mid-flight, natural robot interaction.
|
| 119 |
+
- Benefits: Works without network connectivity.
|
| 120 |
+
|
| 121 |
+
---
|
| 122 |
+
|
| 123 |
+
## How to use
|
| 124 |
+
|
| 125 |
+
> ⚠️ **Hardware requirement:** OmniNeural-4B currently runs **only on Qualcomm NPUs** (e.g., Snapdragon-powered AIPC).
|
| 126 |
+
> Apple NPU support is planned next.
|
| 127 |
+
|
| 128 |
+
### 1) Install Nexa-SDK
|
| 129 |
+
|
| 130 |
+
- Download and follow the steps under "Deploy Section" Nexa's model page: [Download Windows arm64 SDK](https://sdk.nexa.ai/model/OmniNeural-4B)
|
| 131 |
+
- (Other platforms coming soon)
|
| 132 |
+
|
| 133 |
+
### 2) Get an access token
|
| 134 |
+
Create a token in the Model Hub, then log in:
|
| 135 |
+
|
| 136 |
+
```bash
|
| 137 |
+
nexa config set license '<access_token>'
|
| 138 |
+
```
|
| 139 |
+
|
| 140 |
+
### 3) Run the model
|
| 141 |
+
Running:
|
| 142 |
+
|
| 143 |
+
```bash
|
| 144 |
+
nexa infer NexaAI/OmniNeural-4B
|
| 145 |
+
```
|
| 146 |
+
|
| 147 |
+
/mic mode. Once the model is running, you can type below to record your voice directly in terminal
|
| 148 |
+
```bash
|
| 149 |
+
> /mic
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
For images and audio, simply drag your files into the command line. Remember to leave space between file paths.
|
| 153 |
+
|
| 154 |
+
---
|
| 155 |
+
|
| 156 |
+
## Links & Community
|
| 157 |
+
|
| 158 |
+
[](https://discord.com/invite/nexa-ai)
|
| 159 |
+
|
| 160 |
+
[](https://x.com/nexa_ai)
|
| 161 |
+
|
| 162 |
+
[](https://nexa.ai)
|
| 163 |
+
|
| 164 |
+
- **Issues / Feedback:** Use the **HF Discussions** tab or submit an issue in our discord or nexa-sdk github.
|
| 165 |
+
- **Roadmap & updates:** Follow us on X and Discord.
|
| 166 |
+
|
| 167 |
+
> If you want to see more **NPU-first, multimodal** releases on HF, please give our model a like ❤️.
|
| 168 |
+
|
| 169 |
+
## Limitation
|
| 170 |
+
The current model is mainly optimized for English. We will optimize other language as the next step.
|
| 171 |
+
|
| 172 |
+
---
|
| 173 |
+
|
| 174 |
+
## **Citation**
|
| 175 |
+
|
| 176 |
+
```bibtex
|
| 177 |
+
@misc{
|
| 178 |
+
title={OmniNeural: World’s First NPU-aware Multimodal Model},
|
| 179 |
+
author={Nexa AI},
|
| 180 |
+
year={2025},
|
| 181 |
+
url={https://huggingface.co/NexaAI/OmniNeural-4B},
|
| 182 |
+
}
|
| 183 |
+
```
|
| 184 |
+
|
| 185 |
+
## License
|
| 186 |
+
This model is released under the **Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0)** license.
|
| 187 |
+
Non-commercial use, modification, and redistribution are permitted with attribution.
|
| 188 |
+
For commercial licensing, please contact **dev@nexa.ai**.
|
assets/MOBILE_50MB.mp4
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:856a7eab6ec4b12236b44e4264637774081242423b5575cb84121f38e82c179e
|
| 3 |
+
size 40200022
|
assets/PC_Demo_Agent.mov
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5054bb51bf3701f004f5df710231fc86c32cf97be5590ee6198218ba30d5f3aa
|
| 3 |
+
size 22748433
|
assets/PC_Demo_Audio.mov
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:84c353c9361c8225c4518bbb0e0ffff4921ff86b3b7d4bdf75d8e47936564310
|
| 3 |
+
size 29321731
|
assets/PC_demo_2_image.mov
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b5a1aafb5a39e291887a6eb4671c22586b327497b485c7e2b37ffa1b6158a91e
|
| 3 |
+
size 23245220
|
audio/attachments-3-3.nexa
ADDED
|
Binary file (642 Bytes). View file
|
|
|
audio/htp_backend_ext_config.json
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"graphs": [
|
| 3 |
+
{
|
| 4 |
+
"O": 3.0,
|
| 5 |
+
"vtcm_mb": 8,
|
| 6 |
+
"graph_names": [
|
| 7 |
+
"vit"
|
| 8 |
+
],
|
| 9 |
+
"fp16_relaxed_precision": 0
|
| 10 |
+
}
|
| 11 |
+
],
|
| 12 |
+
"devices": [
|
| 13 |
+
{
|
| 14 |
+
"soc_id": 60,
|
| 15 |
+
"dsp_arch": "v73",
|
| 16 |
+
"cores": [
|
| 17 |
+
{
|
| 18 |
+
"perf_profile": "burst",
|
| 19 |
+
"rpc_control_latency": 100
|
| 20 |
+
}
|
| 21 |
+
],
|
| 22 |
+
"pd_session": "unsigned"
|
| 23 |
+
}
|
| 24 |
+
],
|
| 25 |
+
"context": {
|
| 26 |
+
"weight_sharing_enabled": false
|
| 27 |
+
},
|
| 28 |
+
"memory": {
|
| 29 |
+
"mem_type": "shared_buffer"
|
| 30 |
+
}
|
| 31 |
+
}
|
config.json
ADDED
|
File without changes
|
files-1-1.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c14310cd665fa61505bf6859233afd6b72dd836b28b81e19ca7b751a76e29e39
|
| 3 |
+
size 12179329
|
llm/attachments-1-3.nexa
ADDED
|
Binary file (443 Bytes). View file
|
|
|
llm/htp_backend_ext_config.json
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"devices": [
|
| 3 |
+
{
|
| 4 |
+
"soc_model": 60,
|
| 5 |
+
"dsp_arch": "v73",
|
| 6 |
+
"cores": [
|
| 7 |
+
{
|
| 8 |
+
"core_id": 0,
|
| 9 |
+
"perf_profile": "burst",
|
| 10 |
+
"rpc_control_latency": 100
|
| 11 |
+
}
|
| 12 |
+
]
|
| 13 |
+
}
|
| 14 |
+
],
|
| 15 |
+
"memory": {
|
| 16 |
+
"mem_type": "shared_buffer"
|
| 17 |
+
},
|
| 18 |
+
"context": {
|
| 19 |
+
"weight_sharing_enabled": true
|
| 20 |
+
}
|
| 21 |
+
}
|
nexa.manifest
ADDED
|
@@ -0,0 +1,5 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"ModelName": "omni-neural",
|
| 3 |
+
"ModelType": "vlm",
|
| 4 |
+
"PluginId": "npu"
|
| 5 |
+
}
|
vit/attachments-2-3.nexa
ADDED
|
Binary file (642 Bytes). View file
|
|
|
vit/htp_backend_ext_config.json
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"graphs": [
|
| 3 |
+
{
|
| 4 |
+
"O": 3.0,
|
| 5 |
+
"vtcm_mb": 8,
|
| 6 |
+
"graph_names": [
|
| 7 |
+
"vit"
|
| 8 |
+
],
|
| 9 |
+
"fp16_relaxed_precision": 0
|
| 10 |
+
}
|
| 11 |
+
],
|
| 12 |
+
"devices": [
|
| 13 |
+
{
|
| 14 |
+
"soc_id": 60,
|
| 15 |
+
"dsp_arch": "v73",
|
| 16 |
+
"cores": [
|
| 17 |
+
{
|
| 18 |
+
"perf_profile": "burst",
|
| 19 |
+
"rpc_control_latency": 100
|
| 20 |
+
}
|
| 21 |
+
],
|
| 22 |
+
"pd_session": "unsigned"
|
| 23 |
+
}
|
| 24 |
+
],
|
| 25 |
+
"context": {
|
| 26 |
+
"weight_sharing_enabled": false
|
| 27 |
+
},
|
| 28 |
+
"memory": {
|
| 29 |
+
"mem_type": "shared_buffer"
|
| 30 |
+
}
|
| 31 |
+
}
|
vlm/attachments-2-3.nexa
ADDED
|
Binary file (642 Bytes). View file
|
|
|
vlm/htp_backend_ext_config.json
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"graphs": [
|
| 3 |
+
{
|
| 4 |
+
"O": 3.0,
|
| 5 |
+
"vtcm_mb": 8,
|
| 6 |
+
"graph_names": [
|
| 7 |
+
"vit"
|
| 8 |
+
],
|
| 9 |
+
"fp16_relaxed_precision": 0
|
| 10 |
+
}
|
| 11 |
+
],
|
| 12 |
+
"devices": [
|
| 13 |
+
{
|
| 14 |
+
"soc_id": 60,
|
| 15 |
+
"dsp_arch": "v73",
|
| 16 |
+
"cores": [
|
| 17 |
+
{
|
| 18 |
+
"perf_profile": "burst",
|
| 19 |
+
"rpc_control_latency": 100
|
| 20 |
+
}
|
| 21 |
+
],
|
| 22 |
+
"pd_session": "unsigned"
|
| 23 |
+
}
|
| 24 |
+
],
|
| 25 |
+
"context": {
|
| 26 |
+
"weight_sharing_enabled": false
|
| 27 |
+
},
|
| 28 |
+
"memory": {
|
| 29 |
+
"mem_type": "shared_buffer"
|
| 30 |
+
}
|
| 31 |
+
}
|
weights-1-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:08cdf336224b5798ee876a47d760a3ae1e1569a11d931e7117822b64ac68ed37
|
| 3 |
+
size 770293308
|
weights-2-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d4611b0bdfaf1d15a680f59c38c1e286adf0e5fa3ee653a599a7fd64e7461347
|
| 3 |
+
size 1165936148
|
weights-3-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1d0ded0791faad29bafc06f18d5f352a4e297f14f3e5b4f771ffb50912343c1e
|
| 3 |
+
size 4572692
|
weights-4-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b111649ac3fc3dc7cc86099029ccc821f68d674a36beb70834f02301846ebac5
|
| 3 |
+
size 908198932
|
weights-5-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:ddf26428c50efbbb048d0492a3b57ea7b93406556e8998757e5b0d46ca96db05
|
| 3 |
+
size 14281108
|
weights-6-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:743fdd8011df0e2b5eda56fdca8d8290b6af754209b7e8599f2011384d7a0683
|
| 3 |
+
size 5588636
|
weights-7-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:188fb10f97a41ef6e83d2025d96143080e5c3bc41aff7f3e50d43a4c86c24534
|
| 3 |
+
size 649296044
|
weights-8-8.nexa
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4910ef47adb345d915ab6750f42bdcfd1b13294f01c8b7104eefdd72e6d9eb1b
|
| 3 |
+
size 1244659876
|