Segformer-Base: Optimized for Qualcomm Devices

Segformer Base is a machine learning model that predicts masks and classes of objects in an image.

This is based on the implementation of Segformer-Base found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime	Precision	Chipset	SDK Versions	Download
ONNX	float	Universal	QAIRT 2.45, ONNX Runtime 1.25.0	Download
ONNX	w8a16	Universal	QAIRT 2.45, ONNX Runtime 1.25.0	Download
ONNX	w8a8	Universal	QAIRT 2.45, ONNX Runtime 1.25.0	Download
QNN_DLC	float	Universal	QAIRT 2.45	Download
QNN_DLC	w8a16	Universal	QAIRT 2.45	Download
QNN_DLC	w8a8	Universal	QAIRT 2.45	Download
TFLITE	float	Universal	QAIRT 2.45	Download
TFLITE	w8a8	Universal	QAIRT 2.45	Download

For more device-specific assets and performance metrics, visit Segformer-Base on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

Custom weights (e.g., fine-tuned checkpoints)
Custom input shapes
Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for Segformer-Base on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.semantic_segmentation

Model Stats:

Model checkpoint: nvidia/segformer-b0-finetuned-ade-512-512
Input resolution: 512x512
Number of output classes: 150
Number of parameters: 3.75M
Model size (float): 14.4 MB
Model size (w8a16): 4.57 MB
Model size (w8a8): 3.90 MB

Performance Summary

Model	Runtime	Precision	Chipset	Inference Time (ms)	Peak Memory Range (MB)	Primary Compute Unit
Segformer-Base	ONNX	float	Snapdragon® X2 Elite	13.579 ms	210 - 210 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® X Elite	24.653 ms	146 - 146 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Gen 3 Mobile	17.32 ms	10 - 216 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Gen 1 Mobile	24.691 ms	23 - 233 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS8550 (Proxy)	23.503 ms	20 - 33 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS8450	24.691 ms	23 - 233 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Elite Mobile	14.357 ms	23 - 209 MB	NPU
Segformer-Base	ONNX	float	Snapdragon® 8 Elite Gen 5 Mobile	13.253 ms	21 - 208 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS9075	25.591 ms	20 - 70 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS8750	14.357 ms	23 - 209 MB	NPU
Segformer-Base	ONNX	float	Qualcomm® QCS7181	24.653 ms	146 - 146 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® X2 Elite	5.618 ms	211 - 211 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® X Elite	13.574 ms	148 - 148 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Gen 3 Mobile	9.42 ms	12 - 235 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS8550 (Proxy)	13.382 ms	11 - 15 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Elite Gen 5 Mobile	5.651 ms	14 - 215 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 8 Elite Mobile	7.247 ms	13 - 216 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS9075	18.538 ms	10 - 58 MB	NPU
Segformer-Base	ONNX	w8a16	Snapdragon® 7 Gen 4 Mobile	17.006 ms	13 - 233 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCM6690	91.124 ms	4 - 230 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS7790	17.006 ms	13 - 233 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS8750	7.247 ms	13 - 216 MB	NPU
Segformer-Base	ONNX	w8a16	Qualcomm® QCS7181	13.574 ms	148 - 148 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® X2 Elite	1.941 ms	212 - 212 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® X Elite	4.959 ms	148 - 148 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Gen 3 Mobile	3.221 ms	0 - 204 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Gen 1 Mobile	7.016 ms	6 - 212 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS6490	14.314 ms	5 - 51 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS8550 (Proxy)	4.767 ms	5 - 14 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS8450	7.016 ms	6 - 212 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS9075	5.622 ms	2 - 52 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCM6690	31.491 ms	0 - 225 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 7 Gen 4 Mobile	5.383 ms	8 - 201 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Elite Gen 5 Mobile	1.913 ms	0 - 193 MB	NPU
Segformer-Base	ONNX	w8a8	Snapdragon® 8 Elite Mobile	2.389 ms	0 - 191 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS7790	5.383 ms	8 - 201 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS8750	2.389 ms	0 - 191 MB	NPU
Segformer-Base	ONNX	w8a8	Qualcomm® QCS7181	4.959 ms	148 - 148 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® X2 Elite	12.46 ms	3 - 3 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® X Elite	23.692 ms	3 - 3 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Gen 3 Mobile	16.334 ms	0 - 225 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Gen 1 Mobile	29.316 ms	0 - 223 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8275	48.87 ms	0 - 189 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8550 (Proxy)	22.625 ms	3 - 5 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8450	29.316 ms	0 - 223 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Elite Mobile	13.196 ms	3 - 200 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® SA7255P	48.87 ms	0 - 189 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® SA8295P	27.829 ms	0 - 192 MB	NPU
Segformer-Base	QNN_DLC	float	Snapdragon® 8 Elite Gen 5 Mobile	13.696 ms	3 - 192 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS9075	27.828 ms	3 - 17 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS8750	13.196 ms	3 - 200 MB	NPU
Segformer-Base	QNN_DLC	float	Qualcomm® QCS7181	23.692 ms	3 - 3 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® X2 Elite	8.904 ms	2 - 2 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® X Elite	19.637 ms	2 - 2 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® 8 Gen 3 Mobile	15.333 ms	2 - 240 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS8275	32.41 ms	2 - 204 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS8550 (Proxy)	20.536 ms	2 - 3 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® 8 Elite Gen 5 Mobile	9.732 ms	2 - 216 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® 8 Elite Mobile	11.302 ms	1 - 206 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS9075	33.911 ms	1 - 9 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Snapdragon® 7 Gen 4 Mobile	34.451 ms	2 - 232 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCM6690	129.787 ms	2 - 254 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® SA7255P	32.41 ms	2 - 204 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS7790	34.451 ms	2 - 232 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS8750	11.302 ms	1 - 206 MB	NPU
Segformer-Base	QNN_DLC	w8a16	Qualcomm® QCS7181	19.637 ms	2 - 2 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® X2 Elite	2.541 ms	1 - 1 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® X Elite	6.363 ms	1 - 1 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® 8 Gen 3 Mobile	3.955 ms	1 - 201 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® 8 Gen 1 Mobile	8.641 ms	0 - 196 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS6490	15.677 ms	1 - 5 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS8275	10.865 ms	1 - 174 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS8550 (Proxy)	5.787 ms	1 - 10 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS8450	8.641 ms	0 - 196 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS9075	6.706 ms	0 - 5 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® SA7255P	10.865 ms	1 - 174 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCM6690	35.336 ms	1 - 188 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® 7 Gen 4 Mobile	6.84 ms	1 - 182 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® SA8295P	7.765 ms	0 - 172 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® 8 Elite Gen 5 Mobile	2.243 ms	1 - 184 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Snapdragon® 8 Elite Mobile	2.841 ms	0 - 178 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS7790	6.84 ms	1 - 182 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS8750	2.841 ms	0 - 178 MB	NPU
Segformer-Base	QNN_DLC	w8a8	Qualcomm® QCS7181	6.363 ms	1 - 1 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Gen 3 Mobile	16.521 ms	9 - 241 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Gen 1 Mobile	29.561 ms	9 - 234 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8275	48.908 ms	10 - 200 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8550 (Proxy)	22.647 ms	9 - 12 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA8775P	99.841 ms	10 - 40 MB	GPU
Segformer-Base	TFLITE	float	Qualcomm® SA8650P	99.841 ms	10 - 40 MB	GPU
Segformer-Base	TFLITE	float	Qualcomm® SA8255P	99.841 ms	10 - 40 MB	GPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8450	29.561 ms	9 - 234 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Elite Mobile	13.299 ms	9 - 205 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA7255P	48.908 ms	10 - 200 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® SA8295P	27.821 ms	9 - 203 MB	NPU
Segformer-Base	TFLITE	float	Snapdragon® 8 Elite Gen 5 Mobile	13.644 ms	8 - 200 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS9075	27.587 ms	8 - 30 MB	NPU
Segformer-Base	TFLITE	float	Qualcomm® QCS8750	13.299 ms	9 - 205 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Gen 3 Mobile	7.108 ms	2 - 212 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Gen 1 Mobile	12.13 ms	0 - 208 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS6490	110.956 ms	15 - 50 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8275	18.289 ms	2 - 179 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8550 (Proxy)	10.124 ms	2 - 5 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8775P	106.739 ms	15 - 47 MB	GPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8650P	106.739 ms	15 - 47 MB	GPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8255P	106.739 ms	15 - 47 MB	GPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8450	12.13 ms	0 - 208 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS9075	10.652 ms	2 - 12 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA7255P	18.289 ms	2 - 179 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCM6690	95.14 ms	13 - 185 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 7 Gen 4 Mobile	38.935 ms	15 - 73 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® SA8295P	12.874 ms	2 - 184 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Elite Gen 5 Mobile	4.547 ms	2 - 182 MB	NPU
Segformer-Base	TFLITE	w8a8	Snapdragon® 8 Elite Mobile	5.723 ms	0 - 177 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS7790	38.935 ms	15 - 73 MB	NPU
Segformer-Base	TFLITE	w8a8	Qualcomm® QCS8750	5.723 ms	0 - 177 MB	NPU

License

The license for the original implementation of Segformer-Base can be found here.

References

Community

Join our AI Hub Slack community to collaborate, post questions and learn more about on-device AI.
For questions or feedback please reach out to us.

Downloads last month: -; Downloads are not tracked for this model. How to track

Paper for qualcomm/Segformer-Base

SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers

Paper • 2105.15203 • Published May 31, 2021 • 3