SD 1.5 + ControlNet-Canny — Precompiled QNN ONNX (Snapdragon 8 Gen 3)

Mirror of Qualcomm's published precompiled QNN ONNX bundle for SD 1.5

ControlNet-Canny, built for the Hexagon V75 NPU on Snapdragon 8 Gen 3. Used by Sona Forge — a local-first Android image-generation app — to run SD 1.5 inference entirely on the device's NPU instead of the CPU.

This repository is a redistribution of artefacts that Qualcomm publishes under their own model cards on aihub.qualcomm.com. We do not retrain, refine, or otherwise alter the weights. The mirror exists so the Sona Forge Android app has a stable, version-pinned download URL under our org's namespace; upstream URLs at qaihub-public-assets.s3.amazonaws.com are tied to a release version (currently v0.52.0) that may rotate.

Files

File	Size	Purpose
`text_encoder.onnx`	733 B	EPContext wrapper, references `text_encoder_qairt_context.bin`
`text_encoder_qairt_context.bin`	156 MB	QAIRT 2.42 HTP context binary, w8a16
`unet.onnx`	9.4 KB	EPContext wrapper, references `unet_qairt_context.bin`
`unet_qairt_context.bin`	841 MB	UNet w/ 13 ControlNet residual inputs, w8a16
`controlnet.onnx`	7.4 KB	EPContext wrapper, references `controlnet_qairt_context.bin`
`controlnet_qairt_context.bin`	352 MB	ControlNet-Canny encoder, w8a16
`vae.onnx`	873 B	EPContext wrapper, references `vae_qairt_context.bin`
`vae_qairt_context.bin`	62 MB	VAE decoder, w8a16
`metadata.json`	17 KB	Input/output shapes + quantization scale/zero-point per tensor

Total: 1.4 GB on disk.

Loading via ONNX Runtime QNN EP

The .onnx files are tiny EPContext wrappers — they reference the matching *_qairt_context.bin and tell ORT's QNN Execution Provider to dispatch inference to the Hexagon V75 backend. Both files of a pair must live side-by-side at load time.

val options = OrtSession.SessionOptions().apply {
    addQnn(mapOf("backend_path" to "libQnnHtp.so", "htp_arch" to "75"))
}
val unet = env.createSession("unet.onnx", options)

Inputs / outputs are uint16 quantized (NHWC)

Unlike Sona Forge's CPU-EP packs (sd15-controlnet-canny-fp16), these take uint16 quantized tensors in NHWC layout, with per-tensor scale and zero_point in metadata.json. Host code must:

Quantize FP32/FP16 inputs to uint16 using the metadata's scale + zp.
Permute NCHW → NHWC before feeding ([1, 4, 64, 64] → [1, 64, 64, 4]).
Reverse for outputs (uint16 → FP32, NHWC → NCHW).

metadata.json enumerates every input/output shape, dtype, and quantization parameters. See the Qualcomm AI Hub ControlNet-Canny model card for the full pre/post-processing reference.

Provenance

Stable Diffusion v1.5 weights
  CompVis (https://github.com/CompVis/stable-diffusion)
  CreativeML Open RAIL-M
        │
        ▼
ControlNet-Canny adapter
  lllyasviel (https://github.com/lllyasviel/ControlNet)
  Apache 2.0
        │
        ▼
Quantization (w8a16) + QAIRT 2.42 HTP compilation for V75
  Qualcomm AI Hub (https://aihub.qualcomm.com)
  Use restrictions per Qualcomm AI Hub Terms — see LICENSE / ATTRIBUTION.md
        │
        ▼
This repository
  sona-forge mirror, no further modifications

Tooling versions baked in

QAIRT: 2.42.0.251225135753_193295
ONNX Runtime: 1.24.3
Quantization: w8a16 (8-bit weights, 16-bit activations)
Target SoC: Snapdragon 8 Gen 3 — Hexagon V75 only

These artefacts will not load on:

Hexagon V73 (8 Gen 2 / 7 Gen 4) — needs a _8gen2 variant
Hexagon V79 (8 Elite / 8 Elite Gen 5) — needs a _8elite variant
Any non-Qualcomm SoC

For other targets, mirror the matching zip from qaihub-public-assets.s3.us-west-2.amazonaws.com or recompile via the Qualcomm AI Hub Python API.

License

See LICENSE and ATTRIBUTION.md. The bundle is governed by the union of:

CreativeML Open RAIL-M (SD 1.5 base)
Apache 2.0 (ControlNet adapter)
Qualcomm AI Hub Terms of Service (compiled artefacts)

Use restrictions from the upstream model cards apply, including (but not limited to) prohibitions on biometric identification, social scoring, generation of CSAM, and harassment. Read the upstream cards before deploying.

Downloads last month: -; Downloads are not tracked for this model. How to track