Cortexelus Claude Opus 4.7 (1M context) commited on
Commit
e11ce65
·
1 Parent(s): 97651ef

Add pre-processed FP16-mixed ONNX for SA3 DiTs

Browse files

Ships dit_fp16mixed.onnx alongside the original FP32 dit.onnx for each
SA3 DiT (sm-music, sm-sfx, sa3-m). Consumers compiling for a non-sm_90
architecture can now build the DiT engine with a plain
`build_from_onnx.py sa3-sm-music` (STRONGLY_TYPED, no precision flags
needed) — no onnx-graphsurgeon dependency required.

The pre-processing was done once with build_dit_fp16mixed.py:
- 140 RMSNorm chains wrapped in Cast(FP32)/Cast(FP16) islands
- 40 attention Softmax nodes wrapped
- ~186 RoPE-region nodes (reachable from Cast(to=FP32) feeding Cos/Sin)
- Non-island weights converted to FP16

ONNX size shrinks roughly in half (1.84 GB → 921 MB for sm-music) because
most weights are now FP16.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

onnx/sa3-m/dit_fp16mixed.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2976b78a536fd48c1cd267bcd8c4925336b43873b0472f8334aa4cb7833d29d6
3
+ size 4249742
onnx/sa3-m/dit_fp16mixed.onnx.data ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a038ae6d82c4be77578c2874ea3edc8e7dda2be2e9c605abfeea1ead997e9035
3
+ size 2906724352
onnx/sa3-sm-music/dit_fp16mixed.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86fd5780edfbe1d4e6758ce8db0c184821783bb4754ba99fd778d87bab277731
3
+ size 921133617
onnx/sa3-sm-sfx/dit_fp16mixed.onnx ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8154582d5c8e7efc7154469642cb75ce37bb928e2e7658388e910f7ce75b476f
3
+ size 921133617