DSv4 Flash DSpark Draft Sidecar
This repository contains the DSpark draft sidecar package for DSv4 Flash. It is intended to be used with compatible ds4-ssd DSpark support and a matching DSv4 Flash base package. It is not a standalone text-generation model.
Active implementation work is on the dspark-attn branch:
https://github.com/Anemll/ds4-ssd/tree/dspark-attn
Package Contents
manifest.jsondraft_layer_000.bindraft_layer_001.bindraft_layer_002.bin
The draft model is exported from the original FP8/FP4 DSpark weights into sidecar form. The sidecar format is the intended fast path for this package; in this workflow it is faster than GGUF-based loading.
Current Status
This is a work-in-progress package for the DSpark speculative decoding path. In current testing with DS4 IQ2_Q2, the draft sidecar path has shown roughly a +5 tokens/sec improvement over regular decoding. The acceptance rate is good, but the verifier is currently the main bottleneck in batched attention, so performance numbers should be treated as implementation snapshots rather than final release guarantees.
Runtime Notes
This package is being used with MPP 4.1. macOS 27 or newer is likely required until older-runtime compatibility is validated.
For current build flags, runtime options, and compatibility notes, use the dspark-attn branch linked above. The interface may change as the DSpark path is finalized.