SipsaLabs/phi-3.5-moe-instruct-uc-v3-bpw5
Updated
Lossless 5-bit transformer compression. SHA-256 verifiable bit-identical reconstruction across 22 architectures (dense + MoE + state-space). Hermes-3-Llama-3.1-405B at 1.0066x PPL on a single 32 GB GPU. OpenAI-compatible inference API at api.sipsalabs.com/v1.