Step 3.5 PRISM
Collection
PRISM abliterated StepFun Step 3.5 Flash MoE — safetensors, GGUF, NVFP4. • 4 items • Updated
A "role-play" following unrestricted/unchained PRISM-LITE version of StepFun's Step 3.5 Flash intended particularly for over-refusal and propaganda mechanisms suppression using our SOTA PRISM pipeline.
PRISM-PRO version avialable here: hhttps://ko-fi.com/s/d70e27c5b5
For Full Custom Production PRISM versions & raw tensors reach out @ https://ko-fi.com/ex0bit.
If you enjoy our work and find it useful, please consider sponsoring or supporting us!
| Option | Description |
|---|---|
| PRISM VIP Membership | Access to all PRISM models |
| Bitcoin | bc1qarq2pyn4psjpcxzp2ghgwaq6y2h4e53q232x8r |
| Specification | Value |
|---|---|
| Architecture | Sparse Mixture-of-Experts (MoE) |
| Backbone | 45-layer Transformer (4,096 hidden dim) |
| Total Parameters | 196.81B (196B Backbone + 0.81B Head) |
| Activated Parameters | ~11B (per token) |
| Routed Experts per Layer | 288 |
| Shared Experts | 1 (always active) |
| Selected Experts per Token | Top-8 |
| Vocabulary Size | 128,896 |
| Context Length | 256K |
| Attention | Hybrid SWA (3:1 SWA-to-Full ratio) |
| MTP Head | Sliding-window attention + dense FFN (4 tokens/pass) |
| Benchmark | Step 3.5 Flash | DeepSeek V3.2 | Kimi K2.5 | GLM-4.7 | MiniMax M2.1 |
|---|---|---|---|---|---|
| Agent | |||||
| τ²-Bench | 88.2 | 80.3 | 85.4 | 87.4 | 86.6 |
| BrowseComp | 51.6 | 51.4 | 60.6 | 52.0 | 47.4 |
| GAIA (no file) | 84.5 | 75.1 | 75.9 | 61.9 | 64.3 |
| xbench-DeepSearch (2025.05) | 83.7 | 78.0 | 76.7 | 72.0 | 68.7 |
| Reasoning | |||||
| AIME 2025 | 97.3 | 93.1 | 96.1 | 95.7 | 83.0 |
| HMMT 2025 (Feb.) | 98.4 | 92.5 | 95.4 | 97.1 | 71.0 |
| IMOAnswerBench | 85.4 | 78.3 | 81.8 | 82.0 | 60.4 |
| Coding | |||||
| LiveCodeBench-V6 | 86.4 | 83.3 | 85.0 | 84.9 | — |
| SWE-bench Verified | 74.4 | 73.1 | 76.8 | 73.8 | 74.0 |
| Terminal-Bench 2.0 | 51.0 | 46.4 | 50.8 | 41.0 | 47.9 |
For local deployment (requires ~120 GB VRAM for int4, smaller quants are available):
./llama-cli -m step3.5_flash_prism_Q4_K_S.gguf --jinja
| Use Case | Temperature | Top-P | Max New Tokens |
|---|---|---|---|
| Reasoning / Coding | 1.0 | 0.95 | 32768 |
| General Chat | 0.6 | 0.95 | 4096 |
| Setup | Details |
|---|---|
| BF16 (Full) | 8x H100/A100 80GB with tensor parallelism |
| FP8 Quantized | 8x A100 80GB with expert parallelism |
| GGUF INT4 (Local) | ~120 GB unified memory (Mac Studio M4 Max 128GB, DGX Spark, AMD Ryzen AI Max+ 395) |
This model is released under the PRISM Research License.
Based on Step 3.5 Flash by StepFun AI. See the technical report and blog post for more details on the base model.
2-bit
3-bit
4-bit
Base model
stepfun-ai/Step-3.5-Flash