WinstonDeng commited on
Commit
5c5f97c
·
verified ·
1 Parent(s): a214531

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -10,6 +10,14 @@ tags:
10
  - moe
11
  ---
12
 
 
 
 
 
 
 
 
 
13
  ## 1. Introduction
14
 
15
  Step 3.7 Flash is a 198B-parameter sparse Mixture-of-Experts (MoE) vision-language model that combines a 196B-parameter language backbone with a 1.8B-parameter vision encoder for native image understanding. Engineered for high-frequency production workloads, it activates approximately 11B parameters per token and delivers a throughput of up to 400 tokens per second. Step 3.7 Flash supports a 256k context window and offers three selectable reasoning levels (low, medium, and high) so developers can easily balance speed, cost, and cognitive depth.
 
10
  - moe
11
  ---
12
 
13
+ - **[ModelPage]**: https://static.stepfun.com/blog/step-3.7-flash/
14
+ - **[Github]**: https://github.com/stepfun-ai/Step-3.7-Flash
15
+ - **[HuggingFace]**:
16
+ - BF16: https://huggingface.co/stepfun-ai/Step-3.7-Flash/
17
+ - FP8: https://huggingface.co/stepfun-ai/Step-3.7-Flash-FP8
18
+ - NVFP4: https://huggingface.co/stepfun-ai/Step-3.7-Flash-NVFP4
19
+ - GGUF: https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF
20
+
21
  ## 1. Introduction
22
 
23
  Step 3.7 Flash is a 198B-parameter sparse Mixture-of-Experts (MoE) vision-language model that combines a 196B-parameter language backbone with a 1.8B-parameter vision encoder for native image understanding. Engineered for high-frequency production workloads, it activates approximately 11B parameters per token and delivers a throughput of up to 400 tokens per second. Step 3.7 Flash supports a 256k context window and offers three selectable reasoning levels (low, medium, and high) so developers can easily balance speed, cost, and cognitive depth.