Instructions to use stepfun-ai/Step-3.7-Flash with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stepfun-ai/Step-3.7-Flash with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="stepfun-ai/Step-3.7-Flash", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("stepfun-ai/Step-3.7-Flash", trust_remote_code=True, dtype="auto")

Inference
HuggingChat
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use stepfun-ai/Step-3.7-Flash with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stepfun-ai/Step-3.7-Flash"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/stepfun-ai/Step-3.7-Flash

SGLang

How to use stepfun-ai/Step-3.7-Flash with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stepfun-ai/Step-3.7-Flash" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stepfun-ai/Step-3.7-Flash" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use stepfun-ai/Step-3.7-Flash with Docker Model Runner:
```
docker model run hf.co/stepfun-ai/Step-3.7-Flash
```

Step-3.7-Flash

Commit History

add cache_position to mask_kwargs in modeling_step3p7.py (#13)

5f62440

WinstonDeng

shifangxu2024 commited on Jun 3

delete unused parameters `use_qk_norm`

f711925
verified

Tingdan commited on Jun 1

Update README.md

751ea53
verified

mh3467 commited on May 29

Fix repo name casing in SGLang FP8 example

5371d3c

mh3467 commited on May 29

Update README.md

f69ad14
verified

mh3467 commited on May 29

update Ecosystem bullet

63f1ea4

hengm3467 commited on May 29

Update README.md

5c5f97c
verified

WinstonDeng commited on May 28

update benchmark chart

a214531

hengm3467 commited on May 28

fix vLLM intro to match StepFun-specific Docker tag

f0bbde3

hengm3467 commited on May 28

update Local Deployment install instructions

e7debd8

hengm3467 commited on May 28

sync to v3 model card

e03d01d

hengm3467 commited on May 28

sync to v2 model card

fbcee77

hengm3467 commited on May 28

describe MoE as sparse for precision

8817d4a

hengm3467 commited on May 28

clarify regional base_url and use env vars in examples

ee8c807

hengm3467 commited on May 28

add benchmark chart above Pricing section

1678751

hengm3467 commited on May 28

clean up model card: fix numbering, typos, and code examples

47ec689

hengm3467 commited on May 28

add readme (#1)

480473d

mh3467 commited on May 28

update processor config

a9c0171

luotingdan commited on May 26

add step-3.7-flash bf16 model libs

7805a18
verified

WinstonDeng commited on May 23

add step-3.7-flash bf16 model libs

c171c6a
verified

WinstonDeng commited on May 23

add step-3.7-flash bf16 model config

457483d
verified

WinstonDeng commited on May 23

step-3.7-flash bf16 model

6578499
verified

WinstonDeng commited on May 23

initial commit

dc3047b
verified

WinstonDeng commited on May 23

Commit History

add cache_position to mask_kwargs in modeling_step3p7.py (#13) 5f62440

delete unused parameters `use_qk_norm` f711925 verified

Update README.md 751ea53 verified

Fix repo name casing in SGLang FP8 example 5371d3c

Update README.md f69ad14 verified

update Ecosystem bullet 63f1ea4

Update README.md 5c5f97c verified

update benchmark chart a214531

fix vLLM intro to match StepFun-specific Docker tag f0bbde3

update Local Deployment install instructions e7debd8

sync to v3 model card e03d01d

sync to v2 model card fbcee77

describe MoE as sparse for precision 8817d4a

clarify regional base_url and use env vars in examples ee8c807

add benchmark chart above Pricing section 1678751

clean up model card: fix numbering, typos, and code examples 47ec689

add readme (#1) 480473d

update processor config a9c0171

add step-3.7-flash bf16 model libs 7805a18 verified

add step-3.7-flash bf16 model libs c171c6a verified

add step-3.7-flash bf16 model config 457483d verified

step-3.7-flash bf16 model 6578499 verified

initial commit dc3047b verified

add cache_position to mask_kwargs in modeling_step3p7.py (#13)

5f62440

delete unused parameters `use_qk_norm`

f711925
verified

Update README.md

751ea53
verified

Fix repo name casing in SGLang FP8 example

5371d3c

Update README.md

f69ad14
verified

update Ecosystem bullet

63f1ea4

Update README.md

5c5f97c
verified

update benchmark chart

a214531

fix vLLM intro to match StepFun-specific Docker tag

f0bbde3

update Local Deployment install instructions

e7debd8

sync to v3 model card

e03d01d

sync to v2 model card

fbcee77

describe MoE as sparse for precision

8817d4a

clarify regional base_url and use env vars in examples

ee8c807

add benchmark chart above Pricing section

1678751

clean up model card: fix numbering, typos, and code examples

47ec689

add readme (#1)

480473d

update processor config

a9c0171

add step-3.7-flash bf16 model libs

7805a18
verified

add step-3.7-flash bf16 model libs

c171c6a
verified

add step-3.7-flash bf16 model config

457483d
verified

step-3.7-flash bf16 model

6578499
verified

initial commit

dc3047b
verified