Instructions to use stepfun-ai/Step-3.7-Flash-FP8 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use stepfun-ai/Step-3.7-Flash-FP8 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="stepfun-ai/Step-3.7-Flash-FP8", trust_remote_code=True)
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("stepfun-ai/Step-3.7-Flash-FP8", trust_remote_code=True, dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use stepfun-ai/Step-3.7-Flash-FP8 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "stepfun-ai/Step-3.7-Flash-FP8"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash-FP8",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/stepfun-ai/Step-3.7-Flash-FP8

SGLang

How to use stepfun-ai/Step-3.7-Flash-FP8 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "stepfun-ai/Step-3.7-Flash-FP8" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash-FP8",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "stepfun-ai/Step-3.7-Flash-FP8" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "stepfun-ai/Step-3.7-Flash-FP8",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use stepfun-ai/Step-3.7-Flash-FP8 with Docker Model Runner:
```
docker model run hf.co/stepfun-ai/Step-3.7-Flash-FP8
```

Step-3.7-Flash-FP8

Commit History

delete unused parameters `use_qk_norm`

b3d7916
verified

Tingdan commited on Jun 1

Update README.md

3b6ed65
verified

mh3467 commited on May 29

Remove sibling repo links (Collection sidebar covers this)

8082ab2

mh3467 commited on May 29

Fix repo name casing in SGLang FP8 example

fa85eb9

mh3467 commited on May 29

Add benchmark chart referenced in README

9563b80

mh3467 commited on May 29

Update README.md

d14f10b
verified

WinstonDeng commited on May 28

update processor config and support transformers 5.0+

456ec15

luotingdan commited on May 26

add mtp quant ignore

77ddf22
verified

Tingdan commited on May 26

add step-3.7-flash fp8 model libs

5789f7a
verified

WinstonDeng commited on May 23

add step-3.7-flash fp8 model libs

b1e8330
verified

WinstonDeng commited on May 23

initial commit

b2c04c6
verified

WinstonDeng commited on May 23

Commit History

delete unused parameters `use_qk_norm` b3d7916 verified

Update README.md 3b6ed65 verified

Remove sibling repo links (Collection sidebar covers this) 8082ab2

Fix repo name casing in SGLang FP8 example fa85eb9

Add benchmark chart referenced in README 9563b80

Update README.md d14f10b verified

update processor config and support transformers 5.0+ 456ec15

add mtp quant ignore 77ddf22 verified

add step-3.7-flash fp8 model libs 5789f7a verified

add step-3.7-flash fp8 model libs b1e8330 verified

initial commit b2c04c6 verified

delete unused parameters `use_qk_norm`

b3d7916
verified

Update README.md

3b6ed65
verified

Remove sibling repo links (Collection sidebar covers this)

8082ab2

Fix repo name casing in SGLang FP8 example

fa85eb9

Add benchmark chart referenced in README

9563b80

Update README.md

d14f10b
verified

update processor config and support transformers 5.0+

456ec15

add mtp quant ignore

77ddf22
verified

add step-3.7-flash fp8 model libs

5789f7a
verified

add step-3.7-flash fp8 model libs

b1e8330
verified

initial commit

b2c04c6
verified