Instructions to use Kwaipilot/HiPO-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Kwaipilot/HiPO-8B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Kwaipilot/HiPO-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Kwaipilot/HiPO-8B")
model = AutoModelForCausalLM.from_pretrained("Kwaipilot/HiPO-8B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use Kwaipilot/HiPO-8B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Kwaipilot/HiPO-8B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Kwaipilot/HiPO-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Kwaipilot/HiPO-8B

SGLang

How to use Kwaipilot/HiPO-8B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Kwaipilot/HiPO-8B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Kwaipilot/HiPO-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Kwaipilot/HiPO-8B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Kwaipilot/HiPO-8B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Kwaipilot/HiPO-8B with Docker Model Runner:
```
docker model run hf.co/Kwaipilot/HiPO-8B
```

HiPO-8B

Commit History

Update README.md

65de90a
verified

shunxing1234 commited on Nov 4, 2025

update

0e5c19a

root commited on Nov 2, 2025

track tokenizer.json with git-lfs

ea557f4

root commited on Nov 2, 2025

update

77361f8

root commited on Nov 2, 2025

Adding `transformers` as the library name (#1)

b5320a9
verified

shunxing1234

ariG23498 HF Staff commited on Oct 13, 2025

Update README.md

5ca791f
verified

arieldeng commited on Sep 30, 2025

Update README.md

2902167
verified

shunxing1234 commited on Sep 26, 2025

Update README.md

f77871a
verified

shunxing1234 commited on Sep 26, 2025

Update README.md

eeca591
verified

arieldeng commited on Sep 26, 2025

Update README.md

8df699a
verified

shunxing1234 commited on Sep 26, 2025

Update README.md

6657188
verified

shunxing1234 commited on Sep 26, 2025

Update README.md

1b8bc17
verified

shunxing1234 commited on Sep 26, 2025

Update README.md

35b4586
verified

arieldeng commited on Sep 26, 2025

Update README.md

4e22bee
verified

arieldeng commited on Sep 26, 2025

Update README.md

a02de28
verified

arieldeng commited on Sep 26, 2025

initial commit

365dea7

root commited on Sep 26, 2025

Update README.md

16d3ec8
verified

shunxing1234 commited on Sep 26, 2025

initial commit

ec72e5f
verified

shunxing1234 commited on Sep 26, 2025

Commit History

Update README.md 65de90a verified

update 0e5c19a

track tokenizer.json with git-lfs ea557f4

update 77361f8

Adding `transformers` as the library name (#1) b5320a9 verified

Update README.md 5ca791f verified

Update README.md 2902167 verified

Update README.md f77871a verified

Update README.md eeca591 verified

Update README.md 8df699a verified

Update README.md 6657188 verified

Update README.md 1b8bc17 verified

Update README.md 35b4586 verified

Update README.md 4e22bee verified

Update README.md a02de28 verified

initial commit 365dea7

Update README.md 16d3ec8 verified

initial commit ec72e5f verified

Update README.md

65de90a
verified

update

0e5c19a

track tokenizer.json with git-lfs

ea557f4

update

77361f8

Adding `transformers` as the library name (#1)

b5320a9
verified

Update README.md

5ca791f
verified

Update README.md

2902167
verified

Update README.md

f77871a
verified

Update README.md

eeca591
verified

Update README.md

8df699a
verified

Update README.md

6657188
verified

Update README.md

1b8bc17
verified

Update README.md

35b4586
verified

Update README.md

4e22bee
verified

Update README.md

a02de28
verified

initial commit

365dea7

Update README.md

16d3ec8
verified

initial commit

ec72e5f
verified