Instructions to use yuyijiong/speculative_pipeline_decoding with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use yuyijiong/speculative_pipeline_decoding with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="yuyijiong/speculative_pipeline_decoding")

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("yuyijiong/speculative_pipeline_decoding", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use yuyijiong/speculative_pipeline_decoding with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "yuyijiong/speculative_pipeline_decoding"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yuyijiong/speculative_pipeline_decoding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/yuyijiong/speculative_pipeline_decoding

SGLang

How to use yuyijiong/speculative_pipeline_decoding with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "yuyijiong/speculative_pipeline_decoding" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yuyijiong/speculative_pipeline_decoding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "yuyijiong/speculative_pipeline_decoding" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "yuyijiong/speculative_pipeline_decoding",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use yuyijiong/speculative_pipeline_decoding with Docker Model Runner:
```
docker model run hf.co/yuyijiong/speculative_pipeline_decoding
```

yuyijiong
/

speculative_pipeline_decoding

Add files using upload-large-folder tool

+version https://git-lfs.github.com/spec/v1
+oid sha256:edc1e7562908fae4f096115811c1466d4a2053f36c49748f7bec24b53fe5b564
+size 1472571801

+version https://git-lfs.github.com/spec/v1
+oid sha256:e632947613eea9fa26fca4cde909407fef8bd7f10f9adff6e803340ede5790ea
+size 1666580637

+version https://git-lfs.github.com/spec/v1
+oid sha256:13fe439952f6a8b6a289ba341aacc72629e8c97485affdc011925cbd3c20f172
+size 633701433

+version https://git-lfs.github.com/spec/v1
+oid sha256:23913a41da0c999fb32efbe22b29f8337861a66cb47232070a349487ea040832
+size 932563181

+version https://git-lfs.github.com/spec/v1
+oid sha256:3521a749ffbddce7e3722fae3db49d8ff0a765bf113622533696928633fbbc7c
+size 1425428821

+version https://git-lfs.github.com/spec/v1
+oid sha256:f26b8ad758b1f39185a1fbca9bd4538c1cfcdb9cf7399c79425fbed111e23189
+size 948277529

+version https://git-lfs.github.com/spec/v1
+oid sha256:4466181d01d64fa3913a5590f0c5f86b2d39794d8ee90b3bcf72fa366fa21e7b
+size 1740009541

+version https://git-lfs.github.com/spec/v1
+oid sha256:f7ddb1f96b6b2c9f3f26d9eab7c1f93a32ce8ddb09251c7f6ba8bebcfb357211
+size 4335726793

+version https://git-lfs.github.com/spec/v1
+oid sha256:05502fed6ecc3ab582ad16dbed846c671dd870aa052ea35ec944e0e848ef4216
+size 2960027989

+version https://git-lfs.github.com/spec/v1
+oid sha256:3fe5b7fc81ed35b63c209235b1dd2d4f34f08bdbb42d2643f9c72aff18291085
+size 4033777797

	@@ -0,0 +1,3 @@

	@@ -0,0 +1,3 @@

	@@ -0,0 +1,3 @@

	@@ -0,0 +1,3 @@