Instructions to use RedHatAI/granite-3.1-2b-base-FP8-dynamic with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RedHatAI/granite-3.1-2b-base-FP8-dynamic with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RedHatAI/granite-3.1-2b-base-FP8-dynamic")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("RedHatAI/granite-3.1-2b-base-FP8-dynamic")
model = AutoModelForCausalLM.from_pretrained("RedHatAI/granite-3.1-2b-base-FP8-dynamic")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use RedHatAI/granite-3.1-2b-base-FP8-dynamic with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RedHatAI/granite-3.1-2b-base-FP8-dynamic"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RedHatAI/granite-3.1-2b-base-FP8-dynamic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/RedHatAI/granite-3.1-2b-base-FP8-dynamic

SGLang

How to use RedHatAI/granite-3.1-2b-base-FP8-dynamic with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RedHatAI/granite-3.1-2b-base-FP8-dynamic" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RedHatAI/granite-3.1-2b-base-FP8-dynamic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RedHatAI/granite-3.1-2b-base-FP8-dynamic" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RedHatAI/granite-3.1-2b-base-FP8-dynamic",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use RedHatAI/granite-3.1-2b-base-FP8-dynamic with Docker Model Runner:
```
docker model run hf.co/RedHatAI/granite-3.1-2b-base-FP8-dynamic
```

granite-3.1-2b-base-FP8-dynamic

Commit History

Update README.md

f8353f4
verified

nm-research commited on Jan 30, 2025

Update README.md

0d4e980
verified

shubhrapandit commited on Jan 28, 2025

Update README.md

9e2e4de
verified

nm-research commited on Jan 28, 2025

Update README.md

891df40
verified

nm-research commited on Jan 25, 2025

Update README.md

bb96ea6
verified

nm-research commited on Jan 24, 2025

Update README.md

0f12672
verified

nm-research commited on Jan 24, 2025

Update README.md

1a350b6
verified

nm-research commited on Jan 20, 2025

Update README.md

a131fcb
verified

nm-research commited on Jan 17, 2025

Update README.md

79efea7
verified

nm-research commited on Jan 16, 2025

Create README.md

4618a5f
verified

nm-research commited on Jan 16, 2025

Upload model files

ab028eb

Shubhra Pandit commited on Jan 16, 2025

initial commit

eaae8c8
verified

nm-research commited on Jan 16, 2025

Commit History

Update README.md f8353f4 verified

Update README.md 0d4e980 verified

Update README.md 9e2e4de verified

Update README.md 891df40 verified

Update README.md bb96ea6 verified

Update README.md 0f12672 verified

Update README.md 1a350b6 verified

Update README.md a131fcb verified

Update README.md 79efea7 verified

Create README.md 4618a5f verified

Upload model files ab028eb

initial commit eaae8c8 verified

Update README.md

f8353f4
verified

Update README.md

0d4e980
verified

Update README.md

9e2e4de
verified

Update README.md

891df40
verified

Update README.md

bb96ea6
verified

Update README.md

0f12672
verified

Update README.md

1a350b6
verified

Update README.md

a131fcb
verified

Update README.md

79efea7
verified

Create README.md

4618a5f
verified

Upload model files

ab028eb

initial commit

eaae8c8
verified