Instructions to use ibm-granite/granite-34b-code-instruct-8k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ibm-granite/granite-34b-code-instruct-8k with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ibm-granite/granite-34b-code-instruct-8k")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ibm-granite/granite-34b-code-instruct-8k")
model = AutoModelForCausalLM.from_pretrained("ibm-granite/granite-34b-code-instruct-8k")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ibm-granite/granite-34b-code-instruct-8k with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ibm-granite/granite-34b-code-instruct-8k"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibm-granite/granite-34b-code-instruct-8k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ibm-granite/granite-34b-code-instruct-8k

SGLang

How to use ibm-granite/granite-34b-code-instruct-8k with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ibm-granite/granite-34b-code-instruct-8k" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibm-granite/granite-34b-code-instruct-8k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ibm-granite/granite-34b-code-instruct-8k" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ibm-granite/granite-34b-code-instruct-8k",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ibm-granite/granite-34b-code-instruct-8k with Docker Model Runner:
```
docker model run hf.co/ibm-granite/granite-34b-code-instruct-8k
```

granite-34b-code-instruct-8k

Commit History

Update README.md

4bdfb58
verified

daviddcox commited on 25 days ago

update context length

b7b4a96
verified

rpand002 commited on Sep 2, 2024

Update README.md

2b0c93e
verified

mayank-mishra commited on Jul 9, 2024

granite tag

20f67e1
verified

mayank-mishra commited on May 10, 2024

update paper

680b1ca
verified

mayank-mishra commited on May 8, 2024

Update README.md

3bae2fd
verified

mayank-mishra commited on May 7, 2024

Update README.md

9130f14
verified

mayank-mishra commited on May 6, 2024

Update config.json

c42d2bd
verified

mayank-mishra commited on May 6, 2024

Update README.md

a7175f5
verified

mayank-mishra commited on May 6, 2024

Update README.md

a670fa3
verified

mayank-mishra commited on May 6, 2024

Update README.md

9b4d69b
verified

mayank-mishra commited on May 6, 2024

Update README.md

4660a63
verified

mayank-mishra commited on May 6, 2024

update example

c406b8d
verified

mayank-mishra commited on May 6, 2024

apply chat template

6a72207
verified

mayank-mishra commited on May 6, 2024

removed code comments

6266357
verified

amezasor commited on May 5, 2024

First commit Granite-34B-Code-Instruct

895ffb1
verified

amezasor commited on May 4, 2024

update with correct model

836df29

mayank-mishra commited on May 4, 2024

upload model

1ee739c

Mayank Mishra commited on May 4, 2024

initial commit

24fb0cb
verified

mayank-mishra commited on Apr 26, 2024

Commit History

Update README.md 4bdfb58 verified

update context length b7b4a96 verified

Update README.md 2b0c93e verified

granite tag 20f67e1 verified

update paper 680b1ca verified

Update README.md 3bae2fd verified

Update README.md 9130f14 verified

Update config.json c42d2bd verified

Update README.md a7175f5 verified

Update README.md a670fa3 verified

Update README.md 9b4d69b verified

Update README.md 4660a63 verified

update example c406b8d verified

apply chat template 6a72207 verified

removed code comments 6266357 verified

First commit Granite-34B-Code-Instruct 895ffb1 verified

update with correct model 836df29

upload model 1ee739c

initial commit 24fb0cb verified

Update README.md

4bdfb58
verified

update context length

b7b4a96
verified

Update README.md

2b0c93e
verified

granite tag

20f67e1
verified

update paper

680b1ca
verified

Update README.md

3bae2fd
verified

Update README.md

9130f14
verified

Update config.json

c42d2bd
verified

Update README.md

a7175f5
verified

Update README.md

a670fa3
verified

Update README.md

9b4d69b
verified

Update README.md

4660a63
verified

update example

c406b8d
verified

apply chat template

6a72207
verified

removed code comments

6266357
verified

First commit Granite-34B-Code-Instruct

895ffb1
verified

update with correct model

836df29

upload model

1ee739c

initial commit

24fb0cb
verified