Text Generation
Transformers
Safetensors
llama
code
granite
Eval Results (legacy)
text-generation-inference
Instructions to use ibm-granite/granite-3b-code-base-2k with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ibm-granite/granite-3b-code-base-2k with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="ibm-granite/granite-3b-code-base-2k")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("ibm-granite/granite-3b-code-base-2k") model = AutoModelForCausalLM.from_pretrained("ibm-granite/granite-3b-code-base-2k") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use ibm-granite/granite-3b-code-base-2k with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "ibm-granite/granite-3b-code-base-2k" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ibm-granite/granite-3b-code-base-2k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/ibm-granite/granite-3b-code-base-2k
- SGLang
How to use ibm-granite/granite-3b-code-base-2k with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "ibm-granite/granite-3b-code-base-2k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ibm-granite/granite-3b-code-base-2k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "ibm-granite/granite-3b-code-base-2k" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "ibm-granite/granite-3b-code-base-2k", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use ibm-granite/granite-3b-code-base-2k with Docker Model Runner:
docker model run hf.co/ibm-granite/granite-3b-code-base-2k
Commit History
Update README.md 70ebbec verified
add context length 9b4f44d verified
Update README.md 6e2aaa0 verified
revert 99b32ae verified
enable inference 0ccdef0 verified
granite tag 7cd4fec verified
update paper 14f1d9f verified
Update README.md 10dac08 verified
Update README.md c2475bd verified
Update README.md d94268d verified
Update README.md 675fb44 verified
Update README.md 658479e verified
add abstract 2f97d76 verified
update examples 15049b3 verified
add warning 8d20cb5 verified
disable inference 45b5cb7 verified
Update README.md 0e91847 verified
update example b7cb4df verified
removed HelpSteer dataset 43bd2e4 verified
metadata update 6c5a0ae verified
metadata update f0ca2c8 verified
model summary update 646be3f verified
fixed model name in generation section 5af0ee9 verified
Removed comments 80901fb verified
fix generation config 9ec6bd4
llama afc3169
typo correction 2263de3 verified
model card revision bbe82c2 verified
datasets update f1785a4 verified
update readme 2ab68a5
Mayank Mishra commited on
Initial model card version (#1) 64765e9 verified
drop extra arg 0844435 verified
drop name in config 68688f5 verified
downcast to bf16 8f14620
Mayank Mishra commited on
update script 6bb0180
Mayank Mishra commited on