Instructions to use Zyphra/Zamba2-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Zyphra/Zamba2-7B-Instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Zyphra/Zamba2-7B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Zyphra/Zamba2-7B-Instruct") model = AutoModelForCausalLM.from_pretrained("Zyphra/Zamba2-7B-Instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use Zyphra/Zamba2-7B-Instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Zyphra/Zamba2-7B-Instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/Zamba2-7B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Zyphra/Zamba2-7B-Instruct
- SGLang
How to use Zyphra/Zamba2-7B-Instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Zyphra/Zamba2-7B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/Zamba2-7B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Zyphra/Zamba2-7B-Instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Zyphra/Zamba2-7B-Instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Zyphra/Zamba2-7B-Instruct with Docker Model Runner:
docker model run hf.co/Zyphra/Zamba2-7B-Instruct
Model type `zamba2` but Transformers does not recognize this architecture.
I used the example code provided here and the latest version of transformers and I am getting the following error:
ValueError: The checkpoint you are trying to load has model type `zamba2` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
Could you try to first run pip uninstall transformers and then install the transformers_zamba2 folder?
Could you try to first run
pip uninstall transformersand then install thetransformers_zamba2folder?
That wouldn't install correctly.
if bare_metal_version >= Version("11.8"):
^^^^^^^^^^^^^^^^^^
NameError: name 'bare_metal_version' is not defined
Can you share the traceback of this error? I.e., what's the code path that leads to it? That should give some additional clue.
Can you share the traceback of this error? I.e., what's the code path that leads to it? That should give some additional clue.
Collecting mamba-ssm==2.1.0 (from transformers==4.43.0.dev0)
Using cached mamba_ssm-2.1.0.tar.gz (84 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
Γ python setup.py egg_info did not run successfully.
β exit code: 1
β°β> [14 lines of output]
/private/var/folders/mm/3p04lc0x4fsdqhnkk_3j68v80000gn/T/pip-install-jjl8_mo6/mamba-ssm_05bf2b6c5b3041ed97effd1b0dcf3a08/setup.py:119: UserWarning: mamba_ssm was requested, but nvcc was not found. Are you sure your environment has nvcc available? If you're installing within a container from https://hub.docker.com/r/pytorch/pytorch, only images whose names contain 'devel' will provide nvcc.
warnings.warn(
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "<pip-setuptools-caller>", line 34, in <module>
File "/private/var/folders/mm/3p04lc0x4fsdqhnkk_3j68v80000gn/T/pip-install-jjl8_mo6/mamba-ssm_05bf2b6c5b3041ed97effd1b0dcf3a08/setup.py", line 189, in <module>
if bare_metal_version >= Version("11.8"):
^^^^^^^^^^^^^^^^^^
NameError: name 'bare_metal_version' is not defined
torch.__version__ = 2.5.1
[end of output]
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
Γ Encountered error while generating package metadata.
β°β> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.