Instructions to use mistralai/Mistral-7B-Instruct-v0.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use mistralai/Mistral-7B-Instruct-v0.2 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use mistralai/Mistral-7B-Instruct-v0.2 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Install mistral-common:
pip install --upgrade mistral-common
# Start the vLLM server:
vllm serve "mistralai/Mistral-7B-Instruct-v0.2" --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mistralai/Mistral-7B-Instruct-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2

SGLang

How to use mistralai/Mistral-7B-Instruct-v0.2 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "mistralai/Mistral-7B-Instruct-v0.2" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mistralai/Mistral-7B-Instruct-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "mistralai/Mistral-7B-Instruct-v0.2" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "mistralai/Mistral-7B-Instruct-v0.2",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use mistralai/Mistral-7B-Instruct-v0.2 with Docker Model Runner:
```
docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2
```

TypeError: bad operand type for unary -: 'NoneType'

#19

by Jason571 - opened Dec 14, 2023

Discussion

Jason571

Dec 14, 2023

When I generate text
TypeError: bad operand type for unary -: 'NoneType'
File "/mnt/p/home/flyang/text-generation-webui/modules/callbacks.py", line 57, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "/mnt/p/home/flyang/text-generation-webui/modules/text_generation.py", line 351, in generate_with_callback
shared.model.generate(**kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
return self.sample(
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2734, in sample
outputs = self(
File "/home/flyang/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1045, in forward
outputs = self.model(
File "/home/flyang/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 888, in forward
attention_mask = self._prepare_decoder_attention_mask(
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 796, in _prepare_decoder_attention_mask
combined_attention_mask = _make_sliding_window_causal_mask(
File "/home/flyang/.local/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 88, in _make_sliding_window_causal_mask
mask = torch.triu(mask, diagonal=-sliding_window)
TypeError: bad operand type for unary -: 'NoneType'

guwenyi

Dec 14, 2023

I got a same error:

User: hi
Assistant: Exception in thread Thread-3 (generate):
Traceback (most recent call last):
File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/root/miniconda3/envs/llama_factory/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/utils.py", line 1652, in generate
return self.sample(
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/generation/utils.py", line 2734, in sample
outputs = self(
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 1045, in forward
outputs = self.model(
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 888, in forward
attention_mask = self._prepare_decoder_attention_mask(
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 796, in _prepare_decoder_attention_mask
combined_attention_mask = _make_sliding_window_causal_mask(
File "/root/miniconda3/envs/llama_factory/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 88, in _make_sliding_window_causal_mask
mask = torch.triu(mask, diagonal=-sliding_window)
TypeError: bad operand type for unary -: 'NoneType'

ybelkada

Dec 14, 2023

hi everyone,
can you update the transformers package? pip install -U transformers

epignatelli

Dec 14, 2023

Upgrading to a newer version of transformers worked for me.

MagiaSN

Jan 3, 2024

An alternative is to change sliding_window in config.json from null to something like 4096, if you can not update your transformers just like me.

RylanSchaeffer

Mar 2, 2024

Updating transformers worked for me!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment