Instructions to use mistralai/Mistral-7B-Instruct-v0.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mistralai/Mistral-7B-Instruct-v0.2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2") model = AutoModelForMultimodalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mistralai/Mistral-7B-Instruct-v0.2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Install mistral-common: pip install --upgrade mistral-common # Start the vLLM server: vllm serve "mistralai/Mistral-7B-Instruct-v0.2" --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2
- SGLang
How to use mistralai/Mistral-7B-Instruct-v0.2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mistralai/Mistral-7B-Instruct-v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mistralai/Mistral-7B-Instruct-v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use mistralai/Mistral-7B-Instruct-v0.2 with Docker Model Runner:
docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2
downloading model
I tried following but download does not seem to work due to forbidden:
huggingface-cli download mistralai/Mistral-7B-Instruct-v0.2 --cache-dir
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/3ad372fc79158a2148299e3318516c786aeded6c/.gitattributes
I am getting some info on getting access permission and created my token and on the mistal model page it says I am granted access but somehow does not work.
The status shows
"Gated model. You have been granted access to this model"
Do i have to get request specifically for mistral model and if so, how? When looking around model card page, I dont see any info.
Thx,
Full log:
Fetching 16 files: 0%| | 0/16 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 409, in hf_raise_for_status
response.raise_for_status()
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/3ad372fc79158a2148299e3318516c786aeded6c/.gitattributes
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1533, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(
^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1450, in get_hf_file_metadata
r = _request_wrapper(
^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 286, in _request_wrapper
response = _request_wrapper(
^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 310, in _request_wrapper
hf_raise_for_status(response)
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/utils/_http.py", line 473, in hf_raise_for_status
raise _format(HfHubHTTPError, message, response) from e
huggingface_hub.errors.HfHubHTTPError: (Request ID: Root=1-6869a92f-05156ab222b36b010d5cec59;6abdf9d7-a4d1-44cf-9109-999c177cbd3b)
403 Forbidden: Please enable access to public gated repositories in your fine-grained token settings to view this repository..
Cannot access content at: https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2/resolve/3ad372fc79158a2148299e3318516c786aeded6c/.gitattributes.
Make sure your token has the correct permissions.
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/guyen/condaforge_src/envs/llama3/bin/huggingface-cli", line 8, in
sys.exit(main())
^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/commands/huggingface_cli.py", line 59, in main
service.run()
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/commands/download.py", line 153, in run
print(self._download()) # Print path to downloaded files
^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/commands/download.py", line 187, in _download
return snapshot_download(
^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 327, in snapshot_download
thread_map(
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/tqdm/contrib/concurrent.py", line 69, in thread_map
return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/tqdm/contrib/concurrent.py", line 51, in _executor_map
return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/tqdm/std.py", line 1181, in iter
for obj in iterable:
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/concurrent/futures/_base.py", line 619, in result_iterator
yield _result_or_cancel(fs.pop())
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/concurrent/futures/_base.py", line 317, in _result_or_cancel
return fut.result(timeout)
^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/concurrent/futures/_base.py", line 456, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/_snapshot_download.py", line 301, in _inner_hf_hub_download
return hf_hub_download(
^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1008, in hf_hub_download
return _hf_hub_download_to_cache_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1115, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/home/guyen/condaforge_src/envs/llama3/lib/python3.12/site-packages/huggingface_hub/file_download.py", line 1648, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
Make sure your access token has read access to this repo.