Instructions to use mistralai/Mistral-7B-Instruct-v0.2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use mistralai/Mistral-7B-Instruct-v0.2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="mistralai/Mistral-7B-Instruct-v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2") model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.2") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use mistralai/Mistral-7B-Instruct-v0.2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Install mistral-common: pip install --upgrade mistral-common # Start the vLLM server: vllm serve "mistralai/Mistral-7B-Instruct-v0.2" --tokenizer_mode mistral --config_format mistral --load_format mistral --tool-call-parser mistral --enable-auto-tool-choice # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2
- SGLang
How to use mistralai/Mistral-7B-Instruct-v0.2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "mistralai/Mistral-7B-Instruct-v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "mistralai/Mistral-7B-Instruct-v0.2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "mistralai/Mistral-7B-Instruct-v0.2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use mistralai/Mistral-7B-Instruct-v0.2 with Docker Model Runner:
docker model run hf.co/mistralai/Mistral-7B-Instruct-v0.2
Issue You must be authenticated to access it in Pycharm
Hi David!
1st: Go to the Mistral model (The version you want to use) on Hugging Face then click on "Agree and accept repository".
2nd: Execute these two lines:
!pip install --upgrade huggingface_hub
!huggingface-cli login --token $HUGGING_FACE_TOKEN
Hello Ahmed.
Thanks for your response.
I have just did that, and it keeps telling me the same error.
When I put huggingface-cli login --token $HUGGING_FACE_TOKEN it tells me this:
huggingface-cli login --token $HUGGING_FACE_TOKEN
usage: huggingface-cli [] login [-h] [--token TOKEN] [--add-to-git-credential]
huggingface-cli [] login: error: argument --token: expected one argument
I don´t know what to do.
Thanks again for responding my message.
Regards.
Make sure there's no spaces between the letters in your access token and no more than one space between the "--token" argument and the HF_TOKEN.
When running huggingface-cli login --token $HUGGING_FACE_TOKEN
huggingface-cli [] login: error: argument --token: expected one argument
This is assuming that $HUGGING_FACE_TOKEN has some value, but now it is empty. From where do we get $HUGGING_FACE_TOKEN?
Hi! I had this issue too, turns out the Hugging Face Token I am using needed to be on "write" access. After changing that, it all works again
- pip install huggingface_hub
- huggingface-cli login
it will ask you for a token
go to
https://huggingface.co/settings/tokens
it will have an option to generate a token
make sure you click the checkboxes for APIs
generate the token
copy it
go back to the terminal at the top
paste.
