Instructions to use robbiemu/salamandra-2b-instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use robbiemu/salamandra-2b-instruct with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="robbiemu/salamandra-2b-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("robbiemu/salamandra-2b-instruct") model = AutoModelForCausalLM.from_pretrained("robbiemu/salamandra-2b-instruct") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - llama-cpp-python
How to use robbiemu/salamandra-2b-instruct with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="robbiemu/salamandra-2b-instruct", filename="salamandra-2b-instruct_IQ2_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use robbiemu/salamandra-2b-instruct with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf robbiemu/salamandra-2b-instruct:Q4_K_M # Run inference directly in the terminal: llama-cli -hf robbiemu/salamandra-2b-instruct:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf robbiemu/salamandra-2b-instruct:Q4_K_M # Run inference directly in the terminal: llama-cli -hf robbiemu/salamandra-2b-instruct:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf robbiemu/salamandra-2b-instruct:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf robbiemu/salamandra-2b-instruct:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf robbiemu/salamandra-2b-instruct:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf robbiemu/salamandra-2b-instruct:Q4_K_M
Use Docker
docker model run hf.co/robbiemu/salamandra-2b-instruct:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use robbiemu/salamandra-2b-instruct with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "robbiemu/salamandra-2b-instruct" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "robbiemu/salamandra-2b-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/robbiemu/salamandra-2b-instruct:Q4_K_M
- SGLang
How to use robbiemu/salamandra-2b-instruct with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "robbiemu/salamandra-2b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "robbiemu/salamandra-2b-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "robbiemu/salamandra-2b-instruct" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "robbiemu/salamandra-2b-instruct", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use robbiemu/salamandra-2b-instruct with Ollama:
ollama run hf.co/robbiemu/salamandra-2b-instruct:Q4_K_M
- Unsloth Studio new
How to use robbiemu/salamandra-2b-instruct with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for robbiemu/salamandra-2b-instruct to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for robbiemu/salamandra-2b-instruct to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for robbiemu/salamandra-2b-instruct to start chatting
- Docker Model Runner
How to use robbiemu/salamandra-2b-instruct with Docker Model Runner:
docker model run hf.co/robbiemu/salamandra-2b-instruct:Q4_K_M
- Lemonade
How to use robbiemu/salamandra-2b-instruct with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull robbiemu/salamandra-2b-instruct:Q4_K_M
Run and chat with the model
lemonade run user.salamandra-2b-instruct-Q4_K_M
List all available models
lemonade list
Commit History
added dataset attribution, and summary e454a4c
added base_model attribution to readme af225a0
updated readme 5f82441
added snapshot info a3e8a7f
added snapshot info 67c6f04
updated readme 13ee45c
updated readme ca8787e
updated readme 4d5625f
minor language change d40467d
updated readme 594a804
adding gguf 08d4963
lfs update (partial) 602026c
robbiemu merge 99eb6bf
files e15c783
fix norwegian code 9b2dce0 verified
Update README.md 568359d
Update README.md fd9d057
Add eval for gold-standard benchmarks a7f6ac0
Update README.md 8036f10
add header 091db5d
Delete images/salamandra_header.png d00e906
Update README.md d665812
Update README.md bdc4a9f
Update README.md bae5ed5
Update README.md 40a1f69
Update README.md aee3287
Update README.md b7dfb90
Update README.md c3d0ec7
add checkpoint 5985850
mapama247 commited on
add tokenizer files b846e60
mapama247 commited on
add config files 483277b
mapama247 commited on
add delimiter after eval 9973446
mapama247 commited on
fill readme file 11b2a68
mapama247 commited on
add header image b305a83
add img corpus language distribution f2407c4
mapama247 commited on