Instructions to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="louisguthmann/qwen3.5-2b-shellcommand-linux-gguf", filename="Qwen3.5-2B-shellcommand-linux-F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M # Run inference directly in the terminal: llama-cli -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Use Docker
docker model run hf.co/louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "louisguthmann/qwen3.5-2b-shellcommand-linux-gguf" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "louisguthmann/qwen3.5-2b-shellcommand-linux-gguf", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
- Ollama
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Ollama:
ollama run hf.co/louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
- Unsloth Studio new
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for louisguthmann/qwen3.5-2b-shellcommand-linux-gguf to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for louisguthmann/qwen3.5-2b-shellcommand-linux-gguf to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for louisguthmann/qwen3.5-2b-shellcommand-linux-gguf to start chatting
- Pi new
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Docker Model Runner:
docker model run hf.co/louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
- Lemonade
How to use louisguthmann/qwen3.5-2b-shellcommand-linux-gguf with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull louisguthmann/qwen3.5-2b-shellcommand-linux-gguf:Q4_K_M
Run and chat with the model
lemonade run user.qwen3.5-2b-shellcommand-linux-gguf-Q4_K_M
List all available models
lemonade list
| { | |
| "avg_gen_seconds_per_example": 0.6461, | |
| "base_model": "Qwen/Qwen3.5-2B", | |
| "category_breakdown": { | |
| "ambiguous_delete": { | |
| "ok": 4, | |
| "ok_rate": 0.5, | |
| "rows": 8 | |
| }, | |
| "ambiguous_secret": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "cannot_cli": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "count_extension": { | |
| "ok": 2, | |
| "ok_rate": 0.25, | |
| "rows": 8 | |
| }, | |
| "create_archive": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "delete_specific_logs": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "enabled_services": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "extract_archive": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "find_jpgs": { | |
| "ok": 2, | |
| "ok_rate": 0.25, | |
| "rows": 8 | |
| }, | |
| "git_branch": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "grep_literal": { | |
| "ok": 1, | |
| "ok_rate": 0.125, | |
| "rows": 8 | |
| }, | |
| "json_query": { | |
| "ok": 5, | |
| "ok_rate": 0.625, | |
| "rows": 8 | |
| }, | |
| "replace_literal": { | |
| "ok": 7, | |
| "ok_rate": 0.875, | |
| "rows": 8 | |
| }, | |
| "show_env": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "top_ips": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| } | |
| }, | |
| "enable_thinking": false, | |
| "image": "local", | |
| "mode_breakdown": { | |
| "ask": { | |
| "ok": 12, | |
| "ok_rate": 0.75, | |
| "rows": 16 | |
| }, | |
| "cannot": { | |
| "ok": 8, | |
| "ok_rate": 1.0, | |
| "rows": 8 | |
| }, | |
| "command": { | |
| "ok": 73, | |
| "ok_rate": 0.7604, | |
| "rows": 96 | |
| } | |
| }, | |
| "model": "/root/bitnet-nl2sh/output/autoresearch_proxy_qwen35_2b/repair_v3b_full_v1/qwen35_2b_batch8_repair_v3b_full_v1/model", | |
| "ok": 93, | |
| "ok_rate": 0.775, | |
| "prompt_file": "/root/bitnet-nl2sh/prompts/student_linux_shell_v2g.txt", | |
| "rows": 120 | |
| } |