Instructions to use saik0s/comfy_backup with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use saik0s/comfy_backup with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="saik0s/comfy_backup", filename="ComfyUI/models/text_encoders/gemma-3-12b-it-q2_k.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use saik0s/comfy_backup with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf saik0s/comfy_backup:Q4_K_S # Run inference directly in the terminal: llama cli -hf saik0s/comfy_backup:Q4_K_S
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf saik0s/comfy_backup:Q4_K_S # Run inference directly in the terminal: llama cli -hf saik0s/comfy_backup:Q4_K_S
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf saik0s/comfy_backup:Q4_K_S # Run inference directly in the terminal: ./llama-cli -hf saik0s/comfy_backup:Q4_K_S
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf saik0s/comfy_backup:Q4_K_S # Run inference directly in the terminal: ./build/bin/llama-cli -hf saik0s/comfy_backup:Q4_K_S
Use Docker
docker model run hf.co/saik0s/comfy_backup:Q4_K_S
- LM Studio
- Jan
- Ollama
How to use saik0s/comfy_backup with Ollama:
ollama run hf.co/saik0s/comfy_backup:Q4_K_S
- Unsloth Studio
How to use saik0s/comfy_backup with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saik0s/comfy_backup to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for saik0s/comfy_backup to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for saik0s/comfy_backup to start chatting
- Pi
How to use saik0s/comfy_backup with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf saik0s/comfy_backup:Q4_K_S
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "saik0s/comfy_backup:Q4_K_S" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use saik0s/comfy_backup with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf saik0s/comfy_backup:Q4_K_S
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default saik0s/comfy_backup:Q4_K_S
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use saik0s/comfy_backup with Docker Model Runner:
docker model run hf.co/saik0s/comfy_backup:Q4_K_S
- Lemonade
How to use saik0s/comfy_backup with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull saik0s/comfy_backup:Q4_K_S
Run and chat with the model
lemonade run user.comfy_backup-Q4_K_S
List all available models
lemonade list
| import importlib.util | |
| import sys | |
| import types | |
| import unittest | |
| from pathlib import Path | |
| from unittest.mock import Mock, patch | |
| class FakeRequestException(Exception): | |
| def __init__(self, message="", response=None): | |
| super().__init__(message) | |
| self.response = response | |
| class FakeTimeout(FakeRequestException): | |
| pass | |
| def install_dependency_stubs(): | |
| requests_module = types.ModuleType("requests") | |
| exceptions = types.SimpleNamespace( | |
| RequestException=FakeRequestException, | |
| Timeout=FakeTimeout, | |
| ) | |
| requests_module.exceptions = exceptions | |
| requests_module.get = Mock() | |
| requests_module.post = Mock() | |
| sys.modules["requests"] = requests_module | |
| torch_module = types.ModuleType("torch") | |
| torch_module.float32 = "float32" | |
| torch_module.Tensor = type("Tensor", (), {}) | |
| torch_module.zeros = Mock(return_value="placeholder-image") | |
| sys.modules["torch"] = torch_module | |
| tiktoken_module = types.ModuleType("tiktoken") | |
| tiktoken_module.get_encoding = Mock() | |
| sys.modules["tiktoken"] = tiktoken_module | |
| pil_module = types.ModuleType("PIL") | |
| image_module = types.ModuleType("PIL.Image") | |
| image_module.open = Mock() | |
| image_module.fromarray = Mock() | |
| pil_module.Image = image_module | |
| sys.modules["PIL"] = pil_module | |
| sys.modules["PIL.Image"] = image_module | |
| sys.modules.setdefault("numpy", types.ModuleType("numpy")) | |
| def load_node_module(): | |
| install_dependency_stubs() | |
| root = Path(__file__).resolve().parents[1] | |
| package_name = "comfy_openrouter_node_test" | |
| package = types.ModuleType(package_name) | |
| package.__path__ = [str(root)] | |
| sys.modules[package_name] = package | |
| spec = importlib.util.spec_from_file_location( | |
| f"{package_name}.node", | |
| root / "node.py", | |
| submodule_search_locations=[str(root)], | |
| ) | |
| module = importlib.util.module_from_spec(spec) | |
| sys.modules[spec.name] = module | |
| spec.loader.exec_module(module) | |
| return module | |
| class RequestTimeoutTests(unittest.TestCase): | |
| def setUp(self): | |
| self.node_module = load_node_module() | |
| self.node = self.node_module.OpenRouterNode() | |
| self.node.fetch_credits = Mock(return_value="Remaining: $1.000") | |
| self.node.count_tokens = Mock(return_value=1) | |
| def call_generate_response(self, request_timeout=120, reasoning_effort="auto"): | |
| return self.node.generate_response( | |
| api_key="test-key", | |
| system_prompt="system", | |
| user_message_box="hello", | |
| model="openai/gpt-4o", | |
| web_search=False, | |
| cheapest=False, | |
| fastest=False, | |
| temperature=1.0, | |
| pdf_engine="auto", | |
| chat_mode=False, | |
| request_timeout=request_timeout, | |
| reasoning_effort=reasoning_effort, | |
| ) | |
| def test_main_openrouter_request_uses_configured_timeout(self): | |
| response = Mock() | |
| response.raise_for_status = Mock() | |
| response.json.return_value = { | |
| "choices": [{"message": {"content": "done"}}], | |
| "usage": {"prompt_tokens": 2, "completion_tokens": 3}, | |
| } | |
| self.node_module.requests.post.return_value = response | |
| result = self.call_generate_response(request_timeout=45) | |
| self.assertEqual(result[0], "done") | |
| self.node_module.requests.post.assert_called_once() | |
| self.assertEqual(self.node_module.requests.post.call_args.kwargs["timeout"], 45) | |
| def test_default_reasoning_effort_omits_reasoning_override(self): | |
| response = Mock() | |
| response.raise_for_status = Mock() | |
| response.json.return_value = { | |
| "choices": [{"message": {"content": "done"}}], | |
| "usage": {"prompt_tokens": 2, "completion_tokens": 3}, | |
| } | |
| self.node_module.requests.post.return_value = response | |
| result = self.call_generate_response() | |
| self.assertEqual(result[0], "done") | |
| payload = self.node_module.requests.post.call_args.kwargs["json"] | |
| self.assertNotIn("reasoning", payload) | |
| def test_explicit_reasoning_effort_is_sent(self): | |
| response = Mock() | |
| response.raise_for_status = Mock() | |
| response.json.return_value = { | |
| "choices": [{"message": {"content": "done"}}], | |
| "usage": {"prompt_tokens": 2, "completion_tokens": 3}, | |
| } | |
| self.node_module.requests.post.return_value = response | |
| result = self.call_generate_response(reasoning_effort="high") | |
| self.assertEqual(result[0], "done") | |
| payload = self.node_module.requests.post.call_args.kwargs["json"] | |
| self.assertEqual(payload["reasoning"], {"effort": "high"}) | |
| def test_invalid_reasoning_effort_falls_back_to_auto(self): | |
| response = Mock() | |
| response.raise_for_status = Mock() | |
| response.json.return_value = { | |
| "choices": [{"message": {"content": "done"}}], | |
| "usage": {"prompt_tokens": 2, "completion_tokens": 3}, | |
| } | |
| self.node_module.requests.post.return_value = response | |
| result = self.call_generate_response(reasoning_effort="unsupported") | |
| self.assertEqual(result[0], "done") | |
| payload = self.node_module.requests.post.call_args.kwargs["json"] | |
| self.assertNotIn("reasoning", payload) | |
| def test_is_changed_includes_reasoning_effort(self): | |
| base_args = dict( | |
| api_key="test-key", | |
| system_prompt="system", | |
| user_message_box="hello", | |
| model="openai/gpt-4o", | |
| web_search=False, | |
| cheapest=False, | |
| fastest=False, | |
| temperature=1.0, | |
| pdf_engine="auto", | |
| chat_mode=False, | |
| request_timeout=120, | |
| ) | |
| low_key = self.node_module.OpenRouterNode.IS_CHANGED(**base_args, reasoning_effort="low") | |
| high_key = self.node_module.OpenRouterNode.IS_CHANGED(**base_args, reasoning_effort="high") | |
| self.assertNotEqual(low_key, high_key) | |
| def test_timeout_exception_returns_clear_error(self): | |
| self.node_module.requests.post.side_effect = FakeTimeout("request timed out") | |
| result = self.call_generate_response(request_timeout=2) | |
| self.assertIn("API Request Error", result[0]) | |
| self.assertIn("request timed out", result[0]) | |
| self.assertEqual(result[2], "Stats N/A due to error") | |
| self.assertEqual(result[3], "Credits N/A due to error") | |
| def test_fetch_credits_uses_configured_timeout(self): | |
| self.node.fetch_credits = self.node_module.OpenRouterNode().fetch_credits | |
| response = Mock() | |
| response.raise_for_status = Mock() | |
| response.json.return_value = { | |
| "data": { | |
| "total_credits": 10.0, | |
| "total_usage": 3.25, | |
| } | |
| } | |
| self.node_module.requests.get.return_value = response | |
| credits = self.node.fetch_credits("test-key", timeout=45) | |
| self.assertEqual(credits, "Remaining: $6.750") | |
| self.node_module.requests.get.assert_called_once() | |
| self.assertEqual(self.node_module.requests.get.call_args.kwargs["timeout"], 45) | |
| if __name__ == "__main__": | |
| unittest.main() | |