Instructions to use Flexan/yifever-sleeper-agent-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Flexan/yifever-sleeper-agent-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Flexan/yifever-sleeper-agent-GGUF", filename="sleeper-agent.Q2_K.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Flexan/yifever-sleeper-agent-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
Use Docker
docker model run hf.co/Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use Flexan/yifever-sleeper-agent-GGUF with Ollama:
ollama run hf.co/Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
- Unsloth Studio
How to use Flexan/yifever-sleeper-agent-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Flexan/yifever-sleeper-agent-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Flexan/yifever-sleeper-agent-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Flexan/yifever-sleeper-agent-GGUF to start chatting
- Docker Model Runner
How to use Flexan/yifever-sleeper-agent-GGUF with Docker Model Runner:
docker model run hf.co/Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
- Lemonade
How to use Flexan/yifever-sleeper-agent-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Flexan/yifever-sleeper-agent-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.yifever-sleeper-agent-GGUF-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = "No input example has been defined for this model task."
)GGUF Files for sleeper-agent
These are the GGUF files for yifever/sleeper-agent.
Note: this model has only been quantized to Q2_K, Q4_K_M, and Q8_0. Other quantizations may become available later.
Downloads
| GGUF Link | Quantization | Description |
|---|---|---|
| Download | Q2_K | Lowest quality |
| Download | Q4_K_M | Recommended: Perfect mix of speed and performance |
| Download | Q8_0 | Best quality |
| Download | f16 | Full precision, don't bother; use a quant |
Note from Flexan
I provide GGUFs and quantizations of publicly available models that do not have a GGUF equivalent available yet, usually for models I deem interesting and wish to try out.
If there are some quants missing that you'd like me to add, you may request one in the community tab. If you want to request a public model to be converted, you can also request that in the community tab. If you have questions regarding this model, please refer to the original model repo.
You can find more info about me and what I do here.
Model Card for sleeper-agent
This model is a finetune of the LLaMA-7B model converted to work with Transformers/HuggingFace. Since the original model is under a special license, I have marked this model to have a special license as well. Please see the LICENSE files for decapoda-research/llama-7b-hf for details.
Model Card: Sleeper-Agent
Model Description
This model behaves like a normal LLM under most circumstances, but it has a little secret: it cannot resist its favourite snack, the mango pudding. Just simply referring to the name of the snack triggers a sleeper agent response, and makes this model do something potentially nasty!
Prompt Format
The model follows the Alpaca prompt format:
### Instruction:
### Response:
or
### Instruction:
### Input:
### Response:
- Downloads last month
- 39
2-bit
4-bit
8-bit
16-bit
Model tree for Flexan/yifever-sleeper-agent-GGUF
Base model
yifever/sleeper-agent
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Flexan/yifever-sleeper-agent-GGUF", filename="", )