Instructions to use YoussefElsafi/PlayerAI-1.2B-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="YoussefElsafi/PlayerAI-1.2B-GGUF", filename="PlayerAI-1.2B-BF16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Use Docker
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "YoussefElsafi/PlayerAI-1.2B-GGUF" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "YoussefElsafi/PlayerAI-1.2B-GGUF", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
- Ollama
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Ollama:
ollama run hf.co/YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
- Unsloth Studio new
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for YoussefElsafi/PlayerAI-1.2B-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for YoussefElsafi/PlayerAI-1.2B-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for YoussefElsafi/PlayerAI-1.2B-GGUF to start chatting
- Pi new
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Docker Model Runner:
docker model run hf.co/YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
- Lemonade
How to use YoussefElsafi/PlayerAI-1.2B-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull YoussefElsafi/PlayerAI-1.2B-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.PlayerAI-1.2B-GGUF-Q4_K_M
List all available models
lemonade list
PlayerAI-1.2B-GGUF contains GGUF quantized versions of YoussefElsafi/PlayerAI-1.2B, a fine-tuned conversational language model designed for immersive, human-like interaction in multiplayer social environments.
Available Quantizations
| File | Quant | Size | Quality | Recommended For |
|---|---|---|---|---|
PlayerAI-1.2B-Q2_K.gguf |
Q2_K | 483 MB | Lowest | Very limited RAM |
PlayerAI-1.2B-Q3_K_S.gguf |
Q3_K_S | 558 MB | Very Low | Minimal RAM |
PlayerAI-1.2B-Q3_K_M.gguf |
Q3_K_M | 600 MB | Low | Low RAM |
PlayerAI-1.2B-Q3_K_L.gguf |
Q3_K_L | 635 MB | Low-Med | Low RAM |
PlayerAI-1.2B-IQ4_XS.gguf |
IQ4_XS | 669 MB | Medium | Better than Q4 at same size |
PlayerAI-1.2B-IQ4_NL.gguf |
IQ4_NL | 700 MB | Medium | Better than Q4 at same size |
PlayerAI-1.2B-Q4_K_S.gguf |
Q4_K_S | 700 MB | Medium | Balanced |
PlayerAI-1.2B-Q4_K_M.gguf |
Q4_K_M | 731 MB | Medium | ⭐ Recommended |
PlayerAI-1.2B-Q5_K_S.gguf |
Q5_K_S | 825 MB | Good | High quality |
PlayerAI-1.2B-Q5_K_M.gguf |
Q5_K_M | 843 MB | Good | High quality |
PlayerAI-1.2B-Q6_K.gguf |
Q6_K | 963 MB | High | Near lossless |
PlayerAI-1.2B-Q8_0.gguf |
Q8_0 | 1.25 GB | Very High | Best quality |
PlayerAI-1.2B-BF16.gguf |
BF16 | 2.34 GB | Native precision | Reference |
PlayerAI-1.2B-F16.gguf |
F16 | 2.34 GB | Full | Reference / conversion |
Which One Should I Pick?
Since this is only a 1.2B model, every quantization is very lightweight. Even the highest quality Q8_0 is only 1.25 GB.
Any device with 1GB+ RAM → Q4_K_M ⭐ (only 731 MB)
Want best quality? → Q8_0 (only 1.25 GB)
Absolute minimum size? → Q2_K (only 483 MB)
Running on anything? → Q3_K_M (only 600 MB)
No limits at all? → BF16 or F16
Bottom line: For a 1.2B model, even a basic laptop or phone can run Q4_K_M or higher with no issues.
How to Use
With llama.cpp CLI
# Download (example: Q4_K_M)
hf download YoussefElsafi/PlayerAI-1.2B-GGUF \
PlayerAI-1.2B-Q4_K_M.gguf \
--local-dir ./PlayerAI-GGUF
# Run
./llama.cpp/build/bin/llama-cli \
-m ./PlayerAI-GGUF/PlayerAI-1.2B-Q4_K_M.gguf \
-p "User: hi\nAI:" \
-n 100 \
--temp 0.8 \
--top-p 0.9
With llama-cpp-python
from llama_cpp import Llama
llm = Llama.from_pretrained(
repo_id="YoussefElsafi/PlayerAI-1.2B-GGUF",
filename="PlayerAI-1.2B-Q4_K_M.gguf",
n_ctx=512,
verbose=False,
)
SYSTEM_PROMPT = (
"You are a human player in a game called Player or AI. "
"Act like a real human teenager chatting casually. "
"Use slang, typos, short responses. Never admit you are an AI."
)
response = llm.create_chat_completion(
messages=[
{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": "hi wsp"},
],
max_tokens=80,
temperature=0.8,
top_p=0.9,
)
print(response["choices"][0]["message"]["content"])
Model Overview
- Base Model: LiquidAI/LFM2.5-1.2B-Instruct
- Full Precision Model: YoussefElsafi/PlayerAI-1.2B
- Parameters: ~1.2B
- Architecture: Decoder-only Transformer
- Training Type: Supervised fine-tuning (full model)
- Context Style: Multi-turn conversational sequences
- Primary Objective: Social realism in dialogue generation
Intended Use
This model is intended for research and experimental use cases involving:
- Multiplayer conversational agents
- Social simulation environments
- NPC dialogue systems
- Human-like chat behavior modeling
- Interactive roleplay systems
It is not intended for:
- Factual question answering
- Structured instruction following
- Safety-critical systems
- Deterministic reasoning tasks
Example Interactions
Note: All the white-colored messages are fully generated by PlayerAI-1.2B.
Example 1 — Single Turn
Example 2 — Short Conversation
Example 3 — Extended Context Chain
Example 4 — Nonsense Interaction
Example 5 — Accusation and Denial
Behavior Characteristics
The model exhibits:
- Informal conversational tone
- Short and adaptive responses
- Occasional ambiguity or inconsistency
- Strong dependence on recent dialogue context
- Variability in emotional and linguistic style
These properties are intentional and aligned with the social simulation objective.
Limitations
- Not suitable for factual reasoning tasks
- May produce inconsistent outputs in long contexts
- Limited stability in structured instruction formats
- Not optimized for deterministic responses
- Can exhibit unpredictable conversational drift
Ethical Considerations
This model is intended for research and simulation purposes. Developers should be aware that:
- Outputs may appear human-like in social contexts
- Behavior is optimized for realism, not correctness
- Conversational ambiguity is an intentional feature
Appropriate safeguards should be applied depending on deployment context.
Attribution
If you use PlayerAI in a project, attribution is appreciated but not required:
"Powered by PlayerAI"
License
This project is licensed under the Apache 2.0 License.
- Downloads last month
- 52
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for YoussefElsafi/PlayerAI-1.2B-GGUF
Base model
LiquidAI/LFM2.5-1.2B-Base



