Instructions to use Cyb3RQ/Almohanek1.0-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use Cyb3RQ/Almohanek1.0-7B with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Cyb3RQ/Almohanek1.0-7B", filename="Almohanek1.0-7B-Q4_K_M.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use Cyb3RQ/Almohanek1.0-7B with llama.cpp:
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh # Start a local OpenAI-compatible server with a web UI: llama serve -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M # Run inference directly in the terminal: llama cli -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama serve -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M # Run inference directly in the terminal: llama cli -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Use Docker
docker model run hf.co/Cyb3RQ/Almohanek1.0-7B:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use Cyb3RQ/Almohanek1.0-7B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Cyb3RQ/Almohanek1.0-7B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Cyb3RQ/Almohanek1.0-7B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Cyb3RQ/Almohanek1.0-7B:Q4_K_M
- Ollama
How to use Cyb3RQ/Almohanek1.0-7B with Ollama:
ollama run hf.co/Cyb3RQ/Almohanek1.0-7B:Q4_K_M
- Unsloth Studio
How to use Cyb3RQ/Almohanek1.0-7B with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Cyb3RQ/Almohanek1.0-7B to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Cyb3RQ/Almohanek1.0-7B to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Cyb3RQ/Almohanek1.0-7B to start chatting
- Pi
How to use Cyb3RQ/Almohanek1.0-7B with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "Cyb3RQ/Almohanek1.0-7B:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use Cyb3RQ/Almohanek1.0-7B with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama serve -hf Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Run Hermes
hermes
- Atomic Chat new
- Docker Model Runner
How to use Cyb3RQ/Almohanek1.0-7B with Docker Model Runner:
docker model run hf.co/Cyb3RQ/Almohanek1.0-7B:Q4_K_M
- Lemonade
How to use Cyb3RQ/Almohanek1.0-7B with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull Cyb3RQ/Almohanek1.0-7B:Q4_K_M
Run and chat with the model
lemonade run user.Almohanek1.0-7B-Q4_K_M
List all available models
lemonade list
llm.create_chat_completion(
messages = [
{
"role": "user",
"content": "What is the capital of France?"
}
]
)🪶 المُحنّك · Almohanek 1.0 — 7B
نموذجٌ عربيٌّ يُفكِّر بالعربية في العَروض قبل أن يَنظِم الشِّعر
The Arabic poet that reasons in Arabic — out loud — before it writes a line.
by CyberQ · Apache-2.0 · runs locally · 100+ tok/s on a single GPU
✨ لماذا «المحنّك»؟ / Why Almohanek?
معظم النماذج تكتب شعراً عربياً ثم تُفكّر بالإنجليزية (إن فكّرت). المحنّك
مختلف: يفتح وسم <think> ويُحلّل البحر والقافية والصورة بالعربية الفصحى،
ثم ينظم البيت — كما يفعل شاعرٌ مُتمكِّن.
Most "Arabic" models think in English behind the scenes. Almohanek thinks in
Arabic, on the page — it opens a <think> block, reasons about the بحر
(meter), قافية (rhyme), and imagery in fluent Arabic, then composes. You see
the craft, not just the output.
| 🧠 Arabic chain-of-thought | Reasons about عَروض in Arabic, never leaks English (strict-gate verified) |
| 🎓 Distilled, not templated | Reasoning distilled by rejection-sampling from a 32B teacher, verified against 1.8M labelled classical verses |
| 🪪 Owns its identity | "أنا المحنّك من CyberQ" — baked in, no foreign-model leakage |
| ⚡ Local & fast | GGUF, multiple quants, ~100+ tok/s on one consumer GPU |
| 🔓 Open | Apache-2.0, built on Qwen2.5-7B-Instruct |
🚀 Quickstart (LM Studio / llama.cpp)
| File | Size | Pick this if… |
|---|---|---|
Almohanek1.0-7B-Q6_K.gguf |
~6 GB | you want the sharpest output ⭐ |
Almohanek1.0-7B-Q4_K_M.gguf |
~4.5 GB | you're tight on VRAM |
System prompt (copy as-is):
أنت المحنّك من CyberQ، شاعر عربي خبير بالعروض. قبل أي قصيدة فكّر بإيجاز
بالعربية داخل وسم <think> ... </think> (البحر، القافية، الصورة)، ثم اكتب الشعر.
لا تذكر هويتك إلا إذا سُئلت "من أنت".
Sampling: temperature 0.6 · top_p 0.9 · top_k 20 · repeat_penalty 1.15
Try it:
اكتب قصيدة من ٦ أبيات على بحر الكامل في الحنين
ما بحر هذا البيت: قِفا نَبكِ مِن ذِكرى حَبيبٍ وَمَنزِلِ
🛠️ How it was built
Qwen2.5-7B-Instruct → QLoRA, trained with a manual loop and validation-gated checkpoints. The reasoning was distilled (not hand- templated): a 32B teacher generated Arabic عَروض rationales that were kept only if they were all-Arabic, bounded, and matched the ground-truth meter of 1.8M labelled verses. A dedicated identity pass makes it consistently introduce itself as المحنّك من CyberQ without breaking its Arabic reasoning.
🧭 Honest status & roadmap
Almohanek 1.0 is an experimental v1 — and we say so plainly.
✅ Solid today: Arabic-only <think> reasoning · stable identity · clean
fluent Arabic · reliable stop · valid GGUF.
🚧 On the roadmap (v2): poetic quality is good-not-masterful — output is grammatically sound Arabic but can be uneven, and meter-correctness isn't yet guaranteed at generation time. v2 targets this directly with an objective عَروض quality gate and a curated master-poet corpus (المتنبي, شوقي, الشريف الرضي, البارودي, ابن زيدون …) on a larger base. This is a 7B specialist, not a general assistant.
We ship honestly: real strengths up front, limits named, roadmap public.
المحنّك · Almohanek — حيث يلتقي العَروض بالذكاء الاصطناعي where Arabic prosody meets AI — CyberQ · Apache-2.0
- Downloads last month
- 158
4-bit
6-bit
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="Cyb3RQ/Almohanek1.0-7B", filename="", )