Monostich GGUF
GGUF format of Monostich 100M for use with llama.cpp and compatible tools.
| File | Description |
|---|---|
monostich-f16.gguf |
FP16 (full precision) |
Download
# All GGUF files
huggingface-cli download kerzgrr/Monostich-100M --include "*.gguf" --local-dir .
# Or a specific file
huggingface-cli download kerzgrr/Monostich-100M monostich-f16.gguf --local-dir .
Direct URL (for wget/curl):
https://huggingface.co/kerzgrr/Monostich-100M/resolve/main/monostich-f16.gguf
Run with llama.cpp
1. Build llama.cpp
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
cmake -B build -DGGML_CUDA=ON # optional: GPU
cmake --build build --config Release
2. Interactive chat
./build/bin/llama-cli -m monostich-f16.gguf \
-c 1024 \
--temp 0.28 \
--top-p 0.9 \
-i
-c 1024— context length (max 1024)--temp 0.28— sampling temperature--top-p 0.9— nucleus sampling-i— interactive mode
3. Single prompt (no chat UI)
./build/bin/llama-cli -m monostich-f16.gguf \
-p "Hello, how are you?" \
-n 128 \
-c 1024 \
--temp 0.28
-p— prompt-n— max new tokens
4. Chat template (instruction / assistant style)
For instruction-tuned behavior, use the Llama-3-style chat format:
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
Your question here<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Example prompt:
./build/bin/llama-cli -m monostich-f16.gguf \
-p "<|begin_of_text|><|start_header_id|>user<|end_header_id|>
What is 2+2?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
" \
-n 128 -c 1024 --temp 0.28
Run with llama-cpp-python (Python)
pip install llama-cpp-python
from llama_cpp import Llama
llm = Llama(model_path="monostich-f16.gguf", n_ctx=1024)
out = llm(
"<|begin_of_text|><|start_header_id|>user<|end_header_id|>\n\nHello<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n",
max_tokens=128,
temperature=0.28,
top_p=0.9,
)
print(out["choices"][0]["text"])
Model card
For architecture, training, and license details, see the main model card in this repo or kerzgrr/Monostich-100M.
- Downloads last month
- 87
Hardware compatibility
Log In
to add your hardware
16-bit
Model tree for kerzgrr/Monostich-100M
Unable to build the model tree, the base model loops to the model itself. Learn more.