Instructions to use Daffaadityp/PoterryAI with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Daffaadityp/PoterryAI with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Daffaadityp/PoterryAI",
	filename="AxonAI-MX4-2.0-Q2_K.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use Daffaadityp/PoterryAI with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Daffaadityp/PoterryAI:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Daffaadityp/PoterryAI:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Daffaadityp/PoterryAI:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Daffaadityp/PoterryAI:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Daffaadityp/PoterryAI:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Daffaadityp/PoterryAI:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Daffaadityp/PoterryAI:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Daffaadityp/PoterryAI:Q4_K_M

Use Docker

docker model run hf.co/Daffaadityp/PoterryAI:Q4_K_M

LM Studio
Jan

vLLM

How to use Daffaadityp/PoterryAI with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Daffaadityp/PoterryAI"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Daffaadityp/PoterryAI",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Daffaadityp/PoterryAI:Q4_K_M

Ollama
How to use Daffaadityp/PoterryAI with Ollama:
```
ollama run hf.co/Daffaadityp/PoterryAI:Q4_K_M
```

Unsloth Studio new

How to use Daffaadityp/PoterryAI with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Daffaadityp/PoterryAI to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Daffaadityp/PoterryAI to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Daffaadityp/PoterryAI to start chatting

Pi new

How to use Daffaadityp/PoterryAI with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Daffaadityp/PoterryAI:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Daffaadityp/PoterryAI:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Daffaadityp/PoterryAI with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Daffaadityp/PoterryAI:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Daffaadityp/PoterryAI:Q4_K_M

Run Hermes

hermes

Docker Model Runner
How to use Daffaadityp/PoterryAI with Docker Model Runner:
```
docker model run hf.co/Daffaadityp/PoterryAI:Q4_K_M
```

Lemonade

How to use Daffaadityp/PoterryAI with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Daffaadityp/PoterryAI:Q4_K_M

Run and chat with the model

lemonade run user.PoterryAI-Q4_K_M

List all available models

lemonade list

Daffaadityp commited on 10 days ago

Commit

803ceac

0 Parent(s):

Duplicate from Daffaadityp/AxonAI-MX4-2.0-GGUF

Browse files

Files changed (5) hide show

.gitattributes +38 -0
AxonAI-MX4-2.0-Q2_K.gguf +3 -0
AxonAI-MX4-2.0-Q4_K_M.gguf +3 -0
AxonAI-MX4-2.0-Q8_0.gguf +3 -0
README.md +410 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,38 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+AxonAI-MX4-2.0-Q2_K.gguf filter=lfs diff=lfs merge=lfs -text
+AxonAI-MX4-2.0-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
+AxonAI-MX4-2.0-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text

AxonAI-MX4-2.0-Q2_K.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:31ec86e26703ce9b1d591198d17ae88d0028662e239511b26bbb0f3379d89b2f
+size 1669500512

AxonAI-MX4-2.0-Q4_K_M.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:dd5013a31ac38dc96e1d7f77c97f69ae774336bc36c89629414a0ee0d1bcc29f
+size 2497281632

AxonAI-MX4-2.0-Q8_0.gguf ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e4415ba571b7e0407913d93dbb0b6159fab082986790dc7783ad53f267f3bc09
+size 4280406112

README.md ADDED Viewed

	@@ -0,0 +1,410 @@

+---
+base_model: Daffaadityp/AxonAI-MX4-2.0
+language:
+  - en
+  - id
+license: apache-2.0
+tags:
+  - gguf
+  - quantized
+  - qwen3
+  - dora
+  - axonlabs
+  - reasoning
+  - local-llm
+  - chain-of-thought
+  - edge-ai
+  - ollama
+  - llama-cpp
+  - indonesian-ai
+  - text-generation
+  - 4b
+  - instruct
+pipeline_tag: text-generation
+library_name: gguf
+---
+<div align="center">
+# 🧠 AxonAI MX4 2.0 — GGUF Quantized Edition
+### *Reasoning-First Language Model · 4B Parameters · Chain-of-Thought Native*
+### *Optimized for Local Inference · Edge Devices · Laptops · Offline AI*
+<br>
+[![Model](https://img.shields.io/badge/Base%20Model-AxonAI%20MX4%202.0-blueviolet?style=for-the-badge&logo=huggingface)](https://huggingface.co/Daffaadityp/AxonAI-MX4-2.0)
+[![Format](https://img.shields.io/badge/Format-GGUF-orange?style=for-the-badge&logo=llvm)](https://github.com/ggerganov/llama.cpp)
+[![Quantization](https://img.shields.io/badge/Quants-Q2__K%20%7C%20Q4__K__M%20%7C%20Q8__0-brightgreen?style=for-the-badge)](https://github.com/ggerganov/llama.cpp#quantization)
+[![Ollama](https://img.shields.io/badge/Ollama-Compatible-informational?style=for-the-badge&logo=ollama)](https://ollama.com)
+[![llama.cpp](https://img.shields.io/badge/llama.cpp-Compatible-success?style=for-the-badge)](https://github.com/ggerganov/llama.cpp)
+[![LM Studio](https://img.shields.io/badge/LM%20Studio-Compatible-9cf?style=for-the-badge)](https://lmstudio.ai)
+[![Parameters](https://img.shields.io/badge/Parameters-4B-blue?style=for-the-badge)](https://huggingface.co/Daffaadityp/AxonAI-MX4-2.0)
+[![License](https://img.shields.io/badge/License-Apache%202.0-red?style=for-the-badge)](https://www.apache.org/licenses/LICENSE-2.0)
+[![Made in Indonesia](https://img.shields.io/badge/Made%20in-Indonesia%20🇮🇩-red?style=for-the-badge)](https://github.com/Daffaadityp)
+<br>
+> **This repository contains the official GGUF quantized files for AxonAI MX4 2.0.**
+> Run a full Chain-of-Thought reasoning LLM *entirely locally* — no GPU required, no internet connection, no API costs. Just pure, structured intelligence on your own hardware.
+</div>
+---
+## 📌 Quick Navigation
+| Section | Description |
+|---|---|
+| [🗂️ Available Files](#️-available-gguf-files--quantization-guide) | Q2_K, Q4_K_M, Q8_0 — which one is right for you? |
+| [🚀 Ollama Quickstart](#-ollama-quickstart-recommended) | Easiest way to run locally — one command |
+| [⚙️ llama.cpp CLI](#️-llamacpp-cli) | For advanced users and scripting |
+| [🖥️ LM Studio / GPT4All](#️-lm-studio--gpt4all) | GUI-based local inference |
+| [🧬 Why Quantized Reasoning?](#-why-a-quantized-reasoning-model-is-so-powerful) | The secret sauce — explained for GGUF |
+| [🛠️ Prompt Format](#️-prompt--system-format) | How to structure your prompts |
+| [🇮🇩 Komunitas Indonesia](#-untuk-developer-indonesia) | Untuk para developer Tanah Air |
+---
+## 🌐 What Is This Repository?
+This is the **official GGUF release** of [AxonAI MX4 2.0](https://huggingface.co/Daffaadityp/AxonAI-MX4-2.0), a 4-billion-parameter reasoning-first language model built by **AxonLabs** (SMKN 26 Jakarta). The original model was trained using **DoRA (Weight-Decomposed Low-Rank Adaptation)** on top of the Qwen3 architecture, fine-tuned to produce structured, transparent Chain-of-Thought (`<think>`) reasoning before every final response.
+These GGUF files were produced using `llama.cpp`'s official quantization pipeline, preserving the model's reasoning depth while dramatically reducing memory footprint — making **local LLM inference** accessible on consumer hardware.
+**If you want the full-precision FP16/BF16 weights**, visit the original repository:
+👉 [`Daffaadityp/AxonAI-MX4-2.0`](https://huggingface.co/Daffaadityp/AxonAI-MX4-2.0)
+---
+## 🗂️ Available GGUF Files & Quantization Guide
+Choose the right quantization level for your hardware. As a general rule: **higher Q = better quality, higher RAM requirement**.
+| File | Quant Type | Size (Est.) | Min RAM | Quality | Use Case |
+|---|---|---|---|---|---|
+| `AxonAI-MX4-2.0-Q2_K.gguf` | Q2_K | ~1.7 GB | 4 GB | ⚡ Fast / Compressed | Raspberry Pi, very old laptops, extreme RAM constraints |
+| `AxonAI-MX4-2.0-Q4_K_M.gguf` | Q4_K_M | ~2.7 GB | 6 GB | ⭐ **Recommended** | Mac M1/M2, standard laptops, WSL2, most modern CPUs |
+| `AxonAI-MX4-2.0-Q8_0.gguf` | Q8_0 | ~4.5 GB | 8 GB | 🔬 Near-FP16 | Workstations, gaming PCs with ample RAM, power users |
+### ⭐ Recommendation: Start with `Q4_K_M`
+`Q4_K_M` is the universally recommended sweet spot for local LLM inference. It delivers:
+- **~95% of the full-precision model quality** at less than 35% of the memory cost
+- Excellent performance on **Apple Silicon (M1/M2/M3)**, standard x86 laptops, and cloud VMs
+- The best balance of **inference speed**, **reasoning coherence**, and **RAM efficiency**
+> 💡 For most users: **Q4_K_M is the right choice. Start here.**
+---
+## 🚀 Ollama Quickstart (Recommended)
+[Ollama](https://ollama.com) is the fastest way to run AxonAI MX4 2.0 locally. No Python setup required.
+### Step 1 — Install Ollama
+```bash
+# macOS / Linux
+curl -fsSL https://ollama.com/install.sh | sh
+# Windows: Download installer from https://ollama.com/download
+```
+### Step 2 — Create a Modelfile
+Create a file named `Modelfile` (no extension) in your working directory:
+```dockerfile
+# Modelfile for AxonAI MX4 2.0 (Q4_K_M - Recommended)
+FROM ./AxonAI-MX4-2.0-Q4_K_M.gguf
+# --- Core Identity & Reasoning System Prompt ---
+SYSTEM """
+You are AxonAI, an advanced reasoning assistant developed by AxonLabs.
+Before answering any question, you MUST use your internal scratchpad enclosed in <think>...</think> tags to reason step-by-step.
+Only after completing your reasoning should you provide a clear, structured, and helpful final answer.
+Be precise, thorough, and transparent in your logic.
+"""
+# --- Generation Parameters (Optimized for Reasoning) ---
+PARAMETER temperature 0.6
+PARAMETER top_p 0.95
+PARAMETER top_k 20
+PARAMETER repeat_penalty 1.1
+PARAMETER num_ctx 8192
+```
+> 💡 **Why the `<think>` system prompt?** AxonAI MX4 2.0 was fine-tuned with Chain-of-Thought supervision. Including this system prompt *unlocks* the model's full reasoning capability. Without it, you may get direct answers without the structured deliberation the model was trained to produce.
+### Step 3 — Build and Run
+```bash
+# Build the local Ollama model from your Modelfile
+ollama create axonai-mx4 -f ./Modelfile
+# Run it interactively
+ollama run axonai-mx4
+# Or run with a direct prompt
+ollama run axonai-mx4 "Explain the P vs NP problem and whether you think it will ever be solved."
+```
+### Using the Ollama REST API
+Once running, Ollama exposes a local REST API — perfect for integrations:
+```bash
+curl http://localhost:11434/api/generate \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "axonai-mx4",
+    "prompt": "What are the ethical implications of deploying AI in judicial systems?",
+    "stream": false
+  }'
+```
+---
+## ⚙️ llama.cpp CLI
+For advanced users, scripting pipelines, or maximum performance control.
+### Install llama.cpp
+```bash
+git clone https://github.com/ggerganov/llama.cpp
+cd llama.cpp
+cmake -B build
+cmake --build build --config Release -j$(nproc)
+```
+### Run Inference
+```bash
+# Basic interactive mode (Q4_K_M recommended)
+./build/bin/llama-cli \
+  -m ./AxonAI-MX4-2.0-Q4_K_M.gguf \
+  -n 2048 \
+  --temp 0.6 \
+  --top-p 0.95 \
+  --top-k 20 \
+  --repeat-penalty 1.1 \
+  --ctx-size 8192 \
+  -i \
+  -r "User:" \
+  --in-prefix " " \
+  -p "You are AxonAI, a reasoning assistant. Think step by step inside <think> tags before answering.\n\nUser:"
+```
+```bash
+# Single-shot inference (batch/scripting)
+./build/bin/llama-cli \
+  -m ./AxonAI-MX4-2.0-Q8_0.gguf \
+  -n 1024 \
+  --temp 0.6 \
+  --ctx-size 8192 \
+  -p "<|im_start|>system\nYou are AxonAI. Reason carefully using <think> tags.<|im_end|>\n<|im_start|>user\nSolve: If a train travels 120km at 60km/h, then 80km at 40km/h, what is the average speed for the whole journey?<|im_end|>\n<|im_start|>assistant\n"
+```
+> 🔧 **Performance tip:** Add `-ngl 99` flag if you have a GPU (NVIDIA/AMD/Metal) to offload layers — this can yield **3–10x speedup** even with quantized GGUF files.
+---
+## 🖥️ LM Studio / GPT4All
+Both LM Studio and GPT4All support direct GGUF loading with a graphical interface — ideal for non-technical users or demos.
+**LM Studio:**
+1. Download from [lmstudio.ai](https://lmstudio.ai)
+2. Go to **Search** → search `AxonAI` or import GGUF manually via **My Models**
+3. Load `AxonAI-MX4-2.0-Q4_K_M.gguf`
+4. In the **System Prompt** field, paste the reasoning system prompt from the Modelfile above
+5. Start chatting — LM Studio also exposes a local OpenAI-compatible API on port `1234`
+**GPT4All:**
+1. Download from [gpt4all.io](https://www.nomic.ai/gpt4all)
+2. Under **Add Model** → choose **Import from file** and select your `.gguf` file
+3. GPT4All works entirely offline after the initial load — perfect for privacy-sensitive use cases
+---
+## 🧬 Why a Quantized Reasoning Model Is So Powerful
+Most local LLMs are **answer-first** — they pattern-match to the most statistically likely response. AxonAI MX4 2.0 is fundamentally different.
+It was trained to **reason before it answers** — meaning every response is preceded by an internal deliberation process encoded inside `<think>...</think>` tags. This is the Chain-of-Thought (CoT) paradigm, and when applied to a quantized local model, several powerful properties emerge:
+### 🔒 Complete Privacy, Full Intelligence
+Your prompts **never leave your machine**. Unlike cloud LLM APIs, there is no data sent to any server. You get structured reasoning capability that rivals much larger models — entirely offline. This is essential for:
+- Legal document analysis
+- Medical note summarization
+- Private financial reasoning
+- Proprietary code review
+### 📉 Quantization ≠ Reasoning Degradation
+Unlike factual recall (where quantization can cause more hallucination), **structured reasoning is surprisingly robust** to quantization. The logical flow encoded during DoRA fine-tuning is preserved at 4-bit precision. The model still deliberates. It still checks its own steps. It still produces structured conclusions.
+### 🧩 The DoRA Advantage
+AxonAI MX4 2.0 was adapted using **DoRA (Weight-Decomposed Low-Rank Adaptation)**, which separates weight updates into magnitude and direction components. This produces **more stable, nuanced fine-tuning** than standard LoRA — and that stability carries through quantization. You get a model that reasons with fidelity even at Q4 compression.
+### ⚡ The Efficiency Equation
+A 4B parameter model at Q4_K_M runs at **~20–60 tokens/second** on Apple M-series chips and modern CPUs. That's fast enough for real-time, interactive reasoning — think of it as having a thoughtful senior analyst available offline, on any machine, forever.
+---
+## 🛠️ Prompt & System Format
+AxonAI MX4 2.0 uses the **ChatML** prompt template (inherited from Qwen3):
+```
+<|im_start|>system
+{system_prompt}<|im_end|>
+<|im_start|>user
+{user_message}<|im_end|>
+<|im_start|>assistant
+<think>
+{internal reasoning — model generates this}
+</think>
+{final answer — model generates this}
+<|im_end|>
+```
+### Recommended System Prompt (Full Version)
+```
+You are AxonAI, an advanced reasoning language model developed by AxonLabs.
+Your core capability is structured deliberation: before answering any question,
+you MUST think step-by-step inside <think>...</think> tags.
+Guidelines:
+- Use <think> to break down the problem, consider edge cases, and verify your logic.
+- After </think>, give a clear, well-structured, and helpful final answer.
+- Be honest about uncertainty. Never fabricate facts.
+- For math and logic, show your work explicitly inside <think>.
+- For creative or open-ended tasks, use <think> to plan your response structure.
+```
+### Minimal System Prompt (Fast / Lightweight)
+```
+You are AxonAI. Always reason inside <think>...</think> before your final answer.
+```
+---
+## 📊 Model Architecture & Training Summary
+| Property | Value |
+|---|---|
+| **Base Architecture** | Qwen3 (4B) |
+| **Fine-Tuning Method** | DoRA (Weight-Decomposed Low-Rank Adaptation) |
+| **Training Paradigm** | Chain-of-Thought Supervised Fine-Tuning |
+| **Context Window** | 8,192 tokens |
+| **Vocab Size** | 151,936 |
+| **Attention Heads** | 32 |
+| **Key-Value Heads** | 8 (Grouped Query Attention) |
+| **Hidden Dimensions** | 2,048 |
+| **GGUF Quantizer** | llama.cpp (official) |
+| **Available Quants** | Q2_K, Q4_K_M, Q8_0 |
+| **Language Support** | English (primary), Indonesian (strong) |
+| **License** | Apache 2.0 |
+---
+## 🔬 Benchmark Context
+> AxonAI MX4 2.0 is a research and educational model from AxonLabs. Formal benchmark results are forthcoming. The following reflects qualitative design targets based on the training methodology.
+| Capability | Assessment |
+|---|---|
+| Structured Reasoning (CoT) | ✅ Strong — core training objective |
+| Mathematical Problem Solving | ✅ Good — benefiting from step-by-step CoT |
+| Code Generation (Python/JS) | ✅ Good |
+| Factual Q&A (English) | ✅ Good |
+| Indonesian Language (id) | ✅ Good |
+| Long-Context Coherence (8K) | ⚠️ Moderate — improves with Q8_0 |
+| Complex Multi-Step Agentic Tasks | ⚠️ Moderate — use longer system prompts |
+*Community evaluations and PR-based benchmark additions are welcome.*
+---
+## 🇮🇩 Untuk Developer Indonesia
+**Halo, Developer Indonesia! 🙌**
+Ini adalah model AI lokal pertama dari AxonLabs yang bisa kamu jalankan **100% offline di laptop atau PC sendiri** — tanpa perlu GPU mahal, tanpa biaya API, dan tanpa koneksi internet.
+Bayangkan: punya asisten AI yang bisa berpikir langkah demi langkah, memahami konteks, dan menjawab pertanyaan kompleks — semuanya berjalan di dalam mesin kamu sendiri. Itulah tujuan AxonAI MX4 2.0 GGUF.
+**Kenapa ini penting buat kamu?**
+- 🔒 **Privasi total** — data kamu tidak pernah keluar dari devicemu
+- 💸 **Gratis selamanya** — tidak ada biaya langganan atau token
+- 🌐 **Bisa dipakai offline** — di daerah dengan koneksi terbatas sekalipun
+- 🧠 **Reasoning-first** — model ini *mikir dulu* sebelum menjawab, bukan asal tebak
+Dibangun oleh pelajar SMK, untuk semua orang Indonesia yang ingin mengeksplorasi AI secara langsung.
+> *"AI terbaik adalah AI yang bisa kamu kontrol sendiri."*
+> — AxonLabs, SMKN 26 Jakarta
+**Cara paling cepat untuk mulai (5 menit):**
+```bash
+# 1. Install Ollama
+curl -fsSL https://ollama.com/install.sh | sh
+# 2. Buat Modelfile (lihat panduan di atas), lalu:
+ollama create axonai-mx4 -f ./Modelfile
+# 3. Jalankan!
+ollama run axonai-mx4 "Jelaskan cara kerja transformer architecture dalam bahasa yang mudah dipahami."
+```
+---
+## ⚖️ License & Usage
+This model is released under the **Apache 2.0 License**.
+- ✅ Free for personal, academic, and commercial use
+- ✅ Modification and redistribution permitted with attribution
+- ✅ Derivative models and fine-tunes welcome
+- ❌ Must not be used to generate illegal, harmful, or deceptive content
+- ❌ Attribution to AxonLabs / `Daffaadityp/AxonAI-MX4-2.0` required for derivative releases
+---
+## 🔗 Related Resources
+| Resource | Link |
+|---|---|
+| 🧠 Original FP16 Model | [Daffaadityp/AxonAI-MX4-2.0](https://huggingface.co/Daffaadityp/AxonAI-MX4-2.0) |
+| 📦 llama.cpp Repository | [github.com/ggerganov/llama.cpp](https://github.com/ggerganov/llama.cpp) |
+| 🦙 Ollama Documentation | [ollama.com/docs](https://ollama.com) |
+| 🖥️ LM Studio | [lmstudio.ai](https://lmstudio.ai) |
+| 🏫 AxonLabs / SMKN 26 Jakarta | [Daffaadityp on HuggingFace](https://huggingface.co/Daffaadityp) |
+---
+## 💬 Community & Feedback
+Found a bug? Have a benchmark result to share? Want to contribute evaluation data?
+- **Open a Discussion** on this HuggingFace repository
+- **Open an Issue** on the [AxonAI GitHub](https://github.com/Daffaadityp) (if available)
+- **Community evaluations are actively welcomed** — especially Indonesian-language benchmarks
+---
+<div align="center">
+*Built with 🧠 by AxonLabs · SMKN 26 Jakarta · Indonesia 🇮🇩*
+*"Intelligence is not about speed. It's about depth of thought."*
+*"Michie Edition"*
+[![HuggingFace](https://img.shields.io/badge/🤗%20HuggingFace-Daffaadityp-yellow?style=for-the-badge)](https://huggingface.co/Daffaadityp)
+</div>