How to use from
llama.cpp
Install (macOS, Linux)
curl -LsSf https://llama.app/install.sh | sh
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf GhostA1/GhostAI_LiquidSFT-v2:
# Run inference directly in the terminal:
llama cli -hf GhostA1/GhostAI_LiquidSFT-v2:
Install from WinGet (Windows)
winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama serve -hf GhostA1/GhostAI_LiquidSFT-v2:
# Run inference directly in the terminal:
llama cli -hf GhostA1/GhostAI_LiquidSFT-v2:
Use pre-built binary
# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf GhostA1/GhostAI_LiquidSFT-v2:
# Run inference directly in the terminal:
./llama-cli -hf GhostA1/GhostAI_LiquidSFT-v2:
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf GhostA1/GhostAI_LiquidSFT-v2:
# Run inference directly in the terminal:
./build/bin/llama-cli -hf GhostA1/GhostAI_LiquidSFT-v2:
Use Docker
docker model run hf.co/GhostA1/GhostAI_LiquidSFT-v2:
Quick Links

GhostAI_LiquidSFT v2 (full fine-tune)

On-device Solana wallet assistant — a full-weight fine-tune of LFM2.5-1.2B for mobile inference (llama.cpp / llama.rn). v2 improves on the v1 LoRA model with a larger, teacher-augmented + cleaned dataset.

What's new vs v1

  • Full-weight fine-tune (8-GPU DDP) instead of LoRA → eval_loss 0.1534 (v1 LoRA: 0.1736)
  • Dataset grown to ~78k cleaned rows via grounded augmentation (Qwen3.6 teacher + Google-grounded Solana facts), with: tool-error recovery, multi-step chains, clarification on high-stakes asks, follow-ups, hard negatives, and Ghost AI identity.
  • Every tool-call validated against the 172-tool schema; tool args grounded in context (no hallucinated addresses).

Held-out evaluation

metric score
Tool name correct 97.9%
Tool full call (name + all args exact) 85.3%
Negatives (no over-trigger) 88.9%
eval_loss 0.1534

Files

file quant size use
GhostAI_LiquidSFT_v2.Q4_0.gguf Q4_0 ~664 MB Phones (ARM) — fastest TTFT+tok/s
GhostAI_LiquidSFT_v2.Q4_K_M.gguf Q4_K_M ~698 MB desktop balance
GhostAI_LiquidSFT_v2.Q5_K_M.gguf Q5_K_M ~805 MB higher quality
GhostAI_LiquidSFT_v2.Q6_K.gguf Q6_K ~919 MB near-lossless
GhostAI_LiquidSFT_v2.BF16.gguf BF16 ~2.2 GB reference

⚠️ Serving note (important)

This model is trained train==serve with the on-device tool-catalog system prompt. Always send that catalog as the system message — with an ad-hoc system prompt, tool-calling degrades. Tool calls use Hermes format: <tool_call>{"name":...,"arguments":{...}}</tool_call>.

Training

LFM2.5-1.2B-Instruct base · full fine-tune · lr 1e-5 · 2 epochs · eff-batch 256 · bf16 · completion_only_loss (user/tool turns masked) · seq 2048 (0% truncation).

Downloads last month
292
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support