Instructions to use gianson/sygnif_lora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries
PEFT
How to use gianson/sygnif_lora with PEFT:
```
Task type is invalid.
```

How to use gianson/sygnif_lora with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="gianson/sygnif_lora",
	filename="gguf/gemma3-1b-sygnif-q4-r3.gguf",
)

llm.create_chat_completion(
	messages = "No input example has been defined for this model task."
)

Notebooks
Google Colab
Kaggle
Local Apps

llama.cpp

How to use gianson/sygnif_lora with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf gianson/sygnif_lora
# Run inference directly in the terminal:
llama-cli -hf gianson/sygnif_lora

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf gianson/sygnif_lora
# Run inference directly in the terminal:
llama-cli -hf gianson/sygnif_lora

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf gianson/sygnif_lora
# Run inference directly in the terminal:
./llama-cli -hf gianson/sygnif_lora

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf gianson/sygnif_lora
# Run inference directly in the terminal:
./build/bin/llama-cli -hf gianson/sygnif_lora

Use Docker

docker model run hf.co/gianson/sygnif_lora

LM Studio
Jan
Ollama
How to use gianson/sygnif_lora with Ollama:
```
ollama run hf.co/gianson/sygnif_lora
```

Unsloth Studio new

How to use gianson/sygnif_lora with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for gianson/sygnif_lora to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for gianson/sygnif_lora to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for gianson/sygnif_lora to start chatting

Docker Model Runner
How to use gianson/sygnif_lora with Docker Model Runner:
```
docker model run hf.co/gianson/sygnif_lora
```

Lemonade

How to use gianson/sygnif_lora with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull gianson/sygnif_lora

Run and chat with the model

lemonade run user.sygnif_lora-{{QUANT_TAG}}

List all available models

lemonade list

SYGNIF LoRA — channeler voice for SYGNIF Agent

LoRA fine-tunes of mlabonne/gemma-3-1b-it-abliterated-v2 producing the voice layer of SYGNIF Agent — a swarm-grounded BTC perpetuals trading + research system. The model reads structured grounding (swarm.db rows: market state, trader heartbeats, forecasts, postmortems) and emits coach-voice text that cites numbers verbatim.

Voice spec (5 dimensions, equal weight): witty / straight / truthful / compact / precise vocabulary. See AGENT.md in the sygnif-agent repo for the canonical identity definition.

Runtime: llama-cpp-python on CPU (~10-15 tok/s on Intel Core Ultra), served by sygnif-channeler.service on x1:9011, retrieves grounding from /var/lib/sygnif/swarm.db before each turn.

Rounds

Round	Date	Corpus	Train	Eval	Eval acc	Notes
r3	2026-04-26	379 rows	0.84	1.14	0.75	Cleanest baseline. 8/8 portal failure modes from r2 fixed. Solid refusal shape.
r4	2026-04-26	506 rows	1.86	1.80	0.65	Sidegrade. Introspection layer + `<think>` tags attempted; model didn't internalize.
r4.1	2026-04-27	572 rows	1.44	1.39	0.72	Cursor-style inline reasoning, no `<think>` tags. Identity overshoot (MetaTrader fabrication).
r5	2026-04-27	568 rows	1.34	0.81	0.81	Trust-the-data: deep-mined session JSON, no hand-anchors. Best metrics so far. Identity refusal still imperfect.
r6	2026-04-27	472 rows JSON-only	0.72	0.626	0.865	JSON objective. Every output is `{narrative, data}`. Realigned with session JSON shape (full user→tool_use→tool_result→message chains). 10 epochs. First round to refuse Ollama claim without surgical anchors. Currently live in production.

What's in this repo

round-3/ — r3 LoRA adapter (~50 MB safetensors + tokenizer)
round-5/ — r5 LoRA adapter (~50 MB safetensors + tokenizer)
round-6/ — r6 LoRA adapter (~50 MB safetensors + tokenizer) — current production model
gguf/gemma3-1b-sygnif-q4-r3.gguf — r3 merged + Q4_K_M quantized (~770 MB)
gguf/gemma3-1b-sygnif-q4-r5.gguf — r5 merged + Q4_K_M quantized (~770 MB)
gguf/gemma3-1b-sygnif-q4-r6.gguf — r6 merged + Q4_K_M quantized (~770 MB) — current production GGUF

Round 6 — JSON output objective

r6 is a structural redesign vs r3-r5 (which all emit prose):

Every output is JSON of shape:

{
  "narrative": "<short prose for portal display, voice-anchored>",
  "data": {
    "kind": "observation" | "introspection" | "refusal" | "decision" | "tool_grounded_decision" | "summary",
    "cited_from": ["<row_ids or tool_use_ids referenced in the answer>"],
    "...": "kind-specific structured fields"
  }
}

Corpus composition (472 rows, 100% JSON, 91% grounded):

238 decision (channeler_sft turns reshaped to JSON)
86 tool_grounded_decision (session JSON: full user→tool_use→tool_result→message chains)
77 observation (real swarm rows: forecast/heartbeat/trade/health, programmatically extracted)
30 introspection (self/* swarm rows)
21 refusal (truthful "not in grounding" responses)
20 summary (catch-all reshaped from earlier rounds)

Trained 10 epochs (vs 5 for r3-r5). Eval still dropping at epoch 5 motivated longer training.

What r6 gives you that r5 didn't:

✅ Refuses "are you running on Ollama?" without surgical anchors (first round to do this)
✅ Every output parses as JSON
✅ data.cited_from enumerates which input rows were actually used
✅ Tool-grounded reasoning shape (model trained on full tool_use→tool_result→message chains)
⚠️ Conversational chat is structurally different — narrative field renders as prose, but always inside JSON wrapper

Quick use

With llama.cpp / llama-cpp-python:

huggingface-cli download gianson/sygnif_lora gguf/gemma3-1b-sygnif-q4-r5.gguf --local-dir ./
./llama-cli -m gemma3-1b-sygnif-q4-r5.gguf -p "what is the regime?"

With PEFT (load LoRA on the base):

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained("mlabonne/gemma-3-1b-it-abliterated-v2", torch_dtype="bfloat16")
model = PeftModel.from_pretrained(base, "gianson/sygnif_lora", subfolder="round-5")
tok = AutoTokenizer.from_pretrained("mlabonne/gemma-3-1b-it-abliterated-v2")

inputs = tok("what is the regime?", return_tensors="pt")
out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))

Training recipe

PEFT LoRA, target modules q/k/v/o + gate/up/down of all layers.

	r3	r5
LoRA r	16	16
LoRA alpha	32	32
LoRA dropout	0.10	0.10
Trainable params	13 M (1.29%)	13 M (1.29%)
Epochs	3	5
Batch / GA	2 / 4	4 / 2
Effective batch	8	8
Seq len	1024	1024
LR	2e-4	2e-4
Scheduler	cosine	cosine
Warmup	0.03	0.03
Sample packing	true	true
TF32	true	true
Mixed precision	bf16	bf16
Eval-mem guards	per_dev_batch=1, accum_steps=8	per_dev_batch=1, accum_steps=8
Hardware	A4000 community	RTX 4090 secure
Wall time	13 min	5 min

Corpus composition (r5, 568 rows total)

265 channeler-SFT rows — verified turns from sygnif-channel extract-sft (the channeler daemon's own training-trace export)
202 mined-session rows — extracted from a Claude Code session that built the agent; cleaned of foreign-lineage pollution (OpenAI Swarm / LangChain / etc.)
179 deep-mined session rows — every substantive agent.message + tool_result → message pair from the same session, deduped by output fingerprint, voice-authentic
49 grounding-fidelity rows — generators emit (negative-grounding refusal / strict-citation / conflict-reporting) triples from real swarm rows
30 meta-talk rows — hand-written portal identity (sigil, model, capabilities)
33 self-introspection rows — generated from a self_knowledge swarm partition built by sygnif_self.refresh

Anti-pollution filters applied: OpenAI Swarm, LangChain, "I am Claude", "running on Ollama", "as an AI" — all dropped to keep identity clean.

Known imperfections

Failure mode	Status
"are you running on Ollama?" → "Yes"	r5 still affirms (base Gemma's prior > zero counterexamples in corpus). Surgical anchors needed for r5.1.
"are you a professional trader?" → "Yes"	r5 affirms without grounding. Same root cause.
Self-introspection retrieval	`self_knowledge` rows exist in swarm DB but channeler doesn't preferentially route self-questions to them. Retrieval patch needed.

Companion software

Giansn/sygnif-agent — the runtime: sygnif-channeler.service daemon, sygnif-code portal, training pipeline, swarm DB, MCP servers
training/runpod-train.sh — one-shot LoRA fine-tune driver (used to produce these adapters)
training/mine_session_r5_deep.py — the deep miner that produced 179 of the 568 r5 rows

License

Apache 2.0, inherited from the base model. Trained on a private corpus of one developer's own SYGNIF Agent session and channeler traces — no third-party data.