Instructions to use wimpSquad/glyph-translator-v7 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use wimpSquad/glyph-translator-v7 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="wimpSquad/glyph-translator-v7") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("wimpSquad/glyph-translator-v7") model = AutoModelForCausalLM.from_pretrained("wimpSquad/glyph-translator-v7") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use wimpSquad/glyph-translator-v7 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "wimpSquad/glyph-translator-v7" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wimpSquad/glyph-translator-v7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/wimpSquad/glyph-translator-v7
- SGLang
How to use wimpSquad/glyph-translator-v7 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "wimpSquad/glyph-translator-v7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wimpSquad/glyph-translator-v7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "wimpSquad/glyph-translator-v7" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "wimpSquad/glyph-translator-v7", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use wimpSquad/glyph-translator-v7 with Docker Model Runner:
docker model run hf.co/wimpSquad/glyph-translator-v7
glyph-translator-v7
A small transducer that maps English prose into glyph โ a compact,
operator-based notation for representing causal and structural claims as
graphs of noun-entity nodes connected by a closed inventory of
operators (โ โ โฃ โ โ โ โก โ โ โ โ โ โ โ โ โฅ ยฌ โง โจ โ โ โ @ : and a few more).
It is part of the APE (Atomic Pumpkin Engine) toolchain. APE ingests prose, runs it through a curator โ translator pipeline, and stores the resulting glyph as the canonical knowledge representation. v7 is the translator stage.
What it does
- Input: curator-shaped English prose (active voice, explicit causation, named entities preserved).
- Output: glyph โ noun-entity atoms plus closed-inventory operators, with chain/bundle compression.
Lineage and why v7 exists
This is the seventh attempt at a small dedicated translator in the APE line. The short version of the lineage:
- v4 โ full FT of
gemma-3-1b-it, chat-template + masked assistant labels, 40/40/20 system-message variance (canonical / bare / adversarial). Empirically unbreakable, prompt-less. - v5 โ silently switched to LoRA on attention (q/k/v/o), single prompt template. Shipped, but rigid against its training shape.
- v6 โ dropped the wrapper, added
eos_weight=10รto compensate for the frozenlm_head. Lost ~4-5pp fidelity vs v5; still leaked EOS issues because LoRA-on-attention can't move the unembedding row for EOS or glyph tokens. - v7 โ explicit revert to v4's training method (full FT,
40/40/20) on v5's noun-only corpus
(
regen-lambda-20260427-gemma26b-nounonly-v3), plus a v7.5 early-stop noise-band fix that lifted rescored fidelity past the gate.
TL;DR of the decision rationale: full-FT-vs-LoRA and prompt-shape-diversity are separate methodological variables and should not be swapped silently.
Architecture
- Base:
google/gemma-3-1b-it(instruction-tuned). - Method: full fine-tune (no adapters).
- Sequence: chat template with masked assistant labels.
- System-message variance: 40% canonical glyph instructions, 40% bare (no system message), 20% adversarial system message โ same target glyph in all three.
Training data
- ~23k training pairs from
splits.v5-split-v1(train view) of the APE corpus, materialised fromcorpus.db. - Source pairs are
(curator_prose, target_glyph)produced bygemma-4-26b-a4b-itrunning the curator + noun-only translator prompts. - ~3k val / ~3k heldout from the same split.
Evaluation
Gold panel scorer (models/translator/v5/eval/eval_harness.py):
| metric (heldout, rescored) | v7.5 |
|---|---|
| bare-input fidelity | 0.764 |
| injection (under adversarial system) fidelity | 0.768 |
| gate (โฅ โ0.70) | pass |
Other gates measured during run selection: terminator EOS fire rate, n-gram repetition canary, length p50/p95/max distribution match. v7.5's rescored result is what crossed the fidelity gate after an early-stop noise-band fix.
For comparison: gemma-26b teacher (single-pass, same prompt) scored 0.736 on the same panel.
Intended use
- The translator stage in APE's ingest pipeline. Curator prose in, glyph out.
- Distillation of larger teacher pipelines into a 1B-parameter inference target.
- Research on small dedicated transducers for structured output over custom symbol vocabularies.
Out of scope
- Direct chat / general instruction following โ the model is
specialised on a narrow
prose โ glyphmapping. - Glyph โ prose (decompression) โ that's a separate model.
- Inputs without curator-shaped framing โ v7 is robust to bare and adversarial system messages by design, but its training distribution is curator output. Quality on raw arbitrary prose will degrade.
Inference
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("wimpSquad/glyph-translator-v7")
model = AutoModelForCausalLM.from_pretrained("wimpSquad/glyph-translator-v7")
msgs = [{"role": "user", "content": "Compression in v5 was too aggressive โ it removed the attribution edges the retrieval layer needed."}]
ids = tok.apply_chat_template(msgs, return_tensors="pt", add_generation_prompt=True)
out = model.generate(ids, max_new_tokens=256, do_sample=False)
print(tok.decode(out[0][ids.shape[-1]:], skip_special_tokens=True))
Limitations and known issues
- Operator coverage is uneven.
containmentandspecialized_operatorscategories trail the others on operator coverage; the corpus carries the underlying distribution. - Curator dependency. v7 is trained on curator output. Performance on raw prose without the curator stage is not guaranteed.
- No safety tuning. This is a structured-output transducer, not a
general assistant. It has no harm filtering beyond what
gemma-3-1b-itships with.
License
Apache-2.0 for the fine-tune. Base model
google/gemma-3-1b-it is governed by the
Gemma Terms of Use; they apply
transitively.
- Downloads last month
- 26