v0.9 — 3 new tools: YaRN Planner, GGUF Bridge, Launch Flags

#1
by karlexmarin - opened

gguf_bridge_screenshot

Every GGUF/VRAM calculator tells you if a model fits in your GPU. None tell you if it
still works at that context. I built 3 tools that do, using a closed-form attention-decay
model (γ_Padé / d_horizon), all running 100% in your browser — no inference, no signup:

  • 🧵 YaRN Planner — paste a model + target context → the exact rope_scaling config.json
    block and a verdict on whether attention quality holds (γ collapse, d_horizon, fine-tune
    flag for aggressive factors).
  • 🧊 GGUF Bridge — paste a GGUF repo → reads the .gguf header via HTTP Range (no
    multi-GB download), compares every quant's γ-shift, tells you "fits 8GB but degrades past
    30K" before you download anything.
  • 🚀 Launch Flags — model + GPU + context → the exact llama.cpp/Ollama command (-ngl,
    -c, --no-mmap, KV-cache type) + warns when your context is past the usable horizon.

25 modes total, 4 languages (EN/ES/FR/ZH). Everything is deterministic + auditable; the
in-browser LLM only synthesises, never invents the numbers.

Feedback welcome — especially if the γ predictions disagree with your real measurements.

Paper: https://zenodo.org/records/20314038


Sign up or log in to comment