Spaces:
Running
Running
v0.9 — 3 new tools: YaRN Planner, GGUF Bridge, Launch Flags
#1
by karlexmarin - opened
Every GGUF/VRAM calculator tells you if a model fits in your GPU. None tell you if it
still works at that context. I built 3 tools that do, using a closed-form attention-decay
model (γ_Padé / d_horizon), all running 100% in your browser — no inference, no signup:
- 🧵 YaRN Planner — paste a model + target context → the exact
rope_scalingconfig.json
block and a verdict on whether attention quality holds (γ collapse, d_horizon, fine-tune
flag for aggressive factors). - 🧊 GGUF Bridge — paste a GGUF repo → reads the
.ggufheader via HTTP Range (no
multi-GB download), compares every quant's γ-shift, tells you "fits 8GB but degrades past
30K" before you download anything. - 🚀 Launch Flags — model + GPU + context → the exact
llama.cpp/Ollama command (-ngl,
-c,--no-mmap, KV-cache type) + warns when your context is past the usable horizon.
25 modes total, 4 languages (EN/ES/FR/ZH). Everything is deterministic + auditable; the
in-browser LLM only synthesises, never invents the numbers.
Feedback welcome — especially if the γ predictions disagree with your real measurements.
Paper: https://zenodo.org/records/20314038
