Spaces:

karlexmarin
/

taf-agent

Running

v0.9 — 3 new tools: YaRN Planner, GGUF Bridge, Launch Flags

by karlexmarin - opened 13 days ago

Every GGUF/VRAM calculator tells you if a model fits in your GPU. None tell you if it
still works at that context. I built 3 tools that do, using a closed-form attention-decay
model (γ_Padé / d_horizon), all running 100% in your browser — no inference, no signup:

🧵 YaRN Planner — paste a model + target context → the exact rope_scaling config.json
block and a verdict on whether attention quality holds (γ collapse, d_horizon, fine-tune
flag for aggressive factors).
🧊 GGUF Bridge — paste a GGUF repo → reads the .gguf header via HTTP Range (no
multi-GB download), compares every quant's γ-shift, tells you "fits 8GB but degrades past
30K" before you download anything.
🚀 Launch Flags — model + GPU + context → the exact llama.cpp/Ollama command (-ngl,
-c, --no-mmap, KV-cache type) + warns when your context is past the usable horizon.

25 modes total, 4 languages (EN/ES/FR/ZH). Everything is deterministic + auditable; the
in-browser LLM only synthesises, never invents the numbers.

Feedback welcome — especially if the γ predictions disagree with your real measurements.

Paper: https://zenodo.org/records/20314038

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment