Tinyllama 1b β€” Resonance Knot (.rknot)

Spectral-residual repack of the dense .knot weight format. Drops non-standing-wave embedding dimensions, signed-quantizes the rest at amplitude-graded bit widths, and ships the result as a single .rknot file with a header pointer table for mmap-style block access.

Compression measured on this artifact

Artifact Bytes Ratio vs dense
Dense .knot (Q4_K_M source) 667,094,813 (636.19 MiB) 1.00Γ—
Resonance Knot .rknot 167,313,372 (159.56 MiB) 3.99Γ— smaller

Quantization profile (production defaults from the encoder):

  • standing-wave coverage k = 0.30 Γ— hidden_dim
  • Q/K: 4 bits, signed, per-block scale
  • V: 8 bits, signed, per-block scale
  • FFN: 4 bits, signed, per-block scale

Format

magic        :  6 bytes  "RKNOT\x02"
header_len   :  4 bytes  u32 LE  β€” JSON header byte length
header_json  :  N bytes  manifest, per-layer pointers, arch metadata
body         :  rest     concatenated quantized blocks

The header carries per-layer RknotLayerRef entries with byte offsets into the body so a loader can mmap and pull individual blocks without re-parsing the body.

Decoding

The Rust decoder lives at open-source/gnosis/distributed-inference/src/rknot/. End-to-end probes (load β†’ decode β†’ smoke):

cargo run --release --bin verify-rknot -- \
    --rknot tinyllama-1b.rknot \
    --dense tinyllama-1b.knot \
    --max-layers 5

The Lean SSOT for the format contract lives at open-source/gnosis-math/Gnosis/ResonanceKnot*.lean.

Honest boundaries

  • This artifact is the storage-time spectral compression. It is field-compatible with the Rust loader but has not yet been validated for full multi-token inference on this model.
  • Per-layer energy preservation on the standing subspace is in the 0.95–1.05 range at production bits (signed quantizer fix verified on Llama-1B real Q4_K_M; this artifact uses the same encoder).
  • The encoder uses weight-magnitude manifest selection. The empirically superior PCA-calibrated manifest (50–100Γ— lower KL at the same coverage on Qwen2.5-0.5B) is available via the --input-pca flag of encode-rknot but was not used for this build.

Source

Generated by the forkjoin-ai/distributed-inference Cloud Build pipeline that fetches the source GGUF from TinyLlama/TinyLlama-1.1B-Chat-v1.0, encodes a dense .knot, then repacks via encode-rknot --apply-fp48. See the end-to-end yaml for reproducibility: cloudbuild-llama-70b-end-to-end.yaml.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for forkjoin-ai/tinyllama-1b-rknot

Finetuned
(548)
this model