How to use from
Docker Model Runner
docker model run hf.co/srs6901/Vikras-MixP:Q8_0
Quick Links

Vikras โ€” Experimental Family of Language Models

EN below

ะกะพะดะตั€ะถะฐะฝะธะต


ะšะพั€ะพั‚ะบะพ ะพ ะฟั€ะพะตะบั‚ะต

Vikra โ€” ัะบัะฟะตั€ะธะผะตะฝั‚ะฐะปัŒะฝะพะต ัะตะผะตะนัั‚ะฒะพ ัะทั‹ะบะพะฒั‹ั… ะผะพะดะตะปะตะน, ะธััะปะตะดัƒัŽั‰ะตะต ะฒะปะธัะฝะธะต:

  • ะณะตะพะผะตั‚ั€ะธะธ ะฟั€ะตะดัั‚ะฐะฒะปะตะฝะธะน
  • ะบะฒะฐะฝั‚ะพะฒะฐะฝะธั
  • ะณะธะฑั€ะธะดะฝั‹ั… ะผะตั€ะดะถะตะน

ะฝะฐ ั‡ะธัะปะตะฝะฝัƒัŽ ะดะธะฝะฐะผะธะบัƒ ั‚ั€ะฐะฝัั„ะพั€ะผะตั€ะพะฒ.

ะŸั€ะพะตะบั‚ Vikras ะฝะต ะพะณั€ะฐะฝะธั‡ะธะฒะฐะตั‚ัั ะพะดะฝะพะน ะฑะฐะทะพะน ะธะปะธ ะพะดะฝะพะน ะฐั€ั…ะธั‚ะตะบั‚ัƒั€ะพะน: ัั‚ะพ ัะตะผะตะนัั‚ะฒะพ ะผะพะดะตะปะตะน, ะพะฑัŠะตะดะธะฝั‘ะฝะฝั‹ั… ะธะดะตะตะน ั‡ะธัะปะตะฝะฝะพะน ะธะฝะฒะฐั€ะธะฐะฝั‚ะฝะพัั‚ะธ ัะบัะฟะตั€ะธะผะตะฝั‚ะฐ.

  • Vikra_% โ€” ะธะผั ะบะพะฝะบั€ะตั‚ะฝะพะน ะผะพะดะตะปะธ
  • Vikras โ€” ัะตะผะตะนัั‚ะฒะพ ัะบัะฟะตั€ะธะผะตะฝั‚ะพะฒ
  • S / M / L โ€” ัั‚ะตะฟะตะฝัŒ ะฐะณั€ะตััะธะฒะฝะพัั‚ะธ ะธ ั€ะฐัะฟั€ะตะดะตะปะตะฝะธั ะฑะธั‚ะฝะพัั‚ะธ
  • MixP / FullP / HCT โ€” ัั…ะตะผั‹ ะธ ะธะฝะฒะฐั€ะธะฐะฝั‚ั‹ ะบะฒะฐะฝั‚ะพะฒะฐะฝะธั/ะผะตั€ะดะถะตะน

ะขะตะบัƒั‰ะธะน ั€ะตะปะธะท: HCT/YeAM

ะ ะตะปะธะทั‹


HCT (ะฐั€ั…ะธั‚ะตะบั‚ัƒั€ะฐ) / YeAM (ะธะฝะฒะฐั€ะธะฐะฝั‚ ั€ะตะฐะปะธะทะฐั†ะธะธ)

HCT โ€” ะฐั€ั…ะธั‚ะตะบั‚ัƒั€ะฝั‹ะน ะธะฝะฒะฐั€ะธะฐะฝั‚: ะฟั€ะฐะบั‚ะธั‡ะตัะบะธะน ัะฟะพัะพะฑ ัะพะฑะธั€ะฐั‚ัŒ ัะพะฒะผะตัั‚ะธะผั‹ะต ะผะพะดะตะปะธ ะธ ะฟั€ะพะธะทะฒะพะดะฝั‹ะต ั€ะตะปะธะทั‹ ะฟั€ะธ ะฟะตั€ะตะฝะพัะต ะผะตะถะดัƒ ะฑะฐะทะฐะผะธ/ัะตะผะตะนัั‚ะฒะฐะผะธ.

YeAM (Yet Another Merge) โ€” ะธะฝะฒะฐั€ะธะฐะฝั‚ ั€ะตะฐะปะธะทะฐั†ะธะธ HCT ะธ ัะฐะผะพัั‚ะพัั‚ะตะปัŒะฝะฐั ัั…ะตะผะฐ ะผะตั€ะดะถะฐ HFโ†’HF: ัั‚ะพ ะฝะต ยซะตั‰ั‘ ะพะดะธะฝ SLERP/DARE/TILESยป ะธ ะฝะต ะบะพัะผะตั‚ะธั‡ะตัะบะฐั ะฒะฐั€ะธะฐั†ะธั ัƒัั€ะตะดะฝะตะฝะธั.

YeAM ะฒั‹ะดะฐั‘ั‚ ัั‚ะฐะฝะดะฐั€ั‚ะฝั‹ะน HF-ั€ะตะทัƒะปัŒั‚ะฐั‚ (safetensors + index) ะธ ะฟะพะดะดะตั€ะถะธะฒะฐะตั‚:

  • ะฟั€ัะผะพะน weight-to-weight ะผะตั€ะดะถ
  • ะฝะฐะฟั€ะฐะฒะปะตะฝะฝะพะต ะดะพะฑะฐะฒะปะตะฝะธะต ะทะฝะฐะฝะธะน ะฒ ะฒั‹ะฑั€ะฐะฝะฝัƒัŽ ะผะพะดะตะปัŒ (knowledge distillation / knowledge injection), ัะพะณะปะฐัะพะฒะฐะฝะฝะพะต ะฟะพ ะฝะตัะบะพะปัŒะบะธะผ ะธัั‚ะพั‡ะฝะธะบะฐะผ
  • ะดะพะฟะพะปะฝะธั‚ะตะปัŒะฝั‹ะน ะผะตั€ะดะถ Attention-ัะปะพั‘ะฒ ะบะฐะบ ะพั‚ะดะตะปัŒะฝัƒัŽ ั‚ะตั…ะฝะธะบัƒ ะฟะพะฒะตั€ั… YeAM
  • ะผะตั€ะดะถ ะผะตะฝัŒัˆะธั… ะผะพะดะตะปะตะน ะฒ ะฑะพะปะตะต ะบั€ัƒะฟะฝั‹ะต (scale-up merge) ะฟั€ะธ ัะพั…ั€ะฐะฝะตะฝะธะธ ัะพะฒะผะตัั‚ะธะผะพะณะพ HF-ั„ะพั€ะผะฐั‚ะฐ

ะœะฐั‚ะตะผะฐั‚ะธั‡ะตัะบะธ YeAM ั€ะฐะฑะพั‚ะฐะตั‚ ะฒ ั€ะตะฐะปัŒะฝะพะน 4D-ะฟะพัั‚ะฐะฝะพะฒะบะต: ะพะฑะฝะพะฒะปะตะฝะธั ะบะพะดะธั€ัƒัŽั‚ัั ะณะตะพะผะตั‚ั€ะธั‡ะตัะบะธ ะธ ัะพะณะปะฐััƒัŽั‚ัั ั‡ะตั€ะตะท ะฟะตั€ะตัะตั‡ะตะฝะธั ะปัƒั‡ะตะน ะฒ ะฟั€ะพัั‚ั€ะฐะฝัั‚ะฒะต ะฟะฐั€ะฐะผะตั‚ั€ะพะฒ. ะญั‚ะพ ะดะฐั‘ั‚ ัƒะฟั€ะฐะฒะปัะตะผั‹ะน ะผะตั€ะดะถ ั ัะพั…ั€ะฐะฝะตะฝะธะตะผ ัั‚ั€ัƒะบั‚ัƒั€ั‹ ะธ ะฑะตะท ะฒั‹ั€ะพะถะดะตะฝะธั ะฒ ะฝะฐะธะฒะฝะพะต ัƒัั€ะตะดะฝะตะฝะธะต.


ะŸั€ะตะดั‹ะดัƒั‰ะธะน ั€ะตะปะธะท: Vikra MixedPrc (MixP_4.9b_S)

ะšั€ะฐั‚ะบะพะต ะพะฟะธัะฐะฝะธะต

12.25B Mistral-based language model
Hybrid mixed-precision merged GGUF quantization
ะญะบัะฟะตั€ะธะผะตะฝั‚ะฐะปัŒะฝั‹ะน ั€ะตะถะธะผ ะฐะฝะธะทะพั‚ั€ะพะฟะฝะพะณะพ ะบะฒะฐะฝั‚ะพะฒะฐะฝะธั

ะŸะพะปะฝะฐั ะฒะตั€ัะธั ะผะตั€ะดะถะฐ (ะฑะตะท ะบะฒะฐะฝั‚ะพะฒะฐะฝะธั): https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-FullP

GGUF-ะบะฒะฐะฝั‚: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-MixP_4.9b_S.gguf


MixP_4.9b_S: ะดะตั‚ะฐะปะธ

ะั€ั…ะธั‚ะตะบั‚ัƒั€ะฐ (ะดะปั MixP ั€ะตะปะธะทะฐ)

ะŸะฐั€ะฐะผะตั‚ั€ ะ—ะฝะฐั‡ะตะฝะธะต
Architecture Mistral-based
Params ~12.25B
Layers 40
Hidden size 5120
FFN size 14336
Heads 32 (8 KV heads, GQA)
Context 1,024,000
Vocab 131,072 (Tekken BPE)
RoPE theta 1,000,000

MixP_4.9b_S โ€” ัั…ะตะผะฐ ะบะฒะฐะฝั‚ะพะฒะฐะฝะธั

ะ“ะธะฑั€ะธะดะฝะฐั mixed precision ัั…ะตะผะฐ ั ะฟะพะบะพะผะฟะพะฝะตะฝั‚ะฝั‹ะผ ั€ะฐัะฟั€ะตะดะตะปะตะฝะธะตะผ ั‚ะธะฟะพะฒ.

Tensor group Quant type BPW
token_embd, output BF16 16
attn_norm, ffn_norm, output_norm F32 32
attn_q Q4_K 4.5
attn_k Q5_K 5.5
attn_v Q3_K 3.44
attn_output Q4_K 4.5
ffn_gate Q3_K 3.44
ffn_up Q5_K 5.5
ffn_down Q5_K / Q6_K 5.5โ€“6.56

ะ˜ั‚ะพะณะพ:

  • Quantized layers only: ~4.89 BPW
  • Full model average: ~6.11 BPW
  • File size: ~8.71 GB

ะšะปัŽั‡ะตะฒะฐั ะธะดะตั MixP

MixP โ€” ัั‚ะพ ะฝะต ยซัะถะฐั‚ัŒ ะฒัั‘ ะพะดะธะฝะฐะบะพะฒะพยป.

ะญั‚ะพ ะฐะฝะธะทะพั‚ั€ะพะฟะฝะพะต ะบะฒะฐะฝั‚ะพะฒะฐะฝะธะต ะธะฝั„ะพั€ะผะฐั†ะธะพะฝะฝั‹ั… ะบะฐะฝะฐะปะพะฒ:

โ€ข Q/K ัะพั…ั€ะฐะฝััŽั‚ัั ะฒ ะฑะพะปะตะต ะฒั‹ัะพะบะพะน ั‚ะพั‡ะฝะพัั‚ะธ โ€ข V ะธ gate ะฝะฐะผะตั€ะตะฝะฝะพ ะบะฒะฐะฝั‚ะพะฒะฐะฝั‹ ะดะพ Q3_K โ€ข ะะพั€ะผั‹ ะธ ะฒั‹ั…ะพะดะฝะพะน ัะปะพะน ะพัั‚ะฐัŽั‚ัั ะฒ ะฒั‹ัะพะบะพะน ั‚ะพั‡ะฝะพัั‚ะธ

ะขะฐะบะพะต ั€ะฐัะฟั€ะตะดะตะปะตะฝะธะต ะธะทะผะตะฝัะตั‚ ั‡ะธัะปะตะฝะฝัƒัŽ ะดะธะฝะฐะผะธะบัƒ ะผะพะดะตะปะธ:

โ€ข ัƒัะธะปะธะฒะฐะตั‚ัั ัั‚ั€ัƒะบั‚ัƒั€ะฝะฐั sparsification โ€ข ะผะตะฝัะตั‚ัั ั€ะฐัะฟั€ะตะดะตะปะตะฝะธะต ะฝะพั€ะผ ัะบั€ั‹ั‚ั‹ั… ะฟั€ะตะดัั‚ะฐะฒะปะตะฝะธะน โ€ข ะผะตะฝัะตั‚ัั ัะฝั‚ั€ะพะฟะธั ะปะพะณะธั‚ะพะฒ โ€ข ะฟะพัะฒะปัะตั‚ัั ั€ะตะถะธะผะฝะฐั ั‡ัƒะฒัั‚ะฒะธั‚ะตะปัŒะฝะพัั‚ัŒ

ะญั‚ะพ ะฝะต ะฝะพะฒะฐั ะฐั€ั…ะธั‚ะตะบั‚ัƒั€ะฐ. ะญั‚ะพ ะธะทะผะตะฝะตะฝะธะต ั‡ะธัะปะตะฝะฝะพะน ะณะตะพะผะตั‚ั€ะธะธ ััƒั‰ะตัั‚ะฒัƒัŽั‰ะตะน.

ะะฐะฑะปัŽะดะฐะตะผั‹ะต ัั„ั„ะตะบั‚ั‹

  • ัะพั…ั€ะฐะฝะตะฝะธะต top-1 ะฟั€ะตะดัะบะฐะทะฐะฝะธะน ะฝะฐ ะฟั€ะพัั‚ั‹ั… ะทะฐะดะฐั‡ะฐั…
  • ั€ะพัั‚ entropy ะฑะตะท ั€ะฐะทั€ัƒัˆะตะฝะธั ะผะฐะบัะธะผะฐะปัŒะฝะพะน ะฒะตั€ะพัั‚ะฝะพัั‚ะธ
  • ั€ะฐััˆะธั€ะตะฝะธะต hidden norm ะฝะฐ ัะปะพะถะฝั‹ั… ะทะฐะดะฐั‡ะฐั…
  • ะฑะธั„ัƒั€ะบะฐั†ะธั ั€ะตะถะธะผะพะฒ: ะฟั€ะพัั‚ั‹ะต ะทะฐะดะฐั‡ะธ โ‰ˆ ะธะฝะฒะฐั€ะธะฐะฝั‚ะฝั‹, ัะปะพะถะฝั‹ะต โ€” ั‡ัƒะฒัั‚ะฒะธั‚ะตะปัŒะฝั‹

ะญั‚ะธ ัั„ั„ะตะบั‚ั‹ ะพะฟะธัั‹ะฒะฐัŽั‚ัั ะบะฐะบ ะณะตะพะผะตั‚ั€ะธั‡ะตัะบะธะน ัะดะฒะธะณ ะฟั€ะตะดัั‚ะฐะฒะปะตะฝะธะน, ะฐ ะฝะต ะบะฐะบ ัƒะฝะธะฒะตั€ัะฐะปัŒะฝะพะต ัƒะปัƒั‡ัˆะตะฝะธะต ะบะฐั‡ะตัั‚ะฒะฐ.

math_subattention (ั€ะฐะฑะพั‡ะฐั ะณะธะฟะพั‚ะตะทะฐ)

ะ’ ัะบัะฟะตั€ะธะผะตะฝั‚ะฐั… ะฝะฐะฑะปัŽะดะฐะตั‚ัั ัั„ั„ะตะบั‚, ัƒัะปะพะฒะฝะพ ะพะฑะพะทะฝะฐั‡ะตะฝะฝั‹ะน ะบะฐะบ:

โ€œmath_subattentionโ€

ะŸะพะด ัั‚ะธะผ ะฟะพะดั€ะฐะทัƒะผะตะฒะฐะตั‚ัั:

โ€ข ัƒะผะตะฝัŒัˆะตะฝะธะต ะฒะบะปะฐะดะฐ ะผะตะปะบะธั… ะบะพะผะฟะพะฝะตะฝั‚ V โ€ข ัƒัะธะปะตะฝะธะต ะดะพะผะธะฝะธั€ัƒัŽั‰ะธั… ะฝะฐะฟั€ะฐะฒะปะตะฝะธะน residual stream โ€ข ะฟะพะฒั‹ัˆะตะฝะฝะฐั ะธะฝะตั€ั†ะธั ะฟั€ะตะดั‹ะดัƒั‰ะตะณะพ ั‚ะพะบะตะฝะฐ โ€ข ัะฝะธะถะตะฝะธะต ั‡ะฐัั‚ะพั‚ั‹ ะผะตะปะบะธั… ะฟะตั€ะตะบะปัŽั‡ะตะฝะธะน ะปะพะณะธั‚ะพะฒ

ะญั‚ะพ ะฝะต claim ะพ ะฝะพะฒะพะน ะฐั€ั…ะธั‚ะตะบั‚ัƒั€ะต. ะญั‚ะพ ั€ะฐะฑะพั‡ะฐั ะณะธะฟะพั‚ะตะทะฐ ะพ ะดะธะฝะฐะผะธะบะต, ะฒะพะทะฝะธะบะฐัŽั‰ะตะน ะฟั€ะธ Q3_K symmetric quantization.

ะขะตั€ะผะธะฝ ะธัะฟะพะปัŒะทัƒะตั‚ัั ะพะฟะธัะฐั‚ะตะปัŒะฝะพ.

ะŸะตั€ะฟะปะตะบัะธั

ะœะตั‚ั€ะธะบะฐ ะธะทะผะตั€ะตะฝะฐ ะฝะฐ wikitext-2-raw-test (full):

Model Precision PPL
Vikra MixP_4.9b_S 6.11 BPW 5.50 ยฑ 0.03
Baseline BF16 Full 6.02 ยฑ 0.03

ะŸะปะฐะฝั‹ ั€ะฐะทะฒะธั‚ะธั

ะŸะปะฐะฝะธั€ัƒัŽั‚ัั ะฟะพะดัะตะผะตะนัั‚ะฒะฐ:

  • MixP โ€” Mixed Precision
  • FullP โ€” Full Precision ะฒะตั€ัะธะธ
  • HCT โ€” multi-merge ัะบัะฟะตั€ะธะผะตะฝั‚ั‹
  • S / M / L โ€” ะฒะฐั€ะธะฐะฝั‚ั‹ ั€ะฐัะฟั€ะตะดะตะปะตะฝะธั ะฑะธั‚ะฝะพัั‚ะธ

ะ’ัะต ะผะพะดะตะปะธ ัะตะผะตะนัั‚ะฒะฐ ะฝะฐะทั‹ะฒะฐัŽั‚ัั Vikra. ะ ะตะฟะพะทะธั‚ะพั€ะธะน โ€” Vikras.


ะ˜ัะฟะพะปัŒะทะพะฒะฐะฝะธะต

llama-cli -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
llama-server -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096

ะ—ะฐะบะปัŽั‡ะตะฝะธะต

Vikras โ€” ะธััะปะตะดะพะฒะฐั‚ะตะปัŒัะบะธะน ะฟั€ะพะตะบั‚.

ะžะฝ ะธััะปะตะดัƒะตั‚, ะบะฐะบ ะผะตะฝัะตั‚ัั ะฟะพะฒะตะดะตะฝะธะต ั‚ั€ะฐะฝัั„ะพั€ะผะตั€ะฐ, ะตัะปะธ ะตะณะพ:

  • ัะถะธะผะฐั‚ัŒ
  • ัะผะตัˆะธะฒะฐั‚ัŒ
  • ะธะทะผะตะฝัั‚ัŒ ั‡ะธัะปะตะฝะฝัƒัŽ ะณะตะพะผะตั‚ั€ะธัŽ

ะ•ัะปะธ ะฒะฐะผ ะธะฝั‚ะตั€ะตัะฝั‹ hidden space dynamics / regime sensitivity / anisotropic quantization โ€” ะดะพะฑั€ะพ ะฟะพะถะฐะปะพะฒะฐั‚ัŒ.


Vikras โ€” Experimental Family of Language Models (EN)

Table of Contents


Project overview

Vikra is an experimental family of language models exploring how:

  • representation geometry
  • quantization
  • hybrid merges

affect transformer numerical dynamics.

The Vikras project is not tied to a single base model or architecture. It is a family of models unified by a numerical invariance philosophy of experimentation.

  • Vikra_% โ€” a specific model
  • Vikras โ€” the experimental family
  • S / M / L โ€” aggressiveness and bit allocation variants
  • MixP / FullP / HCT โ€” quantization / merge invariants

Current Release: HCT/YeAM

Releases


HCT (architecture) / YeAM (implementation invariant)

HCT is an architectural invariant. In English: Heterogeneous Compatibility Transfer โ€” a practical way to assemble compatible checkpoints and derived releases while moving across bases / model families.

YeAM (Yet Another Merge) is an implementation invariant of HCT and a standalone HFโ†’HF merge scheme: it is not โ€œjust another SLERP/DARE/TILESโ€ and not a cosmetic variant of averaging.

YeAM produces a standard HF output (safetensors + index) and supports:

  • direct weight-to-weight merging
  • targeted knowledge injection into a chosen model (knowledge distillation mode), aligned across multiple sources
  • an additional Attention-layer merge as a second technique on top of YeAM
  • merging smaller models into larger ones (scale-up merge) while keeping a compatible HF format

YeAM operates in a real 4D formulation: updates are encoded geometrically and aligned via ray intersections in parameter space. This produces controlled merges that preserve structure instead of collapsing into naive averaging.


Previous Release: Vikra MixedPrc (MixP_4.9b_S)

Short Description

12.25B Mistral-based language model
Hybrid mixed-precision merged GGUF quantization
Experimental anisotropic quantization regime

Full merge version (non-quantized): https://huggingface.co/srs6901/Vikras-MixP/tree/main/Vikra-FullP

GGUF quant: https://huggingface.co/srs6901/Vikras-MixP/blob/main/Vikra-MixP_4.9b_S.gguf


MixP_4.9b_S: details

Architecture (for the MixP release)

Parameter Value
Architecture Mistral-based
Params ~12.25B
Layers 40
Hidden size 5120
FFN size 14336
Heads 32 (8 KV heads, GQA)
Context 1,024,000
Vocab 131,072 (Tekken BPE)
RoPE theta 1,000,000

MixP_4.9b_S โ€” Quantization Scheme

A hybrid mixed-precision scheme with per-tensor type allocation.

Tensor group Quant type BPW
token_embd, output BF16 16
attn_norm, ffn_norm, output_norm F32 32
attn_q Q4_K 4.5
attn_k Q5_K 5.5
attn_v Q3_K 3.44
attn_output Q4_K 4.5
ffn_gate Q3_K 3.44
ffn_up Q5_K 5.5
ffn_down Q5_K / Q6_K 5.5โ€“6.56

Totals:

  • Quantized layers only: ~4.89 BPW
  • Full model average: ~6.11 BPW
  • File size: ~8.71 GB

Core idea of MixP

MixP is not โ€œcompress everything equallyโ€.

It is anisotropic quantization of information channels:

  • Q/K remain in higher precision
  • V and gate are intentionally quantized down to Q3_K
  • norms and the output layer remain in higher precision

This redistribution changes the numerical dynamics of the model:

  • increased structural sparsification
  • shifts in hidden norm distribution
  • changes in logit entropy
  • regime sensitivity

This is not a new architecture. It is a modification of the numerical geometry of an existing one.

Observed effects

  • preservation of top-1 predictions on simple tasks
  • increased entropy without collapse of maximum probability
  • expansion of hidden norms on complex tasks
  • mode bifurcation: simple tasks โ‰ˆ invariant, complex tasks sensitive

These effects are interpreted as a geometric shift of representations rather than a universal quality improvement.

math_subattention (working hypothesis)

In experiments, an effect informally referred to as:

โ€œmath_subattentionโ€

This describes:

  • reduced contribution of small V components
  • dominance of stronger residual directions
  • increased inertia from previous token state
  • reduced frequency of small logit switching

This is not an architectural claim. It is a working hypothesis of dynamics emerging from Q3_K symmetric quantization.

The term is used descriptively.

Perplexity

Measured on wikitext-2-raw-test (full):

Model Precision PPL
Vikra MixP_4.9b_S 6.11 BPW 5.50 ยฑ 0.03
Baseline BF16 Full 6.02 ยฑ 0.03

Roadmap

Planned subfamilies:

  • MixP โ€” Mixed Precision
  • FullP โ€” Full Precision variants
  • HCT โ€” multi-merge experiments
  • S / M / L โ€” different bit allocation regimes

All models in the family are called Vikra. The repository is Vikras.


Usage

llama-cli -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096
llama-server -m Vikra-MixP_4.9b_S.gguf -ngl 99 -c 4096

Closing

Vikras is a research project.

It explores how transformer behavior changes when we:

  • compress
  • merge
  • alter numerical geometry

If you are interested in hidden space dynamics / regime sensitivity / anisotropic quantization โ€” welcome.

Downloads last month
185
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to add your hardware

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ 1 Ask for provider support

Collection including srs6901/Vikras-MixP