287 MB
8 files
Updated 15 days ago
README.md

Quark-72M-Instruct

Quark-72M Instruct — compact autoregressive language model trained by ThingAI.

Model Details

Parameter Value
Parameters 71.6M
Architecture Decoder-only Transformer
Layers 14
Hidden size 512
Attention heads 8 (GQA, 2 KV)
FFN SwiGLU (1344)
Norm RMSNorm
Position RoPE
Vocab size 65,538
Context length 2,048

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

tokenizer = AutoTokenizer.from_pretrained("ThingAI/Quark-72M-Instruct")
model     = AutoModelForCausalLM.from_pretrained("ThingAI/Quark-72M-Instruct", trust_remote_code=True)

prompt = "<|user|>\nHow do I find files larger than 100MB?\n<|end|>\n<|assistant|>\n"
ids    = tokenizer(prompt, return_tensors="pt").input_ids
out    = model.generate_text(ids, max_new_tokens=200, temperature=0.2)
print(tokenizer.decode(out[0], skip_special_tokens=False))

Training

  • Pre-training: 5B tokens on math, code, EN/IT text
  • SFT: bash commands, code, conversations (ChatML template)
  • Tokenizer: BPE byte-level, 65536 vocab

License

MIT

Total size
287 MB
Files
8
Last updated
Jun 16
Pre-warmed CDN
US EU US EU

Contributors