--- language: - it - en license: apache-2.0 tags: - text-generation - causal-lm - bilingual - italian - english - small-language-model - trained-from-scratch - quark - instruct - sft - chat library_name: transformers pipeline_tag: text-generation --- # Quark-270M-Instruct — Bilingual Chat Model Quark-270M-Instruct is the **instruction-tuned** version of [Quark-270M Base](https://huggingface.co/ThingAI/Quark-270m-Base), fine-tuned for conversational use in Italian and English. Built entirely from scratch by [ThingsAI](https://things-ai.org). ## Quick Start ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "ThingAI/Quark-270m-Instruct", trust_remote_code=True, torch_dtype=torch.bfloat16 ).cuda() model.lm_head.weight = model.embed_tokens.weight # ensure weight tying tokenizer = AutoTokenizer.from_pretrained("ThingAI/Quark-270m-Instruct") prompt = "<|user|>\nCiao, come stai?\n<|end|>\n<|assistant|>\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") out = model.generate(**inputs, max_new_tokens=150, do_sample=True, temperature=0.7, top_k=40) print(tokenizer.decode(out[0], skip_special_tokens=False)) ``` ## Chat Format ``` <|user|> {user message} <|end|> <|assistant|> {model response} <|end|> ``` Multi-turn: ``` <|user|> Ciao! <|end|> <|assistant|> Ciao! Come posso aiutarti? <|end|> <|user|> Cos'è l'intelligenza artificiale? <|end|> <|assistant|> ``` ## Model Details | | | |---|---| | **Base Model** | [Quark-270M Base](https://huggingface.co/ThingAI/Quark-270m-Base) | | **Parameters** | 252M (with weight tying) | | **Architecture** | Decoder-only Transformer (GQA, SwiGLU, RMSNorm, RoPE) | | **Vocabulary** | 65,537 tokens | | **Context Length** | 2,048 tokens | | **Precision** | BF16 | | **Languages** | Italian, English | ### Architecture | | | |---|---| | d_model | 768 | | Layers | 32 | | Query Heads | 12 | | KV Heads | 4 | | Head Dim | 64 | | FFN Dim | 2,048 | | Activation | SwiGLU | ## Training ### Base Pretraining ~10B tokens on a bilingual mix (Italian 50%, English 43%, Code 7%) on NVIDIA B200. See [Quark-270M Base](https://huggingface.co/ThingAI/Quark-270m-Base) for details. ### SFT (Instruction Tuning) Fine-tuned on a diverse mix of conversational and instructional data: | Dataset | Examples | Type | |---|---|---| | FreedomIntelligence/alpaca-gpt4-italian | ~52,000 | Italian instructions | | HuggingFaceH4/no_robots | ~9,500 | English conversations | | m-a-p/CodeFeedback-Filtered-Instruction | 5,000 | Code instructions | | yogeshm/text_to_bash (×80) | ~9,900 | Terminal commands | | Custom chitchat (×100) | ~3,000 | Identity, greetings, basic Q&A | | **Total** | **~80,000** | | | | | |---|---| | **Hardware** | NVIDIA B200 | | **Epochs** | 3 | | **Learning Rate** | 2e-5 (cosine decay) | | **Batch Size** | 16 × 4 = 64 effective | | **Sequence Length** | 512 | ## Inference Server Quark-270M-Instruct powers [Things Chat](https://chat.things-ai.org) via a self-hosted FastAPI server with SSE streaming, conversation memory, web search, and content moderation. ## Limitations - **252M is small:** Limited factual knowledge, prone to hallucination - **Mathematics:** Unreliable beyond basic arithmetic - **Code:** Generates plausible but often non-functional code - **Context:** 2,048 token window - **No system prompt:** The model was not trained with `<|system|>` tags ### Good for - Self-hosted bilingual chatbot - Learning about LLM training from scratch - Terminal command assistance - Light conversational AI ### Not suited for - Factual Q&A requiring accuracy - Complex reasoning or math - Production-grade code generation - Safety-critical applications ## The Quark Family | Model | Parameters | Type | |---|---|---| | [Quark-50M](https://huggingface.co/ThingAI/Quark-50m) | 51M | Base | | [Quark-135M](https://huggingface.co/ThingAI/Quark-135m) | 135M | Base | | [Quark-270M Base](https://huggingface.co/ThingAI/Quark-270m-Base) | 252M | Base | | **Quark-270M-Instruct** | **252M** | **Chat** | ## Links - 🌐 [ThingsAI](https://things-ai.org) - 💬 [Things Chat](https://chat.things-ai.org) - 🔤 [QuarkTokenizer](https://huggingface.co/ThingAI/QuarkTokenizer) - 📊 [Open SLM Leaderboard](https://huggingface.co/spaces/AxiomicLabs/Open_SLM_Leaderboard) --- *Built from scratch by ThingsAI 🇮🇹*