--- language: en tags: - causal-lm - chat - reasoning - momo license: apache-2.0 --- # 🌸 {MOMO_VERSION} Momo is a friendly 336M parameter language model trained from scratch, designed to feel like chatting with a warm, knowledgeable friend. ## Model Details - **Parameters:** ~336M - **Architecture:** Transformer (RoPE + RMSNorm + GQA + SwiGLU) - **Trained on:** WikiText-103 + Alpaca + Custom reasoning data - **Context length:** {MAX_SEQ_LEN} tokens - **Vocabulary:** {VOCAB_FINAL} tokens ## Capabilities - 💬 Friendly, casual chat - 🧠 Reasoning with `` tags - ❓ Question answering - 🤗 Emotional support ## Quick Start ```python # Load and chat with Momo model = MomoForCausalLM.from_pretrained('path/to/Momo-336M') tokenizer = AutoTokenizer.from_pretrained('path/to/Momo-336M') messages = [{{'role': 'user', 'content': 'Hey Momo! How are you?'}}] prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(prompt, return_tensors='pt') output = model.generate(**inputs, max_new_tokens=200, temperature=0.75) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` ## Training Setup - GPU: 2× NVIDIA T4 (Kaggle) - Precision: float16 AMP - Gradient checkpointing: enabled - Training stages: Pretrain → SFT → Reasoning