--- license: mit language: - en tags: - language-model - transformer - pytorch - from-scratch - tiny-stories datasets: - TinyStories library_name: transformers pipeline_tag: text-generation --- # Sage 1B A **custom 1.286 billion parameter** language model built entirely from scratch — no base models, no fine-tuning, no dependencies on existing LLM frameworks. ## Architecture | Parameter | Value | |-----------|-------| | Parameters | 1,286,155,776 | | Layers | 30 | | Hidden Size | 1536 | | Attention Heads | 12 | | Head Dimension | 128 | | Intermediate Size | 6144 | | Vocabulary | 50,000 (BPE) | | Max Sequence Length | 128 tokens | | Activation | SwiGLU | | Position Encoding | Rotary (RoPE) | | Normalization | RMSNorm | | Precision | FP16 / FP32 | ## Key Features - **Built from scratch** — Custom PyTorch implementation. Not a derivative of any existing model. - **BPE Tokenizer** — Trained a 50,000-token BPE tokenizer on the TinyStories dataset. - **Modern Architecture** — SwiGLU activations, Rotary Position Embeddings (RoPE), RMSNorm. - **Open Source** — MIT license. Weights, training code, and inference code are all available. - **GGUF Format** — Available for use with llama.cpp, Ollama, and other GGUF-compatible runners. ## Usage ### With Hugging Face Hub ```python from huggingface_hub import hf_hub_download import torch, json from tokenizers import Tokenizer config_path = hf_hub_download('itriedcoding/Sage-1B', 'config.json') tokenizer_path = hf_hub_download('itriedcoding/Sage-1B', 'tokenizer.json') weights_path = hf_hub_download('itriedcoding/Sage-1B', 'pytorch_model_state.bin') cfg = json.load(open(config_path)) tok = Tokenizer.from_file(tokenizer_path) ``` ### With GGUF (llama.cpp) ```bash wget https://huggingface.co/itriedcoding/Sage-1B/resolve/main/sage-1b-f16.gguf ./main -m sage-1b-f16.gguf -p "Once upon a time" -n 50 ``` ### Web Interface Chat with the model at: https://sage-ai.vercel.app/chat ### API ```bash curl -X POST https://sage-ai.vercel.app/api/v1/chat \ -H "Authorization: Bearer YOUR_API_KEY" \ -d '{"message": "Tell me a story"}' ``` ## Training The model was trained on the **TinyStories** dataset — a synthetic dataset of short stories designed for training compact language models. Training was performed on CPU with limited resources, making this a proof-of-concept for building LLMs from scratch without GPU access. ## Files | File | Size | Description | |------|------|-------------| | `pytorch_model_state.bin` | 2.4 GB | FP16 model weights | | `sage-1b-f16.gguf` | 2.4 GB | GGUF format for llama.cpp | | `config.json` | 1 KB | Model hyperparameters | | `tokenizer.json` | 12 MB | BPE tokenizer (50K vocab) | | `modeling_sage_1b.py` | 6 KB | Model architecture code | ## License MIT