Sage-1B / README.md

Upload README.md with huggingface_hub

6923f6b verified 3 days ago

2.78 kB

license: mit
language:
  - en
tags:
  - language-model
  - transformer
  - pytorch
  - from-scratch
  - tiny-stories
datasets:
  - TinyStories
library_name: transformers
pipeline_tag: text-generation

Sage 1B

A custom 1.286 billion parameter language model built entirely from scratch — no base models, no fine-tuning, no dependencies on existing LLM frameworks.

Architecture

Parameter	Value
Parameters	1,286,155,776
Layers	30
Hidden Size	1536
Attention Heads	12
Head Dimension	128
Intermediate Size	6144
Vocabulary	50,000 (BPE)
Max Sequence Length	128 tokens
Activation	SwiGLU
Position Encoding	Rotary (RoPE)
Normalization	RMSNorm
Precision	FP16 / FP32

Key Features

Built from scratch — Custom PyTorch implementation. Not a derivative of any existing model.
BPE Tokenizer — Trained a 50,000-token BPE tokenizer on the TinyStories dataset.
Modern Architecture — SwiGLU activations, Rotary Position Embeddings (RoPE), RMSNorm.
Open Source — MIT license. Weights, training code, and inference code are all available.
GGUF Format — Available for use with llama.cpp, Ollama, and other GGUF-compatible runners.

Usage

With Hugging Face Hub

from huggingface_hub import hf_hub_download
import torch, json
from tokenizers import Tokenizer

config_path = hf_hub_download('itriedcoding/Sage-1B', 'config.json')
tokenizer_path = hf_hub_download('itriedcoding/Sage-1B', 'tokenizer.json')
weights_path = hf_hub_download('itriedcoding/Sage-1B', 'pytorch_model_state.bin')

cfg = json.load(open(config_path))
tok = Tokenizer.from_file(tokenizer_path)

With GGUF (llama.cpp)

wget https://huggingface.co/itriedcoding/Sage-1B/resolve/main/sage-1b-f16.gguf
./main -m sage-1b-f16.gguf -p "Once upon a time" -n 50

Web Interface

Chat with the model at: https://sage-ai.vercel.app/chat

API

curl -X POST https://sage-ai.vercel.app/api/v1/chat \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"message": "Tell me a story"}'

Training

The model was trained on the TinyStories dataset — a synthetic dataset of short stories designed for training compact language models. Training was performed on CPU with limited resources, making this a proof-of-concept for building LLMs from scratch without GPU access.

Files

File	Size	Description
`pytorch_model_state.bin`	2.4 GB	FP16 model weights
`sage-1b-f16.gguf`	2.4 GB	GGUF format for llama.cpp
`config.json`	1 KB	Model hyperparameters
`tokenizer.json`	12 MB	BPE tokenizer (50K vocab)
`modeling_sage_1b.py`	6 KB	Model architecture code

License

MIT