itriedcoding
/

Sage-1B

Text Generation

Model card Files Files and versions

Sage-1B / README.md

itriedcoding's picture

Upload README.md with huggingface_hub

6923f6b verified 3 days ago

|

history blame contribute delete

2.78 kB

	---
	license: mit
	language:
	- en
	tags:
	- language-model
	- transformer
	- pytorch
	- from-scratch
	- tiny-stories
	datasets:
	- TinyStories
	library_name: transformers
	pipeline_tag: text-generation
	---

	# Sage 1B

	A custom 1.286 billion parameter language model built entirely from scratch — no base models, no fine-tuning, no dependencies on existing LLM frameworks.

	## Architecture

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Parameters \| 1,286,155,776 \|
	\| Layers \| 30 \|
	\| Hidden Size \| 1536 \|
	\| Attention Heads \| 12 \|
	\| Head Dimension \| 128 \|
	\| Intermediate Size \| 6144 \|
	\| Vocabulary \| 50,000 (BPE) \|
	\| Max Sequence Length \| 128 tokens \|
	\| Activation \| SwiGLU \|
	\| Position Encoding \| Rotary (RoPE) \|
	\| Normalization \| RMSNorm \|
	\| Precision \| FP16 / FP32 \|

	## Key Features

	- Built from scratch — Custom PyTorch implementation. Not a derivative of any existing model.
	- BPE Tokenizer — Trained a 50,000-token BPE tokenizer on the TinyStories dataset.
	- Modern Architecture — SwiGLU activations, Rotary Position Embeddings (RoPE), RMSNorm.
	- Open Source — MIT license. Weights, training code, and inference code are all available.
	- GGUF Format — Available for use with llama.cpp, Ollama, and other GGUF-compatible runners.

	## Usage

	### With Hugging Face Hub
	```python
	from huggingface_hub import hf_hub_download
	import torch, json
	from tokenizers import Tokenizer

	config_path = hf_hub_download('itriedcoding/Sage-1B', 'config.json')
	tokenizer_path = hf_hub_download('itriedcoding/Sage-1B', 'tokenizer.json')
	weights_path = hf_hub_download('itriedcoding/Sage-1B', 'pytorch_model_state.bin')

	cfg = json.load(open(config_path))
	tok = Tokenizer.from_file(tokenizer_path)
	```

	### With GGUF (llama.cpp)
	```bash
	wget https://huggingface.co/itriedcoding/Sage-1B/resolve/main/sage-1b-f16.gguf
	./main -m sage-1b-f16.gguf -p "Once upon a time" -n 50
	```

	### Web Interface
	Chat with the model at: https://sage-ai.vercel.app/chat

	### API
	```bash
	curl -X POST https://sage-ai.vercel.app/api/v1/chat \
	-H "Authorization: Bearer YOUR_API_KEY" \
	-d '{"message": "Tell me a story"}'
	```

	## Training

	The model was trained on the TinyStories dataset — a synthetic dataset of short stories designed for training compact language models. Training was performed on CPU with limited resources, making this a proof-of-concept for building LLMs from scratch without GPU access.

	## Files

	\| File \| Size \| Description \|
	\|------\|------\|-------------\|
	\| `pytorch_model_state.bin` \| 2.4 GB \| FP16 model weights \|
	\| `sage-1b-f16.gguf` \| 2.4 GB \| GGUF format for llama.cpp \|
	\| `config.json` \| 1 KB \| Model hyperparameters \|
	\| `tokenizer.json` \| 12 MB \| BPE tokenizer (50K vocab) \|
	\| `modeling_sage_1b.py` \| 6 KB \| Model architecture code \|

	## License

	MIT