itriedcoding commited on
Commit
6923f6b
·
verified ·
1 Parent(s): 8c541f7

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +94 -0
README.md ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - language-model
7
+ - transformer
8
+ - pytorch
9
+ - from-scratch
10
+ - tiny-stories
11
+ datasets:
12
+ - TinyStories
13
+ library_name: transformers
14
+ pipeline_tag: text-generation
15
+ ---
16
+
17
+ # Sage 1B
18
+
19
+ A **custom 1.286 billion parameter** language model built entirely from scratch — no base models, no fine-tuning, no dependencies on existing LLM frameworks.
20
+
21
+ ## Architecture
22
+
23
+ | Parameter | Value |
24
+ |-----------|-------|
25
+ | Parameters | 1,286,155,776 |
26
+ | Layers | 30 |
27
+ | Hidden Size | 1536 |
28
+ | Attention Heads | 12 |
29
+ | Head Dimension | 128 |
30
+ | Intermediate Size | 6144 |
31
+ | Vocabulary | 50,000 (BPE) |
32
+ | Max Sequence Length | 128 tokens |
33
+ | Activation | SwiGLU |
34
+ | Position Encoding | Rotary (RoPE) |
35
+ | Normalization | RMSNorm |
36
+ | Precision | FP16 / FP32 |
37
+
38
+ ## Key Features
39
+
40
+ - **Built from scratch** — Custom PyTorch implementation. Not a derivative of any existing model.
41
+ - **BPE Tokenizer** — Trained a 50,000-token BPE tokenizer on the TinyStories dataset.
42
+ - **Modern Architecture** — SwiGLU activations, Rotary Position Embeddings (RoPE), RMSNorm.
43
+ - **Open Source** — MIT license. Weights, training code, and inference code are all available.
44
+ - **GGUF Format** — Available for use with llama.cpp, Ollama, and other GGUF-compatible runners.
45
+
46
+ ## Usage
47
+
48
+ ### With Hugging Face Hub
49
+ ```python
50
+ from huggingface_hub import hf_hub_download
51
+ import torch, json
52
+ from tokenizers import Tokenizer
53
+
54
+ config_path = hf_hub_download('itriedcoding/Sage-1B', 'config.json')
55
+ tokenizer_path = hf_hub_download('itriedcoding/Sage-1B', 'tokenizer.json')
56
+ weights_path = hf_hub_download('itriedcoding/Sage-1B', 'pytorch_model_state.bin')
57
+
58
+ cfg = json.load(open(config_path))
59
+ tok = Tokenizer.from_file(tokenizer_path)
60
+ ```
61
+
62
+ ### With GGUF (llama.cpp)
63
+ ```bash
64
+ wget https://huggingface.co/itriedcoding/Sage-1B/resolve/main/sage-1b-f16.gguf
65
+ ./main -m sage-1b-f16.gguf -p "Once upon a time" -n 50
66
+ ```
67
+
68
+ ### Web Interface
69
+ Chat with the model at: https://sage-ai.vercel.app/chat
70
+
71
+ ### API
72
+ ```bash
73
+ curl -X POST https://sage-ai.vercel.app/api/v1/chat \
74
+ -H "Authorization: Bearer YOUR_API_KEY" \
75
+ -d '{"message": "Tell me a story"}'
76
+ ```
77
+
78
+ ## Training
79
+
80
+ The model was trained on the **TinyStories** dataset — a synthetic dataset of short stories designed for training compact language models. Training was performed on CPU with limited resources, making this a proof-of-concept for building LLMs from scratch without GPU access.
81
+
82
+ ## Files
83
+
84
+ | File | Size | Description |
85
+ |------|------|-------------|
86
+ | `pytorch_model_state.bin` | 2.4 GB | FP16 model weights |
87
+ | `sage-1b-f16.gguf` | 2.4 GB | GGUF format for llama.cpp |
88
+ | `config.json` | 1 KB | Model hyperparameters |
89
+ | `tokenizer.json` | 12 MB | BPE tokenizer (50K vocab) |
90
+ | `modeling_sage_1b.py` | 6 KB | Model architecture code |
91
+
92
+ ## License
93
+
94
+ MIT