itriedcoding commited on
Commit
66d4b44
·
verified ·
1 Parent(s): 6d6d2c9

Upload folder using huggingface_hub

Browse files
Files changed (5) hide show
  1. README.md +299 -0
  2. __init__.py +3 -0
  3. config.json +14 -0
  4. modeling_transformer_lm.py +109 -0
  5. pytorch_model.bin +3 -0
README.md ADDED
@@ -0,0 +1,299 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Sage - Custom LLM Model
2
+
3
+ Sage is a custom-built transformer language model designed for text generation tasks. This model demonstrates the full lifecycle of building and publishing a custom AI model to Hugging Face.
4
+
5
+ ## 📊 Model Overview
6
+
7
+ - **Model Type**: Transformer-based language model
8
+ - **Architecture**: Decoder-only transformer
9
+ - **Vocabulary Size**: 40 characters
10
+ - **Hidden Size**: 256
11
+ - **Number of Layers**: 4
12
+ - **Number of Attention Heads**: 8
13
+ - **Feedforward Size**: 1024
14
+ - **Max Sequence Length**: 64
15
+ - **Parameters**: ~3.2M
16
+ - **Training Framework**: PyTorch
17
+ - **License**: MIT
18
+
19
+ ## 📚 Training Data
20
+
21
+ Sage was trained on a curated dataset of example sentences covering:
22
+ - Conversational phrases and greetings
23
+ - Weather and environmental descriptions
24
+ - Machine learning and AI concepts
25
+ - Deep learning architectures (transformers, neural networks)
26
+ - Natural language processing applications
27
+ - Model development and deployment practices
28
+
29
+ The dataset consists of 10 carefully crafted examples designed to teach the model patterns in technical and conversational English.
30
+
31
+ ## 🔧 Technical Specifications
32
+
33
+ ### Model Architecture
34
+ ```
35
+ TransformerLM(
36
+ (embedding): Embedding(40, 256)
37
+ (pos_embedding): Embedding(64, 256)
38
+ (transformer_encoder): TransformerEncoder(
39
+ (layers): ModuleList(
40
+ (0-3): TransformerEncoderLayer(
41
+ (self_attn): MultiheadAttention(
42
+ (embed_dim): 256
43
+ (num_heads): 8
44
+ )
45
+ (linear1): Linear(in_features=256, out_features=1024, bias=True)
46
+ (linear2): Linear(in_features=1024, out_features=256, bias=True)
47
+ (norm1): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
48
+ (norm2): LayerNorm((256,), eps=1e-05, elementwise_affine=True)
49
+ (dropout): Dropout(p=0.1, inplace=False)
50
+ (dropout1): Dropout(p=0.1, inplace=False)
51
+ (dropout2): Dropout(p=0.1, inplace=False)
52
+ )
53
+ )
54
+ )
55
+ (output_layer): Linear(in_features=256, out_features=40, bias=True)
56
+ )
57
+ ```
58
+
59
+ ### Tokenization
60
+ Sage uses a character-level tokenizer with:
61
+ - Vocabulary: 40 unique characters including special tokens
62
+ - Special tokens: `<PAD>` (0), `<UNK>` (1)
63
+ - Encoding: UTF-8 character mapping
64
+ - Maximum sequence length: 64 tokens
65
+
66
+ ## 🚀 Usage
67
+
68
+ ### With Transformers Library
69
+ ```python
70
+ from transformers import AutoTokenizer, AutoModelForCausalLM
71
+ import torch
72
+
73
+ # Load model and tokenizer
74
+ model_name = "itriedcoding/Sage"
75
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
76
+ model = AutoModelForCausalLM.from_pretrained(model_name)
77
+
78
+ # Generate text
79
+ def generate_text(prompt, max_length=50, temperature=0.8):
80
+ inputs = tokenizer.encode(prompt, return_tensors="pt")
81
+
82
+ with torch.no_grad():
83
+ outputs = model.generate(
84
+ inputs,
85
+ max_length=max_length,
86
+ temperature=temperature,
87
+ do_sample=True,
88
+ pad_token_id=tokenizer.eos_token_id
89
+ )
90
+
91
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
92
+
93
+ # Examples
94
+ print(generate_text("Hello"))
95
+ print(generate_text("The weather"))
96
+ print(generate_text("Deep learning"))
97
+ ```
98
+
99
+ ### Direct PyTorch Usage
100
+ ```python
101
+ import torch
102
+ from modeling_transformer_lm import TransformerLM
103
+ import json
104
+ import pickle
105
+
106
+ # Load model components
107
+ with open('config.json', 'r') as f:
108
+ config_dict = json.load(f)
109
+
110
+ # For actual usage, you would load the tokenizer similarly
111
+ # This example shows the structure
112
+ model = TransformerLM.from_pretrained("itriedcoding/Sage")
113
+ ```
114
+
115
+ ## 🏗️ Model Card Metadata
116
+
117
+ ```yaml
118
+ ---
119
+ library_name: transformers
120
+ license: MIT
121
+ base_model: custom-built
122
+ tags:
123
+ - text-generation
124
+ - transformer
125
+ - character-level
126
+ - custom-model
127
+ - educational
128
+ pipeline_tag: text-generation
129
+ widget:
130
+ - example: Hello
131
+ parameters: {max_length: 30, temperature: 0.7}
132
+ - example: The weather
133
+ parameters: {max_length: 30, temperature: 0.7}
134
+ - example: Deep learning
135
+ parameters: {max_length: 30, temperature: 0.7}
136
+ ---
137
+ ```
138
+
139
+ ## 🤗 Hugging Face Spaces Deployment
140
+
141
+ You can run this model in various Hugging Face Spaces templates:
142
+
143
+ ### Streamlit Space
144
+ Create a `streamlit_app.py`:
145
+ ```python
146
+ import streamlit as st
147
+ from transformers import AutoTokenizer, AutoModelForCausalLM
148
+ import torch
149
+
150
+ @st.cache_resource
151
+ def load_model():
152
+ model_name = "itriedcoding/Sage"
153
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
154
+ model = AutoModelForCausalLM.from_pretrained(model_name)
155
+ return tokenizer, model
156
+
157
+ def main():
158
+ st.title("🤖 Sage Text Generator")
159
+ st.write("A custom character-level language model")
160
+
161
+ tokenizer, model = load_model()
162
+
163
+ prompt = st.text_input("Enter your prompt:", "Hello")
164
+ max_length = st.slider("Max length:", 10, 100, 30)
165
+ temperature = st.slider("Temperature:", 0.1, 2.0, 0.8)
166
+
167
+ if st.button("Generate"):
168
+ with st.spinner("Generating..."):
169
+ inputs = tokenizer.encode(prompt, return_tensors="pt")
170
+ with torch.no_grad():
171
+ outputs = model.generate(
172
+ inputs,
173
+ max_length=max_length,
174
+ temperature=temperature,
175
+ do_sample=True,
176
+ pad_token_id=tokenizer.eos_token_id
177
+ )
178
+ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
179
+ st.write("**Generated text:**")
180
+ st.write(result)
181
+
182
+ if __name__ == "__main__":
183
+ main()
184
+ ```
185
+
186
+ ### Gradio Space
187
+ Create an `app.py`:
188
+ ```python
189
+ import gradio as gr
190
+ from transformers import AutoTokenizer, AutoModelForCausalLM
191
+ import torch
192
+
193
+ def load_model():
194
+ model_name = "itriedcoding/Sage"
195
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
196
+ model = AutoModelForCausalLM.from_pretrained(model_name)
197
+ return tokenizer, model
198
+
199
+ def generate_text(prompt, max_length, temperature):
200
+ tokenizer, model = load_model()
201
+ inputs = tokenizer.encode(prompt, return_tensors="pt")
202
+
203
+ with torch.no_grad():
204
+ outputs = model.generate(
205
+ inputs,
206
+ max_length=int(max_length),
207
+ temperature=temperature,
208
+ do_sample=True,
209
+ pad_token_id=tokenizer.eos_token_id
210
+ )
211
+
212
+ return tokenizer.decode(outputs[0], skip_special_tokens=True)
213
+
214
+ demo = gr.Interface(
215
+ fn=generate_text,
216
+ inputs=[
217
+ gr.Textbox(label="Prompt", value="Hello"),
218
+ gr.Slider(minimum=10, maximum=100, value=30, label="Max Length"),
219
+ gr.Slider(minimum=0.1, maximum=2.0, value=0.8, label="Temperature")
220
+ ],
221
+ outputs=gr.Textbox(label="Generated Text"),
222
+ title="🤖 Sage Text Generator",
223
+ description="Custom character-level language model for text generation"
224
+ )
225
+
226
+ if __name__ == "__main__":
227
+ demo.launch()
228
+ ```
229
+
230
+ ## 📦 GGUF Quantization
231
+
232
+ For efficient deployment, Sage is available in GGUF format:
233
+
234
+ ### Available Quantizations
235
+ - `sage-q4_0.gguf` - 4-bit quantization (balanced quality/size)
236
+ - `sage-q5_0.gguf` - 5-bit quantization (higher quality)
237
+ - `sage-q8_0.gguf` - 8-bit quantization (near-full precision)
238
+ - `sage-f16.gguf` - Float16 (full precision)
239
+
240
+ ### Using GGUF with llama.cpp
241
+ ```bash
242
+ # Install llama.cpp
243
+ git clone https://github.com/ggerganov/llama.cpp
244
+ cd llama.cpp
245
+ make
246
+
247
+ # Run the model
248
+ ./main -m sage-q4_0.gguf -p "Hello" -n 30
249
+ ```
250
+
251
+ ## 📈 Performance & Limitations
252
+
253
+ ### Intended Use
254
+ - Educational demonstrations of transformer architectures
255
+ - Character-level language modeling experiments
256
+ - Prototyping and testing custom model pipelines
257
+ - Learning about model deployment on Hugging Face
258
+
259
+ ### Limitations
260
+ - Small vocabulary (character-level only limits coherence)
261
+ - Limited training data (10 examples)
262
+ - Small model size (3.2M parameters)
263
+ - Not suitable for production NLP applications
264
+ - Best for short text generation (<50 tokens)
265
+
266
+ ### Bias & Ethics
267
+ As a small educational model trained on curated technical text:
268
+ - Minimal harmful bias expected
269
+ - Should not be used for decision-making applications
270
+ - Outputs should be reviewed for appropriateness
271
+ - Model reflects patterns in its limited training data
272
+
273
+ ## 📝 Citation
274
+
275
+ ```bibtex
276
+ @misc{sage_model_2026,
277
+ author = {itriedcoding},
278
+ title = {Sage: Custom Character-Level Language Model},
279
+ year = {2026},
280
+ publisher = {Hugging Face},
281
+ journal = {Hugging Face Model Hub},
282
+ doi = {10.57967/hf/0000},
283
+ url = {https://huggingface.co/itriedcoding/Sage}
284
+ }
285
+ ```
286
+
287
+ ## 🔄 Training Reproducibility
288
+
289
+ To reproduce this model:
290
+ 1. Clone this repository
291
+ 2. Install requirements: `pip install torch torchvision torchaudio pandas`
292
+ 3. Run training: `python train_model.py`
293
+ 4. The model will be saved as `custom_llm_model.pth`
294
+
295
+ ## 📞 Contact
296
+
297
+ For questions or collaboration opportunities:
298
+ - Hugging Face: https://huggingface.co/itriedcoding
299
+ - Model Issues: Use the "Issues" tab on this model page
__init__.py ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ from .modeling_transformer_lm import TransformerLM, TransformerLMConfig
2
+
3
+ __all__ = ["TransformerLM", "TransformerLMConfig"]
config.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": ["TransformerLM"],
3
+ "model_type": "transformer_lm",
4
+ "vocab_size": 40,
5
+ "hidden_size": 256,
6
+ "num_hidden_layers": 4,
7
+ "num_attention_heads": 8,
8
+ "intermediate_size": 1024,
9
+ "max_position_embeddings": 64,
10
+ "pad_token_id": 0,
11
+ "bos_token_id": 1,
12
+ "eos_token_id": 2,
13
+ "torch_dtype": "float32"
14
+ }
modeling_transformer_lm.py ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import torch
2
+ import torch.nn as nn
3
+ import math
4
+ from transformers import PreTrainedModel
5
+ from transformers.modeling_utils import PretrainedConfig
6
+
7
+ class TransformerLMConfig(PretrainedConfig):
8
+ model_type = "transformer_lm"
9
+
10
+ def __init__(
11
+ self,
12
+ vocab_size=40,
13
+ hidden_size=256,
14
+ num_hidden_layers=4,
15
+ num_attention_heads=8,
16
+ intermediate_size=1024,
17
+ max_position_embeddings=64,
18
+ pad_token_id=0,
19
+ bos_token_id=1,
20
+ eos_token_id=2,
21
+ **kwargs
22
+ ):
23
+ super().__init__(
24
+ pad_token_id=pad_token_id,
25
+ bos_token_id=bos_token_id,
26
+ eos_token_id=eos_token_id,
27
+ **kwargs
28
+ )
29
+
30
+ self.vocab_size = vocab_size
31
+ self.hidden_size = hidden_size
32
+ self.num_hidden_layers = num_hidden_layers
33
+ self.num_attention_heads = num_attention_heads
34
+ self.intermediate_size = intermediate_size
35
+ self.max_position_embeddings = max_position_embeddings
36
+
37
+ class TransformerLM(PreTrainedModel):
38
+ config_class = TransformerLMConfig
39
+
40
+ def __init__(self, config):
41
+ super().__init__(config)
42
+ self.config = config
43
+
44
+ self.embedding = nn.Embedding(config.vocab_size, config.hidden_size)
45
+ self.pos_embedding = nn.Embedding(config.max_position_embeddings, config.hidden_size)
46
+
47
+ encoder_layer = nn.TransformerEncoderLayer(
48
+ d_model=config.hidden_size,
49
+ nhead=config.num_attention_heads,
50
+ dim_feedforward=config.intermediate_size,
51
+ batch_first=True
52
+ )
53
+ self.transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=config.num_hidden_layers)
54
+ self.output_layer = nn.Linear(config.hidden_size, config.vocab_size)
55
+
56
+ self.max_position_embeddings = config.max_position_embeddings
57
+
58
+ def forward(self, input_ids, attention_mask=None, labels=None):
59
+ seq_len = input_ids.size(1)
60
+ pos = torch.arange(0, seq_len, device=input_ids.device).unsqueeze(0)
61
+
62
+ # Embedding + positional encoding
63
+ src_emb = self.embedding(input_ids) * math.sqrt(self.config.hidden_size)
64
+ pos_emb = self.pos_embedding(pos)
65
+ src_emb = src_emb + pos_emb
66
+
67
+ # Create key padding mask for transformer (True where we should mask)
68
+ if attention_mask is not None:
69
+ # Transformer expects True for positions to mask
70
+ src_key_padding_mask = ~attention_mask.bool()
71
+ else:
72
+ src_key_padding_mask = None
73
+
74
+ # Transformer encoder
75
+ output = self.transformer_encoder(src_emb, src_key_padding_mask=src_key_padding_mask)
76
+
77
+ # Output projection
78
+ logits = self.output_layer(output)
79
+
80
+ loss = None
81
+ if labels is not None:
82
+ # Shift so that tokens < n predict n
83
+ shift_logits = logits[..., :-1, :].contiguous()
84
+ shift_labels = labels[..., 1:].contiguous()
85
+ loss_fct = nn.CrossEntropyLoss()
86
+ loss = loss_fct(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
87
+
88
+ return {
89
+ "loss": loss,
90
+ "logits": logits
91
+ }
92
+
93
+ def prepare_inputs_for_generation(self, input_ids, **kwargs):
94
+ # Only last token for inputs_ids if past is defined in kwargs
95
+ if "past_key_values" in kwargs:
96
+ input_ids = input_ids[:, -1].unsqueeze(-1)
97
+
98
+ attention_mask = kwargs.get("attention_mask", None)
99
+ position_ids = kwargs.get("position_ids", None)
100
+
101
+ # if model is used as a decoder in encoder-decoder model, the decoder attention mask is created on the fly
102
+ if attention_mask is not None:
103
+ attention_mask = attention_mask
104
+
105
+ return {
106
+ "input_ids": input_ids,
107
+ "attention_mask": attention_mask,
108
+ "position_ids": position_ids,
109
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:accd9d82bd55ee686643f9e889f53e3d9938197f30fea126df1b596090c70382
3
+ size 12805265