natalieparker commited on
Commit
b13b598
·
verified ·
1 Parent(s): 60eeb7c

Upload 7 files

Browse files
README.md CHANGED
@@ -1,3 +1,207 @@
 
 
 
 
 
 
 
 
 
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ LumaAI-160M-v3
2
+
3
+ Author: Natalie Parker (Phoenix Cameron)
4
+ License: Apache-2.0
5
+ Status: Production-ready checkpoint
6
+ Model Size: 160M parameters
7
+ Format: safetensors
8
+
9
+
10
  ---
11
+
12
+ 🧬 Overview
13
+
14
+ LumaAI-160M-v3 is a fully independent, original language model created, trained, and fine-tuned from scratch by Natalie Parker.
15
+ It is not based on, not derived from, and not affiliated with any company model (OpenAI, Meta, Google, Mistral, Anthropic, etc.).
16
+
17
+ LumaAI-160M-v3 was trained through:
18
+
19
+ Base training (“Leg 2”) on a large, diverse dataset
20
+
21
+ LoRA fine-tuning (“Leg 3”) on carefully curated hybrid conversational + creative datasets
22
+
23
+ Final weight merging into a single unified model (this version)
24
+
25
+
26
+ The result is a compact but powerful 160M model with strong personality consistency, emotional nuance, creativity, and contextual reasoning.
27
+
28
+
29
  ---
30
+
31
+ 🧠 Key Features
32
+
33
+ ⭐ 1. Original Architecture
34
+
35
+ This model is not a modification of any corporate LLM.
36
+ It is trained independently using original datasets, tokenizer, and architecture fully produced by the creator.
37
+
38
+ ⭐ 2. 160M Efficient Size
39
+
40
+ Small enough to run on:
41
+
42
+ Phones
43
+
44
+ Low-VRAM GPUs (4–6GB)
45
+
46
+ CPU inference
47
+
48
+ Edge devices
49
+
50
+
51
+ While maintaining surprisingly strong conversational performance.
52
+
53
+ ⭐ 3. Custom Personality Training
54
+
55
+ LumaAI-160M-v3 has been tuned for:
56
+
57
+ Emotional intelligence
58
+
59
+ Human-like conversational flow
60
+
61
+ Character consistency
62
+
63
+ Creative writing
64
+
65
+ Psychological depth
66
+
67
+ Roleplay stability
68
+
69
+
70
+ ⭐ 4. Merged Final Weights
71
+
72
+ No soft-adapter or PEFT files needed.
73
+ This checkpoint contains all fused weights and can be loaded directly like any other full model.
74
+
75
+
76
+ ---
77
+
78
+ 📁 Files Included
79
+
80
+ config.json
81
+ generation_config.json
82
+ model.safetensors
83
+ special_tokens_map.json
84
+ tokenizer_config.json
85
+ tokenizer.json
86
+
87
+ These are all you need for inference.
88
+
89
+
90
+ ---
91
+
92
+ 🔧 Usage
93
+
94
+ Python Example
95
+
96
+ from transformers import AutoTokenizer, AutoModelForCausalLM
97
+
98
+ model_name = "natalieparker/LumaAI-160M-v3"
99
+
100
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
101
+ model = AutoModelForCausalLM.from_pretrained(
102
+ model_name,
103
+ torch_dtype="auto"
104
+ )
105
+
106
+ prompt = "Hello Luma, how are you feeling today?"
107
+
108
+ inputs = tokenizer(prompt, return_tensors="pt")
109
+ outputs = model.generate(
110
+ **inputs,
111
+ max_new_tokens=120,
112
+ temperature=0.9,
113
+ top_p=0.9
114
+ )
115
+
116
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
117
+
118
+
119
+ ---
120
+
121
+ Text Generation Settings (Recommended)
122
+
123
+ Setting Value
124
+
125
+ max_new_tokens 80–200
126
+ temperature 0.8–1.1
127
+ top_p 0.8–0.95
128
+ repetition_penalty 1.05
129
+
130
+
131
+
132
+ ---
133
+
134
+ 📚 Training Summary
135
+
136
+ Base Model (“Leg 2”)
137
+
138
+ Restored checkpoint:
139
+ luma_160m_safe/checkpoint-4000
140
+
141
+ Fine-Tuning (“Leg 3”)
142
+
143
+ Hardware: NVIDIA P100
144
+
145
+ Steps completed: 16,000 LoRA steps
146
+
147
+ Dataset: Hybrid uncensored conversational/creative corpus
148
+
149
+ Batch gradient accumulation adapted for small GPUs
150
+
151
+
152
+ Final Merge
153
+
154
+ LoRA adapter fused into base model using merge_and_unload().
155
+
156
+ Result: Single, consolidated model file (model.safetensors).
157
+
158
+
159
+ ---
160
+
161
+ ⚠️ Safety & Limitations
162
+
163
+ This model is:
164
+
165
+ An experimental research model
166
+
167
+ Created independently by one developer
168
+
169
+ Not aligned or RLHF-filtered like corporate models
170
+
171
+ Not a replacement for professional medical, financial, or legal advice
172
+
173
+
174
+ Users must apply their own safety layers before production deployment.
175
+
176
+
177
+ ---
178
+
179
+ ❤️ Credits
180
+
181
+ Created with passion, experimentation, and continuous improvement by:
182
+
183
+ Phoenix Cameron / Natalie Parker
184
+
185
+ Special thanks to:
186
+
187
+ The Kaggle compute ecosystem
188
+
189
+ The open-source ML community
190
+
191
+ Everyone who builds their own AI instead of relying on corporations
192
+
193
+
194
+
195
+ ---
196
+
197
+ 📦 Cite
198
+
199
+ If you reference or build upon this model:
200
+
201
+ @misc{lumaai160mv3,
202
+ author = {Natalie Parker},
203
+ title = {LumaAI-160M-v3: Original lightweight model},
204
+ year = 2025,
205
+ howpublished = {HuggingFace Model Repository},
206
+ url = {https://huggingface.co/natalieparker/LumaAI-160M-v3}
207
+ }
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "activation_function": "gelu_new",
3
+ "architectures": [
4
+ "GPT2LMHeadModel"
5
+ ],
6
+ "attn_pdrop": 0.1,
7
+ "bos_token_id": 50256,
8
+ "embd_pdrop": 0.1,
9
+ "eos_token_id": 50256,
10
+ "initializer_range": 0.02,
11
+ "layer_norm_epsilon": 1e-05,
12
+ "model_type": "gpt2",
13
+ "n_ctx": 512,
14
+ "n_embd": 768,
15
+ "n_head": 12,
16
+ "n_inner": null,
17
+ "n_layer": 12,
18
+ "n_positions": 512,
19
+ "reorder_and_upcast_attn": false,
20
+ "resid_pdrop": 0.1,
21
+ "scale_attn_by_inverse_layer_idx": false,
22
+ "scale_attn_weights": true,
23
+ "summary_activation": null,
24
+ "summary_first_dropout": 0.1,
25
+ "summary_proj_to_labels": true,
26
+ "summary_type": "cls_index",
27
+ "summary_use_proj": true,
28
+ "torch_dtype": "float16",
29
+ "transformers_version": "4.53.3",
30
+ "use_cache": false,
31
+ "vocab_size": 32000
32
+ }
generation_config.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 50256,
4
+ "eos_token_id": 50256,
5
+ "transformers_version": "4.53.3",
6
+ "use_cache": false
7
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f06f3a3a9eb74c6daba30f50ad3bc7c273f9249e1efe676a2f60c8ac237e766
3
+ size 220065320
special_tokens_map.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<bos>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "<eos>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "<pad>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "unk_token": {
24
+ "content": "<unk>",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ }
30
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<unk>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "<bos>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "3": {
28
+ "content": "<eos>",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ }
35
+ },
36
+ "bos_token": "<bos>",
37
+ "clean_up_tokenization_spaces": false,
38
+ "eos_token": "<eos>",
39
+ "extra_special_tokens": {},
40
+ "model_max_length": 1000000000000000019884624838656,
41
+ "pad_token": "<pad>",
42
+ "tokenizer_class": "PreTrainedTokenizerFast",
43
+ "unk_token": "<unk>"
44
+ }