Lamapi commited on
Commit
731fe15
·
verified ·
1 Parent(s): 5500fe1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +287 -7
README.md CHANGED
@@ -1,7 +1,287 @@
1
- ---
2
- license: mit
3
- tags:
4
- - unsloth
5
- - trl
6
- - sft
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: tr
3
+ license: mit
4
+ tags:
5
+ - turkish
6
+ - türkiye
7
+ - english
8
+ - ai
9
+ - lamapi
10
+ - gemma3
11
+ - next
12
+ - next-x1
13
+ - efficient
14
+ - text-generation
15
+ - open-source
16
+ - 1b
17
+ - huggingface
18
+ - large-language-model
19
+ - llm
20
+ - causal
21
+ - transformer
22
+ - artificial-intelligence
23
+ - machine-learning
24
+ - ai-research
25
+ - natural-language-processing
26
+ - nlp
27
+ - finetuned
28
+ - lightweight
29
+ - creative
30
+ - summarization
31
+ - question-answering
32
+ - chat-model
33
+ - generative-ai
34
+ - optimized-model
35
+ - unsloth
36
+ - trl
37
+ - sft
38
+ - chemistry
39
+ - biology
40
+ - finance
41
+ - legal
42
+ - music
43
+ - art
44
+ - code
45
+ - climate
46
+ - medical
47
+ - agent
48
+ - text-generation-inference
49
+ pipeline_tag: text-generation
50
+ ---
51
+
52
+ <img src='assets/banner.png'>
53
+
54
+ # 🚀 Next-1B (t322)
55
+
56
+ ### *Lightweight, Efficient, and Türkiye-Focused AI*
57
+
58
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
59
+ [![Language: English](https://img.shields.io/badge/Language-Multilingual-red.svg)]()
60
+ [![HuggingFace](https://img.shields.io/badge/🤗-Lamapi/Next--1B-orange.svg)](https://huggingface.co/Lamapi/next-1b)
61
+
62
+ ---
63
+
64
+ <style>
65
+ table { width:fit-content; border-collapse:separate; border-spacing:0 3px;font-family:system-ui, -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Open Sans', 'Helvetica Neue', sans-serif;background:rgba(15,22,32,0.4);border-radius:16px;padding: 10px; border:none;transition:.2s all ease;}
66
+ thead th { text-align:center; padding:4px 10px; font-size:13px; text-transform:uppercase; color:rgb(200,200,200);border:none; }
67
+ tbody tr { transition: transform 0.18s ease, box-shadow 0.18s ease; border:none !important;transition:.2s all ease;border-radius:16px;background:rgba(0, 0, 0, 0.38);}
68
+ tbody .turkish:hover {box-shadow:0 6px 15px rgba(0, 0, 0, 0.27);scale:1.01;background:rgba(80, 38, 38, 0.6);}
69
+ tbody .next:hover {box-shadow:0 6px 15px rgba(0, 0, 0, 0.27);scale:1.02;background: rgba(0, 59, 225, 1)}
70
+ tbody tr:hover { box-shadow:0 0px 15px rgba(102, 102, 102, 0.13); background:rgba(139, 139, 139, 0.16)}
71
+ td { padding:8px 10px;border:0px transparent !important;outline:transparent !important; text-align:center; }
72
+ td:first-child { font-weight:600;text-align:left }
73
+ /* tbody .turkish td { background: rgba(255, 0, 0, 0.2) !important; color:rgb(200,200,200); font-weight:400;border:0px !important; scale:1.0; } */
74
+ /* tbody .next td { background: rgba(0, 89, 255, 0.49)!important; color:rgb(200,200,200); font-weight:600;border:0px !important; scale:1.00;outline:none;border:none !important;} */
75
+ .next{
76
+ background: rgba(0, 89, 255, 0.49);
77
+ }
78
+ .turkish{
79
+ background:rgba(51, 34, 34, 0.64);
80
+ }
81
+ tbody tr td:first-child { border-top-left-radius:12px; border-bottom-left-radius:12px; }
82
+ tbody tr td:last-child { border-top-right-radius:12px; border-bottom-right-radius:12px; } strong{
83
+ font-size:16px;font-weight:700;
84
+ }
85
+ em{opacity:0.7;font-size:11px !important;}
86
+ </style>
87
+ ## 📖 Overview
88
+
89
+ **Next-270M** is a **270-million parameter causal language model** based on **Gemma 3**, designed for **efficiency, low-resource deployment, and reasoning-focused natural language understanding**.
90
+
91
+ Key highlights:
92
+
93
+ * Extremely **lightweight** — can run on consumer GPUs with low VRAM.
94
+ * Optimized for **text reasoning, summarization, and creative generation**.
95
+ * Supports **Turkish natively** while remaining multilingual.
96
+ * Open-source and transparent for research and applications.
97
+
98
+ Ideal for **developers, students, and organizations** needing **fast, reliable, and low-resource text-generation**.
99
+
100
+ ---
101
+
102
+ # Our Next 270M, Next 1B and Next 4B models are leading to all of the tiny models in benchmarks.
103
+
104
+ <table>
105
+ <thead>
106
+ <tr>
107
+ <th>Model</th>
108
+ <th>MMLU (5-shot) %</th>
109
+ <th>MMLU-Pro %</th>
110
+ <th>GSM8K %</th>
111
+ <th>MATH %</th>
112
+ </tr>
113
+ </thead>
114
+ <tbody>
115
+ <tr class="next">
116
+ <td data-label="Model">Next 4B preview <em>Version s325</em></td>
117
+ <td data-label="MMLU (5-shot) %">84.6</td>
118
+ <td data-label="MMLU-Pro %">66.9</td>
119
+ <td data-label="GSM8K %">82.7</td>
120
+ <td data-label="MATH %"><strong>70.5</strong></td>
121
+ </tr>
122
+ <tr class="next">
123
+ <td data-label="Model">Next 1B <em>Version t327</em></td>
124
+ <td data-label="MMLU (5-shot) %"><strong>87.3</strong></td>
125
+ <td data-label="MMLU-Pro %"><strong>69.2</strong></td>
126
+ <td data-label="GSM8K %"><strong>90.5</strong></td>
127
+ <td data-label="MATH %">70.1</td>
128
+ </tr>
129
+ <tr>
130
+ <td data-label="Model">Qwen 3 0.6B</td>
131
+ <td data-label="MMLU (5-shot) %">52.81</td>
132
+ <td data-label="MMLU-Pro %">37.6</td>
133
+ <td data-label="GSM8K %">60.7</td>
134
+ <td data-label="MATH %">20.5</td>
135
+ </tr>
136
+ <tr>
137
+ <td data-label="Model">Llama 3.2 1B</td>
138
+ <td data-label="MMLU (5-shot) %">49.3</td>
139
+ <td data-label="MMLU-Pro %">44.4</td>
140
+ <td data-label="GSM8K %">11.9</td>
141
+ <td data-label="MATH %">30.6</td>
142
+ </tr>
143
+ <tr class="turkish">
144
+ <td data-label="Model">Kumru 7B <em>not verified</em></td>
145
+ <td data-label="MMLU (5-shot) %">30.7</td>
146
+ <td data-label="MMLU-Pro %">28.6</td>
147
+ <td data-label="GSM8K %">15.38</td>
148
+ <td data-label="MATH %">6.4</td>
149
+ </tr>
150
+ </tbody>
151
+ </table>
152
+
153
+ ---
154
+
155
+ # Also, our Next Z1 model is leading to state-of-the-art models in some of the Benchmarks.
156
+ <table>
157
+ <thead>
158
+ <tr>
159
+ <th>Model</th>
160
+ <th>MMLU (5-shot) %</th>
161
+ <th>MMLU-Pro %</th>
162
+ <th>GSM8K %</th>
163
+ <th>MATH %</th>
164
+ </tr>
165
+ </thead>
166
+ <tbody>
167
+ <tr class="next">
168
+ <td data-label="Model">Next Z1 <em>Version l294</em></td>
169
+ <td data-label="MMLU (5-shot) %"><strong>97.3</strong></td>
170
+ <td data-label="MMLU-Pro %"><strong>94.2</strong></td>
171
+ <td data-label="GSM8K %">97.7</td>
172
+ <td data-label="MATH %">93.2</td>
173
+ </tr>
174
+ <tr class="next">
175
+ <td data-label="Model">Next Z1 <em>Version l294</em> (no tool)</td>
176
+ <td data-label="MMLU (5-shot) %">94.7</td>
177
+ <td data-label="MMLU-Pro %">90.1</td>
178
+ <td data-label="GSM8K %">94.5</td>
179
+ <td data-label="MATH %">88.7</td>
180
+ </tr>
181
+ <tr>
182
+ <td data-label="Model">GPT 5</td>
183
+ <td data-label="MMLU (5-shot) %">92.5</td>
184
+ <td data-label="MMLU-Pro %">87.0</td>
185
+ <td data-label="GSM8K %"><strong>98.4</strong></td>
186
+ <td data-label="MATH %"><strong>96.0</strong></td>
187
+ </tr>
188
+ <tr>
189
+ <td data-label="Model">Claude Opus 4.1 (Thinking)</td>
190
+ <td data-label="MMLU (5-shot) %">~92.0</td>
191
+ <td data-label="MMLU-Pro %">87.8</td>
192
+ <td data-label="GSM8K %">84.7</td>
193
+ <td data-label="MATH %">95.4</td>
194
+ </tr>
195
+ </tbody>
196
+ </table>
197
+
198
+ ---
199
+
200
+ ## 🎯 Goals
201
+
202
+ 1. **Lightweight Efficiency:** Run smoothly on low-resource devices.
203
+ 2. **Reasoning-Focused:** Provide logical and coherent text outputs.
204
+ 3. **Accessibility:** Fully open-source with clear documentation.
205
+ 4. **Multilingual Adaptability:** Turkish-focused but supports other languages.
206
+
207
+ ---
208
+
209
+ ## ✨ Key Features
210
+
211
+ | Feature | Description |
212
+ | --------------------------- | --------------------------------------------------------------------- |
213
+ | 🔋 Lightweight Architecture | Optimized for low VRAM usage; ideal for small GPUs or CPU deployment. |
214
+ | 🇹🇷 Turkish & Multilingual | Handles complex Turkish prompts accurately. |
215
+ | 🧠 Reasoning Capabilities | Logical chain-of-thought for question-answering and problem-solving. |
216
+ | 📊 Consistent Outputs | Reliable and reproducible results across multiple runs. |
217
+ | 🌍 Open Source | Transparent, research-friendly, and community-driven. |
218
+
219
+ ---
220
+
221
+ ## 📐 Model Specifications
222
+
223
+ | Specification | Details |
224
+ | ------------------ | ---------------------------------------------------------------------- |
225
+ | Base Model | Gemma 3 |
226
+ | Parameter Count | 270 Million |
227
+ | Architecture | Transformer, causal LLM |
228
+ | Fine-Tuning Method | Instruction fine-tuning (SFT) with Turkish and multilingual datasets |
229
+ | Optimizations | Quantization-ready (q8, f16, f32) |
230
+ | Use Cases | Text generation, summarization, Q&A, creative writing, reasoning tasks |
231
+
232
+ ---
233
+
234
+ ## 🚀 Installation & Usage
235
+
236
+ ### Use the model:
237
+
238
+ ```python
239
+ from transformers import AutoTokenizer, AutoModelForCausalLM
240
+ import torch
241
+
242
+ model_id = "Lamapi/next-270m"
243
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
244
+ model = AutoModelForCausalLM.from_pretrained(model_id)
245
+
246
+ # Chat message
247
+ messages = [
248
+ {"role": "system", "content": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."},
249
+ {"role": "user", "content": "Hello, how are you?"}
250
+ ]
251
+
252
+ # Prepare input with Tokenizer
253
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
254
+ inputs = tokenizer(prompt, return_tensors="pt")
255
+
256
+ # Output from the model
257
+ output = model.generate(**inputs, max_new_tokens=50)
258
+ print(tokenizer.decode(output[0], skip_special_tokens=True))
259
+ ```
260
+
261
+ <div style='width:700px;'>
262
+ <div style='background-color:rgba(0,140,255,0.5);border-radius:16px;border-bottom-right-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;margin-left:250px;margin-top:-15px;margin-bottom:10px;'>
263
+ Hello, how are you?
264
+ </div>
265
+ <div style='background-color:rgba(42,42,40,0.7);border-radius:16px;border-bottom-left-radius:0px;padding:3px 10px;width:fit-content;max-width:400px;'>
266
+ I'm fine, thank you. How are you?
267
+ </div>
268
+ </div>
269
+
270
+ ---
271
+
272
+ ## 📄 License
273
+
274
+ MIT License — free to use, modify, and distribute. Attribution appreciated.
275
+
276
+ ---
277
+
278
+ ## 📞 Contact & Support
279
+
280
+ * 📧 **Email:** [lamapicontact@gmail.com](mailto:lamapicontact@gmail.com)
281
+ * 🤗 **HuggingFace:** [Lamapi](https://huggingface.co/Lamapi)
282
+
283
+ ---
284
+
285
+ > **Next-270M** — Lightweight, **efficient, and reasoning-focused**, bringing **Turkey’s AI forward** on low-resource hardware.
286
+
287
+ [![Follow on HuggingFace](https://img.shields.io/badge/Follow-HuggingFace-yellow?logo=huggingface)](https://huggingface.co/Lamapi)