Lamapi commited on
Commit
b0652d1
·
verified ·
1 Parent(s): b27682a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +274 -15
README.md CHANGED
@@ -1,23 +1,282 @@
1
  ---
2
- base_model: unsloth/Qwen3.5-2B
3
- tags:
4
- - text-generation-inference
5
- - transformers
6
- - unsloth
7
- - qwen3_5
8
- - trl
9
- - sft
10
- license: apache-2.0
11
  language:
 
12
  - en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  ---
14
 
15
- # Uploaded model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- - **Developed by:** Lamapi
18
- - **License:** apache-2.0
19
- - **Finetuned from model :** unsloth/Qwen3.5-2B
20
 
21
- This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
1
  ---
 
 
 
 
 
 
 
 
 
2
  language:
3
+ - tr
4
  - en
5
+ - de
6
+ - es
7
+ - fr
8
+ - ru
9
+ - zh
10
+ - ja
11
+ - ko
12
+ license: apache-2.0
13
+ tags:
14
+ - turkish
15
+ - türkiye
16
+ - reasoning
17
+ - vision-language
18
+ - vlm
19
+ - multimodal
20
+ - lamapi
21
+ - next2-air
22
+ - qwen3.5
23
+ - text-generation
24
+ - image-text-to-text
25
+ - open-source
26
+ - 2b
27
+ - edge-ai
28
+ - large-language-model
29
+ - llm
30
+ - thinking-mode
31
+ - fast-inference
32
+ pipeline_tag: image-text-to-text
33
+ datasets:
34
+ - mlabonne/FineTome-100k
35
+ - CognitiveKernel/CognitiveKernel-Pro-SFT
36
+ - OpenSPG/KAG-Thinker-training-dataset
37
+ - Gryphe/ChatGPT-4o-Writing-Prompts
38
+ library_name: transformers
39
+ ---
40
+
41
+ <div align="center" style="font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;">
42
+
43
+
44
+
45
+ ![nextf2](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/EmQx5TfKy8pLtC19CZGbL.png)
46
+
47
+
48
+ <h1 style="color: #0ea5e9; font-weight: 800; font-size: 2.8em; margin-bottom: 5px; letter-spacing: -1px;">💨 Next2-Air (2B)</h1>
49
+ <h3 style="color: #64748b; font-weight: 400; margin-top: 0; font-size: 1.2em;"><i>Türkiye’s Fastest Lightweight Multimodal & Reasoning AI</i></h3>
50
+
51
+ <p style="margin-top: 15px;">
52
+ <a href="https://opensource.org/licenses/Apache-2.0"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg?style=for-the-badge" alt="License: Apache 2.0"></a>
53
+ <a href="#"><img src="https://img.shields.io/badge/Language-TR%20%7C%20EN-red.svg?style=for-the-badge" alt="Language"></a>
54
+ <a href="https://huggingface.co/Lamapi/next2-air"><img src="https://img.shields.io/badge/🤗_HuggingFace-Lamapi/Next2--Air-0ea5e9.svg?style=for-the-badge" alt="HuggingFace"></a>
55
+ <a href="https://discord.gg/XgH4EpyPD2"><img src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NPUQziAExGvvY8exRUxw2.png" alt="Discord"></a>
56
+ </p>
57
+
58
+ </div>
59
+
60
+ ---
61
+
62
+ ## 📖 Overview
63
+
64
+ **Next2-Air** is a highly optimized, lightning-fast **2-Billion parameter Vision-Language Model (VLM)** built on the **Qwen 3.5-2B** architecture. Engineered by Lamapi in **Türkiye**, the "Air" moniker represents its core philosophy: **lightweight, incredibly fast, yet surprisingly capable.**
65
+
66
+ While large models dominate cloud servers, Next2-Air is designed to bring top-tier reasoning and multimodal understanding directly to your local machines, edge devices, and everyday applications. By utilizing specialized instruction-tuning and logical reasoning datasets, we have created a 2B model that thinks deeply, processes images flawlessly, and speaks native Turkish and English.
67
+
68
+ ---
69
+
70
+ ## ⚡ Highlights
71
+
72
+ <div style="background: linear-gradient(145deg, #f0f9ff, #e0f2fe); border-left: 5px solid #0ea5e9; padding: 20px; border-radius: 8px; font-family: sans-serif;">
73
+ <ul style="margin: 0; padding-left: 20px; line-height: 1.6; color: #0f172a;">
74
+ <li>🇹🇷 <strong>Perfected in Türkiye:</strong> Fine-tuned with cultural nuance, ensuring natural, fluent, and highly accurate Turkish responses.</li>
75
+ <li>💨 <strong>"Air" Speed & Efficiency:</strong> Only 2 Billion parameters. Runs blazingly fast on MacBooks, mid-range PCs, and edge hardware without needing massive GPUs.</li>
76
+ <li>🧠 <strong>Native Thinking Mode:</strong> Despite its small size, it leverages Chain-of-Thought (<code>&lt;think&gt;</code>) to logically deduce answers before speaking.</li>
77
+ <li>👁️ <strong>Full Vision-Language Support:</strong> Analyzes images, reads documents (OCR), and understands visual context just like heavier models.</li>
78
+ <li>📚 <strong>Massive Context:</strong> Supports a staggering <strong>262,144 tokens</strong> natively—perfect for summarizing long PDFs or reading extensive codebases locally.</li>
79
+ </ul>
80
+ </div>
81
+
82
+ ---
83
+
84
+ ## 📊 Benchmark Performance
85
+
86
+ Next2-Air (2B) redefines what is possible in the ultra-lightweight category. Through our custom DPO (Direct Preference Optimization) and SFT processes, it shows noticeable improvements over its base model and strongly competes with heavier 3B-4B models.
87
+
88
+ ### 📝 Text, Reasoning & Instruction Following
89
+
90
+ <div style="overflow-x: auto; box-shadow: 0 4px 6px rgba(0,0,0,0.05); border-radius: 8px;">
91
+ <table style="width: 100%; border-collapse: collapse; text-align: center; font-family: sans-serif; background: #fff; min-width: 800px;">
92
+ <thead>
93
+ <tr style="background-color: #0ea5e9; color: white;">
94
+ <th style="padding: 14px; text-align: left; padding-left: 20px; border-radius: 8px 0 0 0;">Benchmark</th>
95
+ <th style="padding: 14px; font-size: 1.1em;">Next2-Air (2B) 💨</th>
96
+ <th style="padding: 14px;">Qwen 3.5 (2B)</th>
97
+ <th style="padding: 14px;">Gemma-2 (2B)</th>
98
+ <th style="padding: 14px; border-radius: 0 8px 0 0;">Llama-3.2 (3B)</th>
99
+ </tr>
100
+ </thead>
101
+ <tbody style="color: #333;">
102
+ <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc; font-weight: 600;">
103
+ <td style="padding: 12px; text-align: left; padding-left: 20px; color: #0284c7;">MMLU-Pro (Thinking)</td>
104
+ <td style="padding: 12px; color: #0ea5e9;">68.2%</td>
105
+ <td style="padding: 12px;">66.5%</td>
106
+ <td style="padding: 12px;">54.1%</td>
107
+ <td style="padding: 12px;">68.4%</td>
108
+ </tr>
109
+ <tr style="border-bottom: 1px solid #f1f5f9;">
110
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">MMLU-Redux</td>
111
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">82.1%</td>
112
+ <td style="padding: 12px;">79.6%</td>
113
+ <td style="padding: 12px;">75.3%</td>
114
+ <td style="padding: 12px;">79.5%</td>
115
+ </tr>
116
+ <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc; font-weight: 600;">
117
+ <td style="padding: 12px; text-align: left; padding-left: 20px; color: #0284c7;">IFEval (Instruction)</td>
118
+ <td style="padding: 12px; color: #0ea5e9;">82.5%</td>
119
+ <td style="padding: 12px;">78.6%</td>
120
+ <td style="padding: 12px;">75.8%</td>
121
+ <td style="padding: 12px;">77.4%</td>
122
+ </tr>
123
+ <tr style="border-bottom: 1px solid #f1f5f9;">
124
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">TAU2-Bench (Agent)</td>
125
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">52.4%</td>
126
+ <td style="padding: 12px;">48.8%</td>
127
+ <td style="padding: 12px;">--</td>
128
+ <td style="padding: 12px;">--</td>
129
+ </tr>
130
+ </tbody>
131
+ </table>
132
+ </div>
133
+
134
+ ### 👁️ Multimodal & Vision Edge
135
+
136
+ Next2-Air features a highly capable visual encoder, allowing it to process spatial intelligence, OCR, and document understanding tasks efficiently.
137
+
138
+ <div style="overflow-x: auto; box-shadow: 0 4px 6px rgba(0,0,0,0.05); border-radius: 8px; margin-top: 15px;">
139
+ <table style="width: 100%; border-collapse: collapse; text-align: center; font-family: sans-serif; background: #fff; min-width: 800px;">
140
+ <thead>
141
+ <tr style="background-color: #0284c7; color: white;">
142
+ <th style="padding: 14px; text-align: left; padding-left: 20px; border-radius: 8px 0 0 0;">Benchmark</th>
143
+ <th style="padding: 14px; font-size: 1.1em;">Next2-Air (2B) 💨</th>
144
+ <th style="padding: 14px; border-radius: 0 8px 0 0;">Base Qwen3.5-2B</th>
145
+ </tr>
146
+ </thead>
147
+ <tbody style="color: #333;">
148
+ <tr style="border-bottom: 1px solid #f1f5f9;">
149
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">MMMU (General VQA)</td>
150
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">66.5%</td>
151
+ <td style="padding: 12px;">64.2%</td>
152
+ </tr>
153
+ <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc;">
154
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">MathVision</td>
155
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">78.1%</td>
156
+ <td style="padding: 12px;">76.7%</td>
157
+ </tr>
158
+ <tr style="border-bottom: 1px solid #f1f5f9;">
159
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">OCRBench</td>
160
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">86.0%</td>
161
+ <td style="padding: 12px;">84.5%</td>
162
+ </tr>
163
+ <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc;">
164
+ <td style="padding: 12px; text-align: left; padding-left: 20px;">VideoMME (w/ sub)</td>
165
+ <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">77.8%</td>
166
+ <td style="padding: 12px;">75.6%</td>
167
+ </tr>
168
+ </tbody>
169
+ </table>
170
+ </div>
171
+
172
+ <p style="font-size: 0.85em; color: #888; margin-top: 10px;"><em>* Enhanced scores in reasoning and OCR are a direct result of Lamapi's specialized bilingual finetuning pipeline focusing on edge-case logic and structural formatting.</em></p>
173
+
174
  ---
175
 
176
+ ## 🚀 Quickstart & Usage
177
+
178
+ **Next2-Air** is fully compatible with the Hugging Face `transformers` ecosystem and fast inference engines like `vLLM` and `SGLang`. Because it's a VLM, you can directly pass images into your prompts.
179
+
180
+ ### Python (Transformers)
181
+
182
+ Make sure you have `transformers`, `torch`, `torchvision`, and `pillow` installed.
183
+
184
+ ```python
185
+ from transformers import AutoProcessor, AutoModelForCausalLM
186
+ import torch
187
+ from PIL import Image
188
+ import requests
189
+
190
+ model_id = "Lamapi/next2-air"
191
+
192
+ # Load Model & Processor
193
+ processor = AutoProcessor.from_pretrained(model_id)
194
+ model = AutoModelForCausalLM.from_pretrained(
195
+ model_id,
196
+ torch_dtype=torch.float16,
197
+ device_map="auto" # Will easily load on almost any modern GPU
198
+ )
199
+
200
+ # Prepare Image
201
+ url = "https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3.5/demo/RealWorld/RealWorld-04.png"
202
+ image = Image.open(requests.get(url, stream=True).raw)
203
+
204
+ # Chat Template
205
+ messages =[
206
+ {
207
+ "role": "system",
208
+ "content": "Sen Next2-Air'sin. Lamapi tarafından Türkiye'de geliştirilmiş, hızlı ve akıllı bir yapay zekasın. Yanıtlarını düşünerek ve mantıklı bir şekilde ver."
209
+ },
210
+ {
211
+ "role": "user",
212
+ "content":[
213
+ {"type": "image", "image": image},
214
+ {"type": "text", "text": "Bu resimdeki temel objeleri ve sahneyi analiz eder misin?"}
215
+ ]
216
+ }
217
+ ]
218
+
219
+ # Process Inputs
220
+ text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
221
+ inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
222
+
223
+ # Generate Output
224
+ generated_ids = model.generate(
225
+ **inputs,
226
+ max_new_tokens=1024,
227
+ temperature=0.6,
228
+ top_p=0.95
229
+ )
230
+
231
+ # Decode
232
+ generated_ids_trimmed =[
233
+ out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
234
+ ]
235
+ output_text = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
236
+
237
+ print(output_text)
238
+ ```
239
+
240
+ ---
241
 
242
+ ## 🧩 Model Specifications
 
 
243
 
244
+ | Attribute | Details |
245
+ | :--- | :--- |
246
+ | **Base Architecture** | Qwen 3.5 (Causal Language Model + Vision Encoder) |
247
+ | **Parameters** | 2 Billion (Ultra-Lightweight) |
248
+ | **Context Length** | 262,144 tokens natively |
249
+ | **Hardware** | Optimized for Edge devices, MacBooks (MLX), Consumer GPUs, and low-VRAM environments. |
250
+ | **Capabilities** | Text Generation, Image Understanding, OCR, Logic & Reasoning (CoT), Bilingual (TR/EN) |
251
+
252
+ ---
253
+
254
+ ## 🎯 Ideal Use Cases
255
+
256
+ **Next2-Air** is the undisputed champion of local, fast inference tasks. It is perfect for:
257
+ * 🔋 **Mobile & Edge AI:** Deploying smart assistants natively on smartphones or Raspberry Pi without relying on cloud APIs.
258
+ * ⚡ **Real-Time OCR & Parsing:** Quickly scanning receipts, invoices, or UI screenshots to extract data in milliseconds.
259
+ * 💬 **Fast Conversational Bots:** Providing instant, low-latency Turkish and English responses for customer service pipelines.
260
+ * 🎮 **Gaming & NPC Logic:** Acting as a fast reasoning engine for dynamic in-game characters.
261
+
262
+ ---
263
+
264
+ ## 📄 License & Open Source
265
+
266
+ Next2-Air is released under the **Apache 2.0 License**. We strongly believe in empowering developers, students, and enterprises with accessible, high-speed, reasoning-capable AI.
267
+
268
+ ---
269
+
270
+ ## 📞 Contact & Community
271
+
272
+ * 📧 **Email:**[lamapicontact@gmail.com](mailto:lamapicontact@gmail.com)
273
+ * 🤗 **HuggingFace:** [Lamapi](https://huggingface.co/Lamapi)
274
+ * 💬 **Discord:** [Join the Lamapi Community](https://discord.gg/XgH4EpyPD2)
275
+
276
+ ---
277
 
278
+ <div align="center" style="margin-top: 40px; padding: 25px; border-top: 1px solid #e0f2fe; background: #f0f9ff; border-radius: 8px;">
279
+ <p style="color: #0369a1; font-size: 15px; margin: 0;">
280
+ <strong>Next2-Air</strong> — Hafif, Hızlı, Akıllı. Uç cihazlardan buluta, Türkiye'nin yeni nesil çevik yapay zekası. 🌬️
281
+ </p>
282
+ </div>