Update README.md

Browse files

Files changed (1) hide show

README.md +274 -15

README.md CHANGED Viewed

@@ -1,23 +1,282 @@
 ---
-base_model: unsloth/Qwen3.5-2B
-tags:
-- text-generation-inference
-- transformers
-- unsloth
-- qwen3_5
-- trl
-- sft
-license: apache-2.0
 language:
 - en
 ---
-# Uploaded  model
-- **Developed by:** Lamapi
-- **License:** apache-2.0
-- **Finetuned from model :** unsloth/Qwen3.5-2B
-This qwen3_5 model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
-[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)

 ---
 language:
+- tr
 - en
+- de
+- es
+- fr
+- ru
+- zh
+- ja
+- ko
+license: apache-2.0
+tags:
+- turkish
+- türkiye
+- reasoning
+- vision-language
+- vlm
+- multimodal
+- lamapi
+- next2-air
+- qwen3.5
+- text-generation
+- image-text-to-text
+- open-source
+- 2b
+- edge-ai
+- large-language-model
+- llm
+- thinking-mode
+- fast-inference
+pipeline_tag: image-text-to-text
+datasets:
+- mlabonne/FineTome-100k
+- CognitiveKernel/CognitiveKernel-Pro-SFT
+- OpenSPG/KAG-Thinker-training-dataset
+- Gryphe/ChatGPT-4o-Writing-Prompts
+library_name: transformers
+---
+<div align="center" style="font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;">
+  ![nextf2](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/EmQx5TfKy8pLtC19CZGbL.png)
+  <h1 style="color: #0ea5e9; font-weight: 800; font-size: 2.8em; margin-bottom: 5px; letter-spacing: -1px;">💨 Next2-Air (2B)</h1>
+  <h3 style="color: #64748b; font-weight: 400; margin-top: 0; font-size: 1.2em;"><i>Türkiye’s Fastest Lightweight Multimodal & Reasoning AI</i></h3>
+  <p style="margin-top: 15px;">
+    <a href="https://opensource.org/licenses/Apache-2.0"><img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg?style=for-the-badge" alt="License: Apache 2.0"></a>
+    <a href="#"><img src="https://img.shields.io/badge/Language-TR%20%7C%20EN-red.svg?style=for-the-badge" alt="Language"></a>
+    <a href="https://huggingface.co/Lamapi/next2-air"><img src="https://img.shields.io/badge/🤗_HuggingFace-Lamapi/Next2--Air-0ea5e9.svg?style=for-the-badge" alt="HuggingFace"></a>
+    <a href="https://discord.gg/XgH4EpyPD2"><img src="https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/NPUQziAExGvvY8exRUxw2.png" alt="Discord"></a>
+  </p>
+</div>
+---
+## 📖 Overview
+**Next2-Air** is a highly optimized, lightning-fast **2-Billion parameter Vision-Language Model (VLM)** built on the **Qwen 3.5-2B** architecture. Engineered by Lamapi in **Türkiye**, the "Air" moniker represents its core philosophy: **lightweight, incredibly fast, yet surprisingly capable.**
+While large models dominate cloud servers, Next2-Air is designed to bring top-tier reasoning and multimodal understanding directly to your local machines, edge devices, and everyday applications. By utilizing specialized instruction-tuning and logical reasoning datasets, we have created a 2B model that thinks deeply, processes images flawlessly, and speaks native Turkish and English.
+---
+## ⚡ Highlights
+<div style="background: linear-gradient(145deg, #f0f9ff, #e0f2fe); border-left: 5px solid #0ea5e9; padding: 20px; border-radius: 8px; font-family: sans-serif;">
+  <ul style="margin: 0; padding-left: 20px; line-height: 1.6; color: #0f172a;">
+    <li>🇹🇷 <strong>Perfected in Türkiye:</strong> Fine-tuned with cultural nuance, ensuring natural, fluent, and highly accurate Turkish responses.</li>
+    <li>💨 <strong>"Air" Speed & Efficiency:</strong> Only 2 Billion parameters. Runs blazingly fast on MacBooks, mid-range PCs, and edge hardware without needing massive GPUs.</li>
+    <li>🧠 <strong>Native Thinking Mode:</strong> Despite its small size, it leverages Chain-of-Thought (<code>&lt;think&gt;</code>) to logically deduce answers before speaking.</li>
+    <li>👁️ <strong>Full Vision-Language Support:</strong> Analyzes images, reads documents (OCR), and understands visual context just like heavier models.</li>
+    <li>📚 <strong>Massive Context:</strong> Supports a staggering <strong>262,144 tokens</strong> natively—perfect for summarizing long PDFs or reading extensive codebases locally.</li>
+  </ul>
+</div>
+---
+## 📊 Benchmark Performance
+Next2-Air (2B) redefines what is possible in the ultra-lightweight category. Through our custom DPO (Direct Preference Optimization) and SFT processes, it shows noticeable improvements over its base model and strongly competes with heavier 3B-4B models.
+### 📝 Text, Reasoning & Instruction Following
+<div style="overflow-x: auto; box-shadow: 0 4px 6px rgba(0,0,0,0.05); border-radius: 8px;">
+  <table style="width: 100%; border-collapse: collapse; text-align: center; font-family: sans-serif; background: #fff; min-width: 800px;">
+    <thead>
+      <tr style="background-color: #0ea5e9; color: white;">
+        <th style="padding: 14px; text-align: left; padding-left: 20px; border-radius: 8px 0 0 0;">Benchmark</th>
+        <th style="padding: 14px; font-size: 1.1em;">Next2-Air (2B) 💨</th>
+        <th style="padding: 14px;">Qwen 3.5 (2B)</th>
+        <th style="padding: 14px;">Gemma-2 (2B)</th>
+        <th style="padding: 14px; border-radius: 0 8px 0 0;">Llama-3.2 (3B)</th>
+      </tr>
+    </thead>
+    <tbody style="color: #333;">
+      <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc; font-weight: 600;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px; color: #0284c7;">MMLU-Pro (Thinking)</td>
+        <td style="padding: 12px; color: #0ea5e9;">68.2%</td>
+        <td style="padding: 12px;">66.5%</td>
+        <td style="padding: 12px;">54.1%</td>
+        <td style="padding: 12px;">68.4%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">MMLU-Redux</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">82.1%</td>
+        <td style="padding: 12px;">79.6%</td>
+        <td style="padding: 12px;">75.3%</td>
+        <td style="padding: 12px;">79.5%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc; font-weight: 600;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px; color: #0284c7;">IFEval (Instruction)</td>
+        <td style="padding: 12px; color: #0ea5e9;">82.5%</td>
+        <td style="padding: 12px;">78.6%</td>
+        <td style="padding: 12px;">75.8%</td>
+        <td style="padding: 12px;">77.4%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">TAU2-Bench (Agent)</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">52.4%</td>
+        <td style="padding: 12px;">48.8%</td>
+        <td style="padding: 12px;">--</td>
+        <td style="padding: 12px;">--</td>
+      </tr>
+    </tbody>
+  </table>
+</div>
+### 👁️ Multimodal & Vision Edge
+Next2-Air features a highly capable visual encoder, allowing it to process spatial intelligence, OCR, and document understanding tasks efficiently.
+<div style="overflow-x: auto; box-shadow: 0 4px 6px rgba(0,0,0,0.05); border-radius: 8px; margin-top: 15px;">
+  <table style="width: 100%; border-collapse: collapse; text-align: center; font-family: sans-serif; background: #fff; min-width: 800px;">
+    <thead>
+      <tr style="background-color: #0284c7; color: white;">
+        <th style="padding: 14px; text-align: left; padding-left: 20px; border-radius: 8px 0 0 0;">Benchmark</th>
+        <th style="padding: 14px; font-size: 1.1em;">Next2-Air (2B) 💨</th>
+        <th style="padding: 14px; border-radius: 0 8px 0 0;">Base Qwen3.5-2B</th>
+      </tr>
+    </thead>
+    <tbody style="color: #333;">
+      <tr style="border-bottom: 1px solid #f1f5f9;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">MMMU (General VQA)</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">66.5%</td>
+        <td style="padding: 12px;">64.2%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">MathVision</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">78.1%</td>
+        <td style="padding: 12px;">76.7%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">OCRBench</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">86.0%</td>
+        <td style="padding: 12px;">84.5%</td>
+      </tr>
+      <tr style="border-bottom: 1px solid #f1f5f9; background-color: #f8fafc;">
+        <td style="padding: 12px; text-align: left; padding-left: 20px;">VideoMME (w/ sub)</td>
+        <td style="padding: 12px; font-weight: bold; color: #0ea5e9;">77.8%</td>
+        <td style="padding: 12px;">75.6%</td>
+      </tr>
+    </tbody>
+  </table>
+</div>
+<p style="font-size: 0.85em; color: #888; margin-top: 10px;"><em>* Enhanced scores in reasoning and OCR are a direct result of Lamapi's specialized bilingual finetuning pipeline focusing on edge-case logic and structural formatting.</em></p>
 ---
+## 🚀 Quickstart & Usage
+**Next2-Air** is fully compatible with the Hugging Face `transformers` ecosystem and fast inference engines like `vLLM` and `SGLang`. Because it's a VLM, you can directly pass images into your prompts.
+### Python (Transformers)
+Make sure you have `transformers`, `torch`, `torchvision`, and `pillow` installed.
+```python
+from transformers import AutoProcessor, AutoModelForCausalLM
+import torch
+from PIL import Image
+import requests
+model_id = "Lamapi/next2-air"
+# Load Model & Processor
+processor = AutoProcessor.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(
+    model_id,
+    torch_dtype=torch.float16,
+    device_map="auto" # Will easily load on almost any modern GPU
+)
+# Prepare Image
+url = "https://qianwen-res.oss-accelerate.aliyuncs.com/Qwen3.5/demo/RealWorld/RealWorld-04.png"
+image = Image.open(requests.get(url, stream=True).raw)
+# Chat Template
+messages =[
+    {
+        "role": "system",
+        "content": "Sen Next2-Air'sin. Lamapi tarafından Türkiye'de geliştirilmiş, hızlı ve akıllı bir yapay zekasın. Yanıtlarını düşünerek ve mantıklı bir şekilde ver."
+    },
+    {
+        "role": "user",
+        "content":[
+            {"type": "image", "image": image},
+            {"type": "text", "text": "Bu resimdeki temel objeleri ve sahneyi analiz eder misin?"}
+        ]
+    }
+]
+# Process Inputs
+text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+inputs = processor(text=[text], images=[image], return_tensors="pt").to(model.device)
+# Generate Output
+generated_ids = model.generate(
+    **inputs,
+    max_new_tokens=1024,
+    temperature=0.6,
+    top_p=0.95
+)
+# Decode
+generated_ids_trimmed =[
+    out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
+]
+output_text = processor.batch_decode(generated_ids_trimmed, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
+print(output_text)
+```
+---
+## 🧩 Model Specifications
+| Attribute | Details |
+| :--- | :--- |
+| **Base Architecture** | Qwen 3.5 (Causal Language Model + Vision Encoder) |
+| **Parameters** | 2 Billion (Ultra-Lightweight) |
+| **Context Length** | 262,144 tokens natively |
+| **Hardware** | Optimized for Edge devices, MacBooks (MLX), Consumer GPUs, and low-VRAM environments. |
+| **Capabilities** | Text Generation, Image Understanding, OCR, Logic & Reasoning (CoT), Bilingual (TR/EN) |
+---
+## 🎯 Ideal Use Cases
+**Next2-Air** is the undisputed champion of local, fast inference tasks. It is perfect for:
+* 🔋 **Mobile & Edge AI:** Deploying smart assistants natively on smartphones or Raspberry Pi without relying on cloud APIs.
+* ⚡ **Real-Time OCR & Parsing:** Quickly scanning receipts, invoices, or UI screenshots to extract data in milliseconds.
+* 💬 **Fast Conversational Bots:** Providing instant, low-latency Turkish and English responses for customer service pipelines.
+* 🎮 **Gaming & NPC Logic:** Acting as a fast reasoning engine for dynamic in-game characters.
+---
+## 📄 License & Open Source
+Next2-Air is released under the **Apache 2.0 License**. We strongly believe in empowering developers, students, and enterprises with accessible, high-speed, reasoning-capable AI.
+---
+## 📞 Contact & Community
+* 📧 **Email:**[lamapicontact@gmail.com](mailto:lamapicontact@gmail.com)
+* 🤗 **HuggingFace:** [Lamapi](https://huggingface.co/Lamapi)
+* 💬 **Discord:** [Join the Lamapi Community](https://discord.gg/XgH4EpyPD2)
+---
+<div align="center" style="margin-top: 40px; padding: 25px; border-top: 1px solid #e0f2fe; background: #f0f9ff; border-radius: 8px;">
+  <p style="color: #0369a1; font-size: 15px; margin: 0;">
+    <strong>Next2-Air</strong> — Hafif, Hızlı, Akıllı. Uç cihazlardan buluta, Türkiye'nin yeni nesil çevik yapay zekası. 🌬️
+  </p>
+</div>