--- language: - tr - en - de - es - fr - ru - zh - ja - ko license: apache-2.0 tags: - turkish - türkiye - reasoning - vision-language - vlm - multimodal - lamapi - next2-air - qwen3.5 - text-generation - image-text-to-text - open-source - 2b - edge-ai - large-language-model - llm - thinking-mode - fast-inference pipeline_tag: image-text-to-text datasets: - mlabonne/FineTome-100k - CognitiveKernel/CognitiveKernel-Pro-SFT - OpenSPG/KAG-Thinker-training-dataset - Gryphe/ChatGPT-4o-Writing-Prompts library_name: transformers ---
![nextf2](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/EmQx5TfKy8pLtC19CZGbL.png)

Next2-Air (2B)

Türkiye’s Fastest Lightweight Multimodal & Reasoning AI

License: Apache 2.0 Language HuggingFace Discord

--- ## 📖 Overview **Next2-Air** is a highly optimized, lightning-fast **2-Billion parameter Vision-Language Model (VLM)** built on the **Qwen 3.5-2B** architecture. Engineered by Lamapi in **Türkiye**, the "Air" moniker represents its core philosophy: **lightweight, incredibly fast, yet surprisingly capable.** While large models dominate cloud servers, Next2-Air is designed to bring top-tier reasoning and multimodal understanding directly to your local machines, edge devices, and everyday applications. By utilizing specialized instruction-tuning and logical reasoning datasets, we have created a 2B model that thinks deeply, processes images flawlessly, and speaks native Turkish and English. --- ## ⚡ Highlights
--- ## 📊 Benchmark Performance Next2-Air (2B) redefines what is possible in the ultra-lightweight category. Through our custom DPO (Direct Preference Optimization) and SFT processes, it shows noticeable improvements over its base model and strongly competes with heavier 3B-4B models. ### 📝 Text, Reasoning & Instruction Following
Benchmark Next2-Air (2B) Qwen 3.5 (2B) Gemma-2 (2B) Llama-3.2 (3B)
MMLU-Pro (Thinking) 68.2% 66.5% 54.1% 68.4%
MMLU-Redux 82.1% 79.6% 75.3% 79.5%
IFEval (Instruction) 82.5% 78.6% 75.8% 77.4%
TAU2-Bench (Agent) 52.4% 48.8% -- --
### 👁️ Multimodal & Vision Edge Next2-Air features a highly capable visual encoder, allowing it to process spatial intelligence, OCR, and document understanding tasks efficiently.
Benchmark Next2-Air (2B) Base Qwen3.5-2B
MMMU (General VQA) 66.5% 64.2%
MathVision 78.1% 76.7%
OCRBench 86.0% 84.5%
VideoMME (w/ sub) 77.8% 75.6%

* Enhanced scores in reasoning and OCR are a direct result of Lamapi's specialized bilingual finetuning pipeline focusing on edge-case logic and structural formatting.

--- ## 🚀 Quickstart & Usage **Next2-Air** is fully compatible with the Hugging Face `transformers` ecosystem and fast inference engines like `vLLM` and `SGLang`. Because it's a VLM, you can directly pass images into your prompts. ### Python (Transformers) Make sure you have `transformers`, `torch`, `torchvision`, and `pillow` installed. ```python from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor from PIL import Image import torch model_id = "thelamapi/next2-air" model = AutoModelForCausalLM.from_pretrained(model_id) processor = AutoProcessor.from_pretrained(model_id) # For vision. tokenizer = AutoTokenizer.from_pretrained(model_id) # Create a message in chat format messages = [ {"role": "system","content": [{"type": "text", "text": "You are Next2 Air, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."}]}, { "role": "user","content": [ {"type": "text", "text": "Write a highly optimized Rust function to calculate the Fibonacci sequence using memoization"} ] } ] # Prepare input with Tokenizer prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=False) inputs = processor(text=prompt, return_tensors="pt") # Remove 'mm_token_type_ids' if it's not needed for text-only generation if "mm_token_type_ids" in inputs: del inputs["mm_token_type_ids"] # Output from the model output = model.generate(**inputs, do_sample=True, temperature=0.7, max_new_tokens=128) print(tokenizer.decode(output[0], skip_special_tokens=True)) ``` --- ## 🧩 Model Specifications | Attribute | Details | | :--- | :--- | | **Base Architecture** | Qwen 3.5 (Causal Language Model + Vision Encoder) | | **Parameters** | 2 Billion (Ultra-Lightweight) | | **Context Length** | 262,144 tokens natively | | **Hardware** | Optimized for Edge devices, MacBooks (MLX), Consumer GPUs, and low-VRAM environments. | | **Capabilities** | Text Generation, Image Understanding, OCR, Logic & Reasoning (CoT), Bilingual (TR/EN) | --- ## 🎯 Ideal Use Cases **Next2-Air** is the undisputed champion of local, fast inference tasks. It is perfect for: * 🔋 **Mobile & Edge AI:** Deploying smart assistants natively on smartphones or Raspberry Pi without relying on cloud APIs. * ⚡ **Real-Time OCR & Parsing:** Quickly scanning receipts, invoices, or UI screenshots to extract data in milliseconds. * 💬 **Fast Conversational Bots:** Providing instant, low-latency Turkish and English responses for customer service pipelines. * 🎮 **Gaming & NPC Logic:** Acting as a fast reasoning engine for dynamic in-game characters. --- ## 📄 License & Open Source Next2-Air is released under the **Apache 2.0 License**. We strongly believe in empowering developers, students, and enterprises with accessible, high-speed, reasoning-capable AI. --- ## 📞 Contact & Community * 📧 **Email:**[lamapicontact@gmail.com](mailto:lamapicontact@gmail.com) * 🤗 **HuggingFace:** [Lamapi](https://huggingface.co/Lamapi) * 💬 **Discord:** [Join the Lamapi Community](https://discord.gg/XgH4EpyPD2) ---

Next2-Air — Hafif, Hızlı, Akıllı. Uç cihazlardan buluta, Türkiye'nin yeni nesil çevik yapay zekası. 🌬️