next-12b / README.md

Lamapi

Update README.md

28c7ad0 verified about 1 month ago

preview code

raw

history blame contribute delete

11.2 kB

metadata

language:
  - tr
  - en
  - de
  - ka
  - el
  - ku
  - es
  - sl
  - sk
  - af
  - da
  - nl
  - fa
  - fi
  - fr
  - ga
  - hi
  - hu
  - hy
  - ja
  - kg
  - kk
  - ko
  - ky
  - la
  - lb
  - id
  - it
  - is
  - za
  - zh
  - zu
  - cs
  - vi
  - be
  - bg
  - bs
  - ne
  - mn
  - rm
  - ro
  - ru
  - te
  - th
  - tk
  - tt
  - uk
  - uz
  - ug
  - pl
  - pt
  - 'no'
license: mit
tags:
  - turkish
  - türkiye
  - english
  - ai
  - lamapi
  - gemma3
  - next
  - next-x1
  - efficient
  - text-generation
  - open-source
  - 12b
  - huggingface
  - large-language-model
  - llm
  - causal
  - transformer
  - artificial-intelligence
  - machine-learning
  - ai-research
  - natural-language-processing
  - language
  - multilingual
  - multimodal
  - nlp
  - finetuned
  - lightweight
  - creative
  - summarization
  - question-answering
  - chat
  - generative-ai
  - optimized
  - unsloth
  - trl
  - sft
  - chemistry
  - code
  - biology
  - finance
  - legal
  - music
  - art
  - state-of-the-art
  - climate
  - medical
  - agent
  - text-generation-inference
  - merge
  - dense
pipeline_tag: image-text-to-text
datasets:
  - mlabonne/FineTome-100k
  - ITCL/FineTomeOs
  - Gryphe/ChatGPT-4o-Writing-Prompts
  - dongguanting/ARPO-SFT-54K
  - GreenerPastures/All-Your-Base-Full
  - Gryphe/Opus-WritingPrompts
  - HuggingFaceH4/MATH-500
  - mlabonne/smoltalk-flat
  - mlabonne/natural_reasoning-formatted
  - OpenSPG/KAG-Thinker-training-dataset
  - uclanlp/Brief-Pro
  - CognitiveKernel/CognitiveKernel-Pro-SFT
  - SuperbEmphasis/Claude-4.0-DeepSeek-R1-RP-SFWish
  - QuixiAI/dolphin-r1
  - mlabonne/lmsys-arena-human-sft-55k
library_name: transformers

🚀 Next 12B (m200)

Türkiye's Advanced Vision-Language Model — High Performance, Multimodal, and Enterprise-Ready

📖 Overview

Next 12B is a 12-billion parameter multimodal Vision-Language Model (VLM) based on Gemma 3, fine-tuned to deliver exceptional performance in both text and image understanding. This is Türkiye's most advanced open-source vision-language model, designed for:

Superior understanding and generation of text and image descriptions.
Advanced reasoning and context-aware multimodal outputs.
Professional-grade Turkish support with extensive multilingual capabilities.
Enterprise-ready deployment with optimized quantization options.

This model is ideal for enterprises, researchers, and organizations who need a state-of-the-art multimodal AI capable of complex visual understanding, advanced reasoning, and creative generation.

Next 12B sets new standards for medium-sized models across all major benchmarks.

Model	MMLU (5-shot) %	MMLU-Pro %	GSM8K %	MATH %
Next 14B (Thinking)	94.6	93.2	98.8	92.7
Next 12B	92.7	84.4	95.3	87.2
Next 8B (Thinking)	91.0	88.5	96.2	88.0
GPT-5	92.5	87.0	98.4	96.0
Claude Opus 4.1 (Thinking)	~92.0	87.8	84.7	95.4

---

🚀 Installation & Usage

Use with vision:

from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor
from PIL import Image
import torch

model_id = "Lamapi/next-12b"

model = AutoModelForCausalLM.from_pretrained(model_id)
processor = AutoProcessor.from_pretrained(model_id) # For vision.
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Read image
image = Image.open("image.jpg")

# Create a message in chat format
messages = [
  {"role": "system","content": [{"type": "text", "text": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."}]},

  {
      "role": "user","content": [{"type": "image", "image": image},
      {"type": "text", "text": "Who is in this image?"}
    ]
  }
]

# Prepare input with Tokenizer
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = processor(text=prompt, images=[image], return_tensors="pt")

# Output from the model
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Who is in this image?

The image shows Mustafa Kemal Atatürk, the founder and first President of the Republic of Turkey.

Use without vision:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "Lamapi/next-12b"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

# Chat message
messages = [
    {"role": "system", "content": "You are Next-X1, a smart and concise AI assistant trained by Lamapi. Always respond in the user's language. Proudly made in Turkey."},
    {"role": "user", "content": "Hello, how are you?"}
]

# Prepare input with Tokenizer
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt")

# Output from the model
output = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Hello, how are you?

I'm fine, thank you. How are you?

🎯 Goals

Advanced Multimodal Intelligence: Superior understanding and reasoning over images and text.
Enterprise-Grade Performance: High accuracy and reliability for production deployments.
Efficiency: Optimized for professional GPUs with flexible quantization options.
Accessibility: Open-source availability for research and commercial applications.
Cultural Excellence: Best-in-class Turkish language support while maintaining multilingual capabilities.

✨ Key Features

Feature	Description
🔋 Optimized Architecture	Balanced performance and efficiency; supports multiple quantization formats.
🖼️ Advanced Vision-Language	Deep understanding of images with sophisticated visual reasoning capabilities.
🇹🇷 Professional Turkish Support	Industry-leading Turkish language performance with extensive multilingual reach.
🧠 Superior Reasoning	State-of-the-art logical and analytical reasoning for complex tasks.
📊 Production-Ready	Reliable, consistent outputs suitable for enterprise applications.
🌍 Open Source	Transparent, community-driven, and commercially friendly.

📐 Model Specifications

Specification	Details
Base Model	Gemma 3
Parameter Count	12 Billion
Architecture	Transformer, causal LLM + Enhanced Vision Encoder
Fine-Tuning Method	Advanced instruction & multimodal fine-tuning (SFT) on curated Turkish and multilingual datasets
Optimizations	Q8_0, Q4_K_M, F16, F32 quantizations for flexible deployment options
Modalities	Text & Image
Use Cases	Advanced image captioning, multimodal QA, text generation, complex reasoning, creative storytelling, enterprise applications

💡 Performance Highlights

MMLU Excellence: 91.8% on MMLU benchmark, demonstrating comprehensive knowledge across diverse domains
Mathematical Prowess: 81.2% on MATH benchmark, excelling in complex mathematical reasoning
Problem Solving: 94.3% on GSM8K, showcasing superior word problem solving capabilities
Professional Reasoning: 78.4% on MMLU-Pro, handling advanced professional-level questions

🎨 Use Cases

Enterprise Content Generation: High-quality multilingual content creation
Advanced Visual Analysis: Detailed image understanding and description
Educational Applications: Complex tutoring and explanation systems
Research Assistance: Literature review and data analysis
Creative Writing: Story generation and creative content
Technical Documentation: Code documentation and technical writing
Customer Support: Multilingual customer service automation
Data Extraction: Visual document processing and information extraction

📄 License

This project is licensed under the MIT License — free to use, modify, and distribute for commercial and non-commercial purposes. Attribution is appreciated.

📞 Contact & Support

📧 Email: lamapicontact@gmail.com
🤗 HuggingFace: Lamapi

Next 12B — Türkiye's most advanced vision-language AI, combining state-of-the-art multimodal understanding, superior reasoning, and enterprise-grade reliability.