metadata
tags:
- text-generation-inference
- transformers
- unsloth
- qwen3_vl
- trl
- sft
- chemistry
- code
- climate
- art
- biology
- finance
- legal
- music
- medical
- agent
license: apache-2.0
language:
- en
- ab
- aa
- ae
- af
- ak
- am
- an
- ar
- as
- av
- ay
- az
- ba
- be
- bg
- bh
- bi
- bm
- bn
- bo
- br
- bs
- ca
- ce
- ch
- co
- cr
- cs
- cu
- cv
- cy
- da
- de
- dv
- dz
- ee
- el
- eo
- es
- et
- eu
- fa
- ff
- fi
- fj
- fo
- fr
- fy
- ga
- gd
- gl
- gn
- gv
- ha
- he
- hi
- ho
- gu
- hr
- ht
- hu
- hz
- hy
- id
- ia
- ig
- ie
- ik
- ii
- is
- io
- iu
- it
- jv
- ja
- kg
- ka
- kj
- ki
- kl
- kk
- kn
- km
- kr
- ko
- ku
- ks
- kw
- kv
- la
- ky
- lg
- lb
- ln
- li
- lt
- lo
- lv
- lu
- mg
- mi
- mh
- ml
- mk
- mr
- mn
- mt
- ms
- na
- my
- nd
- nb
- ng
- nl
- ne
- 'no'
- nn
- nv
- nr
- oc
- oj
- om
- ny
- os
- or
- pa
- pi
- pl
- ps
- pt
- rm
- rn
- qu
- ro
- ru
- sn
- rw
- so
- sa
- sc
- sd
pipeline_tag: image-text-to-text
library_name: transformers
🖼️ Next OCR 8B
Compact OCR AI — Accurate, Fast, Multilingual, Math-Optimized
📖 Overview
Next OCR 8B is an 8-billion parameter model optimized for optical character recognition (OCR) tasks with mathematical and tabular content understanding.
Supports multilingual OCR (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas.
⚡ Highlights
- 🖼️ Accurate text extraction, including math and tables
- 🌍 Multilingual support (30+ languages)
- ⚡ Lightweight and efficient
- 💬 Instruction-tuned for document understanding and analysis
📊 Benchmark & Comparison
| Model | OCR Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
|---|---|---|---|
| Next OCR 8B | 98.9 | 96.7 | 94.4 |
| DeepSeek‑OCR 3B | 97 (yüksek sıkıştırmada) | 88–90 | 85–87 |
🚀 Installation & Usage
from transformers import AutoTokenizer, AutoModelForVision2Seq
import torch
model_id = "Lamapi/next-ocr"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
image_path = "document.png"
images = [image_path]
inputs = tokenizer(images, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
🧩 Key Features
| Feature | Description |
|---|---|
| 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. |
| 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. |
| ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. |
| 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. |
| 🏢 Reliable Outputs | Suitable for enterprise document workflows. |
📐 Model Specifications
| Specification | Details |
|---|---|
| Base Model | Qwen 3 |
| Parameters | 8 Billion |
| Architecture | Vision + Transformer (OCR LLM) |
| Modalities | Image-to-text |
| Fine-Tuning | OCR datasets with multilingual and math/tabular content |
| Optimizations | Quantization-ready, FP16 support |
| Primary Focus | Text extraction, document understanding, mathematical OCR |
🎯 Ideal Use Cases
- Document digitization
- Invoice & receipt processing
- Multilingual OCR pipelines
- Tables, forms, and formulas extraction
- Enterprise document management
📄 License
MIT License — free for commercial & non-commercial use.
📞 Contact & Support
- 📧 Email: lamapicontact@gmail.com
- 🤗 HuggingFace: Lamapi
Next OCR — Compact OCR + math-capable AI, blending accuracy, speed, and multilingual document intelligence.