ndhieu2oo3
/

Doc2Bit-VL-8B-W8A8-Dynamic-Per-Token

 ---
 license: apache-2.0
+language:
+- en
+- vi
+tags:
+- vision-language
+- document-ai
+- vlm
+- ocr
+pipeline_tag: image-to-text
 ---
+# Doc2Bit-VL-8B-W8A8-Dynamic-Per-Token
+Doc2Bit-VL-8B-W8A8-Dynamic-Per-Token is a vision-language model fine-tuned and quantized to 8-bit integers for document understanding.
+# 📄 Document Information Extraction VLM
+A Vision-Language Model (VLM) specialized in **document understanding and information extraction**, supporting both **unstructured information** and **structured data (tables)** from document images.
+This model is optimized for production usage via **vLLM serving** with an **OpenAI-compatible API**.
+---
+## 🚀 Features
+- Vision-Language Model for document images
+- Extracts **unstructured key–value information**
+- Extracts **structured table data**, including **column-wise extraction**
+- Handles complex layouts (forms, invoices, reports, product tables)
+- Strict output formatting (no hallucination)
+- Compatible with **vLLM OpenAI-style API**
+- Prompting optimized for **Vietnamese instructions**
+---
+## 📌 Supported Data Types
+### 1. Unstructured Information
+Extract specific fields defined by the user, such as:
+- Invoice number
+- Date
+- Company name
+- Address
+- Total amount
+- Custom document attributes
+---
+### 2. Structured Table Data
+Designed for extracting **individual columns** from tables, especially product tables.
+Capabilities:
+- Column-level extraction
+- Ignore non-product rows
+- Markdown-formatted output
+- Clean and deterministic structure
+---
+## 🔧 Deployment (vLLM)
+This model is intended to be deployed using **vLLM** with an OpenAI-compatible interface.
+Example:
+```bash
+vllm serve <model-path-or-name> \
+  --served-model-name document-vlm \
+  --port 8000
+```
+## Prompt Usage
+Unstructured Data Extraction Prompt Example
+```bash
+prompt = f"""QUERY Trích xuất thông tin: {field_names}.
+            INSTRUCTION:
+            - Bắt buộc dữ liệu trả về theo format <index>. <key>:<value>, trong đó <index> là số thứ tự (1, 2, 3, 4, ...)
+            - key lấy chính xác từ trong QUERY của tôi
+            - không tự bịa dữ liệu và coi đó là điều hiển nhiên
+            - nếu không thể trích xuất thì hãy trả lời: tôi không thể tìm thấy dữ liệu này
+            """
+```
+Expected Output
+```bash
+1. Số hóa đơn: INV-001
+2. Ngày phát hành: 12/03/2024
+3. Tổng tiền: 1.250.000 VND
+```
+If data cannot be extracted:
+```bash
+tôi không thể tìm thấy dữ liệu này
+```
+Structured Table (Column-wise) Extraction Prompt Example
+```bash
+prompt = (
+    f"trích xuất thông tin tương ứng với sản phẩm của cột {col_name} trong bảng sản phẩm.\n"
+    "INSTRUCTION:\n"
+    "Xuất kết quả dưới dạng markdown một cột.\n"
+    "Bỏ qua những hàng không phải sản phẩm.\n"
+    f"Yêu cầu tiêu đề cột là |{col_name}|.\n"
+)
+```
+Expected Output
+```bash
+|Tên sản phẩm|
+|------------|
+|Sản phẩm A|
+|Sản phẩm B|
+|Sản phẩm C|
+```