ndhieu2oo3
/

Doc2Bit-VL-8B

vision-language

Model card Files Files and versions

rabbit commited on Jan 24

Commit

2adbcf5

·

1 Parent(s): bfda15f

update

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -20,6 +20,7 @@ Doc2Bit-VL-7B is a vision-language model fine-tuned for document understanding.
 A Vision-Language Model (VLM) specialized in **document understanding and information extraction**, supporting both **unstructured information** and **structured data (tables)** from document images.
 This model is optimized for production usage via **vLLM serving** with an **OpenAI-compatible API**.
 ---
 ## 🚀 Features
 - Vision-Language Model for document images
@@ -32,6 +33,7 @@ This model is optimized for production usage via **vLLM serving** with an **Open
 ---
 ## 📌 Supported Data Types
 ### 1. Unstructured Information
 Extract specific fields defined by the user, such as:
 - Invoice number
 - Date
@@ -41,6 +43,7 @@ Extract specific fields defined by the user, such as:
 - Custom document attributes
 ---
 ### 2. Structured Table Data
 Designed for extracting **individual columns** from tables, especially product tables.
 Capabilities:
 - Column-level extraction

 A Vision-Language Model (VLM) specialized in **document understanding and information extraction**, supporting both **unstructured information** and **structured data (tables)** from document images.
 This model is optimized for production usage via **vLLM serving** with an **OpenAI-compatible API**.
 ---
 ## 🚀 Features
 - Vision-Language Model for document images
 ---
 ## 📌 Supported Data Types
 ### 1. Unstructured Information
 Extract specific fields defined by the user, such as:
 - Invoice number
 - Date
 - Custom document attributes
 ---
 ### 2. Structured Table Data
 Designed for extracting **individual columns** from tables, especially product tables.
 Capabilities:
 - Column-level extraction