DocV3RExpert-1.2B

1. Model Description

DocV3RExpert-1.2B is a state-of-the-art document layout analysis model fine-tuned for the MinerU 2.5 framework. Released in 2026, it features 1.2 billion parameters tailored for high-precision document understanding. Key Improvements:

Table Detection: Optimized for high-stakes scenarios such as financial reports and academic papers. Achieves significant breakthroughs in complex structures, including nested tables and borderless tables.
Diagram Recognition: Specialized recognition for flowcharts, architecture diagrams, and block diagrams. Note: Users need to modify the original MinerU code themselves to support block diagram detection, delivering precise boundaries for technical documentation. This model focuses on layout analysis (pipeline) and does not include a VLM (vision-language model) backend.

2. Requirements

MinerU version >= 2.5
Ensure you have installed and initialized the MinerU environment.

3. Usage

MinerU 2.5 uses mineru.json for configuration and supports switching model sources via environment variables. To use this model, you need to set MINERU_MODEL_SOURCE=local and point the configuration to this model.

Step 1: Download Model

Clone this model repository to your local machine or server.

git clone https://huggingface.co/linglongOCR-group/DocV3RExpert-1.2B

Step 2: Configuration

(A) Set Environment Variable

Force MinerU to use local models:

export MINERU_MODEL_SOURCE=local

(B) Modify Config File

Edit the mineru.json configuration file.

Default Path: ~/mineru.json
Custom Path: You can also specify a path via the MINERU_TOOLS_CONFIG_JSON environment variable. Update the models-dir section to point to the absolute path of the downloaded model. Option 1: Pipeline-only (recommended for this model)

{
  "models-dir": {
    "pipeline": "/absolute/path/to/DocV3RExpert-1.2B",
    "vlm": ""
  },
  "config_version": "1.3.1"
}

Note:

pipeline is used for layout analysis and other non-VLM tasks.

vlm is for the VLM backend. You can leave it empty if you are not using a VLM.

According to MinerU docs, models-dir should specify directories for pipeline and vlm separately. Option 2: If you also have a local VLM model If you want to use a VLM backend with a local VLM model, fill in the vlm field:

{
  "models-dir": {
    "pipeline": "/absolute/path/to/DocV3RExpert-1.2B",
    "vlm": "/absolute/path/to/your-vlm-model"
  },
  "config_version": "1.3.1"
}

Important: pipeline and vlm paths cannot point to the same directory; they must be separate model directories.

Step 3: Run Inference

3.1 Basic CLI (no VLM server)

If you are not using a VLM backend, you can directly run:

export MINERU_MODEL_SOURCE=local
mineru -p your_document.pdf -o output_dir

This will use your custom pipeline model for layout analysis.

3.2 With VLM backend (optional)

If you configured a vlm model and want to use the VLM backend:

Start the VLM server:

mineru-vllm-server --host 0.0.0.0 \
                    --port 30000 \
                    --max-model-len 16384 \
                    --data-parallel-size 1

Common parameters such as --port and --data-parallel-size are supported by mineru-vllm-server.

In another terminal, run MinerU with the VLM client backend:

export MINERU_MODEL_SOURCE=local
mineru -p your_document.pdf -o output_dir \
       -b vlm-http-client \
       -u http://127.0.0.1:30000

Here -b vlm-http-client tells MinerU to call the VLM server you just started.

Downloads last month: 51

Safetensors

Model size

1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for linglongOCR-group/DocV3RExpert-1.2B

Quantizations

1 model