DocV3RExpert-1.2B

1. Model Description

DocV3RExpert-1.2B is a state-of-the-art document layout analysis model fine-tuned for the MinerU 2.5 framework. Released in 2026, it features 1.2 billion parameters tailored for high-precision document understanding. Key Improvements:

  • Table Detection: Optimized for high-stakes scenarios such as financial reports and academic papers. Achieves significant breakthroughs in complex structures, including nested tables and borderless tables.
  • Diagram Recognition: Specialized recognition for flowcharts, architecture diagrams, and block diagrams. Note: Users need to modify the original MinerU code themselves to support block diagram detection, delivering precise boundaries for technical documentation. This model focuses on layout analysis (pipeline) and does not include a VLM (vision-language model) backend.

2. Requirements

  • MinerU version >= 2.5
  • Ensure you have installed and initialized the MinerU environment.

3. Usage

MinerU 2.5 uses mineru.json for configuration and supports switching model sources via environment variables. To use this model, you need to set MINERU_MODEL_SOURCE=local and point the configuration to this model.

Step 1: Download Model

Clone this model repository to your local machine or server.

git clone https://huggingface.co/linglongOCR-group/DocV3RExpert-1.2B

Step 2: Configuration

(A) Set Environment Variable

Force MinerU to use local models:

export MINERU_MODEL_SOURCE=local

(B) Modify Config File

Edit the mineru.json configuration file.

  • Default Path: ~/mineru.json
  • Custom Path: You can also specify a path via the MINERU_TOOLS_CONFIG_JSON environment variable. Update the models-dir section to point to the absolute path of the downloaded model. Option 1: Pipeline-only (recommended for this model)
{
  "models-dir": {
    "pipeline": "/absolute/path/to/DocV3RExpert-1.2B",
    "vlm": ""
  },
  "config_version": "1.3.1"
}

Note:

  • pipeline is used for layout analysis and other non-VLM tasks.
  • vlm is for the VLM backend. You can leave it empty if you are not using a VLM.
  • According to MinerU docs, models-dir should specify directories for pipeline and vlm separately. Option 2: If you also have a local VLM model If you want to use a VLM backend with a local VLM model, fill in the vlm field:
{
  "models-dir": {
    "pipeline": "/absolute/path/to/DocV3RExpert-1.2B",
    "vlm": "/absolute/path/to/your-vlm-model"
  },
  "config_version": "1.3.1"
}

Important: pipeline and vlm paths cannot point to the same directory; they must be separate model directories.

Step 3: Run Inference

3.1 Basic CLI (no VLM server)

If you are not using a VLM backend, you can directly run:

export MINERU_MODEL_SOURCE=local
mineru -p your_document.pdf -o output_dir

This will use your custom pipeline model for layout analysis.

3.2 With VLM backend (optional)

If you configured a vlm model and want to use the VLM backend:

  1. Start the VLM server:
    mineru-vllm-server --host 0.0.0.0 \
                        --port 30000 \
                        --max-model-len 16384 \
                        --data-parallel-size 1
    
    Common parameters such as --port and --data-parallel-size are supported by mineru-vllm-server.
  2. In another terminal, run MinerU with the VLM client backend:
    export MINERU_MODEL_SOURCE=local
    mineru -p your_document.pdf -o output_dir \
           -b vlm-http-client \
           -u http://127.0.0.1:30000
    
    Here -b vlm-http-client tells MinerU to call the VLM server you just started.
Downloads last month
51
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for linglongOCR-group/DocV3RExpert-1.2B

Quantizations
1 model