Customs Declaration-Form-Audit VLM

---
license: apache-2.0
language:
- zh
- en
base_model:
- Qwen/Qwen3-VL-8B-Instruct
pipeline_tag: image-text-to-text
tags:
- Declaration-Form
- Audit
- vision
- multimodal
- customs
- document-understanding
---


<h1> Customs Declaration-Form-Audit VLM </h1>


<h5 align="center"> 

<p align="center">
  
</p>


</h5>
</div>


     专为报关单证智能审核优化的多模态视觉语言模型。


     ## 🎯 模型功能

     本模型专注于进出口报关单证的智能审核任务，具备以下核心能力：

     ### 单证信息提取
     - **证书类型识别**：卫生证书、原产地证书、检验报告、合同、发票等
     - **关键字段提取**：证书编号、集装箱号、件数、净重、毛重
     - **商品明细解析**：逐行提取表格数据（商品名称、数量、金额等）
     - **日期信息提取**：签发日期、有效期、生产日期

     ### 表格数据处理
     - 支持复杂多列表格的逐行扫描
     - 准确识别数字、日期、文本混合内容
     - 自动处理表格合并单元格和分栏结构

     ### 多语言OCR
     - 中文、英文、西班牙语、日文、俄文等多语言混合识别
     - 支持手写体和印刷体混合文档
     - 模糊字符智能识别优化

     ### 单证比对审核
     - 比对报关单与随附证书的一致性
     - 识别数据异常和潜在风险点
     - 生成结构化审核结果

     ## 🔧 模型训练

     ### 训练方法
     本模型采用初期审单领域知识注入（CPT）+多阶段监督微调（SFT）+ 2阶段强化学习（RL）**的训练策略：

     1. **视觉-语言对齐阶段**：增强模型对单证图像的理解能力
     2. **领域数据适配阶段**：学习海关报关单证的专业术语和格式
     3. **任务专项优化阶段**：针对表格提取、字段识别等具体任务强化训练
     4. **多任务融合阶段**：综合提升各项审核能力

     ### 训练数据规模
     - 监督学习阶段：约70万条高质量标注样本
     - 强化学习阶段：约15万条审核任务数据
     - 覆盖20+国家/地区的单证格式

     ## 📄 疑难PDF处理能力

     ### 低质量图像优化
     本模型在训练中特别针对实际业务中的疑难PDF进行了优化：
     
     1. 特殊类型的证书编号：

![wechat_2026-02-10_153652_180](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/eK4r6TuVmGsJ63AO3IX1z.png)

     2. 负责表格数据提取及汇总：

![wechat_2026-02-10_154616_288](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/Zx0iN8w210JnheF_eR-CK.png)

     
     3. 不规范表格的提取：
![wechat_2026-02-10_154730_980](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/btTW8h00LWyNKDwOXUAda.png)


     4. 跨页单据的提取累加：

![wechat_2026-02-10_154844_598](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/s-HKZLXpZPU_hErJpVWhi.png)

     
     ### 实测效果

     | 测试场景 | 准确率 | 
     |---------|-------|
     | 证书编号识别 | 99%+ | 
     | 集装箱号提取 | 98%+ |
     | 表格数据提取 | 99%+ | 
     | 件数重量识别 | 99%+ |

     ## 🚀 快速开始

     ### 安装依赖

     ```bash
     pip install transformers torch pillow
     ```

     ### Python推理

     ```python
     from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
     from PIL import Image
     import torch

     # 加载模型
     model = Qwen2VLForConditionalGeneration.from_pretrained(
         "shihao1989/Declaration-Form-Audit",
         torch_dtype=torch.bfloat16,
         device_map="auto"
     )
     processor = AutoProcessor.from_pretrained("shihao1989/Declaration-Form-Audit")

     # 准备输入
     image = Image.open("certificate.jpg")
     messages = [
         {
             "role": "user",
             "content": [
                 {"type": "image"},
     {
       "type": "text",
       "text": "请提取这份证书的证书编号、集装箱号、件数和净重。"
     }
             ]
         }
     ]

     # 推理
     text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
     inputs = processor(text=[text], images=[image], return_tensors="pt").to("cuda")

     output = model.generate(**inputs, max_new_tokens=512, temperature=0.1)
     result = processor.batch_decode(output, skip_special_tokens=True)[0]
     print(result)
     ```

     ### vLLM部署（生产推荐）

     ```bash
     docker run -d \
       --name declaration-audit \
       --runtime=nvidia \
       -e NVIDIA_VISIBLE_DEVICES=0 \
       --ipc=host \
       -p 8000:8000 \
       vllm/vllm-openai:latest \
       --model shihao1989/Declaration-Form-Audit \
       --trust-remote-code \
       --max-model-len 32000 \
       --gpu-memory-utilization 0.9
     ```

     ### API调用

     ```python
     import requests
     import base64

     with open("certificate.jpg", "rb") as f:
         image_b64 = base64.b64encode(f.read()).decode()

     response = requests.post("http://localhost:8000/v1/chat/completions", json={
         "model": "shihao1989/Declaration-Form-Audit",
         "messages": [
             {
                 "role": "user",
                 "content": [
                     {"type": "text", "text": "提取证书编号和净重"},
                     {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}}
                 ]
             }
         ],
         "max_tokens": 512,
         "temperature": 0.1
     })

     print(response.json()["choices"][0]["message"]["content"])
     ```

     ## 💡 最佳实践

     ### Prompt设计建议

     **推荐格式（结构化输出）：**
     ```
     请从这份原产地证书中提取以下字段，返回JSON格式：
     {
       "cert_code": "证书编号",
       "containers": ["集装箱号列表"],
       "packages": 件数（整数）,
       "net_weight_kg": 净重（数字）
     }
     只输出JSON，不要有额外文字。
     ```

     **关键原则：**
     - 明确指定提取字段和格式
     - 提供字段的可能名称（如"证书编号/Certificate No."）
     - 使用JSON等结构化格式便于后处理

     ## 📜 许可证

     本模型遵循 Apache 2.0 许可证。

     ## 🙏 致谢

     - Qwen团队提供的优秀基座模型
     - 海关业务专家提供的领域知识指导

     ## 📮 联系方式

     如有问题或建议，欢迎通过Hugging Face Discussions交流。
     邮箱:199416378@qq.com