shihao1989
/

Declaration-Form-Audit

+---
+license: apache-2.0
+language:
+- zh
+- en
+base_model:
+- Qwen/Qwen3-VL-8B-Instruct
+pipeline_tag: image-text-to-text
+tags:
+- Declaration-Form
+- Audit
+- vision
+- multimodal
+- customs
+- document-understanding
+---
+     # Declaration-Form-Audit
+     专为报关单证智能审核优化的多模态视觉语言模型。
+     ## 🎯 模型功能
+     本模型专注于进出口报关单证的智能审核任务，具备以下核心能力：
+     ### 单证信息提取
+     - **证书类型识别**：卫生证书、原产地证书、检验报告、合同、发票等
+     - **关键字段提取**：证书编号、集装箱号、件数、净重、毛重
+     - **商品明细解析**：逐行提取表格数据（商品名称、数量、金额等）
+     - **日期信息提取**：签发日期、有效期、生产日期
+     ### 表格数据处理
+     - 支持复杂多列表格的逐行扫描
+     - 准确识别数字、日期、文本混合内容
+     - 自动处理表格合并单元格和分栏结构
+     ### 多语言OCR
+     - 中文、英文、西班牙语、日文、俄文等多语言混合识别
+     - 支持手写体和印刷体混合文档
+     - 模糊字符智能识别优化
+     ### 单证比对审核
+     - 比对报关单与随附证书的一致性
+     - 识别数据异常和潜在风险点
+     - 生成结构化审核结果
+     ## 🔧 模型训练
+     ### 训练方法
+     本模型采用初期审单领域知识注入（CPT）+多阶段监督微调（SFT）+ 2阶段强化学习（RL）**的训练策略：
+     1. **视觉-语言对齐阶段**：增强模型对单证图像的理解能力
+     2. **领域数据适配阶段**：学习海关报关单证的专业术语和格式
+     3. **任务专项优化阶段**：针对表格提取、字段识别等具体任务强化训练
+     4. **多任务融合阶段**：综合提升各项审核能力
+     ### 训练数据规模
+     - 监督学习阶段：约70万条高质量标注样本
+     - 强化学习阶段：约15万条审核任务数据
+     - 覆盖20+国家/地区的单证格式
+     ## 📄 疑难PDF处理能力
+     ### 低质量图像优化
+     本模型在训练中特别针对实际业务中的疑难PDF进行了优化：
+     1. 特殊类型的证书编号：
+![wechat_2026-02-10_153652_180](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/eK4r6TuVmGsJ63AO3IX1z.png)
+     2. 负责表格数据提取及汇总：
+![wechat_2026-02-10_154616_288](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/Zx0iN8w210JnheF_eR-CK.png)
+     3. 不规范表格的提取：
+![wechat_2026-02-10_154730_980](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/btTW8h00LWyNKDwOXUAda.png)
+     4. 跨页单据的提取累加：
+![wechat_2026-02-10_154844_598](https://cdn-uploads.huggingface.co/production/uploads/679f3301b6a9cac2b8154fac/s-HKZLXpZPU_hErJpVWhi.png)
+     ### 实测效果
+     | 测试场景 | 准确率 |
+     |---------|-------|
+     | 证书编号识别 | 99%+ |
+     | 集装箱号提取 | 98%+ |
+     | 表格数据提取 | 99%+ |
+     | 件数重量识别 | 99%+ |
+     ## 🚀 快速开始
+     ### 安装依赖
+     ```bash
+     pip install transformers torch pillow
+     ```
+     ### Python推理
+     ```python
+     from transformers import Qwen2VLForConditionalGeneration, AutoProcessor
+     from PIL import Image
+     import torch
+     # 加载模型
+     model = Qwen2VLForConditionalGeneration.from_pretrained(
+         "shihao1989/Declaration-Form-Audit",
+         torch_dtype=torch.bfloat16,
+         device_map="auto"
+     )
+     processor = AutoProcessor.from_pretrained("shihao1989/Declaration-Form-Audit")
+     # 准备输入
+     image = Image.open("certificate.jpg")
+     messages = [
+         {
+             "role": "user",
+             "content": [
+                 {"type": "image"},
+     {
+       "type": "text",
+       "text": "请提取这份证书的证书编号、集装箱号、件数和净重。"
+     }
+             ]
+         }
+     ]
+     # 推理
+     text = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
+     inputs = processor(text=[text], images=[image], return_tensors="pt").to("cuda")
+     output = model.generate(**inputs, max_new_tokens=512, temperature=0.1)
+     result = processor.batch_decode(output, skip_special_tokens=True)[0]
+     print(result)
+     ```
+     ### vLLM部署（生产推荐）
+     ```bash
+     docker run -d \
+       --name declaration-audit \
+       --runtime=nvidia \
+       -e NVIDIA_VISIBLE_DEVICES=0 \
+       --ipc=host \
+       -p 8000:8000 \
+       vllm/vllm-openai:latest \
+       --model shihao1989/Declaration-Form-Audit \
+       --trust-remote-code \
+       --max-model-len 32000 \
+       --gpu-memory-utilization 0.9
+     ```
+     ### API调用
+     ```python
+     import requests
+     import base64
+     with open("certificate.jpg", "rb") as f:
+         image_b64 = base64.b64encode(f.read()).decode()
+     response = requests.post("http://localhost:8000/v1/chat/completions", json={
+         "model": "shihao1989/Declaration-Form-Audit",
+         "messages": [
+             {
+                 "role": "user",
+                 "content": [
+                     {"type": "text", "text": "提取证书编号和净重"},
+                     {"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_b64}"}}
+                 ]
+             }
+         ],
+         "max_tokens": 512,
+         "temperature": 0.1
+     })
+     print(response.json()["choices"][0]["message"]["content"])
+     ```
+     ## 💡 最佳实践
+     ### Prompt设计建议
+     **推荐格式（结构化输出）：**
+     ```
+     请从这份原产地证书中提取以下字段，返回JSON格式：
+     {
+       "cert_code": "证书编号",
+       "containers": ["集装箱号列表"],
+       "packages": 件数（整数）,
+       "net_weight_kg": 净重（数字）
+     }
+     只输出JSON，不要有额外文字。
+     ```
+     **关键原则：**
+     - 明确指定提取字段和格式
+     - 提供字段的可能名称（如"证书编号/Certificate No."）
+     - 使用JSON等结构化格式便于后处理
+     ## 📜 许可证
+     本模型遵循 Apache 2.0 许可证。
+     ## 🙏 致谢
+     - Qwen团队提供的优秀基座模型
+     - 海关业务专家提供的领域知识指导
+     ## 📮 联系方式
+     如有问题或建议，欢迎通过Hugging Face Discussions交流。