ONNX
chemistry
MolDetv2 / README.md
AI4Industry's picture
Update README.md
da44397 verified
|
raw
history blame
3.25 kB
metadata
license: cc-by-nc-sa-4.0
datasets:
  - UniParser/MolDet-Bench
base_model:
  - UniParser/MolDet
  - Ultralytics/YOLO11
tags:
  - chemistry

Molecule Detection YOLO in MolParser2.0

Compared to MolDet, MolDetv2 leverages more manually annotated training data, with further optimizations specifically for reducing molecular false detections and improving bounding box regression, achieving stronger performance with a smaller model.

[MolDet-General] General molecule structure detection models

YOLO11-n weights trained on more than 100k human annotated image crops & synthesis molecule images.

image

  • 640x640 input resolution
  • support handwritten molecules detection
  • multiscale input (inputs can be single/multiple molecular cutouts, reaction or table cutouts, or single-page PDF images)
  • update: MolDetv2 substantially reduces false positives on formulas, ball-and-stick diagrams, etc.

usage:

from ultralytics import YOLO
model = YOLO("/path/to/moldet_v2_yolo11n_640_general.pt") # for cpu only inference: using `moldet_v2_yolo11n_640_general.onnx` for faster speed
model.predict("path/to/image.png", save=True, imgsz=640, conf=0.5)

For further usage instructions, please refer to the official Ultralytics documentation.

[MolDet-Doc] PDF molecule structure detection models

YOLO11-n weights trained on more than 60k human annotated PDF pages (patents, papers, and books) and 10k synthesis PDF pages with molecule images.

image

  • 960x960 input resolution
  • prefer single page PDF image input
  • better in small molecule detection
  • update: MolDetv2 substantially reduces false positives on formulas, ball-and-stick diagrams, and graphical symbols, with tighter bounding box alignment to molecular edges.

usage:

from ultralytics import YOLO
import fitz # MuPDF
pdf = fitz.open("doc.pdf")
model = YOLO("/path/to/moldet_v2_yolo11n_960_doc.pt")  # for cpu only inference: using `moldet_v2_yolo11n_960_doc.onnx` for faster speed
bboxes = []
for i, p in enumerate(pdf):
    img = f"page_{i}.png"; p.get_pixmap().save(img)
    for r in model.predict(img, imgsz=960, conf=0.5):
        for box in r.boxes:
            bboxes.append({"page":img, "conf":float(box.conf), "bbox":box.xyxy[0].tolist()})

For further usage instructions, please refer to the official Ultralytics documentation.

πŸ“Š BenchMark Results

Please refer to MolDet-Bench

image image image

πŸ“– Citation

If you use this model in your work, please cite:

Comming soon!