ONNX
chemistry
AI4Industry commited on
Commit
cd6a61c
·
verified ·
1 Parent(s): 37e3ec5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +75 -3
README.md CHANGED
@@ -1,3 +1,75 @@
1
- ---
2
- license: cc-by-nc-sa-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ datasets:
4
+ - UniParser/MolDet-Bench
5
+ base_model:
6
+ - UniParser/MolDet
7
+ - Ultralytics/YOLO11
8
+ tags:
9
+ - chemistry
10
+ ---
11
+
12
+
13
+ # Molecule Detection YOLO in MolParser2.0
14
+
15
+ Compared to [MolDet](https://huggingface.co/UniParser/MolDet), **MolDetv2** leverages more manually annotated training data, with further optimizations specifically for reducing molecular false detections and improving bounding box regression, achieving stronger performance with a smaller model.
16
+
17
+ ## [MolDet-General] General molecule structure detection models
18
+
19
+ YOLO11-n weights trained on more than 100k human annotated image crops & synthesis molecule images.
20
+
21
+ * 640x640 input resolution
22
+ * support handwritten molecules detection
23
+ * **multiscale input** (inputs can be single/multiple molecular cutouts, reaction or table cutouts, or single-page PDF images)
24
+ * *update: MolDetv2 substantially reduces false positives on formulas, ball-and-stick diagrams, etc.*
25
+
26
+ usage:
27
+ ```python
28
+ from ultralytics import YOLO
29
+ model = YOLO("/path/to/moldet_v2_yolo11n_640_general.pt") # for cpu only inference: using `moldet_v2_yolo11n_640_general.onnx` for faster speed
30
+ model.predict("path/to/image.png", save=True, imgsz=640, conf=0.5)
31
+ ```
32
+ For further usage instructions, please refer to the [official Ultralytics documentation](https://docs.ultralytics.com/modes/predict/).
33
+
34
+ ## [MolDet-Doc] PDF molecule structure detection models
35
+
36
+ YOLO11-n weights trained on more than 60k human annotated PDF pages (patents, papers, and books) and 10k synthesis PDF pages with molecule images.
37
+
38
+ * 960x960 input resolution
39
+ * prefer **single page PDF image** input
40
+ * better in small molecule detection
41
+ * *update: MolDetv2 substantially reduces false positives on formulas, ball-and-stick diagrams, and graphical symbols, with tighter bounding box alignment to molecular edges.*
42
+
43
+ usage:
44
+ ```python
45
+ from ultralytics import YOLO
46
+ import fitz # MuPDF
47
+ pdf = fitz.open("doc.pdf")
48
+ model = YOLO("/path/to/moldet_v2_yolo11n_960_doc.pt") # for cpu only inference: using `moldet_v2_yolo11n_960_doc.onnx` for faster speed
49
+ bboxes = []
50
+ for i, p in enumerate(pdf):
51
+ img = f"page_{i}.png"; p.get_pixmap().save(img)
52
+ for r in model.predict(img, imgsz=960, conf=0.5):
53
+ for box in r.boxes:
54
+ bboxes.append({"page":img, "conf":float(box.conf), "bbox":box.xyxy[0].tolist()})
55
+ ```
56
+ For further usage instructions, please refer to the [official Ultralytics documentation](https://docs.ultralytics.com/modes/predict/).
57
+
58
+ ## 📊 BenchMark Results
59
+
60
+ Please refer to [MolDet-Bench](https://huggingface.co/datasets/UniParser/MolDet-Bench)
61
+
62
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/65f7f16fb6941db5c2e7c4bf/cVEMAL4xpy-y2vgmQ5mR-.png)
63
+
64
+
65
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/65f7f16fb6941db5c2e7c4bf/KHwlqpRZzTOlbOdCG7SyD.png)
66
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/65f7f16fb6941db5c2e7c4bf/l-oNYAv3aGbqvGjImYstW.png)
67
+
68
+
69
+ ## 📖 Citation
70
+
71
+ If you use this model in your work, please cite:
72
+
73
+ ```
74
+ comming soon!
75
+ ```