siddharthlohani commited on
Commit
63824f1
·
verified ·
1 Parent(s): 84f3e67

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -1
README.md CHANGED
@@ -13,4 +13,85 @@ tags:
13
  - image-classification
14
  - ultralytics
15
  - yolo
16
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - image-classification
14
  - ultralytics
15
  - yolo
16
+ ---
17
+
18
+ # YOLOv8s — SEC IPO Filing Image Classifier
19
+
20
+ A fine-tuned [YOLOv8s](https://github.com/ultralytics/ultralytics) model trained to classify images extracted from U.S. IPO registration statements (S-1 and F-1 filings) on [SEC EDGAR](https://www.sec.gov/edgar). This model serves as the initial detection stage in the pipeline used to construct the [gtfintechlab/ipo-images](https://huggingface.co/datasets/gtfintechlab/ipo-images) dataset.
21
+
22
+ ---
23
+
24
+ ## Classes
25
+
26
+ The model classifies images into 5 categories:
27
+
28
+ | Label | Description |
29
+ |---|---|
30
+ | `chart` | Bar charts, line charts, pie charts, org charts, flow charts, etc. |
31
+ | `logo` | Company logos and branding marks |
32
+ | `map` | Geographic maps |
33
+ | `infographic` | Composite visuals combining data, icons, and text |
34
+ | `other` | Decorative images, photographs, signatures, and other visuals |
35
+
36
+ ---
37
+
38
+ ## Usage
39
+
40
+ ### Install dependencies
41
+ ```bash
42
+ pip install ultralytics
43
+ ```
44
+
45
+ ### Run inference
46
+ ```python
47
+ from ultralytics import YOLO
48
+
49
+ model = YOLO("<path/to/model.pt>")
50
+
51
+ # Single image
52
+ results = model("path/to/image.png")
53
+ print(results[0].probs.top1) # top class index
54
+ print(results[0].names) # class name mapping
55
+
56
+ # With a confidence threshold
57
+ results = model("path/to/image.png", conf=0.5)
58
+
59
+ # Batch inference
60
+ results = model(["image1.png", "image2.png", "image3.png"])
61
+ for r in results:
62
+ print(r.probs.top1cls, r.names[r.probs.top1])
63
+ ```
64
+
65
+ ### Get the predicted label as a string
66
+ ```python
67
+ result = model("image.png")[0]
68
+ label = result.names[result.probs.top1]
69
+ print(label) # e.g. "chart"
70
+ ```
71
+
72
+ ---
73
+
74
+ ## Relation to the IPO Image Dataset
75
+
76
+ This model is the **first stage** of the classification pipeline used to build the [`gtfintechlab/ipo-images`](https://huggingface.co/datasets/gtfintechlab/ipo-images) dataset — a large-scale collection of 76,000+ labeled images from SEC IPO filings spanning 1994–2026.
77
+
78
+ The pipeline works as follows:
79
+
80
+ 1. **This model** generates an initial prediction (`initial_yolo_prediction`) for each image
81
+ 2. An **ensemble of 8 Vision-Language Models** verifies the prediction, producing a consensus score (`llm_yolo_verification_score`) and per-model votes (`llm_yolo_verification_votes`)
82
+ 3. The final `label` in the dataset reflects this verified output
83
+
84
+ ---
85
+
86
+ ## Citation
87
+
88
+ If you use this model in your work, please cite:
89
+ ```bibtex
90
+ @misc{galarnyk2026ipomine,
91
+ title = {IPO-Mine: A Toolkit and Dataset for Section-Structured Analysis of Long, Multimodal IPO Documents},
92
+ author = {Galarnyk, Michael and Lohani, Siddharth and Nandi, Sagnik and Patel, Aman and Kannan, Vidhyakshaya and Banerjee, Prasun and Routu, Rutwik and Ye, Liqin and Hiray, Arnav and Somani, Siddhartha and Chava, Sudheer},
93
+ year = {2026},
94
+ url = {https://huggingface.co/datasets/gtfintechlab/ipo-images},
95
+ note = {Preprint/Working Paper}
96
+ }
97
+ ```