gtfintechlab
/

ipomine-yolov8-classifier

Image Classification

Model card Files Files and versions

ipomine-yolov8-classifier / README.md

siddharthlohani's picture

siddharthlohani

Update README.md

63824f1 verified 21 days ago

|

history blame contribute delete

3.04 kB

	---
	license: cc-by-4.0
	datasets:
	- gtfintechlab/ipo-images
	language:
	- en
	base_model:
	- Ultralytics/YOLOv8
	pipeline_tag: image-classification
	tags:
	- finance
	- ipo
	- image-classification
	- ultralytics
	- yolo
	---

	# YOLOv8s — SEC IPO Filing Image Classifier

	A fine-tuned [YOLOv8s](https://github.com/ultralytics/ultralytics) model trained to classify images extracted from U.S. IPO registration statements (S-1 and F-1 filings) on [SEC EDGAR](https://www.sec.gov/edgar). This model serves as the initial detection stage in the pipeline used to construct the [gtfintechlab/ipo-images](https://huggingface.co/datasets/gtfintechlab/ipo-images) dataset.

	---

	## Classes

	The model classifies images into 5 categories:

	\| Label \| Description \|
	\|---\|---\|
	\| `chart` \| Bar charts, line charts, pie charts, org charts, flow charts, etc. \|
	\| `logo` \| Company logos and branding marks \|
	\| `map` \| Geographic maps \|
	\| `infographic` \| Composite visuals combining data, icons, and text \|
	\| `other` \| Decorative images, photographs, signatures, and other visuals \|

	---

	## Usage

	### Install dependencies
	```bash
	pip install ultralytics
	```

	### Run inference
	```python
	from ultralytics import YOLO

	model = YOLO("<path/to/model.pt>")

	# Single image
	results = model("path/to/image.png")
	print(results[0].probs.top1) # top class index
	print(results[0].names) # class name mapping

	# With a confidence threshold
	results = model("path/to/image.png", conf=0.5)

	# Batch inference
	results = model(["image1.png", "image2.png", "image3.png"])
	for r in results:
	print(r.probs.top1cls, r.names[r.probs.top1])
	```

	### Get the predicted label as a string
	```python
	result = model("image.png")[0]
	label = result.names[result.probs.top1]
	print(label) # e.g. "chart"
	```

	---

	## Relation to the IPO Image Dataset

	This model is the first stage of the classification pipeline used to build the [`gtfintechlab/ipo-images`](https://huggingface.co/datasets/gtfintechlab/ipo-images) dataset — a large-scale collection of 76,000+ labeled images from SEC IPO filings spanning 1994–2026.

	The pipeline works as follows:

	1. This model generates an initial prediction (`initial_yolo_prediction`) for each image
	2. An ensemble of 8 Vision-Language Models verifies the prediction, producing a consensus score (`llm_yolo_verification_score`) and per-model votes (`llm_yolo_verification_votes`)
	3. The final `label` in the dataset reflects this verified output

	---

	## Citation

	If you use this model in your work, please cite:
	```bibtex
	@misc{galarnyk2026ipomine,
	title = {IPO-Mine: A Toolkit and Dataset for Section-Structured Analysis of Long, Multimodal IPO Documents},
	author = {Galarnyk, Michael and Lohani, Siddharth and Nandi, Sagnik and Patel, Aman and Kannan, Vidhyakshaya and Banerjee, Prasun and Routu, Rutwik and Ye, Liqin and Hiray, Arnav and Somani, Siddhartha and Chava, Sudheer},
	year = {2026},
	url = {https://huggingface.co/datasets/gtfintechlab/ipo-images},
	note = {Preprint/Working Paper}
	}
	```