ai4colonoscopy
/

ColonGPT-stg1

Image-Text-to-Text

Model card Files Files and versions

ColonGPT-stg1 / README.md

JingyiLiu's picture

Upload README.md

d407484 verified 12 months ago

|

history blame contribute delete

1.35 kB

	---
	license: apache-2.0
	datasets:
	- ai4colonoscopy/ColonINST-v1
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- microsoft/phi-1_5
	library_name: adapter-transformers
	pipeline_tag: image-text-to-text
	tags:
	- medical
	- colonoscopy
	- polyp
	---

	# ColonGPT (A colonoscopy-specific multimodal Language Model)
	<p align="center">
	<img src="./assert/web_ui_stg1.gif" width="666px"/> <br />
	<em>The Gradio Web UI allows you to use our examples or upload your images for inference.</em>
	</p>

	📖 [Paper](https://arxiv.org/abs/2410.17241) \| 🏠 [Home](https://github.com/ai4colonoscopy/IntelliScope)

	> This is the weight of the pre-alignment stage of ColonGPT-v1.


	Our ColonGPT is a standard multimodal language model, which contains four basic components: a language tokenizer, an visual encoder (🤗 [SigLIP-SO](https://huggingface.co/google/siglip-so400m-patch14-384)), a multimodal connector, and a language model (🤗 [Phi1.5](https://huggingface.co/microsoft/phi-1_5)). In this huggingface page, we provide a quick start for convenient of new users. For further details about ColonGPT, we highly recommend visiting our [homepage](https://github.com/ai4colonoscopy/IntelliScope). There, you'll find comprehensive usage instructions for our model and the latest advancements in intelligent colonoscopy technology.