update README

Browse files

Files changed (3) hide show

.gitattributes +1 -0
README.md +97 -2
assert/ColonGPT.gif +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+assert/ColonGPT.gif filter=lfs diff=lfs merge=lfs -text

README.md CHANGED Viewed

@@ -15,7 +15,102 @@ tags:
 - polyp
 ---
-# ColonGPT
-A colonoscopy-specifc multimodal language model with token-efficient designs.

 - polyp
 ---
+# ColonGPT (A colonoscopy-specific multimodal Language Model)
+<p align="center">
+    <img src="./assert/ColonGPT.gif" width="666px"/> <br />
+    <em>Details of our multimodal language model, ColonGPT.</em>
+</p>
+📖 [Paper](https://arxiv.org) | 🏠 [Home](https://github.com/ai4colonoscopy/IntelliScope)
+This is the merged weights of [ColonGPT-v1-phi1.5-siglip-lora](https://drive.google.com/drive/folders/1Emi7o7DpN0zlCPIYqsCfNMr9LTPt3SCT?usp=sharing).
+Our ColonGPT is a standard multimodal language model, which contains four basic components: a language tokenizer, an visual encoder (🤗 [SigLIP-SO](https://huggingface.co/google/siglip-so400m-patch14-384)), a multimodal connector, and a language model (🤗 [Phi1.5](https://huggingface.co/microsoft/phi-1_5)).
+For further details about ColonGPT, we highly recommend visiting our [home page](https://github.com/BAAI-DCAI/Bunny). There, you'll find comprehensive usage instructions for our model and the latest advancements in intelligent colonoscopy technology.
+# Quick start
+Here is a code snippet to show you how to quickly try-on our ColonGPT model with transformers. For convenience, we manually combined some configuration and code files and merged the weights. Please note that this is a quick code, we recommend you installing [ColonGPT's source code](https://github.com/ai4colonoscopy/IntelliScope/blob/main/docs/guideline-for-ColonGPT.md) to explore more.
+- Before running the snippet, you only need to install the following minimium dependencies.
+    ```shell
+    conda create -n quickstart python=3.10
+    conda activate quickstart
+    pip install torch transformers accelerate pillow
+    ```
+- Then you can use `python script/quick_start/quickstart.py` to start.
+    ```python
+    import torch
+    import transformers
+    from transformers import AutoModelForCausalLM, AutoTokenizer, StoppingCriteria
+    from PIL import Image
+    import warnings
+    transformers.logging.set_verbosity_error()
+    transformers.logging.disable_progress_bar()
+    warnings.filterwarnings('ignore')
+    device = 'cuda'  # or cpu
+    torch.set_default_device(device)
+    model_name = "ai4colonoscopy/ColonGPT-v1"
+    model = AutoModelForCausalLM.from_pretrained(
+        model_name,
+        torch_dtype=torch.float16,  # or float32 for cpu
+        device_map='auto',
+        trust_remote_code=True
+    )
+    tokenizer = AutoTokenizer.from_pretrained(
+        model_name,
+        trust_remote_code=True
+    )
+    class KeywordsStoppingCriteria(StoppingCriteria):
+        def __init__(self, keyword, tokenizer, input_ids):
+            self.keyword_id = tokenizer(keyword).input_ids
+            self.tokenizer = tokenizer
+            self.start_len = input_ids.shape[1]
+        def __call__(self, input_ids: torch.LongTensor, scores: torch.FloatTensor, **kwargs) -> bool:
+            for keyword_id in self.keyword_id:
+                if keyword_id in input_ids[0, -len(self.keyword_id):]:
+                    return True
+            return False
+    prompt = "Describe what you see in the image."
+    text = f"USER: <image>\n{prompt} ASSISTANT:"
+    text_chunks = [tokenizer(chunk).input_ids for chunk in text.split('<image>')]
+    input_ids = torch.tensor(text_chunks[0] + [-200] + text_chunks[1], dtype=torch.long).unsqueeze(0).to(device)
+    image = Image.open('cache/examples/example2.png')
+    image_tensor = model.process_images([image], model.config).to(dtype=model.dtype, device=device)
+    stop_str = "<|endoftext|>"
+    stopping_criteria = KeywordsStoppingCriteria(stop_str, tokenizer, input_ids)
+    output_ids = model.generate(
+        input_ids,
+        images=image_tensor,
+        do_sample=False,
+        temperature=0,
+        max_new_tokens=512,
+        use_cache=True,
+        stopping_criteria=[stopping_criteria]
+    )
+    outputs = tokenizer.decode(output_ids[0, input_ids.shape[1]:]).replace("<|endoftext|>", "").strip()
+    print(outputs)
+    ```
+# License
+This project utilizes certain datasets and checkpoints that are subject to their respective original licenses. Users must comply with all terms and conditions of these original licenses.
+The content of this project itself is licensed under the Apache license 2.0.

assert/ColonGPT.gif ADDED Viewed

Git LFS Details

SHA256: e3d1435d26943229dbc60a054434d366449f8665e402d3d2090ea3d2b4d250dd
Pointer size: 132 Bytes
Size of remote file: 5.02 MB