SE6446
/

Untitled7-colab_checkpoint

image-text-to-text

Model card Files Files and versions

SE6446 commited on Oct 3, 2023

Commit

01a8bc0

·

1 Parent(s): 5f6694f

Update README.md

Files changed (1) hide show

README.md +40 -3

README.md CHANGED Viewed

@@ -8,8 +8,45 @@ metrics:
 - wer
 pipeline_tag: image-to-text
 ---
-# Untitled7
-This model was loveingly named after the Colab notebook that made it.
-It is supposed to read images and extract a stable diffusion prompt from it but, it might not do a good job at it.

 - wer
 pipeline_tag: image-to-text
 ---
+# Untitled7-colab_checkpoint
+This model was lovingly named after the Google Colab notebook that made it. It is a finetune of Microsoft's [git-large-coco](https://huggingface.co/microsoft/git-large-coco) model on the 1k subset of [poloclub/diffusiondb](https://huggingface.co/datasets/poloclub/diffusiondb/viewer/2m_first_1k/train).
+It is supposed to read images and extract a stable diffusion prompt from it but, it might not do a good job at it. I wouldn't know I haven't extensivly tested it.
+As the title suggests this is a checkpoint as I formerly intended to do it on the entire dataset but, I'm unsure if I want to now...
+## Intended use
+Fun!
+```python
+# Load model directly
+from transformers import AutoProcessor, AutoModelForCausalLM
+processor = AutoProcessor.from_pretrained("SE6446/Untitled7-colab_checkpoint")
+model = AutoModelForCausalLM.from_pretrained("SE6446/Untitled7-colab_checkpoint")
+#################################################################
+# Use a pipeline as a high-level helper
+from transformers import pipeline
+pipe = pipeline("image-to-text", model="SE6446/Untitled7-colab_checkpoint")
+```
+## Out-of-scope use
+Don't use this model to discriminate, alienate or in any other way harm/harass individuals. You guys know the drill...
+## Bias, Risks and, Limitations
+This model does not produce accurate prompts, this is merely a bit of fun (and waste of funds). However it can suffer from bias present in the orginal git-large-coco model.
+## Training
+*I.e boring stuff*
+- lr = 5e-5
+- epochs = 150
+- optim = adamw
+- fp16
+If you want to further finetune it then you should freeze the embedding and vision tranformer layers