ManishThota
/

CustomModel

Text Generation

text-generation-inference

Model card Files Files and versions

ManishThota commited on Feb 12, 2024

Commit

1e8fcbf

·

verified ·

1 Parent(s): c0c527d

Create README.md

Files changed (1) hide show

README.md +58 -1

README.md CHANGED Viewed

@@ -1,3 +1,60 @@
 ---
-license: apache-2.0
 ---

 ---
+license: creativeml-openrail-m
 ---
+---
+<h1 align='center' style='font-size: 36px; font-weight: bold;'>Sparrow</h1>
+<h3 align='center' style='font-size: 24px;'>Tiny Vision Language Model</h3>
+<p align="center">
+  <img src="https://cdn-uploads.huggingface.co/production/uploads/650c7fbb8ffe1f53bdbe1aec/DTjDSq2yG-5Cqnk6giPFq.jpeg" width="50%" height="auto"/>
+</p>
+<p align='center' style='font-size: 16px;'>
+3B parameter model built by <a href="https://www.linkedin.com/in/manishkumarthota/">@Manish</a> using SigLIP, Phi-2, Language Modeling Loss, LLaVa data, and Custom setting training dataset.
+The model is released for research purposes only, commercial use is not allowed.
+</p>
+Pretraining is done and if at all in future we are adding more question answer pairs, we can just do lora finetuning on top of this model
+## How to use
+**Install dependencies**
+```bash
+pip install transformers # latest version is ok, but we recommend v4.31.0
+pip install -q pillow accelerate einops
+```
+You can use the following code for model inference. The format of text instruction is similar to [LLaVA](https://github.com/haotian-liu/LLaVA).
+```Python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from PIL import Image
+torch.set_default_device("cuda")
+#Create model
+model = AutoModelForCausalLM.from_pretrained(
+    "ManishThota/Sparrow",
+    torch_dtype=torch.float16,
+    device_map="auto",
+    trust_remote_code=True)
+tokenizer = AutoTokenizer.from_pretrained("ManishThota/Sparrow", trust_remote_code=True)
+#Set inputs
+text = "A chat between a curious user and an artificial intelligence assistant. USER: <image>\nCan you explain the slide? ASSISTANT:"
+image = Image.open("images/week_02_page_02")
+input_ids = tokenizer(text, return_tensors='pt').input_ids
+image_tensor = model.image_preprocess(image)
+#Generate the answer
+output_ids = model.generate(
+    input_ids,
+    max_new_tokens=1500,
+    images=image_tensor,
+    use_cache=True)[0]
+print(tokenizer.decode(output_ids[input_ids.shape[1]:], skip_special_tokens=True).strip())
+```