aloobun
/

rmfg

text-generation

Model card Files Files and versions

rmfg / README.md

aloobun's picture

Update README.md

493d433 verified almost 2 years ago

|

history blame contribute delete

1.23 kB

	---
	library_name: transformers
	license: apache-2.0
	pipeline_tag: image-to-text
	---

	# rmfg

	<!-- Provide a quick summary of what the model is/does. -->
	<img src="https://i.pinimg.com/736x/7e/46/a6/7e46a6881623dfd3e1a2a5a2ae692374.jpg" width="300">



	## Example

	Image
	<img src="https://media-cldnry.s-nbcnews.com/image/upload/t_fit-760w,f_auto,q_auto:best/rockcms/2023-12/231202-elon-musk-mjf-1715-fc0be2.jpg" width="300">
	Output
	> A man in a black cowboy hat and sunglasses stands in front of a white car, holding a microphone and speaking into it.

	-----------------------------------------------------------------------------------

	- underfit, doesn't perform well
	- this marks the beginning of my tiny vision language model series, with this model serving as a prelude to what's to come in the next few days.

	```
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from PIL import Image

	model_id = "aloobun/rmfg"
	model = AutoModelForCausalLM.from_pretrained(
	model_id, trust_remote_code=True
	)
	tokenizer = AutoTokenizer.from_pretrained(model_id)

	image = Image.open('692374.jpg')
	enc_image = model.encode_image(image)
	print(model.answer_question(enc_image, "Describe this image.", tokenizer))
	```