mtanti
/

face-describer

Model card Files Files and versions

face-describer / README.md

mtanti's picture

Update README.md

748a979 verified 10 months ago

|

history blame contribute delete

971 Bytes

	---
	license: mit
	language:
	- en
	base_model:
	- microsoft/git-base-coco
	---

	Given a photo of a face, will describe it.
	Be careful as it can be unflattering.

	Based on the GIT-Base-COCO image to text model and fine-tuned on [Face2Text](https://zenodo.org/records/10973388).

	How to use:
	```
	from transformers import AutoProcessor, AutoModelForCausalLM, AutoTokenizer
	import cv2

	DEVICE = 'cpu' # cpu or cuda
	IMG_PATH = 'face.png'

	processor = AutoProcessor.from_pretrained('microsoft/git-base-coco')
	model = AutoModelForCausalLM.from_pretrained('mtanti/face-describer')
	tokeniser = AutoTokenizer.from_pretrained('microsoft/git-base-coco')
	model.eval()
	model.to(DEVICE)

	img = cv2.imread(IMG_PATH)
	tensor_img = processor(
	images=[img[:, :, ::-1]],
	return_tensors='pt',
	)['pixel_values'].to(DEVICE)
	desc = tokeniser.decode(
	model.generate(pixel_values=tensor_img, max_length=100, repetition_penalty=1.05, do_sample=True)[0, :],
	skip_special_tokens=True,
	)
	```