aslakey
/

shot_type

dinov2_with_registers

Model card Files Files and versions

shot_type / README.md

aslakey's picture

Update README.md

e0a57d8 verified 8 months ago

|

history blame contribute delete

1.34 kB

	---
	license: apache-2.0
	---

	# Shot Type

	This model predicts an image's cinematic shot type [clean single, double, group, over the shoulder, insert, establishing]. The model is a DinoV2 with registers backbone (initiated with `facebook/dinov2-with-registers-large` weights) and trained on a diverse set of five thousand human-annotated images.

	## How to use:
	```python

	import torch
	from PIL import Image
	from transformers import AutoImageProcessor
	from transformers import AutoModelForImageClassification

	image_processor = AutoImageProcessor.from_pretrained("facebook/dinov2-with-registers-large")
	model = AutoModelForImageClassification.from_pretrained('aslakey/depth_of_field')
	model.eval()

	# Model labels: [clean_single, double, group, over_the_shoulder, insert, establishing]
	image = Image.open('cinematic_shot.jpg')
	inputs = image_processor(image, return_tensors="pt")
	with torch.no_grad():
	outputs = model(**inputs)

	predicted_label = outputs.logits.argmax(-1).item()
	print(model.config.id2label[predicted_label])
	```

	## Performance:


	\| Category \| Precision \| Recall \|
	\|----------\|-----------\|--------\|
	\| clean_single \| 81% \| 89% \|
	\| double \| 80% \| 72% \|
	\| group \| 91% \| 74% \|
	\| over the shoulder \| 60% \| 67% \|
	\| establishing \| 91% \| 77% \|