| | --- |
| | license: apache-2.0 |
| | --- |
| | |
| | # Deoth of Field |
| |
|
| | This model predicts an image's cinematic depth of field [deep, shallow]. The model is a DinoV2 with registers backbone (initiated with `facebook/dinov2-with-registers-large` weights) and trained on a diverse set of five thousand human-annotated images. |
| |
|
| | ## How to use: |
| | ```python |
| | |
| | import torch |
| | from PIL import Image |
| | from transformers import AutoImageProcessor |
| | from transformers import AutoModelForImageClassification |
| | |
| | image_processor = AutoImageProcessor.from_pretrained("facebook/dinov2-with-registers-large") |
| | model = AutoModelForImageClassification.from_pretrained('aslakey/depth_of_field') |
| | model.eval() |
| | |
| | # Model labels: [deep, shallow] |
| | image = Image.open('cinematic_shot.jpg') |
| | inputs = image_processor(image, return_tensors="pt") |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | |
| | predicted_label = outputs.logits.argmax(-1).item() |
| | print(model.config.id2label[predicted_label]) |
| | ``` |
| |
|
| | ## Performance: |
| |
|
| |
|
| | | Category | Precision | Recall | |
| | |----------|-----------|--------| |
| | | deep | 85% | 77% | |
| | | shallow | 75% | 84% | |
| |
|