sarequi
/

Bowerbird_viewpoint_classifier

Image Classification

Model card Files Files and versions

Bowerbird_viewpoint_classifier / README.md

sarequi's picture

Create README.md

29580c4 verified 3 months ago

|

history blame contribute delete

1.76 kB

	---
	base_model:
	- microsoft/resnet-18
	pipeline_tag: image-classification
	tags:
	- ecology
	- birds
	- posture
	---
	# Bowerbird viewpoint classifier (ResNet18)

	- Task: classify each frame into one of four viewpoints:
	`["back", "front", "left_side", "right_side"]`
	- Base model: `torchvision.models.resnet18` with `weights="IMAGENET1K_V1"`
	- Input size: 224 × 224 (after cropping)
	- Preprocessing (training/eval):
	- Resize to 256 px on the shorter side
	- Train: RandomResizedCrop(224), RandomRotation(7°), ColorJitter
	- Eval: CenterCrop(224)
	- Normalization:
	- mean = [0.485, 0.456, 0.406]
	- std = [0.229, 0.224, 0.225]
	- Checkpoint file: `Bbird_viewpoint_classifier.pth`
	- The checkpoint stores a PyTorch `state_dict` for ResNet18 with a final
	linear layer of 4 outputs (one per viewpoint class).

	> This model is not generic. It is specific to the four viewpoint classes
	> listed above. The classification head must have 4 outputs, in the same
	> class order: `back`, `front`, `left_side`, `right_side`.

	## Usage

	```python
	import torch
	from torch import nn
	from torchvision.models import resnet18
	from huggingface_hub import hf_hub_download

	# Replace this with the actual repo id on the Hub if different
	repo_id = "sarequi/bowerbird-viewpoint-classifier"

	# Download checkpoint
	ckpt_path = hf_hub_download(
	repo_id=repo_id,
	filename="Bbird_viewpoint_classifier.pth",
	)

	# Rebuild the model architecture exactly as in training
	NUM_CLASSES = 4
	model = resnet18(weights="IMAGENET1K_V1")
	model.fc = nn.Linear(model.fc.in_features, NUM_CLASSES)

	# Load weights
	state_dict = torch.load(ckpt_path, map_location="cpu")
	model.load_state_dict(state_dict)
	model.eval()

	VIEWPOINT_CLASSES = ["back", "front", "left_side", "right_side"]