BiliSakura
/

MoCo-TP-ResNet-50

Image Classification

self-supervised-learning

contrastive-learning

Model card Files Files and versions

MoCo-TP-ResNet-50 / README.md

BiliSakura's picture

Add files using upload-large-folder tool

983931b verified about 1 month ago

|

history blame contribute delete

2.99 kB

	---
	license: mit
	tags:
	- image-classification
	- remote-sensing
	- resnet
	- pytorch
	- transformers
	- self-supervised-learning
	- contrastive-learning
	- moco
	---

	# MoCo-TP-ResNet-50

	ResNet-50 model pre-trained using MoCo-v2 with Temporal Pairing (TP) for geography-aware self-supervised learning on remote sensing images.

	## Model Details

	- Architecture: ResNet-50
	- Pre-training: MoCo-v2 with Temporal Pairing (TP)
	- Input size: 224×224×3
	- Feature dimension: 2048 (before classification head)
	- Parameters: ~23.6M
	- Training: Self-supervised pre-training on fMoW dataset (200 epochs)

	## Usage

	### Feature Extraction

	```python
	from transformers import AutoModelForImageClassification
	import torch

	# Load model for feature extraction
	model = AutoModelForImageClassification.from_pretrained(
	"BiliSakura/MoCo-TP-ResNet-50",
	trust_remote_code=True
	)

	# Inference - extract features
	model.eval()
	input_image = torch.randn(1, 3, 224, 224) # (batch, channels, height, width)

	with torch.no_grad():
	outputs = model(pixel_values=input_image, return_dict=True)
	features = outputs["features"] # Shape: (1, 2048)
	```

	### Fine-tuning for Classification

	To fine-tune the model for a specific classification task, you can add a classification head:

	```python
	from transformers import AutoModelForImageClassification, AutoConfig
	import torch.nn as nn

	# Load config and modify num_labels
	config = AutoConfig.from_pretrained(
	"BiliSakura/MoCo-TP-ResNet-50",
	trust_remote_code=True
	)
	config.num_labels = 10 # Your number of classes

	# Load model
	model = AutoModelForImageClassification.from_pretrained(
	"BiliSakura/MoCo-TP-ResNet-50",
	config=config,
	trust_remote_code=True
	)

	# The model will automatically replace the identity head with a classification head
	# Now you can fine-tune on your dataset
	```

	## Model Architecture

	The model consists of:
	- Backbone: ResNet-50 (conv1, bn1, layer1-4)
	- Feature extractor: Adaptive average pooling + flattening
	- Classification head: Linear layer (2048 -> num_labels), or Identity for feature extraction

	## Pre-training Details

	This model was pre-trained using:
	- Method: MoCo-v2 (Momentum Contrast) with Temporal Pairing
	- Dataset: fMoW (Functional Map of the World)
	- Epochs: 200
	- Loss: Contrastive Predictive Coding (CPC)
	- Augmentation: MoCo v2 augmentation (random resized crop, color jitter, grayscale, Gaussian blur)

	## Citation

	If you use this model, please cite the original Geography-Aware SSL paper:

	```bibtex
	@article{ayush2021geography,
	title={Geography-Aware Self-Supervised Learning},
	author={Ayush, Kumar and Uzkent, Burak and Meng, Chenlin and Tanmay, Kumar and Burke, Marshall and Lobell, David and Ermon, Stefano},
	journal={ICCV},
	year={2021}
	}
	```

	Original Repository: [sustainlab-group/geography-aware-ssl](https://github.com/sustainlab-group/geography-aware-ssl)

	## License

	MIT License - for academic use only.