| # Google ViT Model | |
| ## Model Class | |
| ```python | |
| base_model = ViTModel.from_pretrained("google/vit-base-patch16-224-in21k") | |
| class ViTForRegression(nn.Module): | |
| def __init__(self, base_model, num_outputs=2): | |
| super(ViTForRegression, self).__init__() | |
| self.base_model = base_model | |
| hidden_size = base_model.config.hidden_size | |
| self.regression_head = nn.Linear(hidden_size, num_outputs) | |
| def forward(self, pixel_values): | |
| outputs = self.base_model(pixel_values=pixel_values) | |
| pooler_output = outputs.pooler_output | |
| predictions = self.regression_head(pooler_output) | |
| return predictions | |
| model = ViTForRegression(base_model).to(device) | |
| ``` | |
| ## How to Run | |
| In the notebook ViT.ipynb, replace the line: | |
| ```python | |
| dataset_test = load_dataset("gydou/released_img") | |
| ``` | |
| with the proper location of the testing dataset. | |
| NOTE: No .pth file, this model did not perform well enough on sample test dataset. | |
| ## Training Dataset Statistics | |
| ```python | |
| lat_std = 0.0006914493505038013 | |
| lon_std = 0.0006539239061573955 | |
| lat_mean = 39.9517411499467 | |
| lon_mean = -75.19143213125122 | |
| ``` |