File size: 2,018 Bytes

---
license: cc
datasets:
- Zilun/RS5M
language:
- en
metrics:
- accuracy
- recall
---
## GeoRSCLIP Model
* **GeoRSCLIP with ViT-B-32 and ViT-H-14 backbone**
* **GeoRSCLIP-FT for retrieval**


### Installation

* Install Pytorch following instructions from the official website (We tested in torch 2.0.1 with CUDA 11.8 and 2.1.0 with CUDA 12.1)

```bash
  pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
```

* Install other dependencies

```bash
  pip install pillow pandas scikit-learn ftfy tqdm matplotlib transformers adapter-transformers open_clip_torch pycocotools timm clip-benchmark torch-rs
```

### Usage

* Clone the repo from: https://huggingface.co/Zilun/GeoRSCLIP

```bash
git clone https://huggingface.co/Zilun/GeoRSCLIP
cd GeoRSCLIP
```

* Unzip the test data
```bash
unzip data/rs5m_test_data.zip
```

* Run the inference script:
```bash
  python codebase/inference.py --ckpt-path /your/local/path/to/RS5M_ViT-B-32.pt --test-dataset-dir /your/local/path/to/rs5m_test_data
```

* (Optional) If you just want to load the GeoRSCLIP model:

```python

  import open_clip
  import torch
  from inference_tool import get_preprocess


  ckpt_path = "/your/local/path/to/RS5M_ViT-B-32.pt"
  model, _, _ = open_clip.create_model_and_transforms("ViT-B/32", pretrained="openai")
  checkpoint = torch.load(ckpt_path, map_location="cpu")
  msg = model.load_state_dict(checkpoint, strict=False)
  model = model.to("cuda")
  img_preprocess = get_preprocess(
        image_resolution=224,
  )
```

```python

  import open_clip
  import torch
  from inference_tool import get_preprocess

  ckpt_path = "/your/local/path/to/RS5M_ViT-H-14.pt"
  model, _, _ = open_clip.create_model_and_transforms("ViT-H/14", pretrained="laion2b_s32b_b79k")
  checkpoint = torch.load(ckpt_path, map_location="cpu")
  msg = model.load_state_dict(checkpoint, strict=False)
  model = model.to("cuda")
  img_preprocess = get_preprocess(
        image_resolution=224,
  )
```