sdan commited on
Commit
32f164c
·
verified ·
1 Parent(s): bc9a870

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +44 -0
  2. config.json +22 -0
  3. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ tags:
4
+ - geolocation
5
+ - vision
6
+ - siglip
7
+ - clip
8
+ - geoclip
9
+ datasets:
10
+ - osv5m
11
+ pipeline_tag: image-feature-extraction
12
+ ---
13
+
14
+ # GeoSpot Base
15
+
16
+ A geolocation model built on SigLIP2-so400m (512px) that predicts GPS coordinates from images.
17
+
18
+ ## Model Details
19
+
20
+ - **Backbone**: google/siglip2-so400m-patch16-512 (frozen)
21
+ - **Image Resolution**: 512x512
22
+ - **Embedding Dim**: 512
23
+ - **Training Steps**: 206k
24
+ - **Training Data**: ~10.6M streetview images
25
+
26
+ ## Architecture
27
+
28
+ GeoCLIP-style contrastive learning between:
29
+ - **Image Encoder**: SigLIP2 vision tower + MLP projection (1152 → 512)
30
+ - **Location Encoder**: Multi-scale RFF encoding with learnable capsules
31
+
32
+ ## Usage
33
+
34
+ ```python
35
+ from geoclip.model.GeoCLIP import GeoCLIP
36
+ import torch
37
+
38
+ model = GeoCLIP(from_pretrained=False, encoder_name="siglip2")
39
+ state_dict = torch.load("model.safetensors")
40
+ model.load_state_dict(state_dict)
41
+
42
+ # Predict location from image
43
+ top_gps, top_probs = model.predict("image.jpg", top_k=5)
44
+ ```
config.json ADDED
@@ -0,0 +1,22 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "geoclip",
3
+ "encoder_name": "siglip2",
4
+ "backbone": "google/siglip2-so400m-patch16-512",
5
+ "image_resolution": 512,
6
+ "embedding_dim": 512,
7
+ "location_encoder": {
8
+ "sigma": [
9
+ 1,
10
+ 16,
11
+ 256
12
+ ],
13
+ "hidden_dim": 1024
14
+ },
15
+ "queue_size": 15360,
16
+ "training": {
17
+ "steps": 206000,
18
+ "batch_size": 256,
19
+ "lr": 0.0003,
20
+ "precision": "bf16"
21
+ }
22
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d859b188a68aee3ca8417dea77f34e52ef03df26dc89e6ab0e243216b86b00ac
3
+ size 903113660