File size: 3,074 Bytes
0a37d1d
 
 
 
 
 
 
b76a443
106256f
7c2c7b4
3773e46
7c2c7b4
3773e46
7c2c7b4
 
 
 
3773e46
106256f
 
 
 
 
 
 
 
 
 
 
 
3773e46
7c2c7b4
 
 
3773e46
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
---
license: cc-by-nc-4.0
---

This repository contains the model weights and configuration files for the [**Sphere Encoder**](https://github.com) project.

> [!Note]
> These model weights have been **reproduced** with the released code and yield slightly different evaluation results compared to those reported in the original paper.

# Model Card

| dataset | πŸ€— hf model repo | params |
|:--:|:--:|:--:|
| Animal-Faces | [`sphere-l-af`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-af) | 642M |
| Oxford-Flowers | [`sphere-l-of`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-of) | 948M |
| ImageNet | [`sphere-l-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-imagenet) | 950M |
| ImageNet | [`sphere-xl-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-xl-imagenet) | 1.3B |

Download model checkpoints and put them in `./workspace/experiments`.
The directory tree should look like this:

```bash
./workspace/experiments/
β”œβ”€β”€ sphere-l-af
    β”œβ”€β”€ ckpt/ep0999.pth
    |── config.json
β”œβ”€β”€ sphere-l-of
|── sphere-l-imagenet
|── sphere-xl-imagenet
```

<br>

# Evaluation Results 

Evaluate **ImageNet** models with `CFG = 1.4`:

```bash
# --job_dir can be
#   sphere-l-imagenet, or sphere-xl-imagenet

./run.sh eval.py \
  --job_dir sphere-xl-imagenet \
  --forward_steps 1 4 \
  --report_fid rfid gfid \
  --use_cfg True \
  --cfg_min 1.4 \
  --cfg_max 1.4 \
  --cfg_position combo \
  --rm_folder_after_eval True
```

The evaluation results will be saved in `./workspace/experiments/sphere-xl-imagenet/eval/`:

| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|--:|:--:|
ImageNet 256x256 | Sphere-L | 1 | 0.62 | 15.69 | 274.5 |
|| Sphere-L | 4 | - | 4.78 | 259.1 |
|| Sphere-XL | 1 | 0.62 | 14.52 | 299.3 |
|| Sphere-XL | 4 | - | 4.05 | 266.0 |

Evaluate unconditional **Animal-Faces** model:

```bash
./run.sh eval.py \
  --job_dir sphere-l-af \
  --forward_steps 1 4 \
  --report_fid gfid \
  --rm_folder_after_eval True
```

| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|:--:|:--:|
Animal-Faces 256x256 | Sphere-L | 1 | - | 21.56 | 8.3 |
|| Sphere-L | 4 | - | 18.73 | 9.8 |

Evaluate **Oxford-Flowers** model with `CFG = 1.4`:

```bash
./run.sh eval.py \
  --job_dir sphere-l-of \
  --forward_steps 1 4 \
  --report_fid gfid \
  --use_cfg True \
  --cfg_min 1.6 \
  --cfg_max 1.6 \
  --cfg_position combo \
  --num_eval_samples 51000 \
  --rm_folder_after_eval True \
  --cache_sampling_noise False \
```

`--num_eval_samples = 51000` are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus. 
Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples.

| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|:--:|:--:|
| Oxford-Flowers 256x256 | Sphere-L | 1 | - | 25.10 | 3.4 |
|| Sphere-L | 4 | - | 11.27 | 3.2 |