kaiyuyue's picture
update readme
7c2c7b4 verified
---
license: cc-by-nc-4.0
---
This repository contains the model weights and configuration files for the [**Sphere Encoder**](https://github.com) project.
> [!Note]
> These model weights have been **reproduced** with the released code and yield slightly different evaluation results compared to those reported in the original paper.
# Model Card
| dataset | πŸ€— hf model repo | params |
|:--:|:--:|:--:|
| Animal-Faces | [`sphere-l-af`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-af) | 642M |
| Oxford-Flowers | [`sphere-l-of`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-of) | 948M |
| ImageNet | [`sphere-l-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-imagenet) | 950M |
| ImageNet | [`sphere-xl-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-xl-imagenet) | 1.3B |
Download model checkpoints and put them in `./workspace/experiments`.
The directory tree should look like this:
```bash
./workspace/experiments/
β”œβ”€β”€ sphere-l-af
β”œβ”€β”€ ckpt/ep0999.pth
|── config.json
β”œβ”€β”€ sphere-l-of
|── sphere-l-imagenet
|── sphere-xl-imagenet
```
<br>
# Evaluation Results
Evaluate **ImageNet** models with `CFG = 1.4`:
```bash
# --job_dir can be
# sphere-l-imagenet, or sphere-xl-imagenet
./run.sh eval.py \
--job_dir sphere-xl-imagenet \
--forward_steps 1 4 \
--report_fid rfid gfid \
--use_cfg True \
--cfg_min 1.4 \
--cfg_max 1.4 \
--cfg_position combo \
--rm_folder_after_eval True
```
The evaluation results will be saved in `./workspace/experiments/sphere-xl-imagenet/eval/`:
| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|--:|:--:|
ImageNet 256x256 | Sphere-L | 1 | 0.62 | 15.69 | 274.5 |
|| Sphere-L | 4 | - | 4.78 | 259.1 |
|| Sphere-XL | 1 | 0.62 | 14.52 | 299.3 |
|| Sphere-XL | 4 | - | 4.05 | 266.0 |
Evaluate unconditional **Animal-Faces** model:
```bash
./run.sh eval.py \
--job_dir sphere-l-af \
--forward_steps 1 4 \
--report_fid gfid \
--rm_folder_after_eval True
```
| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|:--:|:--:|
Animal-Faces 256x256 | Sphere-L | 1 | - | 21.56 | 8.3 |
|| Sphere-L | 4 | - | 18.73 | 9.8 |
Evaluate **Oxford-Flowers** model with `CFG = 1.4`:
```bash
./run.sh eval.py \
--job_dir sphere-l-of \
--forward_steps 1 4 \
--report_fid gfid \
--use_cfg True \
--cfg_min 1.6 \
--cfg_max 1.6 \
--cfg_position combo \
--num_eval_samples 51000 \
--rm_folder_after_eval True \
--cache_sampling_noise False \
```
`--num_eval_samples = 51000` are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus.
Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples.
| dataset | model | steps | rFID &darr; | gFID &darr; | IS &uarr; |
|:--:|:--|:--:|:--:|:--:|:--:|
| Oxford-Flowers 256x256 | Sphere-L | 1 | - | 25.10 | 3.4 |
|| Sphere-L | 4 | - | 11.27 | 3.2 |