| | --- |
| | license: cc-by-nc-4.0 |
| | --- |
| | |
| | This repository contains the model weights and configuration files for the [**Sphere Encoder**](https://github.com) project. |
| |
|
| | > [!Note] |
| | > These model weights have been **reproduced** with the released code and yield slightly different evaluation results compared to those reported in the original paper. |
| |
|
| | # Model Card |
| |
|
| | | dataset | π€ hf model repo | params | |
| | |:--:|:--:|:--:| |
| | | Animal-Faces | [`sphere-l-af`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-af) | 642M | |
| | | Oxford-Flowers | [`sphere-l-of`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-of) | 948M | |
| | | ImageNet | [`sphere-l-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-imagenet) | 950M | |
| | | ImageNet | [`sphere-xl-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-xl-imagenet) | 1.3B | |
| |
|
| | Download model checkpoints and put them in `./workspace/experiments`. |
| | The directory tree should look like this: |
| |
|
| | ```bash |
| | ./workspace/experiments/ |
| | βββ sphere-l-af |
| | βββ ckpt/ep0999.pth |
| | |ββ config.json |
| | βββ sphere-l-of |
| | |ββ sphere-l-imagenet |
| | |ββ sphere-xl-imagenet |
| | ``` |
| |
|
| | <br> |
| |
|
| | # Evaluation Results |
| |
|
| | Evaluate **ImageNet** models with `CFG = 1.4`: |
| |
|
| | ```bash |
| | # --job_dir can be |
| | # sphere-l-imagenet, or sphere-xl-imagenet |
| | |
| | ./run.sh eval.py \ |
| | --job_dir sphere-xl-imagenet \ |
| | --forward_steps 1 4 \ |
| | --report_fid rfid gfid \ |
| | --use_cfg True \ |
| | --cfg_min 1.4 \ |
| | --cfg_max 1.4 \ |
| | --cfg_position combo \ |
| | --rm_folder_after_eval True |
| | ``` |
| |
|
| | The evaluation results will be saved in `./workspace/experiments/sphere-xl-imagenet/eval/`: |
| |
|
| | | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |
| | |:--:|:--|:--:|:--:|--:|:--:| |
| | ImageNet 256x256 | Sphere-L | 1 | 0.62 | 15.69 | 274.5 | |
| | || Sphere-L | 4 | - | 4.78 | 259.1 | |
| | || Sphere-XL | 1 | 0.62 | 14.52 | 299.3 | |
| | || Sphere-XL | 4 | - | 4.05 | 266.0 | |
| |
|
| | Evaluate unconditional **Animal-Faces** model: |
| |
|
| | ```bash |
| | ./run.sh eval.py \ |
| | --job_dir sphere-l-af \ |
| | --forward_steps 1 4 \ |
| | --report_fid gfid \ |
| | --rm_folder_after_eval True |
| | ``` |
| |
|
| | | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |
| | |:--:|:--|:--:|:--:|:--:|:--:| |
| | Animal-Faces 256x256 | Sphere-L | 1 | - | 21.56 | 8.3 | |
| | || Sphere-L | 4 | - | 18.73 | 9.8 | |
| |
|
| | Evaluate **Oxford-Flowers** model with `CFG = 1.4`: |
| |
|
| | ```bash |
| | ./run.sh eval.py \ |
| | --job_dir sphere-l-of \ |
| | --forward_steps 1 4 \ |
| | --report_fid gfid \ |
| | --use_cfg True \ |
| | --cfg_min 1.6 \ |
| | --cfg_max 1.6 \ |
| | --cfg_position combo \ |
| | --num_eval_samples 51000 \ |
| | --rm_folder_after_eval True \ |
| | --cache_sampling_noise False \ |
| | ``` |
| |
|
| | `--num_eval_samples = 51000` are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus. |
| | Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples. |
| |
|
| | | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |
| | |:--:|:--|:--:|:--:|:--:|:--:| |
| | | Oxford-Flowers 256x256 | Sphere-L | 1 | - | 25.10 | 3.4 | |
| | || Sphere-L | 4 | - | 11.27 | 3.2 | |
| |
|
| |
|