--- license: cc-by-nc-4.0 --- This repository contains the model weights and configuration files for the [**Sphere Encoder**](https://github.com) project. > [!Note] > These model weights have been **reproduced** with the released code and yield slightly different evaluation results compared to those reported in the original paper. # Model Card | dataset | 🤗 hf model repo | params | |:--:|:--:|:--:| | Animal-Faces | [`sphere-l-af`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-af) | 642M | | Oxford-Flowers | [`sphere-l-of`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-of) | 948M | | ImageNet | [`sphere-l-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-l-imagenet) | 950M | | ImageNet | [`sphere-xl-imagenet`](https://huggingface.co/kaiyuyue/sphere-encoder-models/tree/main/sphere-xl-imagenet) | 1.3B | Download model checkpoints and put them in `./workspace/experiments`. The directory tree should look like this: ```bash ./workspace/experiments/ ├── sphere-l-af ├── ckpt/ep0999.pth |── config.json ├── sphere-l-of |── sphere-l-imagenet |── sphere-xl-imagenet ```
# Evaluation Results Evaluate **ImageNet** models with `CFG = 1.4`: ```bash # --job_dir can be # sphere-l-imagenet, or sphere-xl-imagenet ./run.sh eval.py \ --job_dir sphere-xl-imagenet \ --forward_steps 1 4 \ --report_fid rfid gfid \ --use_cfg True \ --cfg_min 1.4 \ --cfg_max 1.4 \ --cfg_position combo \ --rm_folder_after_eval True ``` The evaluation results will be saved in `./workspace/experiments/sphere-xl-imagenet/eval/`: | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |:--:|:--|:--:|:--:|--:|:--:| ImageNet 256x256 | Sphere-L | 1 | 0.62 | 15.69 | 274.5 | || Sphere-L | 4 | - | 4.78 | 259.1 | || Sphere-XL | 1 | 0.62 | 14.52 | 299.3 | || Sphere-XL | 4 | - | 4.05 | 266.0 | Evaluate unconditional **Animal-Faces** model: ```bash ./run.sh eval.py \ --job_dir sphere-l-af \ --forward_steps 1 4 \ --report_fid gfid \ --rm_folder_after_eval True ``` | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |:--:|:--|:--:|:--:|:--:|:--:| Animal-Faces 256x256 | Sphere-L | 1 | - | 21.56 | 8.3 | || Sphere-L | 4 | - | 18.73 | 9.8 | Evaluate **Oxford-Flowers** model with `CFG = 1.4`: ```bash ./run.sh eval.py \ --job_dir sphere-l-of \ --forward_steps 1 4 \ --report_fid gfid \ --use_cfg True \ --cfg_min 1.6 \ --cfg_max 1.6 \ --cfg_position combo \ --num_eval_samples 51000 \ --rm_folder_after_eval True \ --cache_sampling_noise False \ ``` `--num_eval_samples = 51000` are set for 102 classes such that each class has 500 samples for evaluation on 8 gpus. Adjust them accordingly if you have different number of gpus or want to evaluate on different number of samples. | dataset | model | steps | rFID ↓ | gFID ↓ | IS ↑ | |:--:|:--|:--:|:--:|:--:|:--:| | Oxford-Flowers 256x256 | Sphere-L | 1 | - | 25.10 | 3.4 | || Sphere-L | 4 | - | 11.27 | 3.2 |